机器学习中的神经网络Neural Networks for Machine Learning：Lecture 2 Quiz

本文链接：https://blog.csdn.net/GarfieldEr007/article/details/50598130

Warning: The hard deadline has passed. You can attempt it, but you will not get credit for it. You are welcome to try it as a learning exercise.

Question 1

If the output of a model is given by

y=f(x;W) , then which of the following choices for

f are most appropriate when the task is binary classification?

Linear

Binary threshold

Linear threshold

Logistic sigmoid

Question 2

After learning using the Perceptron algorithm, how easy is it to express the learned weight vector in terms of the input vectors and the initial weight vector? Assume the input vectors have real-valued components.

It requires real numbers.

It requires only one integer per training case.

It is impossible.

It requires one bit per training case.

Question 3

Suppose we are given three data points:

x1,01,10,1→t→1→1→0
Furthermore, we are given the following weight vector (where the bias is set to 0):

w=(0,−3)
Let

||w(t)−w(t−1)||2 be the distance between the weight vectors at iteration

t and iteration

t−1 of the perceptron learning algorithm. Here, for a given 2D vector

v ,

||v||2=v21+v22−−−−−−√ (this is also called the Euclidean norm). What is the maximum amount by which the weight vectors can change between successive iterations? Note that in this example we are not learning the bias.

22√

2√

Question 4

Suppose that we have a perceptron with weight vector

w and we create a new set of weights

w∗=cw by scaling

w by some positive constant

c . Assume that the bias is zero.

True or false: if the perceptron now uses

w∗ instead then it's classification decisions might change (that is, we have moved the classification boundary).

True

False

Question 5

Suppose that we have a perceptron with weight vector

w and we create a new set of weights

w∗=w+c by adding some constant vector

c to

w . Assume that the bias is zero.

True or false: if the perceptron now uses

w∗ instead then it's classification decisions might change (that is, we have moved the classification boundary).

False

True

Question 6

Suppose we are given four training cases:

x1,11,00,10,0→t→1→0→0→1
It is impossible for a binary threshold unit to produce the desired target outputs for all four cases. Now suppose that we add an extra input dimension so that each of the four input vectors consists of three numbers instead of two.
Which of the following ways of setting the value of the extra input will create a set of four input vectors that is linearly separable (i.e. that can be given the right target values by a binary threshold unit with appropriate weights and bias).

Make the third value be 1 for one of the four input vectors and 0 for the other three.

Make the third value of each input vector be the same as the target value for that input vector.

Make the third value of each input vector be the opposite of the first value (i.e. use 1 if the first value is 0 and 0 if the first value is 1)

Make the third value of each input vector be the same as the first value.

Question 7

Brian wants to use a neural network to predict the price of a stock tomorrow given today's price and the price over the last 10 days. The inputs to this network are price over the last 10 days and the output is tomorrow's price. The hidden units in this network receive information from the layer below, transmit information to the layer above and do not send information within the same layer. Is this an example of a feed-forward network or a recurrent network?

Feed-forward

Recurrent

Question 8

Brian and Andy are having an argument about the perceptron algorithm. They have a dataset that the perceptron cannot seem to classify (that is, it fails to converge to a solution). Andy reasons that if he could collect more examples, that might solve the problem by making the data set linearly separable and then the perceptron algorithm will converge. Brian claims that collecting more examples will not help. Which one of them is correct?

Andy

Brian