Question 1
If the output of a model is given by
y=f(x;W)
, then which of the following choices for
f
are most appropriate when the task is binary classification?
Question 2
After learning using the Perceptron algorithm, how easy is it to express the learned weight vector in terms of the input vectors and the initial weight vector? Assume the input vectors have real-valued components.
Question 3
Suppose we are given three data points:
x1,01,10,1→t→1→1→0
Furthermore, we are given the following weight vector (where the bias is set to 0):
w=(0,−3)
Let
||w(t)−w(t−1)||2
be the distance between the weight vectors at iteration
t
and iteration
t−1
of the perceptron learning algorithm. Here, for a given 2D vector
v
,
||v||2=v21+v22−−−−−−√
(this is also called the Euclidean norm). What is the maximum amount by which the weight vectors can change between successive iterations? Note that in this example we are
not learning the bias.
Question 4
Suppose that we have a perceptron with weight vector
w
and we create a new set of weights
w∗=cw
by scaling
w
by some positive constant
c
. Assume that the bias is zero.
True or false: if the perceptron now uses
w∗
instead then it's classification decisions might change (that is, we have moved the classification boundary).
Question 5
Suppose that we have a perceptron with weight vector
w
and we create a new set of weights
w∗=w+c
by adding some constant vector
c
to
w
. Assume that the bias is zero.
True or false: if the perceptron now uses
w∗
instead then it's classification decisions might change (that is, we have moved the classification boundary).
Question 6
Suppose we are given four training cases:
x1,11,00,10,0→t→1→0→0→1
It is impossible for a binary threshold unit to produce the desired target outputs for all four cases. Now suppose that we add an extra input dimension so that each of the four input vectors consists of three numbers instead of two.
Which of the following ways of setting the value of the extra input will create a set of four input vectors that is linearly separable (i.e. that can be given the right target values by a binary threshold unit with appropriate weights and bias).
Question 7
Brian wants to use a neural network to predict the price of a stock tomorrow given today's price and the price over the last 10 days. The inputs to this network are price over the last 10 days and the output is tomorrow's price. The hidden units in this network receive information from the layer below, transmit information to the layer above and do not send information within the same layer. Is this an example of a feed-forward network or a recurrent network?
Question 8
Brian and Andy are having an argument about the perceptron algorithm. They have a dataset that the perceptron cannot seem to classify (that is, it fails to converge to a solution). Andy reasons that if he could collect more examples, that might solve the problem by making the data set linearly separable and then the perceptron algorithm will converge. Brian claims that collecting more examples will not help. Which one of them is correct?