1.Consider the problem of predicting how well a student does in her second year of college/university, given how well she did in her first year.
Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a student receives in their first year of college (freshmen year). We would like to predict the value of y, which we define as the number of "A" grades they get in their second year (sophomore year).
Here each row is one training example. Recall that in linear regression, our hypothesis is hθ(x)=θ0+θ1x , and we use m to denote the number of training examples.
For the training set given above (note that this training set may also be referenced in other questions in this quiz), what is the value of m ? In the box below, please enter your answer (which should be a number between 0 and 10).
2.For this question, assume that we are
using the training set from Q1. Recall our definition of the
cost function was J(θ0,θ1)=12m∑mi=1(hθ(x(i))−y(i))2 .
What is J(0,1) ? In the box below,
please enter your answer (Simplify fractions to decimals when entering answer, and '.' as the decimal delimiter e.g., 1.5).
Suppose we set θ0=−1,θ1=0.5 . What is hθ(4) ?
4.In the given figure, the cost function
J(θ0,θ1) has been plotted against θ0 and θ1 , as shown in 'Plot 2'. The contour plot for the same cost function is given in 'Plot 1'. Based on the figure, choose the correct options (check all that apply).
Point P (the global minimum of plot 2) corresponds to point A of Plot 1. 左图A点对应右图B点,最终收敛在A处故选择A、C答案
If we start from point B, gradient descent with a well-chosen learning rate will eventually help us reach at or near point A, as the value of cost function J(θ0,θ1) is maximum at point A.
If we start from point B, gradient descent with a well-chosen learning rate will eventually help us reach at or near point A, as the value of cost function J(θ0,θ1) is minimum at A.
Point P (The global minimum of plot 2) corresponds to point C of Plot 1.
If we start from point B, gradient descent with a well-chosen learning rate will eventually help us reach at or near point C, as the value of cost function J(θ0,θ1) is minimum at point C.
5.Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we have some training set, and for our training set we managed to find some θ0 , θ1 such that J(θ0,θ1)=0 .
Which of the statements below must then be true? (Check all that apply.)
We can perfectly predict the value of y even for new examples that we have not yet seen.即使我们的代价函数为0,页不能肯定的预测未看到的房价
(e.g., we can perfectly predict prices of even new houses that we have not yet seen.)
For these values of θ0 and θ1 that satisfy J(θ0,θ1)=0 ,
we have that hθ(x(i))=y(i) for every training example (x(i),y(i))根据定义知正确
For this to be true, we must have θ0=0 and θ1=0
so that hθ(x)=0 如果两个都为0我们得不到h(x)=0除非 y为0,不过这样的预测就没有意义了
This is not possible: By the definition of J(θ0,θ1) , it is not possible for there to exist
θ0 and θ1 so that J(θ0,θ1)=0