Coursera Machine Learning 第一周 quiz Linear Regression with One Variable 习题答案

1.Consider the problem of predicting how well a student does in her second year of college/university, given how well she did in her first year.

Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a student receives in their first year of college (freshmen year). We would like to predict the value of y, which we define as the number of "A" grades they get in their second year (sophomore year).

Here each row is one training example. Recall that in linear regression, our hypothesis is  hθ(x)=θ0+θ1x , and we use  m  to denote the number of training examples.

For the training set given above (note that this training set may also be referenced in other questions in this quiz), what is the value of  m ? In the box below, please enter your answer (which should be a number between 0 and 10).


2.For this question, assume that we are

using the training set from Q1. Recall our definition of the

cost function was  J(θ0,θ1)=12mmi=1(hθ(x(i))y(i))2 .

What is  J(0,1) ? In the box below,

please enter your answer (Simplify fractions to decimals when entering answer, and '.' as the decimal delimiter e.g., 1.5).

把公式写出,代入即可 答案为0.5

Suppose we set  θ0=1,θ1=0.5 . What is  hθ(4) ?

把公式写出,代入即可 答案为1

4.In the given figure, the cost function 

J(θ0,θ1)  has been plotted against  θ0  and  θ1 , as shown in 'Plot 2'. The contour plot for the same cost function is given in 'Plot 1'. Based on the figure, choose the correct options (check all that apply).

Point P (the global minimum of plot 2) corresponds to point A of Plot 1.          左图A点对应右图B点,最终收敛在A处故选择A、C答案

If we start from point B, gradient descent with a well-chosen learning rate will eventually help us reach at or near point A, as the value of cost function  J(θ0,θ1)  is maximum at point A.

If we start from point B, gradient descent with a well-chosen learning rate will eventually help us reach at or near point A, as the value of cost function  J(θ0,θ1)  is minimum at A.

Point P (The global minimum of plot 2) corresponds to point C of Plot 1.

If we start from point B, gradient descent with a well-chosen learning rate will eventually help us reach at or near point C, as the value of cost function  J(θ0,θ1)  is minimum at point C.

5.Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we have some training set, and for our training set we managed to find some  θ0 θ1  such that  J(θ0,θ1)=0 .

Which of the statements below must then be true? (Check all that apply.)

We can perfectly predict the value of  y  even for new examples that we have not yet seen.即使我们的代价函数为0,页不能肯定的预测未看到的房价

(e.g., we can perfectly predict prices of even new houses that we have not yet seen.)

For these values of  θ0  and  θ1  that satisfy  J(θ0,θ1)=0 ,

we have that  hθ(x(i))=y(i)  for every training example  (x(i),y(i))根据定义知正确

For this to be true, we must have  θ0=0  and  θ1=0

so that  hθ(x)=0 如果两个都为0我们得不到h(x)=0除非 y为0,不过这样的预测就没有意义了

This is not possible: By the definition of  J(θ0,θ1) , it is not possible for there to exist

θ0  and  θ1  so that  J(θ0,θ1)=0


