Machine Learning week 1 quiz: Linear Regression with One Variable

最新推荐文章于 2021-11-30 22:35:24 发布

GarfieldEr007

最新推荐文章于 2021-11-30 22:35:24 发布

阅读量5.7k

点赞数

分类专栏：机器学习文章标签： Machine Learning 机器学习 quiz Linear Regression Variable

本文链接：https://blog.csdn.net/GarfieldEr007/article/details/49705623

版权

机器学习专栏收录该内容

294 篇文章 16 订阅

订阅专栏

Linear Regression with One Variable

Consider the problem of predicting how well a student does in her second year of college/university, given how well they did in their first year.

Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a student receives in their first year of college (freshmen year). We would like to predict the value of y, which we define as the number of "A" grades they get in their second year (sophomore year).

Refer to the following training set of a small sample of different students' performances (note that this training set will also be referenced in other questions in this quiz). Here each row is one training example. Recall that in linear regression, our hypothesis is $$h_\theta(x) = \theta_0 + \theta_1x$$, and we use $$m$$ to denote the number of training examples.

For the training set given above, what is the value of $$m$$? In the box below, please enter your answer (which should be a number between 0 and 10).

Consider the following training set of $$m=4$$ training examples:

x	y
1	0.5
2	1
4	2
0	0

Consider the linear regression model $$h_\theta(x) = \theta_0 + \theta_1x$$. What are the values of $$\theta_0$$ and $$\theta_1$$ that you would expect to obtain upon running gradient descent on this model? (Linear regression will be able to fit this data perfectly.)

$$\theta_0 = 0 , \theta_1 = 0.5$$

$$\theta_0 = 1, \theta_1 = 1$$

$$\theta_0 = 0.5, \theta_1 = 0$$

$$\theta_0 = 0.5, \theta_1 = 0.5$$

$$\theta_0 = 1, \theta_1 = 0.5$$

Suppose we set $$\theta_0 = -2, \theta_1 = 0.5$$. What is $$h_{\theta}(6)$$?

Let $$f$$ be some function so that

$$f(\theta_0, \theta_1)$$ outputs a number. For this problem,

$$f$$ is some arbitrary/unknown smooth function (not necessarily the

cost function of linear regression, so $$f$$ may have local optima).

Suppose we use gradient descent to try to minimize $$f(\theta_0, \theta_1)$$

as a function of $$\theta_0$$ and $$\theta_1$$. Which of the

following statements are true? (Check all that apply.)

If $$\theta_0$$ and $$\theta_1$$ are initialized at

a local minimum, then one iteration will not change their values.

If $$\theta_0$$ and $$\theta_1$$ are initialized so that $$\theta_0 = \theta_1$$, then by symmetry (because we do simultaneous updates to the two parameters), after one iteration of gradient descent, we will still have $$\theta_0 = \theta_1$$.

If the learning rate is too small, then gradient descent may take a very long

time to converge.

Even if the learning rate $$\alpha$$ is very large, every iteration of

gradient descent will decrease the value of $$f(\theta_0, \theta_1)$$.

Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we

have some training set, and for our training set we managed to find some $$\theta_0$$, $$\theta_1$$ such that $$J(\theta_0, \theta_1)=0$$. Which

of the statements below must then be true? (Check all that apply.)

For these values of $$\theta_0$$ and $$\theta_1$$ that satisfy $$J(\theta_0, \theta_1) = 0$$,

we have that $$h_\theta(x^{(i)}) = y^{(i)}$$ for every training example $$(x^{(i)}, y^{(i)})$$

For this to be true, we must have $$\theta_0 = 0$$ and $$\theta_1 = 0$$

so that $$h_\theta(x) = 0$$

We can perfectly predict the value of $$y$$ even for new examples that we have not yet seen.