Machine Learning - week one

最新推荐文章于 2021-06-05 08:58:36 发布

catchingSun

最新推荐文章于 2021-06-05 08:58:36 发布

阅读量2.4k

点赞数

分类专栏： Machine Learning

本文链接：https://blog.csdn.net/catchingSun/article/details/63253629

版权

Machine Learning 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Introduce

5 试题

Linear Regression with One Variable

1
point

1。

Consider the problem of predicting how well a student does in her second year of college/university, given how well she did in her first year.

Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a student receives in their first year of college (freshmen year). We would like to predict the value of y, which we define as the number of "A" grades they get in their second year (sophomore year).

Here each row is one training example. Recall that in linear regression, our hypothesis is hθ(x)=θ0+θ1x , and we use m to denote the number of training examples.

For the training set given above (note that this training set may also be referenced in other questions in this quiz), what is the value of m ? In the box below, please enter your answer (which should be a number between 0 and 10).

1
point

2。

For this question, assume that we are

using the training set from Q1. Recall our definition of the

cost function was J(θ0,θ1)=12m∑mi=1(hθ(x(i))−y(i))2 .

What is J(0,1) ? In the box below,

please enter your answer (Simplify fractions to decimals when entering answer, and '.' as the decimal delimiter e.g., 1.5).

1
point

3。

Suppose we set θ0=−2,θ1=0.5 in the linear regression hypothesis from Q1. What is hθ(6) ?

1
point

4。

Let f be some function so that

f(θ0,θ1) outputs a number. For this problem,

f is some arbitrary/unknown smooth function (not necessarily the

cost function of linear regression, so f may have local optima).

Suppose we use gradient descent to try to minimize f(θ0,θ1)

as a function of θ0 and θ1 . Which of the

following statements are true? (Check all that apply.)

Even if the learning rate α is very large, every iteration of

gradient descent will decrease the value of f(θ0,θ1) .

If the learning rate is too small, then gradient descent may take a very long

time to converge.

If θ0 and θ1 are initialized so that θ0=θ1 , then by symmetry (because we do simultaneous updates to the two parameters), after one iteration of gradient descent, we will still have θ0=θ1 .

If θ0 and θ1 are initialized at

a local minimum, then one iteration will not change their values.

1
point

5。

Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we have some training set, and for our training set we managed to find some θ0 , θ1 such that J(θ0,θ1)=0 .

Which of the statements below must then be true? (Check all that apply.)

Gradient descent is likely to get stuck at a local minimum and fail to find the global minimum.

For this to be true, we must have θ0=0 and θ1=0

so that hθ(x)=0

Our training set can be fit perfectly by a straight line,

i.e., all of our training examples lie perfectly on some straight line.

For this to be true, we must have y(i)=0 for every value of i=1,2,…,m .

Linear Algebra

正确

1 / 1 分

1。

Let two matrices be

A=[4639],B=[−2−592]

What is A + B?

[6111211]

[2192]

[211211]

正确

To add two matrices, add them element-wise.

[611−67]

正确

1 / 1 分

2。

Let x=⎡⎣⎢⎢5527⎤⎦⎥⎥

What is 2∗x ?

⎡⎣⎢⎢1010414⎤⎦⎥⎥

正确

To multiply the vector x by 2, take each element of x and multiply that element by 2.

⎡⎣⎢⎢⎢⎢5252172⎤⎦⎥⎥⎥⎥

[1010414]

[5252172]

正确

1 / 1 分

3。

Let u be a 3-dimensional vector, where specifically

u=⎡⎣351⎤⎦

What is uT ?

⎡⎣153⎤⎦

⎡⎣351⎤⎦

[153]

[351]

正确

1 / 1 分

4。

Let u and v be 3-dimensional vectors, where specifically

u=⎡⎣4−4−3⎤⎦

and

v=⎡⎣424⎤⎦

What is uTv ?

(Hint: uT is a

1x3 dimensional matrix, and v can also be seen as a 3x1

matrix. The answer you want can be obtained by taking

the matrix product of uT and v .) Do not add brackets to your answer.

-4

正确回答

正确

1 / 1 分

5。

Let A and B be 3x3 (square) matrices. Which of the following

must necessarily hold true? Check all that apply.

A+B=B+A

正确

We add matrices element-wise. So, this must be true.

If A is the 3x3 identity matrix, then A∗B=B∗A

正确

Even though matrix multiplication is not commutative in general ( A∗B≠B∗A for general matrices A,B ), for the special case where A=I , we have A∗B=I∗B=B , and also B∗A=B∗I=B . So, A∗B=B∗A .

If C=A∗B , then C is a 6x6 matrix.

A∗B=B∗A

catchingSun

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Machine Learning - week one

Introduce5 试题1point1。A computer program is said to learn from experience E withrespect to some task T and some performance measure P if itsperformance on T, a
复制链接

扫一扫

专栏目录

Machine Learning - week one

“相关推荐”对你有帮助么？