Homework_Week2_Coursera【Machine Learning】AndrewNg Stanford University、Linear Regression with Multiple Variables、Octave/Matlab Tutorial
Linear Regression with Multiple Variables
第一题(错误)
1 Suppose m=4 students have taken some class, and the class had a midterm exam and a final exam. You have collected a dataset of their scores on the two exams, which is as follows:
公式:
解析:
题中说到 x_1 is the midterm score and x_2 is (midterm score)^2、可以直到有两个变量、而问的x上标是4说明是第四个样本、下标是2说明是第二个属性所以是4761、
所以xi=(xi-ui)/si
其中 xi=4761,ui是特征的平均数、算是5,925.5
si是最大最小值差,为4,075
故为(4761-5925.5)/4075,结果为-0.2857、保留两位小数,为-0.29
最终结果为80分,错了一个题,就是第一题。
别着急。上面一个地方算错了
也导致我整道题算错了就是平均数、ui=(7921+5184+8836+4761)/4=6,675.5
之后再算结果为-0.4698159509202454
取小数点后两位为-0.47、粗心了
第二题
You run gradient descent for 15 iterations with \alpha = 0.3α=0.3 and compute J(\theta)J(θ) after each iteration. You find that the value of J(\theta)J(θ) increases over time. Based on this, which of the following conclusions seems most plausible?
解析:损失函数不断增大应该考虑这种情况、其实就是学习率大了
见下图
第三题
Suppose you have m = 14m=14 training examples with n = 3n=3 features (excluding the additional all-ones feature for the intercept term, which you should add).
解析:
第四题
Suppose you have a dataset with m = 50m=50 examples and n = 15n=15 features for each example. You want to use multivariate linear regression to fit the parameters \thetaθ to our data. Should you prefer gradient descent or the normal equation?
解析:
第五题
Which of the following are reasons for using feature scaling?
解析:数据预处理、归一化就是为了避免数据原来的分布导致的学习速度不理想,见下图
Octave/Matlab Tutorial
第一题
第 1 个问题
Suppose I first execute the following in Octave/Matlab:
解析:
前提是A矩阵是32,B矩阵是23
A选项中A转置后变成23再加上23,完全可以
B选项23 与32,可以乘
C选项矩阵不同型,没办法相加、排除
D选项B转置后成为了32,此时32与3*2矩阵是无法相乘的、所以D也错误
最终选择AB
第二题
解析:
取1-4行,1-2列全体元素、A、B选项
第三题
Let AA be a 10x10 matrix andxx be a 10-element vector. Your friend wants to compute the product AxAx and writes the following code:
v = zeros(10, 1);
for i = 1:10
for j = 1:10
v(i) = v(i) + A(i, j) * x(j);
end
end
解析:
A是1010的矩阵、 x是10维的向量、你的朋友要计算Ax
直接Ax,Ax可能会被当作一个未命名的变量所以B错误
选A
第四题
第 4 个问题
Say you have two column vectors v and w, each with 7 elements (i.e., they have dimensions 7x1). Consider the following code:
z = 0;
for i = 1:7
z = z + v(i) * w(i)
end
Which of the following vectorizations correctly compute z? Check all that apply.
解析:7个元素,两列向量、7*1、要结果象是一个数字、
必须要前面是[1,2,3,…],后面是竖着的[1,2,3…]^T
所以前面要转置、选B
v.*w是对A、B按位置直接乘,最后在sum加起来、效果和B是一样的、所以A也正确
第五题
In Octave/Matlab, many functions work on single numbers, vectors, and matrices. For example, the sin function when applied to a matrix will return a new matrix with the sin of each element. But you have to be careful, as certain functions have different behavior. Suppose you have an 7x7 matrix XX. You want to compute the log of every element, the square of every element, add 1 to every element, and divide every element by 4. You will store the results in four matrices, A, B, C, DA,B,C,D. One way to do so is the following code:
for i = 1:7
for j = 1:7
A(i, j) = log(X(i, j));
B(i, j) = X(i, j) ^ 2;
C(i, j) = X(i, j) + 1;
D(i, j) = X(i, j) / 4;
end
end
Which of the following correctly compute A, B, C or D? Check all that apply.
解析:
D 不正确、少了. 应该是X.^2
结束