做完这套题,ML吴老师系列习题解答也就告一段落了,先谢谢大家。
直奔主题吧:
第一部分:
1.实现estimateGaussian,就是计算一个正态分布的数据的均值和方差
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the mean of the data and the variances
% In particular, mu(i) should contain the mean of
% the data for the i-th feature and sigma2(i)
% should contain variance of the i-th feature.
%
for i = 1:n
mu(i) = mean(X(:,i));
sigma2(i) = (X(:,i) - mu(i))'*(X(:,i) - mu(i))/m;
end
% =============================================================
2.实现selectThreshold,根据高斯分布计算概率,但是得有个概率门限,告诉我到底多大概率算OK,这就是门限了
for epsilon = min(pval):stepsize:max(pval)
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the F1 score of choosing epsilon as the
% threshold and place the value in F1. The code at the
% end of the loop will compare the F1 score for this
% choice of epsilon and set it to be the best epsilon if
% it is better than the current choice of epsilon.
%
% Note: You can use predictions = (pval < epsilon) to get a binary vector
% of 0's and 1's of the outlier predictions
predictions = (pval < epsilon);
tp = sum((predictions == 1)&(yval == 1));
fp = sum((predictions == 1)&(yval == 0));
fn = sum((predictions == 0)&(yval == 1));
prec = tp/(tp + fp);
rec = tp/(tp + fn);
F1 = 2*prec*rec/(prec + rec);
% =============================================================
if F1 > bestF1
bestF1 = F1;
bestEpsilon = epsilon;
end
end
这里用了一个F1-Score算法,然后冒泡排序找到最好的F1,保留此时的epsilon作为概率门限。
作业的表达式 | yval =1 | yval =0 |
---|---|---|
predictions=1 | tp | fp |
predictions=0 | fn | X |
准确度 = tp/(tp + fp),召回率 = tp/(tp + fn);
以上这些我都没意见,但我觉得命名不规范:fn应该叫tn,X才应该是fn(原因:true/false postive/negtive)。
我觉得是这样 | yval =1 | yval =0 |
---|---|---|
predictions=1 | tp | fp |
predictions=0 | tn | fn |
但公式用的对,瑕不掩瑜。
针对F1-Score公式F1 = 2*prec*rec/(prec + rec);
我的理解是:2/F1 = 1/prec + 1/rec;
这不就是两电阻并联公式1/R = 1/R1 + 1/R2
的变体么?见名知意,不展开了。
看看输出:
Best epsilon found using cross-validation: 8.990853e-05
Best F1 on Cross Validation Set: 0.875000
(you should see a value epsilon of about 8.99e-05)
(you should see a Best F1 value of 0.875000)
Program paused. Press enter to continue.
看看Multidimensional Outliers(more features)输出:
Program paused. Press enter to continue.
Best epsilon found using cross-validation: 1.377229e-18
Best F1 on Cross Validation Set: 0.615385
(you should see a value epsilon of about 1.38e-18)
(you should see a Best F1 value of 0.615385)
# Outliers found: 117
第二部分:
3.实现cofiCostFunc,就是计算协同滤波的代价函数和多元梯度
ex8中给出了一种实现方法伪代码:
但我推导出了向量实现形式,所以暂时未用上述方法:
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost function and gradient for collaborative
% filtering. Concretely, you should first implement the cost
% function (without regularization) and make sure it is
% matches our costs. After that, you should implement the
% gradient and use the checkCostFunction routine to check
% that the gradient is correct. Finally, you should implement
% regularization.
%
% Notes: X - num_movies x num_features matrix of movie features
% Theta - num_users x num_features matrix of user features
% Y - num_movies x num_users matrix of user ratings of movies
% R - num_movies x num_users matrix, where R(i, j) = 1 if the
% i-th movie was rated by the j-th user
%
% You should set the following variables correctly:
%
% X_grad - num_movies x num_features matrix, containing the
% partial derivatives w.r.t. to each element of X
% Theta_grad - num_users x num_features matrix, containing the
% partial derivatives w.r.t. to each element of Theta
%
J = sum(sum(R.*((X*Theta'- Y).*(X*Theta' - Y))))/2 + lambda*sum(sum(Theta.*Theta))/2 + lambda*sum(sum(X.*X))/2;
X_grad = (R.*(X*Theta' - Y)*Theta) + lambda*X;
Theta_grad = (R.*(X*Theta' - Y))'*X + lambda*Theta;
% =============================================================
grad = [X_grad(:); Theta_grad(:)];
推导想用word写,需要再专门写一篇博文,近期有时间补上。
(先传一份手稿吧,要是大家看得明白,就不花时间写word了)
先看代价函数的结果:
>> ex8_cofi
Loading movie ratings dataset.
Average rating for movie 1 (Toy Story): 3.878319 / 5
Program paused. Press enter to continue.
Cost at loaded parameters: 22.224604
(this value should be about 22.22)
Program paused. Press enter to continue.
再看看梯度计算:
Checking Gradients (without regularization) ...
5.6871 5.6871
12.6182 12.6182
-5.5630 -5.5630
-7.7717 -7.7717
-2.0086 -2.0086
-4.6137 -4.6137
10.8911 10.8911
9.1930 9.1930
0.0586 0.0586
-3.1872 -3.1872
-0.7274 -0.7274
-3.0677 -3.0677
-2.3581 -2.3581
-8.5247 -8.5247
0.4933 0.4933
-3.4043 -3.4043
7.1658 7.1658
-0.7342 -0.7342
7.5453 7.5453
-3.5194 -3.5194
-1.3339 -1.3339
-9.0309 -9.0309
-0.9906 -0.9906
-2.7904 -2.7904
-0.2378 -0.2378
-1.4452 -1.4452
2.3082 2.3082
The above two columns you get should be very similar.
(Left-Your Numerical Gradient, Right-Analytical Gradient)
If your cost function implementation is correct, then
the relative difference will be small (less than 1e-9).
Relative Difference: 1.51253e-12
Program paused. Press enter to continue.
下面是正则化带lambda的代价函数输出:
Cost at loaded parameters (lambda = 1.5): 31.344056
(this value should be about 31.34)
Program paused. Press enter to continue.
下面是正则化带lambda的梯度输出:
Checking Gradients (with regularization) ...
0.7413 0.7413
0.8192 0.8192
0.3141 0.3141
4.7460 4.7460
-4.5036 -4.5036
2.8989 2.8989
-0.8744 -0.8744
8.7240 8.7240
-2.9560 -2.9560
2.3490 2.3490
-2.1666 -2.1666
-9.4598 -9.4598
-1.5855 -1.5855
-2.1904 -2.1904
-3.3958 -3.3958
-3.6352 -3.6352
-0.0663 -0.0663
4.7927 4.7927
-0.9200 -0.9200
-2.5735 -2.5735
-4.4383 -4.4383
2.4318 2.4318
0.9799 0.9799
1.9579 1.9579
4.0815 4.0815
1.8543 1.8543
3.0205 3.0205
The above two columns you get should be very similar.
(Left-Your Numerical Gradient, Right-Analytical Gradient)
If your cost function implementation is correct, then
the relative difference will be small (less than 1e-9).
Relative Difference: 1.87243e-12
Program paused. Press enter to continue.
就到这里吧