[Coursera机器学习]Anomaly Detection and Recommender Systems WEEK9编程作业

1 Anomaly Detection

Your task is to complete the code in estimateGaussian.m. This function takes as input the data matrix X and should output an n-dimension vector mu that holds the mean of all the n features and another n-dimension vector sigma2 that holds the variances of all the features. You can implement this using a for-loop over every feature and every training example (though a vectorized implementation might be more efficient; feel free to use a vectorized implementation if you prefer).

We can estimate the parameters, ( μi , σ2i ), of the i-th feature by using the following equations. To estimate the mean, you will use:

μi=1mmj=1x(j)i

and for the variance you will use:

σ2i=1mmj=1(x(j)iμi)2

Here we can use implemented functions in matlab which are mean() and var().

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the mean of the data and the variances
%               In particular, mu(i) should contain the mean of
%               the data for the i-th feature and sigma2(i)
%               should contain variance of the i-th feature.
%

% mean function
mu = mean(X)';
% variance function
sigma2 = var(X, 1)';

1.3 Selecting the threshold, ε

You should now complete the code in selectThreshold.m.
The function selectThreshold.m should return two values; the first is the selected threshold ε . If an example x has a low probability p(x) < ε , then it is considered to be an anomaly. The function should also return the F1 score, which tells you how well you’re doing on finding the ground truth anomalies given a certain threshold. For many different values of ε , you will compute the resulting F1 score by computing how many examples the current threshold classi es correctly and incorrectly.

The F1 score is computed using precision (prec) and recall (rec):

F1=2precrecprec+rec

You compute precision and recall by:

prec=tptp+fp

rec=tptp+fn

Note : You can find out how many values in this vector are 0 by using: sum(v == 0). You can also apply a logical and operator to such binary vectors. For instance, let cvPredictions be a binary vector of the size of your number of cross validation set, where the i-th element is 1 if your algorithm considers x(i)cv an anomaly, and 0 otherwise. You can then, for example, compute the number of false positives using:
fp = sum((cvPredictions == 1) & (yval == 0)).

bestEpsilon = 0;
bestF1 = 0;
F1 = 0;

stepsize = (max(pval) - min(pval)) / 1000;
for epsilon = min(pval):stepsize:max(pval)

    % ====================== YOUR CODE HERE ======================
    % Instructions: Compute the F1 score of choosing epsilon as the
    %               threshold and place the value in F1. The code at the
    %               end of the loop will compare the F1 score for this
    %               choice of epsilon and set it to be the best epsilon if
    %               it is better than the current choice of epsilon.
    %               
    % Note: You can use predictions = (pval < epsilon) to get a binary vector
    %       of 0's and 1's of the outlier predictions



    predictions = (pval < epsilon);
    % You can compute the number of false positives using: fp = sum((cvPredictions == 1) & (yval == 0)).
    tp = sum((predictions == 1 & yval == 1));
    fp = sum((predictions == 1 & yval == 0));
    fn = sum((predictions == 0 & yval == 1));

    precision = tp / (tp + fp);
    recall = tp / (tp + fn);
    F1 = (2 * precision * recall) / (precision + recall);


    % =============================================================

    if F1 > bestF1
       bestF1 = F1;
       bestEpsilon = epsilon;
    end
end

2.2 Collaborative Fi ltering learning algorithm

You will complete the code in cofiCostFunc.m to compute the cost function and gradient for collaborative fi ltering. Note that the parameters to the function (i.e., the values that you are trying to learn) are X and Theta. In order to use an off-the-shelf minimizer such as fmincg, the cost function has been set up to unroll the parameters into a single vector params. You had previously used the same vector unrolling method in the neural networks programming exercise.

Regularized cost function : The cost function for collaborative filtering with regularization is given by

J(x(1),...,x(nm),θ(1),...,θ(nu))=12(i,j):r(i,j)=1((θ(j))Tx(i)y(i,j))2+(λ2nuj=1nk=1(θ(j)k)2)+(λ2nmi=1nk=1(x(i)k)2)

Regularized gradient : You should add to your implementation in cofiCostFunc.m to return the regularized gradient by adding the contributions from the regularization terms. Note that the gradients for the regularized cost function is given by :

Jx(i)k=j:r(i,j)=1((θ(j))Tx(i)y(i,j))θ(j)k+λx(i)k

Jθ(j)k=i:r(i,j)=1((θ(j))Tx(i)y(i,j))x(i)k+λθ(j)k

% You need to return the following values correctly
J = 0;
X_grad = zeros(size(X));
Theta_grad = zeros(size(Theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost function and gradient for collaborative
%               filtering. Concretely, you should first implement the cost
%               function (without regularization) and make sure it is
%               matches our costs. After that, you should implement the 
%               gradient and use the checkCostFunction routine to check
%               that the gradient is correct. Finally, you should implement
%               regularization.
%
% Notes: X - num_movies  x num_features matrix of movie features
%        Theta - num_users  x num_features matrix of user features
%        Y - num_movies x num_users matrix of user ratings of movies
%        R - num_movies x num_users matrix, where R(i, j) = 1 if the 
%            i-th movie was rated by the j-th user
%
% You should set the following variables correctly:
%
%        X_grad - num_movies x num_features matrix, containing the 
%                 partial derivatives w.r.t. to each element of X
%        Theta_grad - num_users x num_features matrix, containing the 
%                     partial derivatives w.r.t. to each element of Theta
%


% theta * x - y
errors = ((X * Theta' - Y) .* R);
squaredErrors = errors .^ 2;
J = ((1 / 2) * sum(squaredErrors(:))) + ((lambda / 2) * sum(Theta(:) .^ 2)) + ((lambda / 2) * sum(X(:) .^ 2));
% regularized gradient
X_grad = errors * Theta + (lambda .* X);
Theta_grad = errors' * X + (lambda .* Theta);

% =============================================================

grad = [X_grad(:); Theta_grad(:)];
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值