[Coursera机器学习]Anomaly Detection and Recommender Systems WEEK9编程作业

最新推荐文章于 2024-06-03 01:01:12 发布

亚尔诺炽焰

最新推荐文章于 2024-06-03 01:01:12 发布

阅读量1.6k

点赞数

分类专栏： Machine Learning 机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/wangjianyu0115/article/details/76320237

版权

Machine Learning 同时被 2 个专栏收录

9 篇文章 0 订阅

订阅专栏

机器学习

8 篇文章 2 订阅

订阅专栏

1 Anomaly Detection

Your task is to complete the code in estimateGaussian.m. This function takes as input the data matrix X and should output an n-dimension vector mu that holds the mean of all the n features and another n-dimension vector sigma2 that holds the variances of all the features. You can implement this using a for-loop over every feature and every training example (though a vectorized implementation might be more efficient; feel free to use a vectorized implementation if you prefer).

We can estimate the parameters, ( ${\mu }_{i}$ , $\sigma_{i }^{2}$ ), of the i-th feature by using the following equations. To estimate the mean, you will use:

$\mu _{i}=\frac{1}{m}\sum_{j=1}^{m}x_{i}^{(j)}$

and for the variance you will use:

$\sigma _{i}^{2}=\frac{1}{m}\sum_{j=1}^{m}(x_{i}^{(j)}-\mu _{i})^{2}$

Here we can use implemented functions in matlab which are mean() and var().

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the mean of the data and the variances
%               In particular, mu(i) should contain the mean of
%               the data for the i-th feature and sigma2(i)
%               should contain variance of the i-th feature.
%

% mean function
mu = mean(X)';
% variance function
sigma2 = var(X, 1)';

1.3 Selecting the threshold, $\varepsilon$

You should now complete the code in selectThreshold.m.
The function selectThreshold.m should return two values; the first is the selected threshold $\varepsilon$ . If an example x has a low probability p(x) < $\varepsilon$ , then it is considered to be an anomaly. The function should also return the F1 score, which tells you how well you’re doing on finding the ground truth anomalies given a certain threshold. For many different values of $\varepsilon$ , you will compute the resulting F1 score by computing how many examples the current threshold classi es correctly and incorrectly.

The F1 score is computed using precision (prec) and recall (rec):

$F _{1} = \frac{2\cdot prec\cdot rec}{prec + rec}$

You compute precision and recall by:

$prec = \frac{tp}{tp+fp}$

$rec = \frac{tp}{tp+fn}$

Note : You can find out how many values in this vector are 0 by using: sum(v == 0). You can also apply a logical and operator to such binary vectors. For instance, let cvPredictions be a binary vector of the size of your number of cross validation set, where the i-th element is 1 if your algorithm considers $x_{cv}^{(i)}$ an anomaly, and 0 otherwise. You can then, for example, compute the number of false positives using:
fp = sum((cvPredictions == 1) & (yval == 0)).

bestEpsilon = 0;
bestF1 = 0;
F1 = 0;

stepsize = (max(pval) - min(pval)) / 1000;
for epsilon = min(pval):stepsize:max(pval)

    % ====================== YOUR CODE HERE ======================
    % Instructions: Compute the F1 score of choosing epsilon as the
    %               threshold and place the value in F1. The code at the
    %               end of the loop will compare the F1 score for this
    %               choice of epsilon and set it to be the best epsilon if
    %               it is better than the current choice of epsilon.
    %               
    % Note: You can use predictions = (pval < epsilon) to get a binary vector
    %       of 0's and 1's of the outlier predictions



    predictions = (pval < epsilon);
    % You can compute the number of false positives using: fp = sum((cvPredictions == 1) & (yval == 0)).
    tp = sum((predictions == 1 & yval == 1));
    fp = sum((predictions == 1 & yval == 0));
    fn = sum((predictions == 0 & yval == 1));

    precision = tp / (tp + fp);
    recall = tp / (tp + fn);
    F1 = (2 * precision * recall) / (precision + recall);


    % =============================================================

    if F1 > bestF1
       bestF1 = F1;
       bestEpsilon = epsilon;
    end
end

2.2 Collaborative Fi ltering learning algorithm

You will complete the code in cofiCostFunc.m to compute the cost function and gradient for collaborative fi ltering. Note that the parameters to the function (i.e., the values that you are trying to learn) are X and Theta. In order to use an off-the-shelf minimizer such as fmincg, the cost function has been set up to unroll the parameters into a single vector params. You had previously used the same vector unrolling method in the neural networks programming exercise.

Regularized cost function : The cost function for collaborative filtering with regularization is given by

$J(x^{(1)},...,x^{(n_{m})},\theta ^{(1)},...,\theta ^{(n_{u})})=\frac{1}{2}\sum_{(i,j):r(i,j)=1}^{ }((\theta ^{(j)})^{T}x^{(i)}-y^{(i,j)})^{2}+(\frac{\lambda }{2}\sum_{j=1}^{n_{u}} \sum_{k=1}^{n}(\theta _{k}^{(j)})^{2})+(\frac{\lambda }{2}\sum_{i=1}^{n_{m}} \sum_{k=1}^{n}(x_{k}^{(i)})^{2})$

Regularized gradient : You should add to your implementation in cofiCostFunc.m to return the regularized gradient by adding the contributions from the regularization terms. Note that the gradients for the regularized cost function is given by :

$\frac{\partial J}{\partial x_{k}^{(i)}}=\sum_{j:r(i,j)=1}^{ }((\theta ^{(j)})^{T}x^{(i)}-y^{(i,j)})\theta _{k}^{(j)}+\lambda x_{k}^{(i)}$

$\frac{\partial J}{\partial \theta _{k}^{(j)}}=\sum_{i:r(i,j)=1}^{ }((\theta ^{(j)})^{T}x^{(i)}-y^{(i,j)})x_{k}^{(i)}+\lambda \theta _{k}^{(j)}$

% You need to return the following values correctly
J = 0;
X_grad = zeros(size(X));
Theta_grad = zeros(size(Theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost function and gradient for collaborative
%               filtering. Concretely, you should first implement the cost
%               function (without regularization) and make sure it is
%               matches our costs. After that, you should implement the 
%               gradient and use the checkCostFunction routine to check
%               that the gradient is correct. Finally, you should implement
%               regularization.
%
% Notes: X - num_movies  x num_features matrix of movie features
%        Theta - num_users  x num_features matrix of user features
%        Y - num_movies x num_users matrix of user ratings of movies
%        R - num_movies x num_users matrix, where R(i, j) = 1 if the 
%            i-th movie was rated by the j-th user
%
% You should set the following variables correctly:
%
%        X_grad - num_movies x num_features matrix, containing the 
%                 partial derivatives w.r.t. to each element of X
%        Theta_grad - num_users x num_features matrix, containing the 
%                     partial derivatives w.r.t. to each element of Theta
%


% theta * x - y
errors = ((X * Theta' - Y) .* R);
squaredErrors = errors .^ 2;
J = ((1 / 2) * sum(squaredErrors(:))) + ((lambda / 2) * sum(Theta(:) .^ 2)) + ((lambda / 2) * sum(X(:) .^ 2));
% regularized gradient
X_grad = errors * Theta + (lambda .* X);
Theta_grad = errors' * X + (lambda .* Theta);

% =============================================================

grad = [X_grad(:); Theta_grad(:)];

亚尔诺炽焰

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[Coursera机器学习]Anomaly Detection and Recommender Systems WEEK9编程作业

1 Anomaly DetectionYour task is to complete the code in estimateGaussian.m. This function takes as input the data matrix X and should output an n-dimension vector mu that holds the mean of all the n fe
复制链接

扫一扫