[Coursera机器学习]Neural Networks Learning WEEK5编程作业

最新推荐文章于 2018-09-15 14:54:33 发布

亚尔诺炽焰

最新推荐文章于 2018-09-15 14:54:33 发布

阅读量3.1k

点赞数 1

分类专栏： Machine Learning 机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/wangjianyu0115/article/details/53002258

版权

Machine Learning 同时被 2 个专栏收录

9 篇文章 0 订阅

订阅专栏

机器学习

8 篇文章 2 订阅

订阅专栏

1.3 Feedforward and cost function

Recall that the cost function for the neural network (without regularization) is
$J(\theta ) = \frac{1}{m}\sum_{i=1}^{m}\sum_{k=1}^{K}[-y_{k}^{(i)}log((h_{\theta }(x^{(i)}))_{k}) - (1-y_{k}^{(i)})log(1-(h_{\theta }(x^{(i)}))_{k})]$

1.4 Regularized cost function

The cost function for neural networks with regularization is given by
$J(\theta ) = \frac{1}{m}\sum_{i=1}^{m}\sum_{k=1}^{K}[-y_{k}^{(i)}log((h_{\theta }(x^{(i)}))_{k}) - (1-y_{k}^{(i)})log(1-(h_{\theta }(x^{(i)}))_{k})] + \frac{\lambda }{2m}[\sum_{j=1}^{25}\sum_{k=1}^{400}(\theta _{j,k}^{(1)})^{2}+ \sum_{j=1}^{10}\sum_{k=1}^{25}(\theta _{j,k}^{(2)})^{2}]$

2 Backpropagation

In this part of the exercise, you will implement the backpropagation algorithm to compute the gradient for the neural network cost function. You will need to complete the nnCostFunction.m so that it returns an appropriate value for grad.

% ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
%               following parts.
%
% Part 1: Feedforward the neural network and return the cost in the
%         variable J. After implementing Part 1, you can verify that your
%         cost function computation is correct by verifying the cost
%         computed in ex4.m
%
% Part 2: Implement the backpropagation algorithm to compute the gradients
%         Theta1_grad and Theta2_grad. You should return the partial derivatives of
%         the cost function with respect to Theta1 and Theta2 in Theta1_grad and
%         Theta2_grad, respectively. After implementing Part 2, you can check
%         that your implementation is correct by running checkNNGradients
%
%         Note: The vector y passed into the function is a vector of labels
%               containing values from 1..K. You need to map this vector into a 
%               binary vector of 1's and 0's to be used with the neural network
%               cost function.
%
%         Hint: We recommend implementing backpropagation using a for-loop
%               over the training examples if you are implementing it for the 
%               first time.
%
% Part 3: Implement regularization with the cost function and gradients.
%
%         Hint: You can implement this around the code for
%               backpropagation. That is, you can compute the gradients for
%               the regularization separately and then add them to Theta1_grad
%               and Theta2_grad from Part 2.
%

% Implement of forward propogation
X = [ones(m, 1), X];
Layer_Hidden1 = X * Theta1';
Layer_Hidden2 = sigmoid(Layer_Hidden1);
Layer_Hidden3 = [ones(m, 1), Layer_Hidden2];
Layer_Output = sigmoid(Layer_Hidden3 * Theta2');

for i = 1:m
    % Recall that the cost function for the neural network
    % we need to recode the labels as vectors containing only values 0 or 1
    labels = zeros(num_labels, 1); % create 10*1 matrix as labels,init as 0
    result = y(i);                 % get labels
    labels(result) = 1;            % let correct site be 1
    % calculate cost function
    J = J +  log(Layer_Output(i, :)) * (-labels) - log(1 - Layer_Output(i, :)) * (1-labels);

    % difference of final result and output
    diff_output = Layer_Output(i, :)' - labels; % actually (x-y)^2/2 derivative to be x-y
    % partial derivative for backpropogation
    delta2 = diff_output * Layer_Hidden3(i, :);
    % chain rule for hidden layer to compute difference
    diff_hidden = Theta2(:, 2:end)' * diff_output .* sigmoidGradient(Layer_Hidden1(i, :)');
    delta1 = diff_hidden * X(i, :);

    Theta2_grad = Theta2_grad + delta2;
    Theta1_grad = Theta1_grad + delta1;
end

J = J / m;
Theta2_grad = Theta2_grad / m;
Theta1_grad = Theta1_grad / m;

% Implement regularization with the cost function and gradients.
Theta1_regular = [zeros(hidden_layer_size,1), Theta1(:, 2:end)];
Theta2_regular = [zeros(num_labels,1), Theta2(:, 2:end)];

J = J + (sum(sum(Theta1_regular.^2)) + sum(sum(Theta2_regular.^2))) * lambda/ 2 / m;
Theta1_grad = Theta1_grad + Theta1_regular * lambda / m;
Theta2_grad = Theta2_grad + Theta2_regular * lambda / m;
% -------------------------------------------------------------

% =========================================================================

2.1 Sigmoid gradient

To help you get started with this part of the exercise, you will rst implement the sigmoid gradient function. The gradient for the sigmoid function can be computed as

$g'(z) = \frac{d}{dz}g(z) = g(z)(1-g(z))$

$sigmoid(z) = g(z) = \frac{1}{1+e^{-z}}$

When you are done, try testing a few values by calling sigmoidGradient(z) at the Octave/MATLAB command line.

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the gradient of the sigmoid function evaluated at
%               each value of z (z can be a matrix, vector or scalar).

g = sigmoid(z) .* (1 - sigmoid(z));

2.2 Random initialization

Your job is to complete randInitializeWeights.m to initialize the weights for $\theta$ ; modify the file and fi ll in the following code:

% ====================== YOUR CODE HERE ======================
% Instructions: Initialize W randomly so that we break the symmetry while
%               training the neural network.
%
% Note: The first row of W corresponds to the parameters for the bias units
%

% Randomly initialize the weights to small values
epsilon_init = 0.12;
W = rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init;