[Coursera机器学习]Neural Networks Learning WEEK5编程作业

1.3 Feedforward and cost function

Recall that the cost function for the neural network (without regularization) is
J(θ)=1mmi=1Kk=1[y(i)klog((hθ(x(i)))k)(1y(i)k)log(1(hθ(x(i)))k)]

1.4 Regularized cost function

The cost function for neural networks with regularization is given by
J(θ)=1mmi=1Kk=1[y(i)klog((hθ(x(i)))k)(1y(i)k)log(1(hθ(x(i)))k)]+λ2m[25j=1400k=1(θ(1)j,k)2+10j=125k=1(θ(2)j,k)2]

2 Backpropagation

In this part of the exercise, you will implement the backpropagation algorithm to compute the gradient for the neural network cost function. You will need to complete the nnCostFunction.m so that it returns an appropriate value for grad.

% ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
%               following parts.
%
% Part 1: Feedforward the neural network and return the cost in the
%         variable J. After implementing Part 1, you can verify that your
%         cost function computation is correct by verifying the cost
%         computed in ex4.m
%
% Part 2: Implement the backpropagation algorithm to compute the gradients
%         Theta1_grad and Theta2_grad. You should return the partial derivatives of
%         the cost function with respect to Theta1 and Theta2 in Theta1_grad and
%         Theta2_grad, respectively. After implementing Part 2, you can check
%         that your implementation is correct by running checkNNGradients
%
%         Note: The vector y passed into the function is a vector of labels
%               containing values from 1..K. You need to map this vector into a 
%               binary vector of 1's and 0's to be used with the neural network
%               cost function.
%
%         Hint: We recommend implementing backpropagation using a for-loop
%               over the training examples if you are implementing it for the 
%               first time.
%
% Part 3: Implement regularization with the cost function and gradients.
%
%         Hint: You can implement this around the code for
%               backpropagation. That is, you can compute the gradients for
%               the regularization separately and then add them to Theta1_grad
%               and Theta2_grad from Part 2.
%

% Implement of forward propogation
X = [ones(m, 1), X];
Layer_Hidden1 = X * Theta1';
Layer_Hidden2 = sigmoid(Layer_Hidden1);
Layer_Hidden3 = [ones(m, 1), Layer_Hidden2];
Layer_Output = sigmoid(Layer_Hidden3 * Theta2');

for i = 1:m
    % Recall that the cost function for the neural network
    % we need to recode the labels as vectors containing only values 0 or 1
    labels = zeros(num_labels, 1); % create 10*1 matrix as labels,init as 0
    result = y(i);                 % get labels
    labels(result) = 1;            % let correct site be 1
    % calculate cost function
    J = J +  log(Layer_Output(i, :)) * (-labels) - log(1 - Layer_Output(i, :)) * (1-labels);

    % difference of final result and output
    diff_output = Layer_Output(i, :)' - labels; % actually (x-y)^2/2 derivative to be x-y
    % partial derivative for backpropogation
    delta2 = diff_output * Layer_Hidden3(i, :);
    % chain rule for hidden layer to compute difference
    diff_hidden = Theta2(:, 2:end)' * diff_output .* sigmoidGradient(Layer_Hidden1(i, :)');
    delta1 = diff_hidden * X(i, :);

    Theta2_grad = Theta2_grad + delta2;
    Theta1_grad = Theta1_grad + delta1;
end

J = J / m;
Theta2_grad = Theta2_grad / m;
Theta1_grad = Theta1_grad / m;

% Implement regularization with the cost function and gradients.
Theta1_regular = [zeros(hidden_layer_size,1), Theta1(:, 2:end)];
Theta2_regular = [zeros(num_labels,1), Theta2(:, 2:end)];

J = J + (sum(sum(Theta1_regular.^2)) + sum(sum(Theta2_regular.^2))) * lambda/ 2 / m;
Theta1_grad = Theta1_grad + Theta1_regular * lambda / m;
Theta2_grad = Theta2_grad + Theta2_regular * lambda / m;
% -------------------------------------------------------------

% =========================================================================

2.1 Sigmoid gradient

To help you get started with this part of the exercise, you will rst implement the sigmoid gradient function. The gradient for the sigmoid function can be computed as

g(z)=ddzg(z)=g(z)(1g(z))

sigmoid(z)=g(z)=11+ez

When you are done, try testing a few values by calling sigmoidGradient(z) at the Octave/MATLAB command line.

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the gradient of the sigmoid function evaluated at
%               each value of z (z can be a matrix, vector or scalar).

g = sigmoid(z) .* (1 - sigmoid(z));

2.2 Random initialization

Your job is to complete randInitializeWeights.m to initialize the weights for θ ; modify the file and fi ll in the following code:

% ====================== YOUR CODE HERE ======================
% Instructions: Initialize W randomly so that we break the symmetry while
%               training the neural network.
%
% Note: The first row of W corresponds to the parameters for the bias units
%

% Randomly initialize the weights to small values
epsilon_init = 0.12;
W = rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init;
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值