Exercise 4:Neural Networks Learning
在ex4.m中我们需要完成的是sigmoidGradient.m
, randInitializeWeights.m
,nnCostFunction.m
- Part 1: Loading and Visualizing Data
这部分同ex3一样,在这里就不细写了,这里就贴一下ex4.m中的内容,帮助理解一下
% Load Training Data
fprintf('Loading and Visualizing Data ...\n')
load('ex4data1.mat');% X:5000x400,y:5000x1,5000是样本数量,X的400是20x20像素
m = size(X, 1); % m = 5000
% Randomly select 100 data points to display
sel = randperm(size(X, 1));% 返回行向量,其中包含从1到n(包括两者)之间的正数随机置换
sel = sel(1:100);% 两步合起来就是在1~5000中随机生成100个数
displayData(X(sel, :));
fprintf('Program paused. Press enter to continue.\n');
pause;
- Part 2: Loading Parameters
% In this part of the exercise, we load some pre-initialized
% neural network parameters.
fprintf('\nLoading Saved Neural Network Parameters ...\n')
% Load the weights into variables Theta1 and Theta2
load('ex4weights.mat');
% Unroll parameters
nn_params = [Theta1(:) ; Theta2(:)];% Theta1:25x401 Theta2:10x26 展开成列向量
- Part 3: Compute Cost (Feedforward)
% We suggest implementing the feedforward cost *without* regularization
% first so that it will be easier for you to debug. Later, in part 4, you
% will get to implement the regularized cost.
fprintf('\nFeedforward Using Neural Network ...\n')
% Weight regularization parameter (we set this to 0 here).
lambda = 0;
J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
num_labels, X, y, lambda);
fprintf(['Cost at parameters (loaded from ex4weights): %f '...
'\n(this value should be about 0.287629)\n'], J);
fprintf('\nProgram paused. Press enter to continue.\n');
pause;
先来看下function [J grad] = nnCostFunction(nn_params, ... input_layer_size, ... hidden_layer_size, ... num_labels, ... X, y, lambda)
nn_params就是Theta_1 Theta_2展开后的列向量,input_layer_size=400——20*20,hidden_layer_size =25,num_labels=10,这三个对应三层layer,X和y就是训练集的数据啦。
The parameters for the neural network are “unrolled” into the vectornn_params and need to be converted back into the weight matrices.
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
%reshape(总共的元素个数,行数,列数)
% Setup some useful variables
m = size(X, 1);%总的样本数
% You need to return the following variables correctly
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
nnCostFunction.m Part 1: Feedforward the neural network and return the cost in thevariable J. After implementing Part 1, you can verify that your cost function computation is correct by verifying the cost computed in ex4.m
看看注释应该就能懂了
X = [ones(m,1) X]; % X:5000x401 Theta_1:25x401 Theta_2:10x26
z2 = Theta1 * X'; %z2: 25x5000
a2 = sigmoid(z2); % hidden_layer激活项
tmp = size(a2,2);
a2 = [ones(1,tmp);a2]; % a2:26x5000
z3 = (Theta2 * a2)'; % z3: 5000x10
h = sigmoid(z3);
temp_y = zeros(m,num_labels); % 5000x10
for c = 1 : m,
temp_y(c,y(c)) = 1;
end;
J = 1/m*sum(sum(-temp_y.*log(h)-(1-temp_y).*log(1-h)));
J = J + lambda/2/m*(sum(sum(Theta1(:,2:end).^2))+sum(sum(Theta2(:,2:end).^2)));
nnCostFunction.m Part 2
delta3=h-temp_y; %5000*10
delta2=(Theta2)'*(delta3'); % Theta2:10*26 delta: 5000*10 ->delta2:26*5000
delta2=delta2(2:end,:); %25*5000
delta2=delta2.*sigmoidGradient(z2); %25*5000.*25*5000
acc_grad1=zeros(size(Theta1));
acc_grad2=zeros(size(Theta2));
acc_grad1=(acc_grad1+delta2*(X)); %25*5000*5000*401=25*401
acc_grad2=(acc_grad2+(delta3')*(a2)'); %(5000*10)'*(26*5000)'=10*26
Theta1_grad=acc_grad1/m;
Theta2_grad=acc_grad2/m;
nnCostFunction.m Part 3
reg1=lambda/m.*Theta1(:,2:end);
reg2=lambda/m.*Theta2(:,2:end);
reg1=[zeros(size(Theta1,1),1),reg1];
reg2=[zeros(size(Theta2,1),1),reg2];
Theta1_grad=Theta1_grad+reg1;
Theta2_grad=Theta2_grad+reg2;
- Part 4: Implement Regularization
fprintf('\nChecking Cost Function (w/ Regularization) ... \n')
% Weight regularization parameter (we set this to 1 here).
lambda = 1;
J = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, ...
num_labels, X, y, lambda);
fprintf(['Cost at parameters (loaded from ex4weights): %f '...
'\n(this value should be about 0.383770)\n'], J);
fprintf('Program paused. Press enter to continue.\n');
pause;
- Part 5: Sigmoid Gradient
fprintf('\nEvaluating sigmoid gradient...\n')
g = sigmoidGradient([-1 -0.5 0 0.5 1]);
fprintf('Sigmoid gradient evaluated at [-1 -0.5 0 0.5 1]:\n ');
fprintf('%f ', g);
fprintf('\n\n');
fprintf('Program paused. Press enter to continue.\n');
pause;
sigmoidGradient.m
g = zeros(size(z)); g_sig = sigmoid(z); g = g_sig.*(1-g_sig);
这个从上面的公式很容易写出来
6. Part 6: Initializing Pameters
% In this part of the exercise, you will be starting to implment a two
% layer neural network that classifies digits. You will start by
% implementing a function to initialize the weights of the neural network
% (randInitializeWeights.m)
fprintf('\nInitializing Neural Network Parameters ...\n')
initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);
% Unroll parameters
initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];
Your job is to complete randInitializeWeights.m to initialize the
weights for Θ; modify the file and fill in the following code: %
Randomly initialize the weights to small values
epsilon init = 0.12;
W= rand(L out, 1 + L in) * 2 * epsilon init t epsilon init
randInitializeWeights.m
W = zeros(L_out, 1 + L_in); W=rand(L_out,1+L_in)*2*epsilon_init-epsilon_init;
后面的part原来的文件中已经写好的代码,不过还没怎么看懂,等看懂了再来完善这篇文章。