this exercise is to recognize hand-written digits(from 0 to 9).
1、Mutli-class Classification
1.1 Visualizing the data
the trainning data set is 5000*400 matrices. So the inputs are pixel walues if digit images of size 20*20.
I learned how to display 2D data in a nice grid
% Gray Image
colormap(gray);
% Compute rows, cols
example_width = round(sqrt(size(X, 2)));
[m n] = size(X);
example_height = (n / example_width);
% Compute number of items to display
display_rows = floor(sqrt(m));
display_cols = ceil(m / display_rows);
% Between images padding
pad = 1;
% Setup blank display
display_array = - ones(pad + display_rows * (example_height + pad), ...
pad + display_cols * (example_width + pad));
% Copy each example into a patch on the display array
curr_ex = 1;
for j = 1:display_rows
for i = 1:display_cols
if curr_ex > m,
break;
end
% Copy the patch
% Get the max value of the patch
max_val = max(abs(X(curr_ex, :)));
display_array(pad + (j - 1) * (example_height + pad) + (1:example_height), ...
pad + (i - 1) * (example_width + pad) + (1:example_width)) = ...
reshape(X(curr_ex, :), example_height, example_width) / max_val;
curr_ex = curr_ex + 1;
end
if curr_ex > m,
break;
end
end
% Display Image
h = imagesc(display_array, [-1 1]);
% Do not show axis
axis image off
drawnow;
1.2、Compute cost and gradient(需自己完成)
function [J, grad] = lrCostFunction(theta, X, y, lambda)
%LRCOSTFUNCTION Compute cost and gradient for logistic regression with
%regularization
% J = LRCOSTFUNCTION(theta, X, y, lambda) computes the cost of using
% theta as the parameter for regularized logistic regression and the
% gradient of the cost w.r.t. to the parameters.
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));
hh = sigmoid(X*theta);
grad = X'*(hh -y)./m;
temp = theta;
temp(1) = 0;
grad = grad+lambda/m*temp;
left = -y'*log(hh);
right = (1-y)'*log(1-hh);
reg = lambda/2/m*(temp'*temp);
J = (left - right)/m+reg;
grad = grad(:);
end
1.3 One-vs-All Training(需自己完成)
for c = 1:num_labels
initial_theta = zeros(n+1,1);
options = optimset('GradObj', 'on', 'MaxIter', 50);
[theta] = ...
fmincg (@(t)(lrCostFunction(t, X, (y == c), lambda)), ...
initial_theta, options);
all_theta(c,:) = theta';
endfor
1.4 Predict for One-Vs-All(需自己完成)
pp = sigmoid(all_theta*X');
[max_p inx_p]= max(pp);
p = inx_p';
2、Neural Networks
the goal is to implement the feedforward propagetion algorithm to use the givern weights for prediction.
the neural network is like this:
Θ_1∈R25*401,Θ_2∈R10*26,
2.1 feedforward propagation and prediction
%% =========== Part 1: Loading and Visualizing Data =============
% Load Training Data
fprintf('Loading and Visualizing Data ...\n')
load('ex3data1.mat');
m = size(X, 1);
% Randomly select 100 data points to display
sel = randperm(size(X, 1));
sel = sel(1:100);
displayData(X(sel, :));
%% ================ Part 2: Loading Pameters ================
fprintf('\nLoading Saved Neural Network Parameters ...\n')
% Load the weights into variables Theta1 and Theta2
load('ex3weights.mat');
%% ================= Part 3: Implement Predict =================
pred = predict(Theta1, Theta2, X);
fprintf('\nTraining Set Accuracy: %f\n', mean(double(pred == y)) * 100);
we need to comlete the code in predict.m:
a1 = [ones(m,1) X];
z2 = a1*Theta1';
a2 = sigmoid(z2);
z3 = [ones(m,1) a2]*Theta2';
a3 = sigmoid(z3);
[max_h p] = max(a3,[],2)
这里需要注意的一点是,在每层网络的计算之前,都需要为输入数据添加偏置项。