最近在学习UFLDL Tutorial,这是一套关于无监督学习的教程。在此感觉Andrew Ng做的真的是非常认真。下面把我的代码贴出来,方便大家学习调试。所有代码已经过matlab调试通过。
Convolution and Pooling
本章是使用卷积神经网络进行分类。分类的图片有四种:飞机、汽车、猫、狗(如图1)。每幅图像的大小为64*64*3(彩色)。train图片2000幅,test图片3200幅。使用卷积神经网络的准确率为80%左右。这个结果还是相当不错的。
图1
代码编写
一、cnnExercise.m 主函数。包括训练特征,评测结果。
Step 0. 初始化参数。无需编写。
Step 1. 训练Sparse Autoencoder。代码:
% 读取在Linear Decoders with Autoencoders一章中训练好的权值
load STL10Features %包含optTheta, ZCAWhite, meanPatch
Step 2a. 计算卷积后的图像。
代码详见cnnConvolve.m,在本文后面有写。
Step 2b. 检查cnnConvolve.m。无需编写。
Step 2c. 执行Pool。作用是降维,且对图像偏移等有抑制作用。代码详见cnnPool.m,在本文后面有写。
Step 2d. 检查cnnPool.m。无需编写
Step 3. 对训练集和测试集执行卷积和Pool。无需编写。我的i-7电脑上大概花了半小时左右。算好的特征会写出到硬盘cnnPooledFeatures.mat,以后就可以直接读取并直接用其进行分类了,免得重复计算。
Step 4. 用训练集训练Softmax分类器。无需编写。
Step 5. 用测试集评测结果。无需编写。我的准确率78.56%。由于权值是随机初始化的,结果每次可能会稍有不同。
二、cnnConvolve.m 计算卷积后的图像。由于这个自己要写的部分比较散,所以我把整个.m文件都贴上来。UFLDL已经把架子搭好了,只有少部分是需要自己编写的。
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)
%cnnConvolve Returns the convolution of the features given by W and b with
%the given images
%
% Parameters:
% patchDim - patch (feature) dimension
% numFeatures - number of features
% images - large images to convolve with, matrix in the form
% images(r, c, channel, image number)
% W, b - W, b for features from the sparse autoencoder
% ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
% preprocessing
%
% Returns:
% convolvedFeatures - matrix of convolved features in the form
% convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
numImages = size(images, 4);
imageDim = size(images, 1);
imageChannels = size(images, 3);
convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
% Instructions:
% Convolve every feature with every large image here to produce the
% numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1)
% matrix convolvedFeatures, such that
% convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
% value of the convolved featureNum feature for the imageNum image over
% the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
%
% Expected running times:
% Convolving with 100 images should take less than 3 minutes
% Convolving with 5000 images should take around an hour
% (So to save time when testing, you should convolve with less images, as
% described earlier)
% -------------------- YOUR CODE HERE --------------------
% Precompute the matrices that will be used during the convolution. Recall
% that you need to take into account the whitening and mean subtraction
% steps
subplot(247)
imagesc(images(:,:,:,7))
subplot(248)
imagesc(images(:,:,:,8))
% 变换,参考UFLDL
WT = W*ZCAWhite;
% --------------------------------------------------------
convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
for imageNum = 1:numImages
for featureNum = 1:numFeatures
% convolution of image with feature matrix for each channel
convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
for channel = 1:3
% Obtain the feature (patchDim x patchDim) needed during the convolution
% ---- YOUR CODE HERE ----
feature = zeros(8,8); % You should replace this
% 当前featureNum, 当前channel的权值。size:1*64
WT_curr = WT(featureNum, (channel-1)*patchDim*patchDim+1:channel*patchDim*patchDim);
feature = reshape(WT_curr, patchDim, patchDim); %size:8*8
% ------------------------
% Flip the feature matrix because of the definition of convolution, as explained later
feature = flipud(fliplr(squeeze(feature)));
% Obtain the image
im = squeeze(images(:, :, channel, imageNum)); %获取当前imageNum当前channel图像
% Convolve "feature" with "im", adding the result to convolvedImage
% be sure to do a 'valid' convolution
% ---- YOUR CODE HERE ----
tmp = conv2(im,feature); %计算卷积
convolvedImage = convolvedImage + tmp(patchDim:end-patchDim+1, patchDim:end-patchDim+1); %切除边缘
% ------------------------
end
% Subtract the bias unit (correcting for the mean subtraction as well)
% Then, apply the sigmoid function to get the hidden activation
% ---- YOUR CODE HERE ----
convolvedImage = convolvedImage - WT(featureNum,:)*meanPatch + b(featureNum); %去除偏置,详见UFLDL
convolvedImage = sigmoid(convolvedImage); %经过sigmoid函数
% ------------------------
% The convolved feature is the sum of the convolved values for all channels
convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
end
end
end
function sigm = sigmoid(x)
sigm = 1 ./ (1 + exp(-x));
end
三、cnnPool.m 进行Pool操作。卷积后的图像是57*57,教程中用的pool大小是19*19。这部分代码比较容易,代码:
row = floor(convolvedDim / poolDim);
col = floor(convolvedDim / poolDim);
for imageNum = 1:numImages
for featureNum = 1:numFeatures
for i1 = 1:row
for j1 = 1:col
tmpM = convolvedFeatures(featureNum, imageNum, (i1-1)*poolDim+1:i1*poolDim, (j1-1)*poolDim+1:j1*poolDim);
pooledFeatures(featureNum, imageNum, i1, j1) = mean(mean(tmpM));
end
end
end
end
四、RecognizeKQQ.m 自己添加的函数。由于 cnnExercise.m中输出了cnnPooledFeatures,因此可以直接进行softmax的训练和测试。就不用计算那么久了。这个函数这是为了我自己方便测试用的。只要load一些数据然后把Step 4, Step 5原封不动拷贝过来就行了。代码:
close all
load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels
load stlTestSubset.mat % loads numTestImages, testImages, testLabels
load cnnPooledFeatures
%% STEP 4: Use pooled features for classification
% Now, you will use your pooled features to train a softmax classifier,
% using softmaxTrain from the softmax exercise.
% Training the softmax classifer for 1000 iterations should take less than
% 10 minutes.
% Add the path to your softmax solution, if necessary
% addpath /path/to/solution/
% Setup parameters for softmax
softmaxLambda = 1e-4;
numClasses = 4;
% Reshape the pooledFeatures to form an input vector for softmax
softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...
numTrainImages);
softmaxY = trainLabels;
options = struct;
options.maxIter = 200;
softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...
numClasses, softmaxLambda, softmaxX, softmaxY, options);
%%======================================================================
%% STEP 5: Test classifer
% Now you will test your trained classifer against the test images
softmaxX = permute(pooledFeaturesTest, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);
softmaxY = testLabels;
[pred] = softmaxPredict(softmaxModel, softmaxX);
acc = (pred(:) == softmaxY(:));
acc = sum(acc) / size(acc, 1);
fprintf('Accuracy: %2.3f%%\n', acc * 100);
% You should expect to get an accuracy of around 80% on the test images.
小结
我们来总结一下网络的结构,如下图所示
接近80%的准确率还是非常不错的!