matconvnet只提供了卷积函数,并没有提供全连接函数,那么如何在卷积函数上训练全连接呢?
首先,我们要清楚一件事:卷积核为1*1同时步长是1的网络就是全连接。
那么配置网络的时候就只需执行卷积函数,同时配置卷积核的大小就可以。
这是我的配置文件:
function net = cnn_crp_init(varargin)
% CNN_MNIST_LENET Initialize a CNN similar for MNIST
opts.batchNormalization = true ;
opts.networkType = 'simplenn' ;
opts = vl_argparse(opts, varargin) ;
rng('default');
rng(0) ;
f=1/100 ;
net.layers = {} ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{f*randn(1,1,5000,200, 'single'), zeros(1, 200, 'single')}}, ...
'stride', 1, ...
'pad', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{f*randn(1,1,200,200, 'single'), zeros(1,200,'single')}}, ...
'stride', 1, ...
'pad', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{f*randn(1,1,200,2, 'single'), zeros(1,2,'single')}}, ...
'stride', 1, ...
'pad', 0) ;
net.layers{end+1} = struct('type', 'softmaxloss') ;
% optionally switch to batch normalization
if opts.batchNormalization
net = insertBnorm(net, 1) ;
net = insertBnorm(net, 3) ;
net = insertBnorm(net, 5) ;
end
% Meta parameters
net.meta.inputSize = [1 1 5000] ;
net.meta.trainOpts.learningRate = 0.001 ;
net.meta.trainOpts.numEpochs = 20 ;
net.meta.trainOpts.batchSize = 100 ;
% Fill in defaul values
net = vl_simplenn_tidy(net) ;
% Switch to DagNN if requested
switch lower(opts.networkType)
case 'simplenn'
% done
case 'dagnn'
net = dagnn.DagNN.fromSimpleNN(net, 'canonicalNames', true) ;
net.addLayer('error', dagnn.Loss('loss', 'classerror'), ...
{'prediction','label'}, 'error') ;
otherwise
assert(false) ;
end
% --------------------------------------------------------------------
function net = insertBnorm(net, l)
% --------------------------------------------------------------------
assert(isfield(net.layers{l}, 'weights'));
ndim = size(net.layers{l}.weights{1}, 4);
layer = struct('type', 'bnorm', ...
'weights', {{ones(ndim, 1, 'single'), zeros(ndim, 1, 'single')}}, ...
'learningRate', [1 1 0.05], ...
'weightDecay', [0 0]) ;
net.layers{l}.biases = [] ;
net.layers = horzcat(net.layers(1:l), layer, net.layers(l+1:end)) ;
然后需要单独配置网络输入的大小,由于matcovnet传承了Caffe的配置,所以输入数据是四维的:长宽通道数以及索引。我们的数据是5000*9963,所以我将输入配置改成:
data = single(reshape(F',1,1,5000,[]));
set = [ones(1,size(res.train{c},1)) 3*ones(1,size(res.test{c},1))];
将5000作为通道数就可以了,即1*1*5000
这里的数据是不平衡的,所以按照之前训练出来的结果是偏向负样本的(负样本多)。所以我就使用了一个策略:在minibatch里保证正负样本一样多。
这里填上我所写的代码:
训练时:
for t=1:opts.batchSize/2:size(subset_n,4)
%t1=t1+opts.batchSize;
batchSize= min(floor(opts.batchSize/2), size(subset_n,4) - t + 1) ;
batchSize_p=batchSize;
batchSize_n=batchSize-batchSize_p;
fprintf('%s: epoch %02d: %3d/%3d: ', mode, epoch, ...
fix(t/opts.batchSize)+1, ceil(size(subset_n,4)/opts.batchSize)) ;
numDone = 0 ;
error = [] ;
for s=1:opts.numSubBatches
% get this image batch and prefetch the next
batchStart = t + (labindex-1) + (s-1) * numlabs ;
batchEnd = min(t+opts.batchSize/2-1, size(subset_n,4)) ;
batch = batchStart : opts.numSubBatches * numlabs : batchEnd ;
[im, labels] = getBatch(imdb, batch, batchSize_p) ;
if opts.prefetch
if s==opts.numSubBatches
batchStart = t + (labindex-1) + opts.batchSize ;
batchEnd = min(t+2*opts.batchSize-1, size(subset_n,4)) ;
else
batchStart = batchStart + numlabs ;
end
nextBatch = subset(batchStart : opts.numSubBatches * numlabs : batchEnd) ;
getBatch(imdb, nextBatch) ;
end
提取数据时:
function [images, labels] = getSimpleNNBatch(imdb, batch, batchSize_p)
% --------------------------------------------------------------------
allImageN = imdb.images.data(:,:,:,imdb.images.labels.train{imdb.images.class}(:,2)==-1);
images_n = allImageN(:,:,:,batch);
allImageP = imdb.images.data(:,:,:,imdb.images.labels.train{imdb.images.class}(:,2)==1);
idx = randperm(size(allImageP,4));
allImageP = allImageP(:,:,:,idx);
images_p = allImageP(:,:,:,1:batchSize_p);
images = cat(4,images_p,images_n);
im_num = size(images,4);
labels = [ones(size(images_p,4),1);-1.*ones(size(images_n,4),1)];
im_indx = randperm(im_num);
labels = labels(im_indx,:)';
images = images(:,:,:,im_indx);
这样就保证了数据训练不会太偏斜,使得训练效果并不好。
附件:在我的github里提供了完整的代码。欢迎大家赏光。