转载自:http://blog.sina.com.cn/s/blog_837f83580102vwv4.html
作者:wind_静水流深_cloud
本文主要记录对MatConvNet代码的一些个人理解,方便以后使用(有些地方理解也许有偏差),不定期更新。。。。
一、训练过程中matlab命令窗口输出的speed,obj,top1误差,top5误差怎么计算的
obj=(stats数组的obj)/n,其中n=当前batch*batchSize,比如优化的是第2个batch的话就是n=2*batchSize,由此可以看出,其实n就是样本数目。top1和top5计算方法相同。
the training error is computed asthe model is updated during an epoch, whereas the validation erroris computed on a “frozen” model at the end of an epoch. This isimplicit in how the code works. Let M_{N-1} be the model at epochN-1 and M_N the model at epoch M_N. Let M_{N-1} -> M’ -> M’’-> M’’’ -> …. -> M_N the sequence of intermediate modelscomputed for each mini-batch processed during training. ThenTraining error at epoch N = Average of training errors of M’, M’’,M’’’, … on the training set mini-batches Validation error at epochN = Validation error of model M_N on the validation set Hence theestimated training error you get is somewhere in between thetraining error of model M_{N-1} and model M_N, whereas thevalidation error is for model M_N. Since M_N should be better thanM_{N-1}, then the validation error might be smaller than thisestimate (but won’t when N is large as models change little andoverfitting dominates). Note that you could the “proper" trainingerror of M_N after each epoch, but that would be expensive (as itwould require freezing M_N and passing all the training data again)and not worth it in practice.
net.layers=net.layers(1:end-2);
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{randn(1,1,input,output, 'single'), zeros(1,output,'single')}}, ...
'learningRate', 0.1*lr, 'stride', 1, 'pad', 0) ;
net.layers{end+1} = struct('type', 'softmaxloss') ;
其中input是上一个全连接层的维数,output是自己数据的类别数。
注意:如果原模型在训练时全连接层后加了dropout层,在fine-tune时最好也加上这些层,更有利于收敛
(1) 关于learning rate(lr)的问题。