在自己配置训练网络时,solver文件中lr_policy这个参数选择有好多种策略。
接下来看看/caffe-master/src/caffe/proto/caffe.proto
文件中队这个参数的说明
// The learning rate decay policy. The currently implemented learning rate // policies are as follows: // - fixed: always return base_lr. // - step: return base_lr * gamma ^ (floor(iter / step)) // - exp: return base_lr * gamma ^ iter // - inv: return base_lr * (1 + gamma * iter) ^ (- power) // - multistep: similar to step but it allows non uniform steps defined by // stepvalue // - poly: the effective learning rate follows a polynomial decay, to be // zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power) // - sigmoid: the effective learning rate follows a sigmod decay // return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize)))) // // where base_lr, max_iter, gamma, step, stepvalue and power are defined // in the solver parameter protocol buffer, and iter is the current iteration. 如果想看看效果,可以用DIGITS或自己写代码显示 按matlab实现:
<pre name="code" class="plain">iter=1:50000; max_iter=50000; base_lr=0.01; gamma=0.0001; power=0.75; step_size=5000; % - fixed: always return base_lr. lr=base_lr*ones(1,50000); subplot(2,3,1) plot(lr) title('fixed') % - step: return base_lr * gamma ^ (floor(iter / step)) lr=base_lr .* gamma.^(floor(iter./10000)); subplot(2,3,2) plot(lr) title('step') % - exp: return base_lr * gamma ^ iter lr=base_lr * gamma .^ iter; subplot(2,3,3) plot(lr) title('exp') % - inv: return base_lr * (1 + gamma * iter) ^ (- power) lr=base_lr.*(1./(1+gamma.*iter).^power); subplot(2,3,4) plot(lr) title('inv') % - multistep: similar to step but it allows non uniform steps defined by % stepvalue % - poly: the effective learning rate follows a polynomial decay, to be % zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power) lr=base_lr *(1 - iter./max_iter) .^ (power); subplot(2,3,5) plot(lr) title('poly') % - sigmoid: the effective learning rate follows a sigmod decay % return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize)))) lr=base_lr *( 1./(1 + exp(-gamma * (iter - step_size)))); subplot(2,3,6) plot(lr) title('sigmoid')