在自己配置训练网络时的solver文件中这个参数选择有好多种策略。
接下来看看caffe.proto文件的这个参数:
- // The learning rate decay policy. The currently implemented learning rate
- // policies are as follows:
- // - fixed: always return base_lr.
- // - step: return base_lr * gamma ^ (floor(iter / step))
- // - exp: return base_lr * gamma ^ iter
- // - inv: return base_lr * (1 + gamma * iter) ^ (- power)
- // - multistep: similar to step but it allows non uniform steps defined by
- // stepvalue
- // - poly: the effective learning rate follows a polynomial decay, to be
- // zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power)
- // - sigmoid: the effective learning rate follows a sigmod decay
- // return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))
- //
- // where base_lr, max_iter, gamma, step, stepvalue and power are defined
- // in the solver parameter protocol buffer, and iter is the current iteration.
没装,按matlab实现:
- iter=1:50000;
- max_iter=50000;
- base_lr=0.01;
- gamma=0.0001;
- power=0.75;
- step_size=5000;
- % - fixed: always return base_lr.
- lr=base_lr*ones(1,50000);
- subplot(2,3,1)
- plot(lr)
- title('fixed')
- % - step: return base_lr * gamma ^ (floor(iter / step))
- lr=base_lr .* gamma.^(floor(iter./10000));
- subplot(2,3,2)
- plot(lr)
- title('step')
- % - exp: return base_lr * gamma ^ iter
- lr=base_lr * gamma .^ iter;
- subplot(2,3,3)
- plot(lr)
- title('exp')
- % - inv: return base_lr * (1 + gamma * iter) ^ (- power)
- lr=base_lr.*(1./(1+gamma.*iter).^power);
- subplot(2,3,4)
- plot(lr)
- title('inv')
- % - multistep: similar to step but it allows non uniform steps defined by
- % stepvalue
- % - poly: the effective learning rate follows a polynomial decay, to be
- % zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power)
- lr=base_lr *(1 - iter./max_iter) .^ (power);
- subplot(2,3,5)
- plot(lr)
- title('poly')
- % - sigmoid: the effective learning rate follows a sigmod decay
- % return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))
- lr=base_lr *( 1./(1 + exp(-gamma * (iter - step_size))));
- subplot(2,3,6)
- plot(lr)
- title('sigmoid')
原文链接:http://blog.csdn.net/langb2014/article/details/51274376