【caffe】caffe 在已有模型上继续训练模型

一、caffe 支持在别人的模型上继续训练。 下面是给的例子http://

caffe-master0818\examples\imagenet\resume_training.sh

  1. #!/usr/bin/env sh  
  2.   
  3. ./build/tools/caffe train \  
  4.     --solver=models/bvlc_reference_caffenet/solver.prototxt \  
  5.     --snapshot=models/bvlc_reference_caffenet/caffenet_train_10000.solverstate.h5  
#!/usr/bin/env sh

./build/tools/caffe train \
    --solver=models/bvlc_reference_caffenet/solver.prototxt \
    --snapshot=models/bvlc_reference_caffenet/caffenet_train_10000.solverstate.h5

二 ,caffe同样支持多次降学习率训练

比如  caffe-master0818\examples\cifar10\train_full.sh

  1. #!/usr/bin/env sh  
  2.   
  3. TOOLS=./build/tools  
  4.   
  5. $TOOLS/caffe train \  
  6.     --solver=examples/cifar10/cifar10_full_solver.prototxt  
  7.   
  8. # reduce learning rate by factor of 10  
  9. $TOOLS/caffe train \  
  10.     --solver=examples/cifar10/cifar10_full_solver_lr1.prototxt \                           // 这里学习率了lr1配置文件   
  11.     --snapshot=examples/cifar10/cifar10_full_iter_60000.solverstate.h5  
  12.   
  13. # reduce learning rate by factor of 10  
  14. $TOOLS/caffe train \  
  15.     --solver=examples/cifar10/cifar10_full_solver_lr2.prototxt \                           // <span style="font-family: Arial, Helvetica, sans-serif;"> 这里学习率了lr1配置文件</span>  
  16.     --snapshot=examples/cifar10/cifar10_full_iter_65000.solverstate.h5  
#!/usr/bin/env sh

TOOLS=./build/tools

$TOOLS/caffe train \
    --solver=examples/cifar10/cifar10_full_solver.prototxt

# reduce learning rate by factor of 10
$TOOLS/caffe train \
    --solver=examples/cifar10/cifar10_full_solver_lr1.prototxt \                           // 这里学习率了lr1配置文件 
    --snapshot=examples/cifar10/cifar10_full_iter_60000.solverstate.h5

# reduce learning rate by factor of 10
$TOOLS/caffe train \
    --solver=examples/cifar10/cifar10_full_solver_lr2.prototxt \                           // <span style="font-family: Arial, Helvetica, sans-serif;"> 这里学习率了lr1配置文件</span>
    --snapshot=examples/cifar10/cifar10_full_iter_65000.solverstate.h5

另一个模型训练

  1. #!/usr/bin/env sh  
  2.   
  3. TOOLS=./build/tools  
  4.   
  5. $TOOLS/caffe train \  
  6.   --solver=examples/cifar10/cifar10_quick_solver.prototxt  
  7.   
  8. # reduce learning rate by factor of 10 after 8 epochs  
  9. $TOOLS/caffe train \  
  10.   --solver=examples/cifar10/cifar10_quick_solver_lr1.prototxt \  
  11.   --snapshot=examples/cifar10/cifar10_quick_iter_4000.solverstate.h5  
#!/usr/bin/env sh

TOOLS=./build/tools

$TOOLS/caffe train \
  --solver=examples/cifar10/cifar10_quick_solver.prototxt

# reduce learning rate by factor of 10 after 8 epochs
$TOOLS/caffe train \
  --solver=examples/cifar10/cifar10_quick_solver_lr1.prototxt \
  --snapshot=examples/cifar10/cifar10_quick_iter_4000.solverstate.h5

这样就不用每次都手动降学习率了。 

对于大的模型来说,多次降学习率还是很重要的, 实验结果表明, 当第一次学习率 不再下降的时候,再次降学习率。能够进一步降低损失函数


学习率太跳不到最低点,太小跳不出局部最优点, 所以刚开始学习率要大一些, 防止进入局部最优点。


-------------------------------------------------------------------------------------------------------------------------------------------

三、利用配置文件,配置

当然降学习率也可以通过配置文件配置。

http://caffe.berkeleyvision.org/tutorial/solver.html     caffe 官方例子

To use a learning rate policy like this, you can put the following lines somewhere in your solver prototxt file:

base_lr: 0.01     # begin training at a learning rate of 0.01 = 1e-2

lr_policy: "step" # learning rate policy: drop the learning rate in "steps"
                  # by a factor of gamma every stepsize iterations

gamma: 0.1        # drop the learning rate by a factor of 10
                  # (i.e., multiply it by a factor of gamma = 0.1)

stepsize: 100000  # drop the learning rate every 100K iterations

max_iter: 350000  # train for 350K iterations total

momentum: 0.9

Under the above settings, we’ll always use momentum  μ=0.9 . We’ll begin training at a base_lr of  α=0.01=102  for the first 100,000 iterations, then multiply the learning rate by gamma ( γ ) and train at  α=αγ=(0.01)(0.1)=0.001=103  for iterations 100K-200K, then at  α′′=104  for iterations 200K-300K, and finally train until iteration 350K (since we havemax_iter: 350000) at  α′′′=105 .

Note that the momentum setting  μ  effectively multiplies the size of your updates by a factor of  11μ  after many iterations of training, so if you increase  μ , it may be a good idea to decrease  α accordingly (and vice versa).

For example, with  μ=0.9 , we have an effective update size multiplier of  110.9=10 . If we increased the momentum to  μ=0.99 , we’ve increased our update size multiplier to 100, so we should drop  α  (base_lr) by a factor of 10.

Note also that the above settings are merely guidelines, and they’re definitely not guaranteed to be optimal (or even work at all!) in every situation. If learning diverges (e.g., you start to see very large or NaN or inf loss values or outputs), try dropping the base_lr (e.g., base_lr: 0.001) and re-training, repeating this until you find a base_lr value that works.

[1] A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet Classification with Deep Convolutional Neural NetworksAdvances in Neural Information Processing Systems, 2012.

参考链接: caffe_note

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值