caffe :from snapshop resume training

最新推荐文章于 2023-10-12 20:42:36 发布

wonengguwozai

最新推荐文章于 2023-10-12 20:42:36 发布

阅读量271

点赞数

分类专栏： caffe

caffe 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

执行方法：

原文地址：https://yunmingzhang.wordpress.com/2015/02/04/caffe-notes-using-snap-shot-in-convolutional-neural-network-training/

This is a post summarizing how to resume training on caffe using snapshots.

First, you need to generate snapshot files. You can do this by specify in solver.prototxt file. Of course, the name of the solver file is different for different models, usually like cifar10_quick_solver.prototxt

# snapshot intermediate results

snapshot: 500

This means that it will take a snapshot every 500 iterations. NOT that it will only take a snapshot at the 500th iteration.

Once you have the snapshots, you will see two files, model_iter_xxx.caffemodel and model_iter_xxx.solverstate (for example, cifar10_quick_iter_3000.solverstate). The prefix of the filename can be customized in the prototxt file.

Once you have the snapshot, you can specify to use the snapshot in the training script, for cifar10, you can specify in the train_quick.sh with the

option –snapshot=cifar10_quick_iter_3000.solverstate.

This will start the training at the 3000th iteration, a note can be found here http://caffe.berkeleyvision.org/gathered/examples/imagenet.html for imagine.

Despite the fact that you only specified the cifar10_quick_iter_3000.solverstate file, to get it actually running, you ALSO NEED the cifar10_quick_iter_3000.caffemodel file in the directory.

THERE IS ONE TRICK HERE, the options snapshots and solver have to be specified ON THE SAME LINE, that is don’t miss the “\” after the solver option

$TOOLS/caffe train \

–solver=examples/cifar10/cifar10_quick_solver.prototxt \

–snapshot=examples/cifar10/cifar10_quick_iter_3000.solverstate

OTHERWISE, it WILL NOT start from the snapshot and it won’t tell you what the problem is.

以下为我的个人总结（已在mnist上lenet训练中验证）：

上文中恢复训练的方式中的

1)迭代次数：既试用于1、初始训练时指定了训练次数，中途人为中断，来继续训练；也适合于2、训练完初始指定的次数后，在solover中重新指定训练次数，继续训练。

2)学习率：如果不改变初始训练时solver中的学习方式和学习率的话，继续训练的模型的学习率会按照恢复时的学习率状态继续训练，如果改变了学习方式和学习率的值，则继续训练的模型学习率会跟着改变。

总之在恢复训练时，solver中如果除了迭代次数，其他都没变，则模型将按照原来的所有状态继续训练；如果更改了其他参数，那该参数将重新加载到训练中。

wonengguwozai

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
caffe :from snapshop resume training

执行方法：原文地址：https://yunmingzhang.wordpress.com/2015/02/04/caffe-notes-using-snap-shot-in-convolutional-neural-network-training/This is a post summarizing how to resume training on caffe using snapshots....
复制链接

扫一扫

专栏目录