总结caffe调参技巧

最新推荐文章于 2019-08-14 17:40:08 发布

junmuzi

最新推荐文章于 2019-08-14 17:40:08 发布

阅读量766

点赞数 1

分类专栏： CV & ML caffe

CV & ML 同时被 2 个专栏收录

82 篇文章 1 订阅

订阅专栏

caffe

45 篇文章 0 订阅

订阅专栏

Original url：

https://blog.csdn.net/baidu_26408419/article/details/78497201

好的URL：

https://machinelearningmastery.com/improve-deep-learning-performance/

http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html

http://lamda.nju.edu.cn/weixs/book/CNN_book.html

图片加载有问题，直接百度云下载本文.doc文档：

链接：https://pan.baidu.com/s/1mhCxQsg 密码：igfo

总的来说：

0.data：http://machinelearningmastery.com/improve-deep-learning-performance/

(0.

1.特征选择是门学问

)

①learning rate很重要，可以设置小一点，但是训练时间会加长

②batch size在自己机器允许的条件下，尽可能大一点（设置为2的次方）

③查看精度的结果图表，从中获取信息：

④Sec. 8: Ensemble（怎么组合多个网络结构，具体参考周志华的《机器学习》）

⑤微调总体原理：

⑥https://research.fb.com/wp-content/uploads/2017/06/imagenet1kin1h5.pdf?（gpu分布式训练）：

Linear ScalingRule: When the minibatch size is multiplied by k, multiply the learning rate byk.（All other hyper-parameters (weight decay, momentum, etc.) are keptunchanged）

具体如下：

一．Caffe结构参数调试技巧

0.数据准备的时候：

①X= X.astype(np.float32)

X, y =shuffle(X, y, random_state=42) # shuffle train data

y =y.astype(np.float32)

②归一化等：

1. http://blog.csdn.net/u011762313/article/details/47399981

Solver初始化中（Caffe提供了3种Solver方法：Stochastic Gradient Descent（SGD，随机梯度下降），Adaptive Gradient（ADAGRAD，自适应梯度下降）和Nesterov’sAccelerated Gradient（NESTEROV，Nesterov提出的加速梯度下降）。）

SGD默认的较好的参数构造:

base_lr: 0.01 # 开始学习速率为：α = 0.01

lr_policy: "step"#学习策略: 每stepsize次迭代之后，将α乘以gamma

gamma: 0.1 # 学习速率变化因子

stepsize: 100000 # 每100K次迭代，下降学习速率

max_iter: 350000 # 训练的最大迭代次数

momentum: 0.9 #momentum为：μ = 0.01

其他两种方法效果也好。

2, https://corpocrat.com/2015/02/24/facial-keypoints-extraction-using-deep-learning-with-caffe/

①We specify RELU (to allowvalues > 0, plus faster converging) ,Dropout layerto prevent overfitting.（Dropout层防止过拟合）

（http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html：）

In conclusion,three types of ReLU variants all consistently outperform the original ReLU inthese three data sets. And PReLU and RReLU seem better choices. Moreover, Heet al. also reported similarconclusions in [4].

②全连接层加入(xavier作用：默认将Blob系数x初始化为满足x∼U(−a,+a)的均匀分布)：

weight_filler{

type:"xavier"

}

bias_filler{

type:"constant"

value: 0.1

}

layer {

name:"relu22"

type:"ReLU"

bottom:"fc6"

top:"fc6"

}

③Layer层具体参数的初始方式：http://blog.csdn.net/wenlin33/article/details/53378613

3.看图表等结果：

①训练的时候最好可以可视化一些featuremap

②训练测试的log保存下来成图表，根据图表显示的精度看参数的效果!

4.小技巧：http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/

①（An overfittingnet can generally be made to perform better by using more training data）

增加数据方法：通过图片旋转，翻转等技巧

②加快网络训练速度：Rememberthat in our previous model, we initialized learning rate and momentum with astatic 0.01 and 0.9 respectively. Let's change that such that the learning ratedecreases linearly with the number of epochs, while we let the momentumincrease.

③加载预训练pre-trained model的权重，来加快本次的训练速度

④put the BatchNorm layer immediately afterfully connected
layers (or convolutional layers), and before activation

二微调一些深度网络

①http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html

②https://zhuanlan.zhihu.com/p/22624331

尤其是当我们的数据相对较少时，就更适合选择这种办法。既可以有效利用深度神经网络强大的泛化能力，又可以免去设计复杂的模型以及耗时良久的训练。目前最强大的模型是ResNet，很多视觉任务都可以通过fine-tuning ResNet得到非常好的performance!

③ 微调总体原理：

很好的博客：

① http://machinelearningmastery.com/improve-deep-learning-performance/

② http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html

③ https://github.com/hwdong/deep-learning/blob/master/deep%20learning%20papers.md

junmuzi

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
总结caffe调参技巧

Original url：https://blog.csdn.net/baidu_26408419/article/details/78497201好的URL：https://machinelearningmastery.com/improve-deep-learning-performance/http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNN...
复制链接

扫一扫

专栏目录