Caffe之prototxt

最新推荐文章于 2022-07-07 15:47:32 发布

女王の专属领地

最新推荐文章于 2022-07-07 15:47:32 发布

阅读量6k

点赞数 1

分类专栏： # Caffe 深度学习

本文为博主原创文章，未经博主允许不得转载。

本文链接：https://blog.csdn.net/Julialove102123/article/details/79154945

版权

深度学习同时被 2 个专栏收录

112 篇文章 30 订阅

订阅专栏

Caffe

22 篇文章 4 订阅

订阅专栏

1、可视化工具：

http://ethereon.github.io/netscope/quickstart.html

2、常用网络模型caffe-model之.prototxt:

https://github.com/soeaver/caffe-model

3、python生成.prototxt文件工具：

http://blog.csdn.net/c406495762/article/details/70306550

4、caffe的.prototxt文件解读

https://wenku.baidu.com/view/a38c6aae5901020207409cde.html

5、caffe源码文件的.prototxt

https://github.com/BVLC/caffe

6 .prototxt:网络结构定义个别解析

1. 数据层即输入层。

在caffe中数据以blob的格式进行存储和传输，在这一层中是实现数据其他格式与blob之间的转换，例如从高效的数据库lmdb或者level-db转换为blob，也可以从低效的数据格式如hdf5或者图片。

另外数据的预处理也在本层实现，如减去均值, 放大缩小, 裁剪和镜像等。以Lenet_train_test.prototxt为例：

[plain]view plain copy 
   
 name: "LeNet"  
 layer {  
   name: "mnist"  
   type: "Data"  
   top: "data"  
   top: "label"  
   include {  
     phase: TRAIN  
   }  
   transform_param {  
     scale: 0.00390625  
   }  
   data_param {  
     source: "examples/mnist/mnist_train_lmdb"  
     batch_size: 64  
     backend: LMDB  
   }  
 }  

最上面name ：网络名称，可自己定义。

数据层layer的定义：

name：可自己取

type：层的类型。

（1）Data：数据来源于LevelDB或者LMDB，必须设置batch_size。source为包含数据库的目录名称，如examples/mnist/mnist_train_lmdb

（2）MemoryData: 数据来源于内存，必须设置batch_size, channels, width, height.

[html]view plain copy 
   
 layer {  
   top: "data"  
   top: "label"  
   name: "memory_data"  
   type: "MemoryData"  
   memory_data_param{  
     batch_size: 2  
     height: 100  
     width: 100  
     channels: 1  
   }  
   transform_param {  
     scale: 0.0078125  
     mean_file: "mean.proto"  
     mirror: false  
   }  
 }  

（3）HDF5Data: 数据来源于Hdf5，必须设置batch_size和source，读取的文件名称

[html]view plain copy 
   
 layer {  
   name: "data"  
   type: "HDF5Data"  
   top: "data"  
   top: "label"  
   hdf5_data_param {  
     source: "examples/hdf5_classification/data/train.txt"  
     batch_size: 10  
   }  
 }  

（4）ImageData: 数据来源于图片。

必须设置的参数：

source： 每一行是给定的图片路径和标签;

batch_size

可选设置的参数为：

rand_skip: 在开始的时候，路过某个数据的输入。通常对异步的SGD很有用。

shuffle: 随机打乱顺序，默认值为false

new_height,new_width: 如果设置，则将图片进行resize

[html]view plain copy 
   
 layer {  
   name: "data"  
   type: "ImageData"  
   top: "data"  
   top: "label"  
   transform_param {  
     mirror: false  
     crop_size: 227  
     mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"  
   }  
   image_data_param {  
     source: "examples/_temp/file_list.txt"  
     batch_size: 50  
     new_height: 256  
     new_width: 256  
   }  
 }  

（5）WindowData：来源于windows

[html]view plain copy 
   
 layer {  
   name: "data"  
   type: "WindowData"  
   top: "data"  
   top: "label"  
   include {  
     phase: TRAIN  
   }  
   transform_param {  
     mirror: true  
     crop_size: 227  
     mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"  
   }  
   window_data_param {  
     source: "examples/finetune_pascal_detection/window_file_2007_trainval.txt"  
     batch_size: 128  
     fg_threshold: 0.5  
     bg_threshold: 0.5  
     fg_fraction: 0.25  
     context_pad: 16  
     crop_mode: "warp"  
   }  
 }  

top：本层的输出，例子表明有两个输出，data和label是分类问题所必须的

bottom：本层的输入

include：在其中规定是训练还是测试的层。如果没有定义则表明训练和测试均有此层。如例，此层为训练层，有训练数据和标签

transform_param：数据预处理，scale表明对数据由0-255转换到了[0，1）。mirror（1表示开启，0表示关闭）， mean_file_size（后面跟配置文件mean.binaryproto，进行去均值的处理），crop_size(剪裁，训练数据随机剪裁，测试数据从中间剪裁)

data_param:定义数据，source是数据路径;将全部的图片分为不同的批次batch，batch_size是一个批次包含的图片数目;backend表明所用的数据库

2. 视觉层

包括convolution卷积层, pooling池化层， Local Response Normalization (LRN)局部极大值抑制, im2col等层。

（1）层类型：Convolution，如lenet的第一个卷积层

[html]view plain copy 
   
 layer {  
   
   name: "conv1"  
   type: "Convolution"  
   bottom: "data"  
   top: "conv1"  
   param {  
     lr_mult: 1    #权重w的学习率的系数，学习率=base_lr（定义在solver.prototxt）×lr_mult
   }  
   param {  
     lr_mult: 2          #表示偏重bias的学习率系数
   }  
   convolution_param {  
     num_output: 20     #卷积核kernel的个数
     kernel_size: 5     #kernel大小，如果卷积核的长和宽不等，需要用kernel_h和kernel_w分别设定
     stride: 1          #卷积运算的步长，默认为1。也可以用stride_h和stride_w来设置
     weight_filler {  
       type: "xavier"  
     }  
     bias_filler {  
       type: "constant"  
     }  
   }  
 }  

pad：填充边缘的大小。如设置，可是得到的特征图与原图大小相等，pad_h和pad_w来分别设定

weight_filler: 权值初始化，若设置为constant, 则默认为0。也可使用"xavier"或者”gaussian"进行初始化

bias_filler：偏置项的初始化，与weight_filter类似

bias_term：是否开启偏置项（0或1）

group：分组，默认为1组。如果大于1，我们限制卷积的连接操作在一个子集内。如果我们根据图像的通道来分组，那么第i个输出分组只能与第i个输入分组进行连接。

（2）层类型：Pooling。

[html]view plain copy 
   
 layer {  
   
   name: "pool1"  
   type: "Pooling"  
   bottom: "conv1"  
   top: "pool1"  
   pooling_param {  
     pool: MAX  
     kernel_size: 2   #必须设置的参数：池化的核大小。也可以用kernel_h和kernel_w分别设定。
     stride: 2  
   }  
 }  

　pool: 池化方法，默认为MAX。目前可用的方法有MAX, AVE, 或STOCHASTIC
　pad: 和卷积层的pad的一样，进行边缘扩充。默认为0
　stride: 池化的步长，默认为1。一般我们设置为2，即不重叠。也可以用stride_h和stride_w来设置。

（3）层类型:LRN

[html]view plain copy 
   
  
 layers {  
   name: "norm1"  
   type: LRN  
   bottom: "pool1"  
   top: "norm1"  
   lrn_param {  
     local_size: 5  
     alpha: 0.0001  
     beta: 0.75  
   }  
 }  

local_size: 默认为5。如果是跨通道LRN，则表示求和的通道数；如果是在通道内LRN，则表示求和的正方形区域长度。

alpha: 默认为1，归一化公式中的参数。

beta: 默认为5，归一化公式中的参数。

norm_region: 默认为ACROSS_CHANNELS。

1.ACROSS_CHANNELS表示在相邻的通道间求和归一化。

2.WITHIN_CHANNEL表示在一个通道内部特定的区域内进行求和归一化。与前面的local_size参数对应。

归一化公式为：除以

（4）层类型：img2col

将一个大矩阵，重叠地划分为多个子矩阵，对每个子矩阵序列化成向量，最后得到另外一个矩阵。

在caffe中，卷积运算就是先对数据进行im2col操作，再进行内积运算（inner product)。这样做，比原始的卷积操作速度更快。

看看两种卷积操作的异同：

3. 激活层

对输入数据进行激活操作，常用的激活函数有：Sigmoid、TanH、AbsVal（Absolute Value）RELU（ReLU / Rectified-Linear and Leaky-ReLU）收敛速度最快。

[html]view plain copy 
   
 layer {  
   
   name: "relu1"  
   type: "ReLU"  
   bottom: "ip1"  
   top: "ip1"  
 }  

RELU函数为：max(x, 0)
可选参数：
　　negative_slope：默认为0. 对标准的ReLU函数进行变化，如果设置了这个值，那么数据为负数时，就不再设置为0，而是用原始数据乘以negative_slope
Power：f(x)= (shift + scale * x) ^ power

[html]view plain copy 
   
 layer {  
   name: "layer"  
   bottom: "in"  
   top: "out"  
   type: "Power"  
   power_param {  
     power: 2  
     scale: 1  
     shift: 0  
   }  
 }  

BNLL： binomial normal log likelihood

[html]view plain copy 
   
 layer {  
   name: "layer"  
   bottom: "in"  
   top: "out"  
   type: “BNLL”  
 }  

4. 其他层

softmax_loss层，Inner Product层，accuracy层，reshape层和dropout层
（1）softmax_loss层

[html]view plain copy 
   
 layer {  
   name: "loss"  
   type: "SoftmaxWithLoss"  
   bottom: "ip2"  
   bottom: "label"  
   top: "loss"  
 }  

（2）全连接层，把输入当作成一个向量，输出也是一个简单向量（把输入数据blobs的width和height全变为1）。

输入： n*c0*h*w

输出： n*c1*1*1

全连接层实际上也是一种卷积层，只是它的卷积核大小和原数据大小一致。因此它的参数基本和卷积层的参数一样。

[html]view plain copy 
   
 layer {  
   name: "ip2"  
   type: "InnerProduct"  
   bottom: "ip1"  
   top: "ip2"  
   param {  
     lr_mult: 1  
   }  
   param {  
     lr_mult: 2  
   }  
   inner_product_param {  
     num_output: 10  
     weight_filler {  
       type: "xavier"  
     }  
     bias_filler {  
       type: "constant"  
     }  
   }  
 }  

（3）accuracy，只有测试阶段才有

[html]view plain copy 
   
 layer {  
   name: "accuracy"  
   type: "Accuracy"  
   bottom: "ip2"  
   bottom: "label"  
   top: "accuracy"  
   include {  
     phase: TEST  
   }  
 }  

（4）Reshape层，改变数据维度

[html]view plain copy 
   
 layer {  
     name: "reshape"  
     type: "Reshape"  
     bottom: "input"  
     top: "output"  
     reshape_param {  
       shape {  
         dim: 0  # copy the dimension from below  维度不变
         dim: 2   #维度变为2
         dim: 3   #维度变为3
         dim: -1 # infer it from the other dimensions 计算出来的（总数不变） 
       }  
     }  
   }  

（5）Dropout层，防止过拟合，可以随机让网络某些隐含层节点的权重不工作。

[html]view plain copy 
   
 layer {  
   name: "drop7"  
   type: "Dropout"  
   bottom: "fc7-conv"  
   top: "fc7-conv"  
   dropout_param {  
     dropout_ratio: 0.5  
   }