Caffe各种层

最新推荐文章于 2019-10-23 16:07:01 发布
This is bill
最新推荐文章于 2019-10-23 16:07:01 发布
阅读量547
点赞数
分类专栏：机器学习
机器学习专栏收录该内容
321 篇文章 17 订阅
订阅专栏
 
 
  
  Vision Layers

1.1 卷积层(Convolution)

类型：CONVOLUTION

例子

  
  layers {

name: "conv1"

type: CONVOLUTION

bottom: "data"

top: "conv1"

blobs_lr: 1 # learning rate multiplier for the filters

blobs_lr: 2 # learning rate multiplier for the biases

weight_decay: 1 # weight decay multiplier for the filters

weight_decay: 0 # weight decay multiplier for the biases

convolution_param {

num_output: 96 # learn 96 filters

kernel_size: 11 # each filter is 11x11

stride: 4 # step 4 pixels between each filter application

weight_filler {

type: "gaussian" # initialize the filters from a Gaussian

std: 0.01 # distribution with stdev 0.01 (default mean: 0) }

bias_filler {

type: "constant" # initialize the biases to zero (0) value: 0 }

}

}

  
  **blobs_lr: **学习率调整的参数，在上面的例子中设置权重学习率和运行中求解器给出的学习率一样，同时是偏置学习率为权重的两倍。 （设为0时称为freeze参数）

weight_decay：

卷积层的重要参数

必须参数：

num_output (c_o)：过滤器的个数

kernel_size (or kernel_h and kernel_w)：过滤器的大小

  
  可选参数：

weight_filler [default type: 'constant' value: 0]：参数的初始化方法

bias_filler：偏置的初始化方法

bias_term [default true]：指定是否是否开启偏置项

pad (or pad_h and pad_w) [default 0]：指定在输入的每一边加上多少个像素

stride (or stride_h and stride_w) [default 1]：指定过滤器的步长

**group (g) [default 1]: **If g > 1, we restrict the connectivityof each filter to a subset of the input. Specifically, the input and outputchannels are separated into g groups, and the ith output group channels will beonly connected to the ith input group channels.

  
  通过卷积后的大小变化：

输入：n * c_i * h_i * w_i

输出：n * c_o * h_o * w_o，其中h_o = (h_i + 2 * pad_h - kernel_h) /stride_h + 1，w_o通过同样的方法计算。

  
  1.2 池化层（Pooling）

类型：POOLING

例子

layers {

name: "pool1"

type: POOLING

bottom: "conv1"

top: "pool1"

pooling_param {

pool: MAX kernel_size: 3 # pool over a 3x3 region

stride: 2 # step two pixels (in the bottom blob) between pooling regions

}}

卷积层的重要参数

必需参数：

kernel_size (or kernel_h and kernel_w)：过滤器的大小

  
  可选参数：

pool [default MAX]：pooling的方法，目前有MAX, AVE, 和STOCHASTIC三种方法

pad (or pad_h and pad_w) [default 0]：指定在输入的每一遍加上多少个像素

stride (or stride_h and stride_w) [default1]：指定过滤器的步长

  
  通过池化后的大小变化：

输入：n * c_i * h_i * w_i

输出：n * c_o * h_o * w_o，其中h_o = (h_i + 2 * pad_h - kernel_h) /stride_h + 1，w_o通过同样的方法计算。

  
  1.3 Local Response Normalization (LRN)

类型：LRN

Local ResponseNormalization是对一个局部的输入区域进行的归一化（激活a被加一个归一化权重（分母部分）生成了新的激活b），有两种不同的形式，一种的输入区域为相邻的channels（cross channel LRN），另一种是为同一个channel内的空间区域（within channel LRN）



  
  
计算公式：对每一个输入除以
  
  

   
   

    
    

    
    
     
     
    
    

   
   

   
   

  
  

  
  可选参数：

local_size [default 5]：对于cross channel LRN为需要求和的邻近channel的数量；对于within channel LRN为需要求和的空间区域的边长

alpha [default 1]：scaling参数

beta [default 5]：指数

**norm_region [default ACROSS_CHANNELS]: **选择哪种LRN的方法ACROSS_CHANNELS 或者WITHIN_CHANNEL

  
  Loss Layers

深度学习是通过最小化输出和目标的Loss来驱动学习。

  
  2.1 Softmax

类型: SOFTMAX_LOSS2.2 Sum-of-Squares / Euclidean

类型: EUCLIDEAN_LOSS

  
  2.3 Hinge / Margin

类型: HINGE_LOSS例子：

  
  L1 Normlayers

{ name: "loss"

type: HINGE_LOSS

bottom: "pred"

bottom: "label"}

L2 Normlayers {

name: "loss"

type: HINGE_LOSS

bottom: "pred"

bottom: "label"

top: "loss"

hinge_loss_param

{ norm: L2 }}

  
  可选参数：

**norm [default L1]: **选择L1或者 L2范数

输入：

n * c * h * wPredictions

n * 1 * 1 * 1Labels

输出

1 * 1 * 1 * 1Computed Loss

  
  2.4 Sigmoid Cross-Entropy

类型：SIGMOID_CROSS_ENTROPY_LOSS2.5 Infogain

类型：INFOGAIN_LOSS2.6 Accuracy and Top-k

类型：ACCURACY 用来计算输出和目标的正确率，事实上这不是一个loss，而且没有backward这一步。

  
  激励层（Activation / Neuron Layers）

一般来说，激励层是element-wise的操作，输入和输出的大小相同，一般情况下就是一个非线性函数。

  
  3.1 ReLU / Rectified-Linear and Leaky-ReLU

类型: RELU例子:

  
  layers { name: "relu1" type: RELU bottom: "conv1" top: "conv1"}

  
  可选参数：

negative_slope [default 0]:指定输入值小于零时的输出。

  
  ReLU是目前使用做多的激励函数，主要因为其收敛更快，并且能保持同样效果。

标准的ReLU函数为max(x, 0)，而一般为当x > 0时输出x，但x <= 0时输出negative_slope。RELU层支持in-place计算，这意味着bottom的输出和输入相同以避免内存的消耗。

  
  3.2 Sigmoid

类型: SIGMOID例子:

  
  layers { name: "encode1neuron" bottom: "encode1" top: "encode1neuron" type: SIGMOID}



  
  
SIGMOID 层通过 sigmoid(x) 计算每一个输入x的输出，函数如下图。
  
  

   
   

    
    

    
    
     
     
    
    

   
   

   
   

  
  

  
  3.3 TanH / Hyperbolic Tangent

类型: TANH例子:

  
  layers { name: "encode1neuron" bottom: "encode1" top: "encode1neuron" type: SIGMOID}

  
  TANH层通过 tanh(x) 计算每一个输入x的输出，函数如下图。



  
  

   
   

    
    

    
    
     
     
    
    

   
   

   
   

  
  
  
  3.3 Absolute Value

类型: ABSVAL例子:

layers { name: "layer" bottom: "in" top: "out" type: ABSVAL}

ABSVAL层通过 abs(x) 计算每一个输入x的输出。


  
  3.4 Power

类型: POWER例子：

layers { name: "layer" bottom: "in" top: "out" type: POWER power_param { power: 1 scale: 1 shift: 0 }}

可选参数：power [default 1]scale [default 1]shift [default 0]POWER层通过 (shift + scale * x) ^ power计算每一个输入x的输出。

  
  3.5 BNLL

类型: BNLL例子：

layers { name: "layer" bottom: "in" top: "out" type: BNLL}

BNLL (binomial normal log likelihood) 层通过 log(1 + exp(x)) 计算每一个输入x的输出。

  
  数据层（Data Layers）

数据通过数据层进入Caffe，数据层在整个网络的底部。数据可以来自高效的数据库（LevelDB 或者 LMDB），直接来自内存。如果不追求高效性，可以以HDF5或者一般图像的格式从硬盘读取数据。

  
  4.1 Database

  
  类型：DATA

必须参数：

source:包含数据的目录名称

batch_size:一次处理的输入的数量

  
  可选参数：

rand_skip:在开始的时候从输入中跳过这个数值，这在异步随机梯度下降（SGD）的时候非常有用

backend [default LEVELDB]: 选择使用 LEVELDB 或者 LMDB

  
  4.2 In-Memory

类型: MEMORY_DATA必需参数：batch_size, channels, height, width: 指定从内存读取数据的大小The memory data layer reads data directly from memory, without copying it. In order to use it, one must call MemoryDataLayer::Reset (from C++) or Net.set_input_arrays (from Python) in order to specify a source of contiguous data (as 4D row major array), which is read one batch-sized chunk at a time.

  
  4.3 HDF5 Input

类型: HDF5_DATA必要参数：source:需要读取的文件名batch_size：一次处理的输入的数量

  
  4.4 HDF5 Output

类型: HDF5_OUTPUT必要参数：file_name: 输出的文件名HDF5的作用和这节中的其他的层不一样，它是把输入的blobs写到硬盘

  
  4.5 Images

类型: IMAGE_DATA必要参数：source: text文件的名字，每一行给出一张图片的文件名和labelbatch_size: 一个batch中图片的数量可选参数：rand_skip：在开始的时候从输入中跳过这个数值，这在异步随机梯度下降（SGD）的时候非常有用shuffle [default false]****new_height, new_width: 把所有的图像resize到这个大小

  
  4.6 Windows

类型：WINDOW_DATA4.7 Dummy

类型：DUMMY_DATADummy 层用于development 和debugging。具体参数DummyDataParameter。

  
  一般层（Common Layers）

  
  5.1 全连接层Inner Product

类型：INNER_PRODUCT例子：layers { name: "fc8" type: INNER_PRODUCT blobs_lr: 1 # learning rate multiplier for the filters blobs_lr: 2 # learning rate multiplier for the biases weight_decay: 1 # weight decay multiplier for the filters weight_decay: 0 # weight decay multiplier for the biases inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } bottom: "fc7" top: "fc8"}

必要参数：

num_output (c_o)：过滤器的个数

  
  可选参数：

weight_filler [default type: 'constant' value: 0]：参数的初始化方法

bias_filler：偏置的初始化方法

bias_term [default true]：指定是否是否开启偏置项

  
  通过全连接层后的大小变化：

输入：n * c_i * h_i * w_i

输出：n * c_o * 1 *1

  
  5.2 Splitting

类型：SPLITSplitting层可以把一个输入blob分离成多个输出blobs。这个用在当需要把一个blob输入到多个输出层的时候。5.3 Flattening

类型：FLATTENFlattening是把一个输入的大小为n * c * h * w变成一个简单的向量，其大小为 n * (chw) * 1 * 1。5.4 Concatenation

类型：CONCAT例子：layers { name: "concat" bottom: "in1" bottom: "in2" top: "out" type: CONCAT concat_param { concat_dim: 1 }}

  
  可选参数：

concat_dim [default 1]：0代表链接num，1代表链接channels

  
  通过全连接层后的大小变化：

输入：从1到K的每一个blob的大小n_i * c_i * h * w

输出：

如果concat_dim = 0: (n_1 + n_2 + ... + n_K) *c_1 * h * w，需要保证所有输入的c_i 相同。

如果concat_dim = 1: n_1 * (c_1 + c_2 + ... +c_K) * h * w，需要保证所有输入的n_i 相同。

  
  通过Concatenation层，可以把多个的blobs链接成一个blob。

  
  5.5 Slicing

The SLICE layer is a utility layer that slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices.5.6 Elementwise Operations

类型：ELTWISE5.7 Argmax

类型：ARGMAX5.8 Softmax

类型：SOFTMAX5.9 Mean-Variance Normalization

类型：MVN
 
 

作者：陈继科
链接：http://www.jianshu.com/p/0ade01e9e48a
來源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。