caffe layers

最新推荐文章于 2018-03-12 21:04:12 发布

wendox

最新推荐文章于 2018-03-12 21:04:12 发布

阅读量510

点赞数

分类专栏： DNN

本文链接：https://blog.csdn.net/wendox/article/details/50465218

版权

DNN 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

layers

为创建一个caffe模型，需要首先在protocol 缓存定义文件中定义一个模型结构。
caffe layers 和他们的参数定义在工程的文件caffe.proto，构成protocol buffer definition。

vision layers

视觉层经常使用图像作为输入输出其他图像。大部分视觉层作用于图像的局部区域产生相应区域的图像输出，其他的层把图像拉成一个向量对待，忽略其中的空间结构。

Convolution

Layer type: Convolution
CPU implementation: ./src/caffe/layers/convolution_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/convolution_layer.cu
Parameters (ConvolutionParameter convolution_param)
Required
num_output (c_o): the number of filters
kernel_size (or kernel_h and kernel_w): specifies height and width of each filter
Strongly Recommended
weight_filler [default type: ‘constant’ value: 0]
Optional
bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
pad (or pad_h and pad_w) [default 0]: specifies the number of pixels to (implicitly) add to each side of the input
stride (or stride_h and stride_w) [default 1]: specifies the intervals at which to apply the filters to the input
group (g) [default 1]: If g > 1, we restrict the connectivity of each filter to a subset of the input. Specifically, the input and output channels are separated into g groups, and the iith output group channels will be only connected to the iith input group channels.
Input
n * c_i * h_i * w_i
Output
n * c_o * h_o * w_o, where h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1 and w_o likewise.

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  # learning rate and decay multipliers for the filters
  param { lr_mult: 1 decay_mult: 1 }
  # learning rate and decay multipliers for the biases
  param { lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 96     # learn 96 filters
    kernel_size: 11    # each filter is 11x11
    stride: 4          # step 4 pixels between each filter application
    weight_filler {
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01        # distribution with stdev 0.01 (default mean: 0)
    }
    bias_filler {
      type: "constant" # initialize the biases to zero (0)
      value: 0
    }
  }
}

卷积层使用学习得到的滤波系数对输入图像卷积，每个滤波器产生一个特征映射到输出图像。

pooling

the pooling method. Currently MAX, AVE, or STOCHASTIC

layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3 # pool over a 3x3 region
    stride: 2      # step two pixels (in the bottom blob) between pooling regions
  }
}

Local Response Normalization (LRN)

局部响应归一化层通过归一化局部输入区域执行一个侧抑制。
对于 ACROSS_CHANNELS模式，局部区域跨越相邻通道，对于 WITHIN_CHANNELS模式，在自己的通道内拓展局部区域。

Layer type: LRN
- CPU Implementation: ./src/caffe/layers/lrn_layer.cpp CUDA GPU
- Implementation: ./src/caffe/layers/lrn_layer.cu
Parameters (LRNParameter lrn_param)
- Optional
- local_size [default 5]: the number of channels to sum over (for cross channel LRN) or the side length of the square region to sum over (for within channel LRN)
- alpha [default 1]: the scaling parameter (see below)
- beta [default 5]: the exponent (see below)

Loss Layers

Accuracy and Top-k

Accuracy scores the output as the accuracy of output with respect to target – it is not actually a loss and has no backward step.

Activation / Neuron Layers

power layer

The Power layer computes the output as (shift + scale * x) ^ power for each input element x.

absolute layer

The AbsVal layer computes the output as abs(x) for each input element x.

ReLU / Rectified-Linear and Leaky-ReLU

给定输入 $x$ ,LeRu层输出 $x$ ，如果，否则 $negative_slope*x$ ，如果没有设置的话那么输出 $max(x,0)$ ，

It also supports in-place computation, meaning that the bottom and the top blob could be the same to preserve memory consumption.

BNLL

The BNLL (binomial normal log likelihood) layer computes the output as log(1 + exp(x)) for each input element x.

data layer

Data enters Caffe through data layers: they lie at the bottom of nets. Data can come from efficient databases (LevelDB or LMDB), directly from memory, or, when efficiency is not critical, from files on disk in HDF5 or common image formats.

Common input preprocessing (mean subtraction, scaling, random cropping, and mirroring) is available by specifying TransformationParameters.

in memory

HDF5 input

HDF5 output

images

common layer

slicing

The Slice layer is a utility layer that slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices.