layers
为创建一个caffe模型,需要首先在protocol 缓存定义文件中定义一个模型结构。
caffe layers 和他们的参数定义在工程的文件caffe.proto,构成protocol buffer definition。
vision layers
视觉层经常使用图像作为输入输出其他图像。大部分视觉层作用于图像的局部区域产生相应区域的图像输出,其他的层把图像拉成一个向量对待,忽略其中的空间结构。
Convolution
- Layer type: Convolution
CPU implementation: ./src/caffe/layers/convolution_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/convolution_layer.cu - Parameters (ConvolutionParameter convolution_param)
Required
num_output (c_o): the number of filters
kernel_size (or kernel_h and kernel_w): specifies height and width of each filter - Strongly Recommended
weight_filler [default type: ‘constant’ value: 0] - Optional
bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
pad (or pad_h and pad_w) [default 0]: specifies the number of pixels to (implicitly) add to each side of the input
stride (or stride_h and stride_w) [default 1]: specifies the intervals at which to apply the filters to the input
group (g) [default 1]: If g > 1, we restrict the connectivity of each filter to a subset of the input. Specifically, the input and output channels are separated into g groups, and the iith output group channels will be only connected to the iith input group channels. - Input
n * c_i * h_i * w_i
Output
n * c_o * h_o * w_o, where h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1 and w_o likewise.
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
# learning rate and decay multipliers for the filters
param { lr_mult: 1 decay_mult: 1 }
# learning rate and decay multipliers for the biases
param { lr_mult: 2 decay_mult: 0 }
convolution_param {
num_output: 96 # learn 96 filters
kernel_size: 11 # each filter is 11x11
stride: 4 # step 4 pixels between each filter application
weight_filler {
type: "gaussian" # initialize the filters from a Gaussian
std: 0.01 # distribution with stdev 0.01 (default mean: 0)
}
bias_filler {
type: "constant" # initialize the biases to zero (0)
value: 0
}
}
}
卷积层使用学习得到的滤波系数对输入图像卷积,每个滤波器产生一个特征映射到输出图像。
pooling
the pooling method. Currently MAX, AVE, or STOCHASTIC
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3 # pool over a 3x3 region
stride: 2 # step two pixels (in the bottom blob) between pooling regions
}
}
Local Response Normalization (LRN)
局部响应归一化层通过归一化局部输入区域执行一个侧抑制。
对于 ACROSS_CHANNELS模式,局部区域跨越相邻通道,对于 WITHIN_CHANNELS模式,在自己的通道内拓展局部区域。
- Layer type: LRN
- CPU Implementation: ./src/caffe/layers/lrn_layer.cpp CUDA GPU
- Implementation: ./src/caffe/layers/lrn_layer.cu
- Parameters (LRNParameter lrn_param)
- Optional
- local_size [default 5]: the number of channels to sum over (for cross channel LRN) or the side length of the square region to sum over (for within channel LRN)
- alpha [default 1]: the scaling parameter (see below)
- beta [default 5]: the exponent (see below)
Loss Layers
Accuracy and Top-k
Accuracy scores the output as the accuracy of output with respect to target – it is not actually a loss and has no backward step.
Activation / Neuron Layers
power layer
The Power layer computes the output as (shift + scale * x) ^ power for each input element x.
absolute layer
The AbsVal layer computes the output as abs(x) for each input element x.
ReLU / Rectified-Linear and Leaky-ReLU
给定输入
x
,LeRu层输出
It also supports in-place computation, meaning that the bottom and the top blob could be the same to preserve memory consumption.
BNLL
The BNLL (binomial normal log likelihood) layer computes the output as log(1 + exp(x)) for each input element x.
data layer
Data enters Caffe through data layers: they lie at the bottom of nets. Data can come from efficient databases (LevelDB or LMDB), directly from memory, or, when efficiency is not critical, from files on disk in HDF5 or common image formats.
Common input preprocessing (mean subtraction, scaling, random cropping, and mirroring) is available by specifying TransformationParameters.
in memory
HDF5 input
HDF5 output
images
common layer
slicing
The Slice layer is a utility layer that slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices.
inner product
Splitting
The Split layer is a utility layer that splits an input blob to multiple output blobs. This is used when a blob is fed into multiple output layers.
caffe 相关
[caffe]深度学习之图像分类模型AlexNet解读