caffe layer层详解

6 篇文章 0 订阅

1、基本的layer定义,参数

1、基本的layer定义,参数

如何利用caffe定义一个网络,首先要了解caffe中的基本接口,下面分别对五类layer进行介绍

Vision Layers

可视化层来自于头文件 Header: ./include/caffe/vision_layers.hpp 一般输入和输出都是图像,这一层关注图像的2维的几何结构,并根据此结构对输入进行处理,特别是,大多数可视化层都通过对一些区域的操作,产生相关的区域进行输出,相反的是其他层忽视结合结构,只是把输入当作一个一维的大规模的向量进行处理。
Convolution:
Convolution

Layer type: Convolution
CPU implementation: ./src/caffe/layers/convolution_layer.cpp

CUDA GPU implementation: ./src/caffe/layers/convolution_layer.cu
Parameters (ConvolutionParameter convolution_param)
Required:

num_output (c_o): the number of filters
//卷积的个数
kernel_size (or kernel_h and kernel_w): specifies height and width of each filter
//每个卷积的size
Strongly Recommended
weight_filler [default type: 'constant' value: 0]
Optional

bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
//偏移量
pad (or pad_h and pad_w) [default 0]: specifies the number of pixels to (implicitly) add to each side of the input
//pad是对输入图像的扩充,边缘增加的大小
stride (or stride_h and stride_w) [default 1]: specifies the intervals at which to apply the filters to the input
//定义引用卷积的区间
group (g) [default 1]: If g > 1, we restrict the connectivity of each filter to a subset of the input. Specifically, the input and output channels are separated into g groups, and the ith output group channels will be only connected to the ith input group channels.
//限定输入的连通性,输入通道被分成g组,输出和输入的联通性是一致的,第i个输出通道仅仅和第i个输入通道联通。

每个filter产生一个featuremap.
输入的大小: nci(channel)hi(height)wi(weight)
输出的大小:
ncohowo,whereho=(hi+2padhkernelh)/strideh+1andwo likewise.

Pooling:
池化层的作用是压缩特征的维度,把相邻的区域变成一个值。目前的类型包括:最大化,平均,随机
参数有:
kernel_size,filter的大小
pool:类型
pad:每个输入图像的增加的边界的大小
stride:filter之间的大小
输入大小:
nchiwi
输出大小:
nchowo , where h_o and w_o are computed in the same way as convolution.

Local Response Normalization (LRN):
Layer type: LRN
CPU Implementation: ./src/caffe/layers/lrn_layer.cpp
CUDA GPU Implementation: ./src/caffe/layers/lrn_layer.cu
Parameters (LRNParameter lrn_param)
Optional
local_size [default 5]: the number of channels to sum over (for cross channel LRN) or the side length of the square region to sum over (for within channel LRN)
alpha [default 1]: the scaling parameter (see below)
beta [default 5]: the exponent (see below)
norm_region [default ACROSS_CHANNELS]: whether to sum over adjacent channels (ACROSS_CHANNELS) or nearby spatial locaitons (WITHIN_CHANNEL)
The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions. In ACROSS_CHANNELS mode, the local regions extend across nearby channels, but have no spatial extent (i.e., they have shape local_size x 1 x 1). In WITHIN_CHANNEL mode, the local regions extend spatially, but are in separate channels (i.e., they have shape 1 x local_size x local_size). Each input value is divided by (1+(α/n)ix2i)β , where n is the size of each local region, and the sum is taken over the region centered at that value (zero padding is added where necessary).

im2col
图像转化为列向量

Loss Layers

损失层是网络在学习过程的依据,一般最小化一个损失函数,通过FP和梯度
softmax:
本层计算输入的多元的Logistic 损失l(θ)=log(oy)其中 oy 是分类是y的概率.
注意与softmax-loss的区别softmax-loss其实就是把 oy 展开

l˜(y,z)=log(ezymj=1ezj)=log(j=1mezj)zy
.其中 zy zi=ωTix+bi 是第i个类别的线性预测结果。

平方和
类型: EuclideanLoss
欧式损失层计算的是两个输入向量之间的损失函数,

12Ni=1N||x1ix2i||22.

hinge:
类型:hingeloss
选项:L1,L2范数
输入:n*c*h*w的预测结果,n*1*1*1的label
输出:1*1*1*1的损失计算结果
样例:

# L1 Norm
layer {
  name: "loss"
  type: "HingeLoss"
  bottom: "pred"
  bottom: "label"
}

# L2 Norm
layer {
  name: "loss"
  type: "HingeLoss"
  bottom: "pred"
  bottom: "label"
  top: "loss"
  hinge_loss_param {
    norm: L2
  }
}

hinge loss层计算了一个一对多的,或者是平方的损失函数
Sigmoid Cross-Entropy:
类型:

 31 template <typename Dtype>
 32 void SigmoidCrossEntropyLossLayer<Dtype>::Forward_cpu(
 33     const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {
 34   // The forward pass computes the sigmoid outputs.
 35   sigmoid_bottom_vec_[0] = bottom[0];
 36   sigmoid_layer_->Forward(sigmoid_bottom_vec_, sigmoid_top_vec_);
 37   // Compute the loss (negative log likelihood)
 38   const int count = bottom[0]->count();
 39   const int num = bottom[0]->num();
 40   // Stable version of loss computation from input data
 41   const Dtype* input_data = bottom[0]->cpu_data();
 42   const Dtype* target = bottom[1]->cpu_data();
 43   Dtype loss = 0;
 44   for (int i = 0; i < count; ++i) {
 45     loss -= input_data[i] * (target[i] - (input_data[i] >= 0)) -
 46         log(1 + exp(input_data[i] - 2 * input_data[i] * (input_data[i] >= 0)));
 47   }
 48   top[0]->mutable_cpu_data()[0] = loss / num;
 49 }
 50 

Infogain:

 49 template <typename Dtype>
 50 void InfogainLossLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
 51     const vector<Blob<Dtype>*>& top) {
 52   const Dtype* bottom_data = bottom[0]->cpu_data();
 53   const Dtype* bottom_label = bottom[1]->cpu_data();
 54   const Dtype* infogain_mat = NULL;
 55   if (bottom.size() < 3) {
 56     infogain_mat = infogain_.cpu_data();
 57   } else {
 58     infogain_mat = bottom[2]->cpu_data();
 59   }
 60   int num = bottom[0]->num();
 61   int dim = bottom[0]->count() / bottom[0]->num();
 62   Dtype loss = 0;
 63   for (int i = 0; i < num; ++i) {
 64     int label = static_cast<int>(bottom_label[i]);
 65     for (int j = 0; j < dim; ++j) {
 66       Dtype prob = std::max(bottom_data[i * dim + j], Dtype(kLOG_THRESHOLD));
 67       loss -= infogain_mat[label * dim + j] * log(prob);
 68     }
 69   }
 70   top[0]->mutable_cpu_data()[0] = loss / num;
 71 }
 72 
 73 template <typename Dtype>
 74 void InfogainLossLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
 75     const vector<bool>& propagate_down,
 76     const vector<Blob<Dtype>*>& bottom) {
 77   if (propagate_down[1]) {
 78     LOG(FATAL) << this->type()
 79                << " Layer cannot backpropagate to label inputs.";
 80   }
 81   if (propagate_down.size() > 2 && propagate_down[2]) {
 82     LOG(FATAL) << this->type()
 83                << " Layer cannot backpropagate to infogain inputs.";
 84   }
 85   if (propagate_down[0]) {
 86     const Dtype* bottom_data = bottom[0]->cpu_data();
 87     const Dtype* bottom_label = bottom[1]->cpu_data();
 88     const Dtype* infogain_mat = NULL;
 89     if (bottom.size() < 3) {
 90       infogain_mat = infogain_.cpu_data();
 91     } else {
 92       infogain_mat = bottom[2]->cpu_data();
 93     }
 94     Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
 95     int num = bottom[0]->num();
 96     int dim = bottom[0]->count() / bottom[0]->num();
 97     const Dtype scale = - top[0]->cpu_diff()[0] / num;
 98     for (int i = 0; i < num; ++i) {
 99       const int label = static_cast<int>(bottom_label[i]);
100       for (int j = 0; j < dim; ++j) {
101         Dtype prob = std::max(bottom_data[i * dim + j], Dtype(kLOG_THRESHOLD));
102         bottom_diff[i * dim + j] = scale * infogain_mat[label * dim + j] / prob;
103       }
104     }
105   }
106 }
107 
108 INSTANTIATE_CLASS(InfogainLossLayer);
109 REGISTER_LAYER_CLASS(InfogainLoss);
110 }  // namespace caffe

Accuracy and Top-k:

这个是对输出的结果与实际目标之间的准确率,实际上不是一个bp过程

Activation / Neuron Layers

一般激活/神经层是元操作,输入一个底层的数据blob,输出一个同样大小的顶层的blob,下面的层中,我们将忽略输入输出的大小,由于他们是同样的大小的。
Input: nchw
Output: nchw

ReLU/Rectified inner and leaky-ReLU:
Parameters (ReLUParameter relu_param)
Optional
negative_slope [default 0]: specifies whether to leak the negative part by multiplying it with the slope value rather than setting it to 0.

layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}

ReLU函数如下定义,设输入值为X

f(x)={xnegative_slopexif x>0,otherwise.

其中 negative_slope 不是设定的,与 max(0,x) 相等,详情见我的另外一个小博客
http://blog.csdn.net/swfa1/article/details/45601789

sigmoid层
层的类型:sigmoid
样例:

layer {
  name: "encode1neuron"
  bottom: "encode1"
  top: "encode1neuron"
  type: "Sigmoid"
}

公式:

f(x)=sigmoid(x)

TanH / Hyperbolic Tangent:
类型:TanH
样例:

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "TanH"
}

f(x)=tanh(x)

绝对值:
类型:AbsVal

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "AbsVal"
}

公式:

f(x)=abs(x)

幂函数:
类型:Power
参数:
power [default 1]
scale [default 1]
shift [default 0]
样例:

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "Power"
  power_param {
    power: 1
    scale: 1
    shift: 0
  }
}

公式:

f(x)=(shift+scalex)power

BNLL:
type:BNLL

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: BNLL
}

公式:
The BNLL (binomial normal log likelihood) layer computes the output as

log(1+exp(x))

Data Layers

Common Layers

InnerProduct
类型:InnerProduct
参数:
必须的:
num_output (c_o): the number of filters
强烈建议的:weight_filler [default type: ‘constant’ value: 0]
可选的:
bias_filler [default type: ‘constant’ value: 0]
bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
样例:

layer {
  name: "fc8"
  type: "InnerProduct"
  # learning rate and decay multipliers for the weights
  param { lr_mult: 1 decay_mult: 1 }
  # learning rate and decay multipliers for the biases
  param { lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  bottom: "fc7"
  top: "fc8"
}

作用:
内积层又叫全连接层,输入当做一个以为想想,产生的输出也是以向量的形式输出,相当于blob的height 和width是1.

经过一段时间的学习之后,我发现上面的一些网络写的不是很详细,下面详细解释一下其中的
slice,ArgMaxLayer以及elementwise
slice layer
对输入进行分块处理,处理之后再进行剩下的计算,
ArgMaxLayer
Compute the index of the K max values for each datum across all dimensions (C×H×W) .

Intended for use after a classification layer to produce a prediction. If parameter out_max_val is set to true, output is a vector of pairs (max_ind, max_val) for each image. The axis parameter specifies an axis along which to maximise.
NOTE: does not implement Backwards operation.
elementwise
Compute elementwise operations, such as product and sum, along multiple input Blobs.

2、Alex 网络定义

3、如何增加一个新层

Add a class declaration for your layer to the appropriate one of common_layers.hpp,data_layers.hpp, loss_layers.hpp, neuron_layers.hpp, or vision_layers.hpp. Include an inline implementation of type and the *Blobs() methods to specify blob number requirements. Omit the*_gpu declarations if you’ll only be implementing CPU code.

Implement your layer in layers/your_layer.cpp.

SetUp for initialization: reading parameters, allocating buffers, etc.

Forward_cpu for the function your layer computes

Backward_cpu for its gradient

(Optional) Implement the GPU versions Forward_gpu and Backward_gpu in layers/your_layer.cu.

Add your layer to proto/caffe.proto, updating the next available ID. Also declare parameters, if needed, in this file.

Make your layer createable by adding it to layer_factory.cpp.

Write tests in test/test_your_layer.cpp. Use test/test_gradient_check_util.hpp to check that your Forward and Backward implementations are in numerical agreement.

以上是github上某大神的解答,步骤很清晰,具体说一下,比如现在要添加一个vision layer,名字叫Aaa_Layer:

1、属于哪个类型的layer,就打开哪个hpp文件,这里就打开vision_layers.hpp,然后自己添加该layer的定义,或者直接复制Convolution_Layer的相关代码来修改类名和构造函数名都改为Aaa_Layer,如果不用GPU,将*_gpu的声明都去掉。

2、实现自己的layer,编写Aaa_Layer.cpp,加入到src/caffe/layers,主要实现Setup、Forward_cpu、Backward_cpu。

3、如果需要GPU实现,那么在Aaa_Layer.cu中实现Forward_gpu和Backward_gpu。

4、修改src/caffe/proto/caffe.proto,好到LayerType,添加Aaa,并更新ID,如果Layer有参数,添加AaaParameter类。

5、在src/caffe/layer_factory.cpp中添加响应代码。

6、在src/caffe/test中写一个test_Aaa_layer.cpp,用include/caffe/test/test_gradient_check_util.hpp来检查前向后向传播是否正确。

  • 8
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Caffe是一个用于深度学习的开源框架,其配置文件包含了网络模型、数据输入输出等相关信息,下面是关于Caffe配置文件的详解Caffe配置文件主要包括两类:网络模型配置文件和Solver配置文件。 1. 网络模型配置文件 网络模型配置文件通常包括以下几个部分: (1)name:指定模型的名称。 (2)input:指定输入数据的属性,包括数据维度、数据类型等。 (3)layer:描述网络的信息,包括的名称、类型、输入输出数据的维度等。 (4)loss:指定损失函数的类型。 (5)accuracy:指定模型评估指标的类型。 下面是一个典型的网络模型配置文件的例子: ``` name: "MyNet" input: "data" input_dim: 1 input_dim: 3 input_dim: 224 input_dim: 224 input_dim: 3 layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } loss { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" } accuracy { name: "accuracy" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy" } ``` 2. Solver配置文件 Solver配置文件用于定义训练网络的参数和超参数,包括以下几个部分: (1)net:指定训练使用的网络模型配置文件。 (2)test_iter:指定测试时使用的迭代次数。 (3)test_interval:指定每隔多少次迭代进行一次测试。 (4)base_lr:指定初始学习率。 (5)lr_policy:指定学习率的调整策略。 (6)momentum:指定动量参数。 (7)weight_decay:指定权重衰减参数。 下面是一个典型的Solver配置文件的例子: ``` net: "MyNet.prototxt" test_iter: 100 test_interval: 500 base_lr: 0.01 lr_policy: "step" gamma: 0.1 stepsize: 100000 momentum: 0.9 weight_decay: 0.0005 ``` 以上是关于Caffe配置文件的详解,希望能对你有所帮助。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值