文章目录
本篇为基于<深度学习21天实战caffe>所做的学习笔记,如果错误,欢迎指导
一、前篇
caffe 源码导读(一) 了解protobuf
caffe 源码导读(二) Blob数据结构介绍
caffe 源码导读(三) Blob.hpp头文件解析
caffe 源码导读(四) Blob.cpp解析
d===========================================================================d
二、正文
Layer是caffe的基本计算单元,至少有一个输入Blob和输出Blob,部分Layer带有权值和偏置. 有两个运算方向:前向传播和反向传播,其中前向传播计算会对输入Blob进行某种处理(有权值和偏置项的Layer会利用这些对输入进行处理),得到输出Blob,而反向传播计算则对输出Blob的diff进行某种处理,得到输入Blob的diff(有权值和偏置项的Layer可能也会计算权值Blob\偏置项Blob的diff).
2.1 LayerParameter的数据结构描述(Proto)
看Proto代码,发现还包含Phase、ParamSpec等
// NOTE
// Update the next available ID when you add a new LayerParameter field.
// 注意:当你怎增加一个新的Layerparameter 字段时,更新一下下一个可用的ID
// LayerParameter next available layer-specific ID: 149 (last added: clip_param)
message LayerParameter {
optional string name = 1; // the layer name Layer名称
optional string type = 2; // the layer type Layer类型
repeated string bottom = 3; // the name of each bottom blob输入(Bootom Blob)的名称
repeated string top = 4; // the name of each top blob blob输出(Top Blob)的名称
// The train / test phase for computation.
optional Phase phase = 10; // 阶段 (TRAIN/TEST)
// The amount of weight to assign each top blob in the objective.
// Each layer assigns a default value, usually of either 0 or 1,
// to each top blob.
// 为每个TopBlob 分配对损失函数的权重,每个Layer都有默认值,要么为0,表示不参与损失函数的计算,要么为1
// 表示参与损失函数的计算
repeated float loss_weight = 5;
// Specifies training parameters (multipliers on global learning constants,
// and the name and other settings used for weight sharing).
// 指定的训练参数,例如 相对全局学习常数的缩放因子,以及用于权值共享的名称或其他设置,比如lr_mult decay_mult等
repeated ParamSpec param = 6;
// The blobs containing the numeric parameters of the layer.
// 承载该层参数的blob
repeated BlobProto blobs = 7;
// Specifies whether to backpropagate to each bottom. If unspecified,
// Caffe will automatically infer whether each input needs backpropagation
// to compute parameter gradients. If set to true for some inputs,
// backpropagation to those inputs is forced; if set false for some inputs,
// backpropagation to those inputs is skipped.
//指定是否对Bootom Blob进行反向传播过程,该字段的维度应该与Bottom Blob个数一致
// 如果不指定,caffe会自动检查腿短每一个输入是否都需要反向传播计算,
// 如果一些输入设置为true,这些layer将强制反向传播
// 如果设置为false,这些layer将跳过反向传播
// The size must be either 0 or equal to the number of bottoms.
repeated bool propagate_down = 11;
// Rules controlling whether and when a layer is included in the network,
// based on the current NetState. You may specify a non-zero number of rules
// to include OR exclude, but not both. If no include or exclude rules are
// specified, the layer is always included. If the current NetState meets
// ANY (i.e., one or more) of the specified rules, the layer is
// included/excluded.
// 控制某个层某个时刻是否包含在网络中,基于当前的NetSatateRule.
// 可以为include或者exclude,如果没有任何规则,那个该层一致包含在网络中.其他根据指定来
repeated NetStateRule include = 8;
repeated NetStateRule exclude = 9;
// Parameters for data pre-processing.
// 数据预处理参数
optional TransformationParameter transform_param = 100;
// Parameters shared by loss layers.
// 损失层共享的参数
optional LossParameter loss_param = 101;
// Layer type-specific parameters.
//
// Note: certain layers may have more than one computational engine
// for their implementation. These layers include an Engine type and
// engine parameter for selecting the implementation.
// The default for the engine is set by the ENGINE switch at compile-time.
// 特定类型层的参数.
optional AccuracyParameter accuracy_param = 102;
optional ArgMaxParameter argmax_param = 103;
optional BatchNormParameter batch_norm_param = 139;
optional BiasParameter bias_param = 141;
optional ClipParameter clip_param = 148; //下一个是149
optional ConcatParameter concat_param = 104;
optional ContrastiveLossParameter contrastive_loss_param = 105;
optional ConvolutionParameter convolution_param = 106;
optional CropParameter crop_param = 144;
optional DataParameter data_param = 107;
optional DropoutParameter dropout_param = 108;
optional DummyDataParameter dummy_data_param = 109;
optional EltwiseParameter eltwise_param = 110;
optional ELUParameter elu_param = 140;
optional EmbedParameter embed_param = 137;
optional ExpParameter exp_param = 111;
optional FlattenParameter flatten_param = 135;
optional HDF5DataParameter hdf5_data_param = 112;
optional HDF5OutputParameter hdf5_output_param = 113;
optional HingeLossParameter hinge_loss_param = 114;
optional ImageDataParameter image_data_param = 115;
optional InfogainLossParameter infogain_loss_param = 116;
optional InnerProductParameter inner_product_param = 117;
optional InputParameter input_param = 143;
optional LogParameter log_param = 134;
optional LRNParameter lrn_param = 118;
optional MemoryDataParameter memory_data_param = 119;
optional MVNParameter mvn_param = 120;
optional ParameterParameter parameter_param = 145;
optional PoolingParameter pooling_param = 121;
optional PowerParameter power_param = 122;
optional PReLUParameter prelu_param = 131;
optional PythonParameter python_param = 130;
optional RecurrentParameter recurrent_param = 146;
optional ReductionParameter reduction_param = 136;
optional ReLUParameter relu_param = 123;
optional ReshapeParameter reshape_param = 133;
optional ScaleParameter scale_param = 142;
optional SigmoidParameter sigmoid_param = 124;
optional SoftmaxParameter softmax_param = 125;
optional SPPParameter spp_param = 132;
optional SliceParameter slice_param = 126;
optional SwishParameter swish_param = 147;
optional TanHParameter tanh_param = 127;
optional ThresholdParameter threshold_param = 128;
optional TileParameter tile_param = 138;
optional WindowDataParameter window_data_param = 129;
}
2.2 Phase结构
就两个 TRAIN和 TEST
enum Phase {
TRAIN = 0;
TEST = 1;
}
2.3 ParamSpec结构
// Specifies training parameters (multipliers on global learning constants,
// and the name and other settings used for weight sharing).
// 指定训练参数以及权重共享的其他设置
message ParamSpec {
// The names of the parameter blobs -- useful for sharing parameters among
// layers, but never required otherwise. To share a parameter between two
// layers, give it a (non-empty) name.
// 两个参数之间进行参数共享的blob的名字
optional string name = 1;
// Whether to require shared weights to have the same shape, or just the same
// count -- defaults to STRICT if unspecified.
// 参数共享时是否需要相同的shape,默认是需要相同的
optional DimCheckMode share_mode = 2;
enum DimCheckMode {
// STRICT (default) requires that num, channels, height, width each match.
STRICT = 0;
// PERMISSIVE requires only the count (num*channels*height*width) to match.
PERMISSIVE = 1;
}
// The multiplier on the global learning rate for this parameter.
// 学习率参数,learning rate = base+lr * lr_mult
optional float lr_mult = 3 [default = 1.0];
// The multiplier on the global weight decay for this parameter.
// 权重衰减系数, weight = weight_decay * dacay_mult
optional float decay_mult = 4 [default = 1.0];
}
2.4 NetStateRule结构
// 网络状态
message NetStateRule {
// Set phase to require the NetState have a particular phase (TRAIN or TEST)
// to meet this rule.
// 设置phase
optional Phase phase = 1;
// Set the minimum and/or maximum levels in which the layer should be used.
// Leave undefined to meet the rule regardless of level.
// 设置Layer的level
optional int32 min_level = 2;
optional int32 max_level = 3;
// Customizable sets of stages to include or exclude.
// The net must have ALL of the specified stages and NONE of the specified
// "not_stage"s to meet the rule.
// (Use multiple NetStateRules to specify conjunctions of stages.)
// 可定制的stage集合
repeated string stage = 4;
repeated string not_stage = 5;
}
2.5 TransformationParameter 结构
// Message that stores parameters used to apply transformation
// to the data layer's data
// 用来进行数据层(图像)变换的参数
message TransformationParameter {
// For data pre-processing, we can do simple scaling and subtracting the
// data mean, if provided. Note that the mean subtraction is always carried
// out before scaling.
// 像素归一化,归一化前会减去均值
optional float scale = 1 [default = 1];
// Specify if we want to randomly mirror data.
// 图像随机mirror操作
optional bool mirror = 2 [default = false];
// Specify if we would like to randomly crop an image.
// 图像随机crop操作
optional uint32 crop_size = 3 [default = 0];
// mean_file and mean_value cannot be specified at the same time
// 图像的均值文件
optional string mean_file = 4;
// if specified can be repeated once (would subtract it from all the channels)
// or can be repeated the same number of times as channels
// (would subtract them from the corresponding channel)
// 图像的均值,手动指定,通常三个
repeated float mean_value = 5;
// Force the decoded image to have 3 color channels.
// 强制图像必须有三个通道
optional bool force_color = 6 [default = false];
// Force the decoded image to have 1 color channels.
// 强制图像还有一个通道(灰度图像)
optional bool force_gray = 7 [default = false];
}
2.6 LossParameter 结构
// Message that stores parameters shared by loss layers
// loss层的参数
message LossParameter {
// If specified, ignore instances with the given label.
// 如果指定,则label等于ignore_label的样本将不参与Loss计算,并且反向传播
// 时的梯度直接置0
optional int32 ignore_label = 1;
// How to normalize the loss for loss layers that aggregate across batches,
// spatial dimensions, or other dimensions. Currently only implemented in
// SoftmaxWithLoss and SigmoidCrossEntropyLoss layers.
// 指定loss归一化的方式,目前只在SoftmaxWithLoss and SigmoidCrossEntropyLoss上实施
enum NormalizationMode {
// Divide by the number of examples in the batch times spatial dimensions.
// Outputs that receive the ignore label will NOT be ignored in computing
// the normalization factor.
// 所有样本都参与计算,包括ignore label
FULL = 0;
// Divide by the total number of output locations that do not take the
// ignore_label. If ignore_label is not set, this behaves like FULL.
// 所有样本都参与计算,不包括ignore label
VALID = 1;
// Divide by the batch size.
// 除以给定的batch size
BATCH_SIZE = 2;
// Do not normalize the loss.
// 不归一化 loss
NONE = 3;
}
// For historical reasons, the default normalization for
// SigmoidCrossEntropyLoss is BATCH_SIZE and *not* VALID.
// loss归一化方式
optional NormalizationMode normalization = 3 [default = VALID];
// Deprecated. Ignored if normalization is specified. If normalization
// is not specified, then setting this to false will be equivalent to
// normalization = BATCH_SIZE to be consistent with previous behavior.
// 已经放弃,loss会除以参与计算样本的总数,否则loss等于直接求和
optional bool normalize = 2;
}
三、参考文献
<深度学习21天实战caffe>
Caffe源码解析(一)——caffe.proto