Caffe配置说明

最新推荐文章于 2022-04-28 11:30:09 发布

Rareay

最新推荐文章于 2022-04-28 11:30:09 发布

阅读量181

点赞数

分类专栏： # DL 文章标签：深度学习 caffe

本文链接：https://blog.csdn.net/qq_33236581/article/details/111658375

版权

DL 专栏收录该内容

19 篇文章 0 订阅

订阅专栏

1 slover_xx.prototxt 训练参数文件

参考

首先贴一个实例，然后一个个说明：

net: "examples/mnist/lenet_train_test.prototxt"
test_iter: 100
test_interval: 500
base_lr: 0.01
momentum: 0.9
type: SGD
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 20000
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
solver_mode: CPU

1.1 指定网络模型

net: "examples/mnist/lenet_train_test.prototxt"

设置要训练的深度网络模型，训练、测试模型可以单独设置：

train_net: "examples/hdf5_classification/logreg_auto_train.prototxt"
test_net: "examples/hdf5_classification/logreg_auto_test.prototxt"

1.2 迭代次数

test_iter: 100

指完成一次测试需要向模型传输 100 批次数据。test_iter x test_batch_size 应该为测试数据集总数。

test_interval: 500

测试间隔。也就是每训练 500 次，才进行一次测试，test_interval x train_batch_size 应该为训练数据集总数或其倍数。

test_initialization: false

刚启动就进行测试。false的话不进行第一次的测试。

1.3 学习率衰减

stepsize: 300
base_lr: 0.01
lr_policy: "inv"
gamma: 0.0001
power: 0.75

每迭代 stepsize 次降低学习速率，base_lr 为基础学习率，lr_policy 为衰减方式，可以设置为以下几种（iter为当前迭代次数）：

fixed，固定学习速率： $basei\_lr$
step，步进衰减： $base\_lr * gamma ^ {floor\frac{iter}{stepsize}}$
exp，指数衰减： $base\_lr * gamma ^ {iter}$
inv，倒数衰减： $base\_lr (1 + gamma * iter) ^ {- power}$
multistep，多步衰减，与步进衰减类似，允许非均匀步进值(stepvalue)
poly，多项式衰减： $base\_lr (1 - \frac{iter}{max\_iter} )^ {power}$
sigmoid，S形衰减： $base\_lr ( \frac{1}{1 + e^{-gamma * (iter - stepsize)}})$

1.4 优化方法

type: SGD

优化算法，有几种可选：“SGD”,“AdaDelta”,“AdaGrad”,“Adam”,“Nesterov”,“RMSProp”。参考这里

1.5 权重衰减

weight_decay: 0.0005

权重衰减项，防止过拟合的一个参数。

1.6 显示

display: 100

每训练100次，在屏幕上显示一次。如果设置为0，则不显示。

1.7 最大迭代次数

max_iter: 20000

最大迭代次数。这个数设置太小，会导致没有收敛，精确度很低。设置太大，会导致震荡，浪费时间。

1.8 保存快照

snapshot: 500

指每迭代 500 次存一个快照，默认不保存。

snapshot_prefix: "examples/mnist/lenet"

保存地址，默认为模型所在的目录下。

snapshot_diff: true

是否保存梯度值，默认为false,不保存。

snapshot_format: HDF5

保存的类型。有两种选择：HDF5 和BINARYPROTO ，默认为BINARYPROTO

1.9 训练方式

solver_mode: CPU

设置运行模式。默认为GPU,如果你没有GPU,则需要改成CPU,否则会出错。

2 train_xx.prototxt 模型参数

以下两节是对caffe里常见网络层的描述，简单作个记录，若自己写的网络层，这些关键字完全可以自定义。

2.1 关键字的描述

下面的关键字是大部分网络层可能会共用的，不同的网络层可能会有不同的关键字，详见下一节。

name:网络层的名称，可以自定义。
type:网络层的类型，下一节有详细举例
top:网络层输出数据块的名称（网络是从下往上生长的）
bottom:网络层输入数据块的名称
include {}
- phase:可以设置为 TRAIN 或 TEST，默认包含这两项
param {}权重和偏置的参数
- lr_mult:权重或偏置的学习率，计算方法如：(base_lr * lr_mult)
- decay_mult:权重或偏置的衰减率

2.2 网络层的描述

数据层

:::details 关键字

transform_param {}数据转换的参数
- mirror:是否镜像翻转
- crop_size:裁减后的尺寸
- mean_file:均值文件
data_param {}数据集的参数
- source:数据集
- batch_size:单次传入模型的图像数量
- backend:数据集的数据类型
  :::

:::details Data

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true # 是否需要加入镜像翻转，默认false
    crop_size: 256 # 输入图像裁减为这个尺寸
    mean_file: "examples/mycaffe/mean.binaryproto" # 均值文件
  }
  data_param {
    source: "examples/mycaffe/img_train_lmdb" # 训练图像
    batch_size: 10 # 一次传入模型的图片数量
    backend: LMDB # 数据类型，默认为 leveldb
  }
}

:::

卷积层

:::details 关键字

convolution_param {}卷积层的参数
- num_output:卷积核的个数，也是输出数据的通道数
- kernel_size:卷积核尺寸
- pad:卷积时是否用0填充
- stride:步长
- weight_filler {}权重的初始化
  - type:初始化类型，如"xavier"、“constant”、“gaussian”、“uniform”
- bias_filler {}偏置的初始化
  - type:初始化类型，如"xavier"、“constant”、“gaussian”、“uniform”
    :::

:::details Convolution

layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 1 # 权重的学习率，计算方法：(base_lr * lr_mult)
    decay_mult: 1 # 权重的衰减率
  }
  param {
    lr_mult: 1 # 偏置的学习率，计算方法：(base_lr * lr_mult)
    decay_mult: 0 # 偏置的衰减率
  }
  convolution_param {
    num_output: 64 # 卷积核的个数，也是输出数据的通道数
    kernel_size: 3 # 卷积核尺寸
    pad: 1 # 卷积时是否用0填充
    stride: 1 # 步长
    weight_filler { # 权重的初始化
      type: "xavier" 
    }
    bias_filler { # 偏置的初始化
      type: "constant"
    }
  }
}

:::

激活层
:::details ReLU

layer {
  name: "relu5_3"
  type: "ReLU"
  bottom: "conv5_3"
  top: "conv5_3"
}

:::

池化层
:::details 关键字
pooling_param {}池化参数
- pool:池化方法：MAX,AVG,STOCHASTIC
- kernel_size:核尺寸
- stride:步长
  :::

:::details Pooling

layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5_3"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

:::

全连接层
:::details 关键字
inner_product_param {}全连接参数
- num_output:输出个数
- weight_filler {}权重的初始化
  - type:初始化类型，如"xavier"、“constant”、“gaussian”、“uniform”
- bias_filler {}偏置的初始化
  - type:初始化类型，如"xavier"、“constant”、“gaussian”、“uniform”
    :::

:::details InnerProduct

layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output:  5             #注意将fc8层改成自己的图像类别数目
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

:::

Dropout 层
:::details Dropout

layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}

:::

准确率
:::details 关键字
accuracy_param {}
- top_k:如果为5，表示输出结果概率最高的前5个中若包含正确结果，即视为预测正确。这里默认为1
  :::
  :::details Accuracy

layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
  accuracy_param {
    top_k: 5
  }
}

:::

损失层
:::details SoftmaxWithLoss

layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

:::

-局部响应归一化层
:::details LRN

layer {
  name: "pool1/norm1"
  type: "LRN"
  bottom: "pool1/3x3_s2"
  top: "pool1/norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}

:::

拼接层
:::details Concat

layer {
  name: "inception_3a/output"
  type: "Concat"
  bottom: "inception_3a/1x1"
  bottom: "inception_3a/3x3"
  bottom: "inception_3a/5x5"
  bottom: "inception_3a/pool_proj"
  top: "inception_3a/output"
}

:::

BN 层
:::details BatchNorm

layer {
	bottom: "conv1"
	top: "conv1"
	name: "bn_conv1"
	type: "BatchNorm"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
}

:::

Scale 层，用在 BN 层后面
:::details Scale

layer {
	bottom: "conv1"
	top: "conv1"
	name: "scale_conv1"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

:::

Eltwise层
:::details 关键字
eltwise_param {}操作传入的两个bottom数据块
- operation:有三种:product(点乘),sum(相加减),max(取大值),其中sum是默认操作。
  :::
  :::details Eltwise

layer {
	bottom: "res2a_branch1"
	bottom: "res2a_branch2c"
	top: "res2a"
	name: "res2a"
	type: "Eltwise"
    eltwise_param {
        operation: SUM
    }
}

:::