先声明:如果想着从0开始搭建caffe环境,请自行百度其他博文~
本渣渣的学长前辈已经在服务器上(Linux)搭好了caffe环境,懒于重搭,但对caffe的运行机制和环境不甚了解,又浮躁于快速上手,所以企图结合caffe官方给的、example里面的cifar10例程、简单熟悉一下整体caffe怎么玩~~~
第一步:获得数据集
文件目录:
代码:
#!/usr/bin/env sh
# This scripts downloads the CIFAR10 (binary version) data and unzips it.
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR
echo "Downloading..."
wget --no-check-certificate http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz
echo "Unzipping..."
tar -xf cifar-10-binary.tar.gz && rm -f cifar-10-binary.tar.gz
mv cifar-10-batches-bin/* . && rm -rf cifar-10-batches-bin
# Creation is split out because leveldb sometimes causes segfault
# and needs to be re-created.
echo "Done."
其中
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR
表示获取该.sh文件所在的目录,并cd到该目录,然后后面的代码是检查该路径下是否已经与数据集了,如果有则会删除然后再重新下载(为防止命名冲突吧),所以这个文件不需要修改任何,在哪运行,就在那个文件下载.bin的二进制cifar10数据集。
运行:
ubuntu@imagination:/home/workspace/zhudd/caffe/mycaffe/data/get_data$ sh get_cifar10.sh
第二步:创建数据集
上一步直接get的数据集文件是二进制文件,需要将其转换成LEVELDB文件或者LMDB文件才能被caffe识别,该程序在
直接运行 create_cifar10.sh 文件,不出意外,应该会报一堆错误~
主要因为路径问题
直接上代码
#!/usr/bin/env sh
# This script converts the cifar data into leveldb format.
EXAMPLE=/home/workspace/zhudd/caffe/mycaffe/data/make_data
DATA=/home/workspace/zhudd/caffe/mycaffe/data/get_data
DBTYPE=lmdb
变量 | 解释 |
---|---|
EXAMPLE | 创建出的cifar10数据集的路径(两文件夹,分别是用于train和test的) |
DATA | 原始cifar10的.bin文件所在的位置 |
DBTYPE | 创建出来的可供caffe直接使用的数据集的数据类型 |
这里一定要修改文件路径,因为原始代码存放.bin文件可能在电脑或者服务器上并不存在……
然后
echo "Creating $DBTYPE..."
rm -rf $EXAMPLE/cifar10_train_$DBTYPE $EXAMPLE/cifar10_test_$DBTYPE
然后
./.build_release/examples/cifar10/convert_cifar_data.bin $DATA $EXAMPLE $DBTYPE
echo "Computing image mean..."
./.build_release/tools/compute_image_mean -backend=$DBTYPE \
$EXAMPLE/cifar10_train_$DBTYPE $EXAMPLE/mean.binaryproto
echo "Done."
其中 ./.build_release/examples/cifar10/convert_cifar_data.bin 非常重要,在编译完caffe之后,会生成一个 .build_release 文件夹,里面有很多要用到的脚本工具,可以直接把这个文件夹放到 create_cifar10.sh 所在的目录,这样就可以直接利用相对位置 ./ 来查找了,否则绝对位置。
另外
./.build_release/tools/compute_image_mean -backend= …… 是用来生成图像均值的,用于对原始图像预处理,后面后用到。
运行
可以发现在
/home/workspace/zhudd/caffe/mycaffe/data/make_data
文件下已经生成了
第三步:搭网络
创建一个prototxt文件,文件名为 my_net_cifar.prototxt,存放网络的结构
代码在这里:(先来数据层)
name: "my_net"
layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "/home/workspace/zhudd/caffe/mycaffe/data/make_data/mean.binaryproto"
}
data_param {
source: "/home/workspace/zhudd/caffe/mycaffe/data/make_data/cifar10_train_lmdb"
batch_size: 100
backend: LMDB
}
}
注意:其中的
transform_param {
mean_file: "/home/workspace/zhudd/caffe/mycaffe/data/make_data/mean.binaryproto"
}
是用来完成数据预处理的(原始图像减去均值,使得生成的新数组均值为0),该 mean.binaryproto 文件就在刚刚创建的数据集所在的文件目录里
另外:
data_param {
source: "/home/workspace/zhudd/caffe/mycaffe/data/make_data/cifar10_train_lmdb"
batch_size: 100
backend: LMDB
}
是指定数据集所在的位置,以及每批读取的文件数量和数据集的格式
剩下的代码:(用空再写注释,里面涉及了一些精妙的算法以及处理细节)
layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "/home/workspace/zhudd/caffe/mycaffe/data/make_data/mean.binaryproto"
}
data_param {
source: "/home/workspace/zhudd/caffe/mycaffe/data/make_data/cifar10_test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.0001
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3"
top: "pool3"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool3"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 64
weight_filler {
type: "gaussian"
std: 0.1
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "gaussian"
std: 0.1
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
第四步:写训练的参数及训练的方式等
创建 my_net_cifar_solver.prototxt 文件
# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10
# The train/test net protocol buffer definition
net: "/home/workspace/zhudd/caffe/mycaffe/model/my_net_cifar.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.001
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 5000
# snapshot intermediate results
snapshot: 5000
snapshot_format: HDF5
snapshot_prefix: "/home/workspace/zhudd/caffe/mycaffe/model"
# solver mode: CPU or GPU
solver_mode: GPU
其中最重要的就是 net: 指定之前写好的 my_net_cifar.prototxt 所在的路径。
snapshot_prefix: 训练好的模型保存的路径
第五步:写训练模型用的.sh可执行文件
直接上代码
文件名: train_my_net.sh
#!/usr/bin/env sh
TOOLS=/home/workspace/zhudd/caffe/mycaffe/data/make_data/.build_release/tools
$TOOLS/caffe train \
--solver=/home/workspace/zhudd/caffe/mycaffe/model/my_net_cifar_solver.prototxt
其中
TOOLS=/home/workspace/zhudd/caffe/mycaffe/data/make_data/.build_release/tools
是指 在编译完caffe之后,会生的 .build_release 文件夹里面的 tools文件所在的位置
$TOOLS/caffe train \
--solver=/home/workspace/zhudd/caffe/mycaffe/model/my_net_cifar_solver.prototxt
是指刚刚写好的 my_net_cifar_solver.prototxt 文件所在的位置,注意,一定要最后写上my_net_cifar_solver.prototxt,否则,报错
训练完了,可以查看模型