在cpu下的caffenet初使用

最新推荐文章于 2021-12-02 19:36:47 发布

置顶 sumerun

最新推荐文章于 2021-12-02 19:36:47 发布

阅读量507

点赞数

分类专栏： caffe相关文章标签： caffe 自己的数据集识别详细过程代码说明

本文链接：https://blog.csdn.net/sumerun/article/details/90633026

版权

caffe相关专栏收录该内容

1 篇文章 0 订阅

订阅专栏

由于项目需要进行机器人运行环境的识别，搜寻了一番发现caffe实现较为简单，所以进行了初步尝试，为了避免自己忘掉主要步骤在这里记录一下。使用的系统是ubuntu 16.04 caffe opencv 3.4 cpu only

1.caffea编译

caffe在ubuntu下的编译可参考https://www.cnblogs.com/darkknightzh/p/5797526.html 和https://blog.csdn.net/u013832707/article/details/53159071一定要安装这篇文章首先提到的那几个依赖库。这里很奇怪，我在使用以下代码进行安装时总出现很多错误，最后使用cmake进行了安装，建立一个build文件夹，cmake .. make, make install最后竟然装好了 == 注意在执行以下这句生成mekafile时如果和我一样仅仅使用的是CPU需要将Makefile.config.example 中的 CPU_ONLY := 1 取消注释，即：

# CPU-only switch (uncomment to build without GPU support).
CPU_ONLY := 1

cp Makefile.config.example Makefile.config

cp Makefile.config.example Makefile.config
# Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
make all
make test
make runtest

2.训练

2.1准备数据以及网络

在安装好caffe之后，保证能实现手写识别的demo 具体可以参考/home/***/CAFFE_ROOT/caffe/examples/mnist 目录下的readme.md进行实验，教程非常详细，运行该例程保证caffe可以正常使用。通过运行这个例子大概明白caffe时如何让进行工作的。首先在训练的时候需要准备：

(1) 数据 lmdb格式的；（2）train.txt, val.txt作为数据的目录（3）train_val.prototxt网络模型 (4)solver.prototxt

我的工程目录结构如下：其中my_data的路径为：/home/×××/CAFFE_ROOT/caffe/data/my_data/

my_data:
    build_lmdb
    data_set
        train
        val
        train.txt
        val.txt
    sovler.prototxt

制作数据标签的脚本如下：

#!/usr/bin/env sh
DATA_TRAIN_GROUND=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/train/ground #训练的图片文件位置
DATA_TRAIN_STAIR=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/train/stairs #训练的图片文件位置
DATA_VAL_STAIR=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/val/stairs #测试的图片文件位置
DATA_VAL_GROUND=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/val/ground #测试的图片文件位置

DATASAVE=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set #保存train.txt 和 cal.txt的路径


echo "Create train.txt..."

find $DATA_TRAIN_STAIR -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 0/">>$DATASAVE #/train.txt中的目录只截取到第8层级 即原本的路径是/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/train/stairs 截取之后存入train.txt中的只是 data_set/train/stairs/332.jpg 0其中0表示 标签 332.jpg时文件名，下文同理
find $DATA_VAL_STAIR -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 0/">>$DATASAVE/val.txt
find $DATA_TRAIN_GROUND -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 1/">>$DATASAVE/tmp1.txt
find $DATA_VAL_GROUND -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 1/">>$DATASAVE/tmp.txt

cat $DATASAVE/tmp1.txt>>$DATASAVE/train.txt
cat $DATASAVE/tmp.txt>>$DATASAVE/val.txt
#cat $DATASAVE/train.txt>>$DATASAVE/train.txt

rm -rf $DATASAVE/tmp.txt
rm -rf $DATASAVE/tmp1.txt

echo "create train.txt & evl.txt  Done.."

其中几个目录是我电脑上的，这里使用了绝对路径，可能需要根据自己的电脑进行修改。

制作lmdb文件的脚本如下：

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e

EXAMPLE=/home/×××/CAFFE_ROOT/caffe/data/my_data/build_lmdb

DATA=/home/×××//CAFFE_ROOT/caffe/data/my_data/data_set  #its the dirctory of train and val images file

TOOLS=/home//×××//CAFFE_ROOT/caffe/build/tool  #编译好caffe后生成的转换工具的目录 

TRAIN_DATA_ROOT=/home//×××//CAFFE_ROOT/caffe/data/my_data/ #这个目录只需要写道 trian.txt中的上一层即可，如在上文中最后triaan.txt的目录是data_set/train....\这两个路径加起来构成完整的数据路径！
VAL_DATA_ROOT=/home//×××//CAFFE_ROOT/caffe/data/my_data/   #its the dirctory + val.txt leads to the image with lables 


# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
  echo "RESIZE_HEIGHT=256 RESIZE_WIDTH=256."
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/ilsvrc12_train_lmdb

echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $DATA/val.txt \
    $EXAMPLE/ilsvrc12_val_lmdb

echo "Done."

制作均值文件脚本：

 #!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12

EXAMPLE=/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb  #lmdb directory
DATA=/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb     #target dirextory
TOOLS=/home/***/CAFFE_ROOT/caffe/build/tools
$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_train_lmdb \
  $DATA/imagenet_mean.binaryproto

echo "Done."

caffenet的网络结构 sovler.prototxt如下：

net: "/home/***/CAFFE_ROOT/caffe/data/my_data/train_val.prototxt"
test_iter: 20
test_interval: 500
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 400
display: 20
max_iter: 15000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "/home//***//CAFFE_ROOT/caffe/data/my_data/caffenet_train"
solver_mode: CPU
#test_iter：在测试的时候，需要迭代的次数，即test_iter* batchsize（测试集的）=测试集的大小，
#测试集batchsize可以在prototx文件里设置。
#test_interval：interval是区间的意思，该参数表示训练的时
#候，每迭代500次就进行一次测试。 
#caffe在训练的过程是边训练边测试的。训练过程中每500次迭代（也就是160个训练样本参与了计
#算，batchsize为8），计算一次测试误差。计算一次测试误差就需要包含所有的测试图片（这里为200），
#这样可以认为在一个epoch里，训练集中的所有样本都遍历以一遍，但测试集的所有样本至少要遍历一次，
#至于具体要多少次，也许不是整数次，这就要看代码，大致了解下这个过程就可以了。

caffenet的网络结构 train_val.prototxt如下：

name: "CaffeNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb/imagenet_mean.binaryproto"
  }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: true
#  }
  data_param {
    source: "/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb/ilsvrc12_train_lmdb"
    batch_size: 8
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    mean_file: "/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb/imagenet_mean.binaryproto"
  }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: false
#  }
  data_param {
    source: "/home/chao-zhang/CAFFE_ROOT/caffe/data/my_data/build_lmdb/ilsvrc12_val_lmdb"
    batch_size: 50
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "norm1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

几个参数的简单说明：

test_iter：在测试的时候，需要迭代的次数，即test_iter* batchsize（测试集的）=测试集的大小，测试集batchsize可以在prototx文件里设置。
test_interval：interval是区间的意思，该参数表示训练的时候，每迭代500次就进行一次测试。
caffe在训练的过程是边训练边测试的。训练过程中每500次迭代（也就是32000个训练样本参与了计算，batchsize为64），计算一次测试误差。计算一次测试误差就需要包含所有的测试图片（这里为10000），这样可以认为在一个epoch里，训练集中的所有样本都遍历以一遍，但测试集的所有样本至少要遍历一次，至于具体要多少次，也许不是整数次，这就要看代码，大致了解下这个过程就可以了。

2.2 训练

使用如下脚本进行训练：

#!/usr/bin/env sh
cd /home/***/CAFFE_ROOT/caffe
./build/tools/caffe train --solver=/home/***/CAFFE_ROOT/caffe/data/my_data/solver.prototxt -iterations=10000
#执行该脚本进行训练

-iterations=10000迭代次数，默认为50 这里设置为10000 接下来就是漫长的等待，根据机器的差异时间会不同。在训练过程中发现loss在不断的降低，最后我的稳定在0.2 0.3附近，上网查后别人说主要时看训练过程中的accuracy最后我的大概稳定在0.8附近。训练的过程中依据之前在solver.prototxt中设置的test_interval的大小进行，比如我的是 test_interval: 500即运行500次进行一侧测试，给出测试accrency。

2.3 测试

在测试时调用 ./build/examples/cpp_classification/classification.bin 程序，是由之前caffe编译完成生成的二进制文件。其中的源码在/home/***/CAFFE_ROOT/caffe/examples/cpp_classification从源代码中我们可以发现，该函数需要传入的参数如下：

Classifier(const string& model_file,
             const string& trained_file,
             const string& mean_file,
             const string& label_file);

其中第一个参数为训练完成后生成的模型文件，该模型文件需要依据训练网络进行修改，去掉不必要的部分，比如训练数据输入等我的测试模型如下：

name: "CaffeNet"

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param{shape:{dim:10 dim:3 dim:227 dim:227}}#这里要和训练的模型一致！
	# dim:1 batchsize  dim:1 number of colour channels - rgb
  	# dim:256 width dim:256 height 
}

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1" 
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "norm1"
  top: "conv2"
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  inner_product_param {
    num_output: 4096   
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  inner_product_param {
    num_output: 2    
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8"
  top: "prob"
}

第二个参数为训练完成后生成的模型文件 caffenet_train_iter_1000.caffemodel 第三个参数为之前数据准备时生成的均值文件：imagenet_mean.binaryproto 第四个参数为要识别的图片文件：1.jpg 这里最好对图片进行处理将尺寸设置为256×256 具体细节可参考：https://blog.csdn.net/zchang81/article/details/73088042比如我的训练脚本如下：


#!/usr/bin/env sh
cd /home/***/CAFFE_ROOT/caffe 
./build/examples/cpp_classification/classification.bin 
data/my_data/deploy.prototxt 
data/my_data/caffenet_train_iter_2000.caffemodel data/my_data/build_lmdb/imagenet_mean.binaryproto 
data/my_data/lable.txt data/my_data/1.jpg

#执行这个脚本进行单张图片测试

运行完该脚本后的结果如下：

***@***-ThinkPad-X1-Carbon-3rd:~/CAFFE_ROOT/caffe/data/my_data$ ./test_image.sh 
---------- Prediction for data/my_data/1.jpg ----------
0.5195 - "stair 0"
0.4805 - "ground 1"

本文讲了如何使用caffenet使用自己的数据集进行图像识别。仅仅是傻瓜式操作。在之后的工作中需要修改classification的源码，结合摄像头进行视频实时识别。在下一篇文章中将会介绍如何使用已经训练好的模型进行识别移植。