在cpu下的caffenet初使用

由于项目需要进行机器人运行环境的识别,搜寻了一番发现caffe实现较为简单,所以进行了初步尝试,为了避免自己忘掉主要步骤在这里记录一下。使用的系统是ubuntu 16.04 caffe opencv 3.4 cpu only

1.caffea编译

caffe在ubuntu下的编译可参考https://www.cnblogs.com/darkknightzh/p/5797526.html https://blog.csdn.net/u013832707/article/details/53159071一定要安装这篇文章首先提到的那几个依赖库。这里很奇怪,我在使用以下代码进行安装时总出现很多错误,最后使用cmake进行了安装,建立一个build文件夹,cmake ..  make,   make install最后竟然装好了 ==  注意在执行以下这句生成mekafile时 如果和我一样仅仅使用的是CPU需要将Makefile.config.example 中的 CPU_ONLY := 1 取消注释,即:

# CPU-only switch (uncomment to build without GPU support).
 CPU_ONLY := 1

cp Makefile.config.example Makefile.config
cp Makefile.config.example Makefile.config
# Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
make all
make test
make runtest

2.训练

2.1准备数据以及网络

在安装好caffe之后,保证能实现手写识别的demo 具体可以参考/home/***/CAFFE_ROOT/caffe/examples/mnist 目录下的readme.md进行实验,教程非常详细,运行该例程保证caffe可以正常使用。通过运行这个例子大概明白caffe时如何让进行工作的。首先在训练的时候需要准备:

(1) 数据 lmdb格式的;(2)train.txt, val.txt作为数据的目录 (3)train_val.prototxt网络模型 (4)solver.prototxt

我的工程目录结构如下:其中my_data的路径为:/home/×××/CAFFE_ROOT/caffe/data/my_data/

my_data:
    build_lmdb
    data_set
        train
        val
        train.txt
        val.txt
    sovler.prototxt
    

制作数据标签的脚本如下:

#!/usr/bin/env sh
DATA_TRAIN_GROUND=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/train/ground #训练的图片文件位置
DATA_TRAIN_STAIR=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/train/stairs #训练的图片文件位置
DATA_VAL_STAIR=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/val/stairs #测试的图片文件位置
DATA_VAL_GROUND=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/val/ground #测试的图片文件位置

DATASAVE=/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set #保存train.txt 和 cal.txt的路径


echo "Create train.txt..."

find $DATA_TRAIN_STAIR -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 0/">>$DATASAVE #/train.txt中的目录只截取到第8层级 即原本的路径是/home/×××/CAFFE_ROOT/caffe/data/my_data/data_set/train/stairs 截取之后存入train.txt中的只是 data_set/train/stairs/332.jpg 0其中0表示 标签 332.jpg时文件名,下文同理
find $DATA_VAL_STAIR -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 0/">>$DATASAVE/val.txt
find $DATA_TRAIN_GROUND -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 1/">>$DATASAVE/tmp1.txt
find $DATA_VAL_GROUND -name *.jpg | cut -d '/' -f8-12 | sed "s/$/ 1/">>$DATASAVE/tmp.txt

cat $DATASAVE/tmp1.txt>>$DATASAVE/train.txt
cat $DATASAVE/tmp.txt>>$DATASAVE/val.txt
#cat $DATASAVE/train.txt>>$DATASAVE/train.txt

rm -rf $DATASAVE/tmp.txt
rm -rf $DATASAVE/tmp1.txt

echo "create train.txt & evl.txt  Done.."

其中几个目录是我电脑上的,这里使用了绝对路径,可能需要根据自己的电脑进行修改。

制作lmdb文件的脚本如下:

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e

EXAMPLE=/home/×××/CAFFE_ROOT/caffe/data/my_data/build_lmdb

DATA=/home/×××//CAFFE_ROOT/caffe/data/my_data/data_set  #its the dirctory of train and val images file

TOOLS=/home//×××//CAFFE_ROOT/caffe/build/tool  #编译好caffe后生成的转换工具的目录 

TRAIN_DATA_ROOT=/home//×××//CAFFE_ROOT/caffe/data/my_data/ #这个目录只需要写道 trian.txt中的上一层即可,如在上文中最后triaan.txt的目录是data_set/train....\这两个路径加起来构成完整的数据路径!
VAL_DATA_ROOT=/home//×××//CAFFE_ROOT/caffe/data/my_data/   #its the dirctory + val.txt leads to the image with lables 


# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
  echo "RESIZE_HEIGHT=256 RESIZE_WIDTH=256."
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/ilsvrc12_train_lmdb

echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $DATA/val.txt \
    $EXAMPLE/ilsvrc12_val_lmdb

echo "Done."

制作均值文件脚本:

 #!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12

EXAMPLE=/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb  #lmdb directory
DATA=/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb     #target dirextory
TOOLS=/home/***/CAFFE_ROOT/caffe/build/tools
$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_train_lmdb \
  $DATA/imagenet_mean.binaryproto

echo "Done."

caffenet的网络结构 sovler.prototxt如下:

net: "/home/***/CAFFE_ROOT/caffe/data/my_data/train_val.prototxt"
test_iter: 20
test_interval: 500
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 400
display: 20
max_iter: 15000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "/home//***//CAFFE_ROOT/caffe/data/my_data/caffenet_train"
solver_mode: CPU
#test_iter:在测试的时候,需要迭代的次数,即test_iter* batchsize(测试集的)=测试集的大小,
#测试集batchsize可以在prototx文件里设置。
#test_interval:interval是区间的意思,该参数表示训练的时
#候,每迭代500次就进行一次测试。 
#caffe在训练的过程是边训练边测试的。训练过程中每500次迭代(也就是160个训练样本参与了计
#算,batchsize为8),计算一次测试误差。计算一次测试误差就需要包含所有的测试图片(这里为200),
#这样可以认为在一个epoch里,训练集中的所有样本都遍历以一遍,但测试集的所有样本至少要遍历一次,
#至于具体要多少次,也许不是整数次,这就要看代码,大致了解下这个过程就可以了。

caffenet的网络结构 train_val.prototxt如下:

name: "CaffeNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb/imagenet_mean.binaryproto"
  }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: true
#  }
  data_param {
    source: "/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb/ilsvrc12_train_lmdb"
    batch_size: 8
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    mean_file: "/home/***/CAFFE_ROOT/caffe/data/my_data/build_lmdb/imagenet_mean.binaryproto"
  }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: false
#  }
  data_param {
    source: "/home/chao-zhang/CAFFE_ROOT/caffe/data/my_data/build_lmdb/ilsvrc12_val_lmdb"
    batch_size: 50
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "norm1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

几个参数的简单说明:

test_iter:在测试的时候,需要迭代的次数,即test_iter* batchsize(测试集的)=测试集的大小,测试集batchsize可以在prototx文件里设置。
test_interval:interval是区间的意思,该参数表示训练的时候,每迭代500次就进行一次测试。
caffe在训练的过程是边训练边测试的。训练过程中每500次迭代(也就是32000个训练样本参与了计算,batchsize为64),计算一次测试误差。计算一次测试误差就需要包含所有的测试图片(这里为10000),这样可以认为在一个epoch里,训练集中的所有样本都遍历以一遍,但测试集的所有样本至少要遍历一次,至于具体要多少次,也许不是整数次,这就要看代码,大致了解下这个过程就可以了。

2.2 训练

使用如下脚本进行训练:

#!/usr/bin/env sh
cd /home/***/CAFFE_ROOT/caffe
./build/tools/caffe train --solver=/home/***/CAFFE_ROOT/caffe/data/my_data/solver.prototxt -iterations=10000
#执行该脚本进行训练

-iterations=10000迭代次数,默认为50 这里设置为10000 接下来就是漫长的等待,根据机器的差异时间会不同。在训练过程中发现loss在不断的降低,最后我的稳定在0.2 0.3附近,上网查后别人说主要时看训练过程中的accuracy最后我的大概稳定在0.8附近。训练的过程中依据之前在solver.prototxt中设置的test_interval的大小进行,比如我的是 test_interval: 500即运行500次进行一侧测试,给出测试accrency。

2.3 测试

在测试时调用 ./build/examples/cpp_classification/classification.bin 程序,是由之前caffe编译完成生成的二进制文件。其中的源码在/home/***/CAFFE_ROOT/caffe/examples/cpp_classification从源代码中我们可以发现,该函数需要传入的参数如下:

Classifier(const string& model_file,
             const string& trained_file,
             const string& mean_file,
             const string& label_file);

其中第一个参数为训练完成后生成的模型文件,该模型文件需要依据训练网络进行修改,去掉不必要的部分,比如训练数据输入等我的测试模型如下:

name: "CaffeNet"

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param{shape:{dim:10 dim:3 dim:227 dim:227}}#这里要和训练的模型一致!
	# dim:1 batchsize  dim:1 number of colour channels - rgb
  	# dim:256 width dim:256 height 
}

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1" 
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "norm1"
  top: "conv2"
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  inner_product_param {
    num_output: 4096   
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  inner_product_param {
    num_output: 2    
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8"
  top: "prob"
}

第二个参数为训练完成后生成的模型文件 caffenet_train_iter_1000.caffemodel                                                                                        第三个参数为之前数据准备时生成的均值文件:imagenet_mean.binaryproto                                                                                          第四个参数为要识别的图片文件:1.jpg 这里最好对图片进行处理 将尺寸设置为256×256 具体细节可参考:https://blog.csdn.net/zchang81/article/details/73088042比如我的训练脚本如下:


#!/usr/bin/env sh
cd /home/***/CAFFE_ROOT/caffe 
./build/examples/cpp_classification/classification.bin 
data/my_data/deploy.prototxt 
data/my_data/caffenet_train_iter_2000.caffemodel data/my_data/build_lmdb/imagenet_mean.binaryproto 
data/my_data/lable.txt data/my_data/1.jpg

#执行这个脚本进行单张图片测试

运行完该脚本后的结果如下:

***@***-ThinkPad-X1-Carbon-3rd:~/CAFFE_ROOT/caffe/data/my_data$ ./test_image.sh 
---------- Prediction for data/my_data/1.jpg ----------
0.5195 - "stair 0"
0.4805 - "ground 1"

本文讲了如何使用caffenet使用自己的数据集进行图像识别。仅仅是傻瓜式操作。在之后的工作中需要修改classification的源码,结合摄像头进行视频实时识别。在下一篇文章中将会介绍如何使用已经训练好的模型进行识别移植。

 

 

 

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值