前言
网上有很多关于linux系统下,利用VGG16对图片进行分类,但是运用到Windows下会出现很多问题,所以写这篇博客做个总结。
关于配置caffe环境可以参考这篇博客win10+vs2013+caffe+gpu+python环境配置,在这里就不在叙述了。下面将从以下5个步骤讲起。
1.下载图片,生成.txt标签
2.把训练图片转换成lmdb格式
3.计算图片均值
4.修改网络参数和网络的图片路径等
5.训练网络
1 准备数据集
1.网上有很多数据集,可以下载这个数据集re.zip做为参考,如果你要是训练自己的数据集的话,可以模仿这个数据集制作,首先将你的数据集尺寸统一大小384*256(我有matlab代码,想要的话可以私聊),然后以5:1比例分为训练集和测试集,训练集命名为train,测试集命名为test,将数据拷贝到D:\caffe\caffe-master\data\re文件下(这是我caffe的路径),如下图所示。
2. 数据集准备好后,开始制作标签。在D:/caffe/caffe-master/examples路径下,新建文件夹myfile。
用python编写代码,直接运行即可
# -*- coding: utf-8 -*-
import os
data_path='D:/caffe/caffe-master/data/re/' #数据集路径
my='D:/caffe/caffe-master/examples/myfile/' #文件输出路径
classes=[3,4,5,6,7]
def gen_txt(phase):
f=open(my+phase+'.txt','w')
for c in classes:
folder=str(c)
images=os.listdir(data_path+phase+'/')
for img in images:
f.write(phase+'/'+img+' '+folder+'\n')
gen_txt('train')
gen_txt('test')
程序运行后,会在你设置的路径下,生成train.txt和test.txt文件,打开如下图所示格式,表示你的标签制作成功。
2. 将数据转换成lmdb格式
这一步是最麻烦的,因为网上很多教程都是基于linux系统的,要想运用到Windows下,在这我研究了好长时间。
- 新建txt文件,命名为create_lmdb.sh,这里的点是后缀,但是先别着急把txt后缀去掉,因为这样写代码比较方便,然后写入如下代码
#!/usr/bin/env sh
MY=D:/caffe/caffe-master/examples/myfile #文件输出路径
echo "Create train lmdb.."
rm -rf $MY/img_train_lmdb
D:/caffe/caffe-master/Build/x64/Release/convert_imageset \ #caffe配置路径
--shuffle \
--resize_height=256 \
--resize_width=256 \
D:/caffe/caffe-master/data/re/ \ #数据集路径
$MY/train.txt \
$MY/img_train_lmdb
echo "Create test lmdb.."
rm -rf $MY/img_test_lmdb
D:/caffe/caffe-master/Build/x64/Release/convert_imageset \ #caffe配置路径
--shuffle \
--resize_width=256 \
--resize_height=256 \
D:/caffe/caffe-master/data/re/ \ #数据集路径
$MY/test.txt \ #是test,不是val
$MY/img_test_lmdb #是test,不是val
echo "All Done.."
注意以上修改4个路径,改为你电脑的路径。改完后,把汉字注释去掉,因为会影响运行,然后保存,把txt后缀改成sh。如下图所示
然后运行程序,就是这最麻烦,因为sh是linux系统命令,所以在运行程序前,先使用电脑管家的软件管理,输入git然后下载安装,一直点next就行,安装结束后,在环境变量中添加C:\Program Files\Git\bin路径。这里就不在介绍如何添加了路径了。
添加完后,在程序中打开cmd命令窗口,然后cd到D:\caffe\caffe-master\examples\myfile路径下,输入指令 sh create_lmdb.sh
如下图所示
如果在你的路径下生成如下两个文件夹,表示图像数据已经成功制作成lmdb格式了。到这可以松一口气了,因为最难得已经过去了,后面的就很简单了。
3.计算图片均值
图片减去均值再训练,会提高训练速度和精度。因此,一般都会有这个操作。
在D:\caffe\caffe-master\examples\imagenet路径下找到make_imagenet_mean.sh文件拷贝到D:\caffe\caffe-master\examples\myfile路径下,如下图所示,
然后用记事本打开文件,修改代码
#!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12
EXAMPLE=D:/caffe/caffe-master/examples/myfile
DATA=D:/caffe/caffe-master/examples/myfile
TOOLS=D:/caffe/caffe-master/Build/x64/Release
$TOOLS/compute_image_mean $EXAMPLE/img_train_lmdb \
$DATA/imagenet_mean.binaryproto
echo "Done."
EXAMPLE=D:/caffe/caffe-master/examples/myfile 为lmdb文件目录
DATA=D:/caffe/caffe-master/examples/myfile 生成文件所要存放的目录
TOOLS=D:/caffe/caffe-master/Build/x64/Release caffe配置路径
代码修改后,在cmd命令窗口cd到D:\caffe\caffe-master\examples\myfile路径下,然后输入代码 sh make_imagenet_mean.sh
如下图所示
如果运行成功的话,会生成如下图imagenet_mean.binaryproto文件
4.定义模型
在路径D:\caffe\caffe-master\examples\myfile下新建vggnet_train_val.prototxt文件,用vs打开,复制以下代码
name: "VGGNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
crop_size: 224
mean_value: 104
mean_value: 117
mean_value: 123
mirror: true
}
data_param {
source: "D:/caffe/caffe-master/examples/myfile/img_train_lmdb" #注意训练集文件的路径
batch_size: 20 #训练批次大小根据自己的显卡显存而定
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
crop_size: 224
mean_value: 104
mean_value: 117
mean_value: 123
mirror: false
}
data_param {
source: "D:/caffe/caffe-master/examples/myfile/img_test_lmdb" #注意验证集文件的路径
batch_size: 10
backend: LMDB
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "pool1"
top: "conv2_1"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2_1"
type: "ReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2_2"
type: "ReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2_2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "pool2"
top: "conv3_1"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3_1"
type: "ReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3_2"
type: "ReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3_3"
type: "ReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3_3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "pool3"
top: "conv4_1"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu4_1"
type: "ReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu4_2"
type: "ReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu4_3"
type: "ReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4_3"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv5_1"
type: "Convolution"
bottom: "pool4"
top: "conv5_1"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu5_1"
type: "ReLU"
bottom: "conv5_1"
top: "conv5_1"
}
layer {
name: "conv5_2"
type: "Convolution"
bottom: "conv5_1"
top: "conv5_2"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu5_2"
type: "ReLU"
bottom: "conv5_2"
top: "conv5_2"
}
layer {
name: "conv5_3"
type: "Convolution"
bottom: "conv5_2"
top: "conv5_3"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu5_3"
type: "ReLU"
bottom: "conv5_3"
top: "conv5_3"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5_3"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
inner_product_param {
num_output: 4096
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
}
param {
lr_mult: 1
}
inner_product_param {
num_output: 4096
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10 #注意将fc8层改成自己的图像类别数目
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
以上是VGG16模型,需改路径和类别数后保存退出即可。
在当前路径下新建文件vggnet_solver.prototxt,复制以下代码,别忘了修改相对应的路径
net: "D:/caffe/caffe-master/models/VGG16/vggnet_train_val.prototxt"
test_iter: 10
test_interval: 500 #每经过500次训练,进行一次验证查看accuracy
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 1000
display: 20
max_iter: 2000 #只是做做练习,2000次迭代就够了
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000 #每经过1000次迭代训练保存一次快照
snapshot_prefix: "D:/caffe/caffe-master/examples/myfile/caffenet_train" #在路径下新建caffenet_train文件夹
solver_mode: GPU
5.训练网络
在当前路径下新建train_vggnet.sh文件,同样用记事本打开,复制以下代码,同样修改相对应的路径
#!/usr/bin/env sh
set -e
D:/caffe/caffe-master/Build/x64/Release/caffe train \
--solver=D:/caffe/caffe-master/models/VGG16/vggnet_solver.prototxt $@
修改后保存退出,然后在cmd命令窗口cd到当前路径,然后输入命令 sh train_vggnet.sh,如下图所示
如果运行没出错的话,那恭喜你训练成功,然后就是等待训练结果就行