caffe分类网训练流程
1. 数据准备
1.1 将不同类别图像分别置于相应文件夹下
1.2 将数据集划分为训练集和测试集,脚本如下:
import os.path
import os
import shutil
rootpath = "/home/wz/Desktop/safety_belt/merged/"
saveDir = "/home/wz/Desktop/safety_belt/caffedata/"
def traversalDir_FirstDir(path):
list = []
if (os.path.exists(path)):
files = os.listdir(path)
for file in files:
m = os.path.join(path, file)
if (os.path.isdir(m)):
h = os.path.split(m)
#print h[1]
list.append(h[1])
return list
def ImageListInDir(path, list_name):
for file in os.listdir(path):
file_path = os.path.join(path, file)
if os.path.isdir(file_path):
listdir(file_path, list_name)
elif os.path.splitext(file_path)[1]=='.png' or os.path.splitext(file_path)[1]=='.jpg' or os.path.splitext(file_path)[1]=='.jpeg':
list_name.append(file_path)
subdirlist = traversalDir_FirstDir(rootpath)
trainSetDir = saveDir + "train/"
testSetDir = saveDir + "test/"
classNameTxt = saveDir+"classname.txt"
testTxt = saveDir+"test.txt"
trainTxt = saveDir+"train.txt"
testindex = open(testTxt,'w')
trainindex = open(trainTxt,'w')
classNameFile = open(classNameTxt,'w')
imglist = []
dirindex = 0
for dir in subdirlist:
ImageListInDir(rootpath+dir,imglist)
length = len(imglist)
if(length < 20):
print 'Not enough image in ',dir
continue
if(length>=20):
if not os.path.exists(trainSetDir + dir):
os.makedirs(trainSetDir + dir)
if not os.path.exists(testSetDir + dir):
os.makedirs(testSetDir + dir)
for i in range(int(length*0.15),length):
imagename = imglist[i].split("/")
shutil.copyfile(imglist[i], trainSetDir+dir+"_"+imagename[-1])
trainindex.writelines(trainSetDir+dir+"_"+imagename[-1]+' '+str(dirindex)+'\n')
for i in range(int(length*0.15)):
imagename = imglist[i].split("/")
shutil.copyfile(imglist[i], testSetDir+dir+"_"+imagename[-1])
testindex.writelines(testSetDir+dir+"_"+imagename[-1]+' '+str(dirindex)+'\n')
imglist = []
classNameFile.writelines(dir+'\n')#
dirindex += 1
testindex.close()
trainindex.close()
classNameFile.close()
print 'Done!'
1.3 运行工具convert_imageset.bin
./build/tools/convert_imageset.bin '' train.txt train_lmdb -resize_height=128 -resize_width=128 -shuffle=True
其中,train.txt由上一步骤生成,若该文件中保存的是图像的绝对路径,那么第一个参数给空路径'',否则需要给一个rootdir,rootdir+imagename才是图像的完整路径,这样train_lmdb文件夹下就能生成相应的数据文件.
同样过程再生成测试集。
1.4 均值文件计算
不一定是必须,但是若要用example/cpp_classification脚本进行测试那么,meanfile是必须的选项。如若人物需要对光照等具有鲁棒性,最好减掉均值。而如果任务是色彩分类之类的,那么是否必须,还需要验证。命令为:
'./tools/compute_image_mean.bin train_lmdb mean.binaryproto'
2.模型相关文件准备
2.1 预训练模型
2.2 train_val.txt
需要注意的地方:
1.输入层更改:
type: "Data" #注意,如果是采用HDF5或者LMDB格式预先准备的数据,那么这里一定是Data,否则如果是直接以读图的方式就应该是ImageData,对应例子可以看官网教程fintune_flicker。
transform_param {
mean_file: "/home/wz/Desktop/safety_belt/caffedata/mean.binaryproto"
#或者可以直接写每个通道的均值,如:
#mean_value: 104
#mean_value: 117
#mean_value: 123
#crop_size: 227
#mirror: True
#scale: 0.00390625
}
data_param {
source: "/home/wz/Desktop/safety_belt/caffedata/train_lmdb"
batch_size: 128
backend: LMDB
}
2.输出层更改
a.全连接层的num_output改为任务所需的输出类别,层名更改为新的名字,否则在进行参数拷贝时会发生维度不匹配的错误
b.全连接层的学习率更改,使得该层学习率高于其它层
c.SoftmaxWithLoss层的bottom对应改为全连层的top
如(注意像googlenet这种结构,有几个输出分支就要对应改几个地方):
layer {
name: "loss3/classifier_new"
type: "InnerProduct"
bottom: "pool5/7x7_s1"
top: "loss3/classifier_new"
param {
lr_mult: 2
decay_mult: 1
}
param {
lr_mult: 5
decay_mult: 0
}
inner_product_param {
num_output: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss3/loss3"
type: "SoftmaxWithLoss"
bottom: "loss3/classifier_new"
bottom: "label"
top: "loss3/loss3"
loss_weight: 1
}
layer {
name: "loss3/top-1"
type: "Accuracy"
bottom: "loss3/classifier_new"
bottom: "label"
top: "loss3/top-1"
include {
phase: TEST
}
}
2.3 deploy.txt
这个和train_val.txt基本一致,输入层需要提供input_param参数,确定图像尺寸。其它层的名字和形状必须保持跟train_val.txt中的一致,否则进行测试时很可能程序不报错但是得出一些错误的结果。
2.4 solver.txt
以lenet.solver为例:
net: "examples/mnist/lenet_train_test.prototxt"(网络结构定义文件)
test_iter: 100(要求test_iter*batch_size >= test_sample_size)
test_interval: 500(每隔500次迭代进行一次测试)
(学习率衰减策略,支持线性,多项式,指数,分步设定)
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100(# Display every 100 iterations,终端输出频率)
max_iter: 10000(The maximum number of iterations,tf里面总的epoch完成所遍历的样本数=max_iter*batch_size)
snapshot: 5000(snapshot intermediate results,跟tf里面的checkpoint一样)
snapshot_prefix: "examples/mnist/lenet"(模型保存路径)
solver_mode: GPU
训练与测试
./build/tools/caffe train -solver models/finetune_safty_belt/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0
classification.cpp 对于单张图
batchClassification.cpp 多图