利用Caffe进行多标签影像数据训练时,主要有两种方法:
1. 对caffe源码进行修改,修改convert_imageset.cpp文件支持多标签,具体步骤可以参考https://www.jianshu.com/p/fdf7c599ab9d
2. 利用HDF5数据和Slice层进行多标签数据训练,本文主要介绍本方法。
- 制造hdf5数据
首先将图像数据和标签保存到TXT文件中,本文为图像路径 label_1 label_2形式如下
114841417@N06/coarse_tilt_aligned_face.492.12059747423_6b3535aa6a_o.jpg 3 0
8187011@N06/coarse_tilt_aligned_face.992.10353206665_5173637857_o.jpg 4 1
48647239@N03/coarse_tilt_aligned_face.1515.11838493313_5b7240b1c9_o.jpg 3 1
100003415@N08/coarse_tilt_aligned_face.2176.9523981569_ea255870f1_o.jpg 4 0
7464014@N04/coarse_tilt_aligned_face.965.10107710156_9cb48097c5_o.jpg 4 0
31183835@N08/coarse_tilt_aligned_face.2096.8754898174_5d34522d9a_o.jpg 4 1
63164355@N03/coarse_tilt_aligned_face.1082.8826664078_de8f6c6a9e_o.jpg 3 1
111700049@N08/coarse_tilt_aligned_face.1548.11833465006_b9235b0c89_o.jpg 5 0
63164355@N03/coarse_tilt_aligned_face.1111.11014305184_25bc533930_o.jpg 5 0
7398884@N04/coarse_tilt_aligned_face.1641.8727032370_97ab4ee179_o.jpg 3 0
10280355@N07/coarse_tilt_aligned_face.1880.9496762548_754e1337d6_o.jpg 6 1
33627988@N04/coarse_tilt_aligned_face.1949.8809482906_7021c9794c_o.jpg 7 1
64504106@N06/coarse_tilt_aligned_face.911.11846581226_fc9f42d681_o.jpg 0 0
112599447@N03/coarse_tilt_aligned_face.1201.11576030294_cf8d7137a6_o.jpg 5 1
对数据和标签进行编辑,生成hdf5数据:
#影像文件夹所在目录
img_root = './image'
#训练数据txt路径
train_path = './train.txt'
#输出路径
train_out = './hdf5_train'
#将txt中的数据存入
with open(train_path) as f:
lines = f.readlines()
file_list = []#存入影像路径
#建立标签和数据数组
#若要生成hdf5数据,必须先把影像和标签变为数组
#本文标签数目为2,影像数据:channel = 3,width = 256,height = 256故生成如下形式数据
labels = np.zeros((len(lines), 2)).astype(np.int)
datas = np.zeros((len(lines),3,256,256)).astype(np.float32)
#读取数据
count = 0
for line in lines:
file_list.append(line.split()[0])
labels[count][0] = line.split()[1]
labels[count][1] = line.split()[2]
count += 1
f.close()
#caffe利用hfd5数据时,在输入层没有transform_param 参数,所以需要先对影像数据进行预处理
for i, file in enumerate(file_list):
path = os.path.join(img_root,file)
image = cv.imread(path) #获取影像
image = cv.resize(image,(256,256))#重采样为256*256大小的图像
img = np.array(image)
img = img.transpose(2,0,1)#讲图像从宽 高 通道 形式转化为通道 宽 高 caffe读取图像形式
datas[i, :, :, :] = img.astype(np.float32)#hdf5要求数据为float或double形式
#获取影像均值
mean = datas.mean(axis=0)
mean = mean.mean(1).mean(1)
#将影像减去均值
for i in range(len(datas)):
datas[i][0] = datas[i][0] - mean[0]
datas[i][1] = datas[i][1] - mean[1]
datas[i][2] = datas[i][2] - mean[2]
#保存hdf5文件
with h5py.File(train_out,'w') as fout:
#'data'必须和train_val.prototxt文件里数据层中top:后边的名称一致,在修改prototxt文件时会进一步说明
fout.create_dataset('data',data = datas)
fout.create_dataset('label', data=labels)
fout.close()
注意:1. caffe中获取hyd5文件时,需要把所有数据读入内存中,所以当数据量很大时,需要将其分成多份保存,每份最好不大于2GB
2.在train_val.prototxt文件中,hdf5_data_param的source应该为保存hdf5数据路径的txt文件,不能直接读取hdf5数据
3.HDF5Data layer没有transform_param参数,所以需要在生成hdf5数据之前对影像数据进行相应的预处理
- 修改train_val.prototxt文件
name: "CaffeNet"
layers {
name: "data"
type: HDF5_DATA
top: "data" #在生成hdf5文件时,数据和标签名称一定要和top之后的名称一致
top: "label"
hdf5_data_param {
source: "/train_list.txt"
batch_size: 64
}
include: { phase: TRAIN }
}
layers {
name: "data"
type: HDF5_DATA
top: "data"
top: "label"
hdf5_data_param {
source: "/test_list.txt"
batch_size: 64
}
include: { phase: TEST }
}
#添加slices层,对标签进行划分
layers {
name: "slices"
type: SLICE
bottom: "label"
top: "label_1" #有几个标签,就建立几个top
top: "label_2"
slice_param{
axis: 1 #axis表示轴,用来确定数据是按照num还是channel来划分,此处表示利用channel来划分
slice_point: 1 #slice_point数目等于label数目减1,本文为2个标签,所以划分一次即可
#slice_point: 2 有三个标签,则增加slice_point层即可
}
}
#卷积池化等层没做修改,在此处省略
#修改fc8 accuracy和loss层
layers {
name: "fc8_age"
type: INNER_PRODUCT
bottom: "fc7"
top: "fc8_age"
inner_product_param {
num_output: 8 #对应第一个标签输出
}
}
layers {
name: "accuracy1"
type: ACCURACY
bottom: "fc8_age"
bottom: "label_1" #标签1 slices层划分的
top: "accuracy1"
include: { phase: TEST }
}
layers {
name: "loss_age"
type: SOFTMAX_LOSS
bottom: "fc8_age"
bottom: "label_1" #label_1
top: "loss_age"
}
layers {
name: "fc8_gender"
type: INNER_PRODUCT
bottom: "fc7"
top: "fc8_gender"
inner_product_param {
num_output: 2 #label_2的输出
}
}
layers {
name: "accuracy2"
type: ACCURACY
bottom: "fc8_gender"
bottom: "label_2" #对应于label_2
top: "accuracy2"
include: { phase: TEST }
}
layers {
name: "loss_gender"
type: SOFTMAX_LOSS
bottom: "fc8_gender"
bottom: "label_2" #对应于label_2
top: "loss_gender"
}
- 修改deploy.prototxt
#只修改fc8和prob层即可
layers {
name: "fc8_age"
type: INNER_PRODUCT
bottom: "fc7"
top: "fc8_age"
inner_product_param {
num_output: 8
}
}
layers {
name: "prob1"
type: SOFTMAX
bottom: "fc8_age"
top: "prob1"
}
layers {
name: "fc8_gender"
type: INNER_PRODUCT
bottom: "fc7"
top: "fc8_gender"
inner_product_param {
num_output: 2
}
}
layers {
name: "prob2"
type: SOFTMAX
bottom: "fc8_gender"
top: "prob2"
}