这个主要参考《深度学习——caffe之经典模型详解与实战》
kaggle比赛链接:https://www.kaggle.com/c/facial-keypoints-detection
数据格式描述:由于下载下来的文件是csv格式的文件,caffe并不支持,
step1:因此需要通过csv->DataFrame->numpy.Array->hdf5 将csv转化为hdf5格式的数据: 1)通过pandas将csv格式的数据转化为DataFrame 2)通过DataFrame转化为numpy 的array 3)通过numpy处理掉标签缺少的部分 4)通过hdf5将numpy的array转化为HDF5 下面是csv_to_HDF5.py的代码:# -*- coding:utf-8 -*- __author__ = 'xuy' """ 因为caffe不能读取csv格式的数据,因此需要通过csv->DataFrame->numpy.Array->hdf5 将csv转化为hdf5格式的数据: 1)通过pandas将csv格式的数据转化为DataFrame 2)通过DataFrame转化为numpy 的array 3)通过numpy处理掉标签缺少的部分 4)通过hdf5将numpy的array转化为HDF5 """ import os import numpy as np from pandas.io.parsers import read_csv from sklearn.utils import shuffle import h5py TRAIN_CSV='/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/data_set/training.csv' def csv_to_hd5(): ''' 对于数据集进行说明: training.csv这个文件,第一列到倒数第二列是label属性当中的五官的区域位置,最后一列是image的像素点【0~255】 :return: ''' dataframe=read_csv(os.path.expanduser(TRAIN_CSV))#os.path.expanduser(path) #把path中包含的"~"和"~user"转换成用户目录 dataframe['Image']=dataframe['Image'].apply(lambda img:np.fromstring(img,sep=' '))#dataframe.apply(), #经过前后对比,可以得知:变化有如下两点:apply的参数是一个函数名 #1)将空格分割改为了逗号分割,lambda img:np.fromstring(img,sep=' ')的作用 #2)类型变为了Array ''' 在这里,apply()当中的参数是lambda×(匿名函数)输入参数是:img,输出参数是:np.fromstring(img,sep=' ') string : str A string containing the data. dtype : data-type, optional Parameters: The data type of the array; default: float. For binary input data, the data must be in exactly this format. count : int, optional Read this number of dtype elements from the data. If this is negative (the default), the count will be determined from the length of the data. sep : str, optional If not provided or, equivalently, the empty string, the data will be interpreted as binary data; otherwise, as ASCII text with decimal numbers. Also in this latter case, this argument is interpreted as the string separating numbers in the data; extra whitespace between elements is also ignored. Returns: arr : ndarray The constructed array. Raises: ValueError If the string is not the correct size to satisfy the requested dtype and count. ------------------------------------------------------------------------------------------------------------------------------ 参考链接:http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html func : function Function to apply to each column/row axis : {0 or ‘index’, 1 or ‘columns’}, default 0 0 or ‘index’: apply function to each column 1 or ‘columns’: apply function to each row Parameters: broadcast : boolean, default False For aggregation functions, return object of same size with values propagated raw : boolean, default False If False, convert each row or column into a Series. If raw=True the passed function will receive ndarray objects instead. If you are just applying a NumPy reduction function this will achieve much better performance reduce : boolean or None, default None Try to apply reduction procedures. If the DataFrame is empty, apply will use reduce to determine whether the result should be a Series or a DataFrame. If reduce is None (the default), apply’s return value will be guessed by calling func an empty Series (note: while guessing, exceptions raised by func will be ignored). If reduce is True a Series will always be returned, and if False a DataFrame will always be returned. args : tuple Positional arguments to pass to function in addition to the array/series Additional keyword arguments will be passed as keywords to the function Returns: applied : Series or DataFrame ''' # print '------------------------------------------------------------------------------------------------------------------------------' # print dataframe['Image'] dataframe=dataframe.dropna() # 将缺失值丢弃掉 data = np.vstack(dataframe['Image'].values) / 255 # 归一化数据,将array类型转化为list类型 label = dataframe[dataframe.columns[:-1]].values # 从第一列到倒数第二列,输出五官当中的label区域位置,.values这个功能是将DataFrame转化为np.array # print 'the label type is %s'%type(dataframe[dataframe.columns[:-1]]) label=(label-48)/48#归一化 data,label=shuffle(data,label,random_state=0) ''' 返回类型是numpy.Array类型 ''' return data,label if __name__ =='__main__': #train_data/val_data,此时的格式是:numpy.Array的类型 data,label=csv_to_hd5() data=data.reshape(-1,1,96,96)#第一个参数:-1,表示可以自行确定当前的数值,因为可以通过(1,96,96)来确定这个-1的数值 data_train=data[:-100,:,:,:]#将第一张图片到倒数100张图片作为train_set,从倒数第100张到最后一张为验证集 data_val=data[-100:,:,:,:] #train_label/val_label label=label.reshape(-1,1,1,30) label_train=label[:-100,:,:,:] label_val=label[-100:,:,:,:] # train hdf5数据库 fhandle=h5py.File('train.hd5','w') fhandle.create_dataset('data',data=data_train,compression='gzip',compression_opts=4) fhandle.create_dataset('label',data=label_train,compression='gzip',compression_opts=4) fhandle.close() #val hdf5 数据库 fhandle=h5py.File('val.hd5','w') fhandle.create_dataset('data',data=data_val,compression='gzip',compression_opts=4) fhandle.create_dataset('label',data=label_val,compression='gzip',compression_opts=4) fhandle.close()
step2:在经过数据格式转化之后,需要构造神经网络:write_train.py来构造train.prototxt(自认为不能通过pycaffe构造train_val.prototxt,因为会出现python当中的命名冲突问题)
因此,在写train_val.prototxt的时候,先根据train.prototxt再稍加修改,形成train_val.prototxt文件
这里要进行说明:在生成.hdf5文件之后,caffe的HDF5DATA接受的是txt文件,因此,
需要生成train.txt:内容是:
/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/train.hd5
需要生成val.txt,内容是:
/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/val.hd5
# -*- coding:utf-8 -*-
__author__ = 'xuy'
import caffe
# from caffe import layers as L
# from caffe import params as P
frozen_weight_param = dict(lr_mult=1)#权重
frozen_bias_param = dict(lr_mult=2)#偏执值
froozen_param = [frozen_weight_param, frozen_bias_param]
def fk_layer():
net=caffe.NetSpec()
net.train_data,net.train_label=caffe.layers.HDF5Data(
ntop=2,#因为是两个输出值:net.train_data,net.train_label,因此需要指定ntop=2
name='fk-data',#这里指定的是层的名字
hdf5_data_param={
'source': '/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/train.txt',
'batch_size': 64
},
include={
'phase': caffe.TRAIN
})
# net.val_data,net.val_label=caffe.layers.HDF5Data(
# ntop=2,
# name='fk-val',
# hdf5_data_param={
# 'source': './val.txt',
# 'batch_size': 100
# },
# include={
# 'phase': caffe.TEST
# })
net.ip1 = caffe.layers.InnerProduct(net.train_data,
param=froozen_param,#这里通过定义一个list,来整合到param的字典,也就是:param=[]
num_output=100,
weight_filler=dict(type='xavier'),bias_filler=dict(type='constant'))
net.relu1 = caffe.layers.ReLU(net.ip1
,in_place=True#如果不加这句话或者设置为false,那么该层的bottom是ip1,top是relu;如果加上这句话的话,bottom和top都是ip1
)
net.ip2=caffe.layers.InnerProduct(net.ip1,param=froozen_param,num_output=30, weight_filler=dict(type='xavier'),bias_filler=dict(type='constant'))
net.loss=caffe.layers.EuclideanLoss(net.ip2,net.train_label)
return net.to_proto()
with open('fk_train.prototxt', 'w') as f:
f.write(str(fk_layer()))
# with open('examples/mnist/lenet_auto_test.prototxt', 'w') as f:
# f.write(str(lenet('examples/mnist/mnist_test_lmdb', 100)))
生成如下train_val.prototxt
name:'FK'
layer {
name: "fk-data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/train.txt"
batch_size: 64
}
}
layer {
name: "fk-val"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TEST
}
hdf5_data_param {
source: "/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/val.txt"
batch_size: 100
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "data"
top: "ip1"
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
inner_product_param {
num_output: 100
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
inner_product_param {
num_output: 30#这里30表示15种特征的坐标(x,y)
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
# -*- coding:utf-8 -*-
__author__ = 'xuy'
import caffe #导入caffe包
def write_sovler():
my_project_root = "/home/xuy/桌面/code/python/caffe_code/kaggle_face_detection/" #my-caffe-project目录
sovler_string = caffe.proto.caffe_pb2.SolverParameter() #sovler存储
solver_file = my_project_root + 'solver.prototxt' #sovler文件保存位置
sovler_string.net = my_project_root + 'fk_train_val.prototxt' #train.prototxt位置指定
sovler_string.test_iter.append(1) #测试迭代次数
sovler_string.test_interval = 500 #每训练迭代test_interval次进行一次测试
sovler_string.base_lr = 0.001 #基础学习率
sovler_string.momentum = 0.9 #动量
sovler_string.weight_decay = 0.0005 #权重衰减
sovler_string.lr_policy = 'fixed' #学习策略
sovler_string.gamma=0.0001
sovler_string.power=0.75
sovler_string.display = 100 #每迭代display次显示结果
sovler_string.max_iter = 10000 #最大迭代数
sovler_string.snapshot = 5000 #保存临时模型的迭代数
#sovler_string.snapshot_format = 0 #临时模型的保存格式,0代表HDF5,1代表BINARYPROTO
sovler_string.snapshot_prefix = 'caffemodel/fk' #模型前缀
sovler_string.solver_mode = caffe.proto.caffe_pb2.SolverParameter.GPU #优化模式
with open(solver_file, 'w') as f:
f.write(str(sovler_string))
if __name__ == '__main__':
write_sovler()
生成如下solver.prototxt文件
net: "/home/xuy/\346\241\214\351\235\242/code/python/caffe_code/kaggle_face_detection/fk_train_val.prototxt"
test_iter: 1
test_interval: 500
base_lr: 0.0010000000475
display: 100
max_iter: 10000
lr_policy: "fixed"
gamma: 9.99999974738e-05
power: 0.75
momentum: 0.899999976158
weight_decay: 0.000500000023749
snapshot: 5000
snapshot_prefix: "caffemodel/fk"
solver_mode: GPU
step4:通过caffe自带命令进行train:
/home/xuy/caffe/build/tools/caffe train --solver=solver.prototxt