下载程序,
git clone https://github.com/Orpine/py-R-FCN.git
打开py-R-FCN,下载caffe
git clone https://github.com/Microsoft/caffe.git
编译Cython模块
cd lib
make
编译caffe和pycaffe
cd caffe
cp Makefile.config.example MAkefile.config
然后配置Makefile.config文件
make -j8
make pycaffe
由于py-faster-rcnn不支持多个训练集,我们创造一个新的文件夹叫做VOC0712,把VOC2007和VOC2012里的JPEGImage和Annonation融合到一个单独的文件夹JPEGImage和Annonation里,生成新的ImageSets文件夹:
下载在ImageNet上预训练好的模型,放到./data/imagenet_models里,如下图所示:
下面开始用py-rfcn来训练自己的数据集:(我的数据集是标准pascal voc数据集,名字叫做VOC5000)
首先修改网络模型:
1.修改/py-R-FCN/models/pascal_voc/ResNet-50/rfcn_end2end/class-aware/train_ohem.prototxt
name: "ResNet-50"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 2" #改为你的数据集的类别数+1
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 2"#改为你的数据集的类别数+1
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #2*(7^2) cls_num*(score_maps_size^2)(类别数+1)*49
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 392 #8*(7^2) cls_num*(score_maps_size^2)(类别数+1) x 49 x 4
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #类别数+1
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 8#(类别数+1) x 4
group_size: 7
}
}
2.修改/py-R-FCN/models/pascal_voc/ResNet-50/rfcn_end2end/class-aware/test.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #21*(7^2) cls_num*(score_maps_size^2)(类别数+1) x 49
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 392 #8*(7^2) cls_num*(score_maps_size^2)(类别数+1)*49*4
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #(类别数+1)
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 8 #(类别数+1)*4
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 2 #(类别数+1)
}
}
}
layer {
name: "bbox_pred_reshape"
type: "Reshape"
bottom: "bbox_pred_pre"
top: "bbox_pred"
reshape_param {
shape {
dim: -1
dim: 8 #(类别数+1)*4
}
}
}
3.修改/py-R-FCN/lib/datasets/pascal_voc.py
class pascal_voc(imdb):
def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set)
self._year = year
self._image_set = image_set
self._devkit_path = self._get_default_path() if devkit_path is None \
else devkit_path
self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
self._classes = ('__background__', # always index 0
'aeroplane')
self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))
self._image_ext = '.jpg'
修改self._classes为你的类别加背景。
4./py-R-FCN/lib/datasets/factory.py修改
for year in ['2007', '2012','2001','2002','2006','5000']:
for split in ['train', 'val', 'trainval', 'test']:
name = 'voc_{}_{}'.format(year, split)
__sets[name] = (lambda split=split, year=year: pascal_voc(split, year))
我的数据集叫:VOC5000,所以把5000加到年份当中。
5/py-R-FCN/experiments/scripts/rfcn_end2end_ohem.sh修改
case $DATASET in
pascal_voc)
TRAIN_IMDB="voc_5000_trainval"
TEST_IMDB="voc_5000_test"
PT_DIR="pascal_voc"
ITERS=4000
;;
把训练数据集和测试数据集改为你的数据集,迭代次数改为4000。
开始训练:./experiments/scripts/rfcn_end2end_ohem.sh 0 ResNet-50 pascal_voc
迭代4000次,取得了81.2%的精度。