本文将讲述如何以end2end方式用自己的数据训练Faster-RCNN,环境和相关的安装以及问题的解决请参照前两篇文章。
1. 数据集准备
根据工程github上readme的说明下载并解压VOCdevkit,解压后的结构如下
VOCdevkit
└── VOC2007
├── Annotations
├── ImageSets
├── JPEGImages
├── SegmentationClass
└── SegmentationObject
为了避免路径变化引起的问题,我们直接用自己的数据集替换VOC2007下面的Annotations、ImageSets、JPEGImages。至于VOC格式的具体要求和ImageSets下面的meta文件的编写方式本文不再讲解。
2. 修改网络定义文件
这一步主要是修改train.prototxt和test.prototxt中和检测类别数有关的层参数。假设最终要是用Faster-RCNN-VGG16模型,则需要修改,注意VOC的总类别数是21即20个前景类和1个背景类。因此对于我们自己的数据来说,也应该算进背景类。例如要是用Faster-RCNN检测cat和dog,则类数总共应该为2+1=3类
$py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt
line 11: param_str: "'num_classes': 21" 中21改为n+1
line 530: param_str: "'num_classes': 21" 中21改为n+1
line 620: num_output: 21 中21改为n+1
line 643: num_output: 84 中84改为4*(n+1)
$py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt
line 567: num_output: 21 中21改为n+1
line 592: num_output: 84 中84改为4*(n+1)
3. 修改数据集转换程序
self._classes = ('__background__', # always index 0
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor')
改为
self._classes = ('__background__', # always index 0
'yourclass1', 'yourclass2',...)
Problem
Traceback (most recent call last):
File "./tools/train_net.py", line 104, in <module>
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/train_net.py", line 69, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/train_net.py", line 66, in get_roidb
roidb = get_training_roidb(imdb)
File "/home/***/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 119, in get_training_roidb
imdb.append_flipped_images()
File "/home/***/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 106, in append_flipped_images
boxes = self.roidb[i]['boxes'].copy()
File "/home/***/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 67, in roidb
self._roidb = self.roidb_handler()
File "/home/***/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 109, in gt_roidb
for index in self.image_index]
File "/home/***/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 193, in _load_pascal_annotation
obj for obj in objs if int(obj.find('difficult').text) == 0]
AttributeError: 'NoneType' object has no attribute 'text'
解决办法:注释掉190行至197行,如果不涉及difficult object
# if not self.config['use_diff']:
# # Exclude the samples labeled as difficult
# non_diff_objs = [
# obj for obj in objs if int(obj.find('difficult').text) == 0]
# # if len(non_diff_objs) != len(objs):
# # print 'Removed {} difficult objects'.format(
# # len(objs) - len(non_diff_objs))
# objs = non_diff_objs
4. 运行训练脚本
进入py-faster-rcnn的根目录
$ ./experiments/scripts/faster_rcnn_end2end.sh 1 VGG16 pascal_voc
然后就可以开始训练啦