本篇博文旨在记录如何跑通py-fater-rcnn的过程。
论文地址:Faster R-CNN
代码:py-faster-rcnn
环境准备
py-faster-rcnn 是依赖caffe的,所以需要安装caffe。相关环境配置可以参考:
Ubuntu16.04 Caffe 安装步骤记录(超详尽)
通常如果你用的是服务器,并且相关依赖环境已经安装好后,而且你也没有sudoer权限(比如我?),你只需要安装自己用户下的caffe就行了。
下载并安装 py-faster-rnn
这里基本安装官方文档走,但中间可能会出现一些问题,我会在下面一一给出解决方案。
-
下载 py-faster-rnn
# Make sure to clone with --recursive# Make sure to clone with --recursive git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
-
Build the Cython modules
cd $FRCN_ROOT/lib make
-
Build Caffe and pycaffe
cd $FRCN_ROOT/caffe-fast-rcnn # Now follow the Caffe installation instructions here: # http://caffe.berkeleyvision.org/installation.html # If you're experienced with Caffe and have all of the requirements installed # and your Makefile.config in place, then simply do: make -j8 && make pycaffe
这一步通常会出现问题:
./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ In file included from ./include/caffe/util/cudnn.hpp:5:0, from ./include/caffe/util/device_alternate.hpp:40, from ./include/caffe/common.hpp:19, from ./include/caffe/data_reader.hpp:8, from src/caffe/util/blocking_queue.cpp:4: /usr/local/cuda-9.0/include/cudnn.h:991:1: note: declared here cudnnSetPooling2dDescriptor(cudnnPoolingDescriptor_t poolingDesc, ^ Makefile:563: recipe for target '.build_release/src/caffe/util/blocking_queue.o' failed
-
下载预训练好的fast-rcnn模型
这里可能是服务器没有梯子的原因,直接执行下载脚本是无法成功的。cd $FRCN_ROOT ./data/scripts/fetch_faster_rcnn_models.sh
一个可行的方案是直接打开
fetch_faster_rcnn_models.sh
复制其中的url
地址在有梯子的机器上进行下载:点击这里。或者参考 issues:faster_rcnn_models.tgz Not found on server!
运行Demo
cd $FRCN_ROOT
./tools/demo.py
由于服务器的 terminal 没有 GUI,而 demo.py
中使用了 matplotlib
进行绘图,解决办法 Linux终端没有GUI,如何使用matplotlib绘图,其中第一种解决方案在 from matplotlib import pylot
之前,添加代码,在我这没有 work ,也不知道什么原因。我使用第二种方法,成功解决。
在Linux配置文件中创建文件 ~/.config/matplotlib/matplotlibrc
(其中,~/.config/matplotlib/
是配置文件 matplotlibrc
的路径)添加如下一行:
backend : Agg
再在 plt.show()
之前添加代码 plt.savefig('/home/moxiao/code/python/classification_pr.png')
。
...
...
plt.savefig('/home/moxiao/code/python/demo_detection.png')
plt.show()
再次运行 demo.py
成功运行。
cd $FRCN_ROOT
./tools/demo.py
再把文件拖到 GUI 环境下显示,结果如下:
使用VOC2007训练和测试模型
数据准备
-
切换到
data
目录下,下载 training, validation, test 数据和 VOCdevkitwget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-
解压所有的tar到一个文件夹VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
解压后文件结果如下:
$VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
-
创建一个软连接
cd $FRCN_ROOT/data ln -s $VOCdevkit VOCdevkit2007
下载预训练好的 ImageNet 模型
cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh
跟上面faster-rcnn模型一样,可能无法直接下载,解决办法参照上面说的。
使用
这里先测试了论文中提出的 alternating optimization
:
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701
比如你在 6 号GPU上训练,使用VGG16可以参照如下命令:
./experiments/scripts/faster_rcnn_alt_opt.sh 6 VGG16 pascal_voc
这时可能出现问题:
Traceback (most recent call last):
File "./tools/train_net.py", line 112, in
max_iters=args.max_iters)
File "/home/kepricon/git/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 157, in train_net
pretrained_model=pretrained_model)
File "/home/kepricon/git/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 51, in init
pb2.text_format.Merge(f.read(), self.solver_param)
AttributeError: 'module' object has no attribute 'text_format'
解决方案 issue:protobuf module has no attribute ‘text_format’
找到 $FRCN_ROOT/lib/fast_rcnn/train.py
中添加 import google.protobuf.text_format
再次运行
./experiments/scripts/faster_rcnn_alt_opt.sh 6 VGG16 pascal_voc
可以看到成功运行,由包含 loss
的迭代信息打印在 terminal 上。
最后还会出现一个问题,就是在迭代到中间的时候,由于训练阶段不同,调用了新的代码,会出现:
Solving...
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "./tools/train_faster_rcnn_alt_opt.py", line 236, in train_fast_rcnn
max_iters=max_iters
File "/home/scott/code/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 185, in train_net
model_paths = sw.train_model(max_iters)
File "/home/scott/code/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 112, in train_model
self.solver.step(1)
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 155, in forward
blobs = self._get_next_minibatch()
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 68, in _get_next_minibatch
return get_minibatch(minibatch_db, self._num_classes)
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 64, in get_minibatch
num_classes)
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 110, in _sample_rois
fg_inds, size=fg_rois_per_this_image, replace=False
File "mtrand.pyx", line 1176, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18822)
TypeError: 'numpy.float64' object cannot be interpreted as an index
主要原因是因为现在最新 numpy 不支持 float 类型的 slice。解决方案详见TypeError: ‘numpy.float64’ object cannot be interpreted as an index, 将其中出现错误的用浮点数做切片的地方都转化为 int 值。
这样就完全成功运行了!!
在VOC2007上训练后测试的效果:
待续…