Ubuntu 16.04 下 py-faster-rcnn运行配置流程

最新推荐文章于 2020-03-27 10:49:51 发布

kinredon

最新推荐文章于 2020-03-27 10:49:51 发布

阅读量1.3k

点赞数

分类专栏：目标检测文章标签： py-faster-rcnn 环境配置 caffe fster-rcnn python

本文链接：https://blog.csdn.net/djh123456021/article/details/86543260

版权

目标检测专栏收录该内容

2 篇文章 0 订阅

订阅专栏

本篇博文旨在记录如何跑通py-fater-rcnn的过程。
论文地址：Faster R-CNN
代码：py-faster-rcnn

环境准备

py-faster-rcnn 是依赖caffe的，所以需要安装caffe。相关环境配置可以参考：
Ubuntu16.04 Caffe 安装步骤记录（超详尽)

通常如果你用的是服务器，并且相关依赖环境已经安装好后，而且你也没有sudoer权限（比如我?），你只需要安装自己用户下的caffe就行了。

下载并安装 py-faster-rnn

这里基本安装官方文档走，但中间可能会出现一些问题，我会在下面一一给出解决方案。

下载 py-faster-rnn

 # Make sure to clone with --recursive# Make sure to clone with --recursive
 git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git

Build the Cython modules
```
 cd $FRCN_ROOT/lib
 make
```

Build Caffe and pycaffe

 cd $FRCN_ROOT/caffe-fast-rcnn
 # Now follow the Caffe installation instructions here:
 #   http://caffe.berkeleyvision.org/installation.html
 
 # If you're experienced with Caffe and have all of the requirements installed
 # and your Makefile.config in place, then simply do:
 make -j8 && make pycaffe

这一步通常会出现问题：

 ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’
      cudnnStatus_t status = condition; \
                             ^
 In file included from ./include/caffe/util/cudnn.hpp:5:0,
                  from ./include/caffe/util/device_alternate.hpp:40,
                  from ./include/caffe/common.hpp:19,
                  from ./include/caffe/data_reader.hpp:8,
                  from src/caffe/util/blocking_queue.cpp:4:
 /usr/local/cuda-9.0/include/cudnn.h:991:1: note: declared here
  cudnnSetPooling2dDescriptor(cudnnPoolingDescriptor_t poolingDesc,
  ^
 Makefile:563: recipe for target '.build_release/src/caffe/util/blocking_queue.o' failed

解决方案见 py-faster-rcnn使用时caffe编译错误解决办法

下载预训练好的fast-rcnn模型
这里可能是服务器没有梯子的原因，直接执行下载脚本是无法成功的。
```
 cd $FRCN_ROOT
 ./data/scripts/fetch_faster_rcnn_models.sh
```
一个可行的方案是直接打开 fetch_faster_rcnn_models.sh 复制其中的 url 地址在有梯子的机器上进行下载：点击这里。或者参考 issues:faster_rcnn_models.tgz Not found on server!

运行Demo

cd $FRCN_ROOT
./tools/demo.py

由于服务器的 terminal 没有 GUI，而 demo.py 中使用了 matplotlib 进行绘图，解决办法 Linux终端没有GUI，如何使用matplotlib绘图，其中第一种解决方案在 from matplotlib import pylot之前，添加代码，在我这没有 work ，也不知道什么原因。我使用第二种方法，成功解决。
在Linux配置文件中创建文件 ~/.config/matplotlib/matplotlibrc（其中，~/.config/matplotlib/ 是配置文件 matplotlibrc 的路径）添加如下一行：

backend : Agg

再在 plt.show() 之前添加代码 plt.savefig('/home/moxiao/code/python/classification_pr.png') 。

...
...
plt.savefig('/home/moxiao/code/python/demo_detection.png')
plt.show()

再次运行 demo.py 成功运行。

cd $FRCN_ROOT
./tools/demo.py

再把文件拖到 GUI 环境下显示，结果如下：

在这里插入图片描述

使用VOC2007训练和测试模型

数据准备

切换到 data 目录下，下载 training, validation, test 数据和 VOCdevkit

 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
 wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

解压所有的tar到一个文件夹VOCdevkit

 tar xvf VOCtrainval_06-Nov-2007.tar
 tar xvf VOCtest_06-Nov-2007.tar
 tar xvf VOCdevkit_08-Jun-2007.tar

解压后文件结果如下：

 $VOCdevkit/                           # development kit
 $VOCdevkit/VOCcode/                   # VOC utility code
 $VOCdevkit/VOC2007                    # image sets, annotations, etc.
 # ... and several other directories ...

创建一个软连接

 cd $FRCN_ROOT/data
 ln -s $VOCdevkit VOCdevkit2007

下载预训练好的 ImageNet 模型

cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh

跟上面faster-rcnn模型一样，可能无法直接下载，解决办法参照上面说的。

使用

这里先测试了论文中提出的 alternating optimization ：

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

比如你在 6 号GPU上训练，使用VGG16可以参照如下命令：

./experiments/scripts/faster_rcnn_alt_opt.sh 6 VGG16 pascal_voc

这时可能出现问题：

Traceback (most recent call last):
File "./tools/train_net.py", line 112, in 
max_iters=args.max_iters)
File "/home/kepricon/git/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 157, in train_net
pretrained_model=pretrained_model)
File "/home/kepricon/git/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 51, in init
pb2.text_format.Merge(f.read(), self.solver_param)
AttributeError: 'module' object has no attribute 'text_format'

解决方案 issue:protobuf module has no attribute ‘text_format’
找到 $FRCN_ROOT/lib/fast_rcnn/train.py 中添加 import google.protobuf.text_format

再次运行

./experiments/scripts/faster_rcnn_alt_opt.sh 6 VGG16 pascal_voc

可以看到成功运行，由包含 loss 的迭代信息打印在 terminal 上。

最后还会出现一个问题，就是在迭代到中间的时候，由于训练阶段不同，调用了新的代码，会出现：

Solving...
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "./tools/train_faster_rcnn_alt_opt.py", line 236, in train_fast_rcnn
max_iters=max_iters
File "/home/scott/code/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 185, in train_net
model_paths = sw.train_model(max_iters)
File "/home/scott/code/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 112, in train_model
self.solver.step(1)
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 155, in forward
blobs = self._get_next_minibatch()
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 68, in _get_next_minibatch
return get_minibatch(minibatch_db, self._num_classes)
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 64, in get_minibatch
num_classes)
File "/home/scott/code/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 110, in _sample_rois
fg_inds, size=fg_rois_per_this_image, replace=False
File "mtrand.pyx", line 1176, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18822)
TypeError: 'numpy.float64' object cannot be interpreted as an index

主要原因是因为现在最新 numpy 不支持 float 类型的 slice。解决方案详见TypeError: ‘numpy.float64’ object cannot be interpreted as an index, 将其中出现错误的用浮点数做切片的地方都转化为 int 值。

这样就完全成功运行了！！

在VOC2007上训练后测试的效果：
在这里插入图片描述
待续…