A Simple and Fast Implementation of Faster R-CNN翻译文档（方便看）

最新推荐文章于 2023-08-11 12:50:47 发布

yumin1997

最新推荐文章于 2023-08-11 12:50:47 发布

阅读量550

点赞数

分类专栏：学习文档——深度学习文章标签：机器学习深度学习人工智能

本文链接：https://blog.csdn.net/qq_42946328/article/details/107867674

版权

学习文档——深度学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

A Simple and Fast Implementation of Faster R-CNN翻译文档（方便看）

此文档是翻译代码中的A Simple and Fast Implementation of Faster R-CNN项目中的README.MD文档，主要是为了方便自己更直观了解项目。有些地方翻译不对还请指教😁（侵删）

1. 介绍

[更新:] 我进一步将代码简化为Pythorch1.5、TorchVision0.6，并用torchvision中的一个替换定制的OpsRoipool和nms。如果您想要旧版本代码，请切换分支 v1.0

This project is a Simplified Faster R-CNN implementation based on chainercv and other projects . 这个项目是一个简化的 Faster R-CNN，基于chainercv和其他 projects 。我希望它能成为那些想知道Faster R-CNN细节的人的开始代码。它的目的是:

简化代码（简单比复杂好）
使代码更直接（平面比嵌套好）
与原始文件中报告的性能匹配（速度计数和地图问题）

它具有以下特点：

它可以作为纯Python代码运行，不再需要构建事务。
这是一个最小的实现，大约有2000行有效代码，其中包含大量注释和说明（多亏了chainercv的优秀文档）
它实现了比原始实现更高的映射（0.712 VS 0.699）
它的速度可与其他实现相媲美（在TITAN XP中训练和测试为6fps和14fps）
它的内存效率很高（vgg16约为3GB）

2. 性能

2.1 mAP

VGG16在trainval 上训练，在 test 上测试

Note:训练显示出很大的随机性，你可能需要一点运气和更多的训练时间才能到达最高地图。但是，它应该很容易超过下限。

实施方法	mAP
原论文	0.699
caffe预训练模式训练	0.700-0.712
torchvision预训练模式训练	0.685-0.701
从chainercv 转换的模型（报告0.706）(reported 0.706)	0.7053

2.2 速度

实施方法	GPU	Inference	Trainining
原论文	K40	5 fps	NA
This[1]	TITAN Xp	14-15 fps	6 fps
pytorch-faster-rcnn	TITAN Xp	15-17fps	6fps

[1]: 确保正确安装了cupy并且GPU上只有一个程序运行。训练速度对你的gpu状态很敏感。看 troubleshooting 获取更多信息. 而且在项目开始的时候很慢——需要时间来热身。

通过去除可视化、测井、平均损失等，可以更快地实现。

3. 安装依赖项

下面是一个使用“anaconda”从头创建环境的示例

# create conda env
conda create --name simp python=3.7
conda activate simp
# install pytorch
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

# install other dependancy
pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet

# start visdom
nohup python -m visdom.server &

如果你不使用anaconda，那么：

使用GPU安装Pythorch（代码仅限GPU），请参阅官方网站
安装其他依赖项: pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet
用visdom可视化

nohup python -m visdom.server &

4. Demo演示

从Google Drive 或 Baidu Netdisk( passwd: scxn)下载预训练模型

查看demo.ipynb 获取更多细节。

5. 训练

5.1 准备数据

Pascal VOC2007

下载训练集，验证集，测试集数据和VOCdevkit

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

将所有这些tar解压到一个名为’VOCdevkit的目录中`

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar

它应该有这样的基本结构

$VOCdevkit/                           # development kit
$VOCdevkit/VOCcode/                   # VOC utility code
$VOCdevkit/VOC2007                    # image sets, annotations, etc.
# ... and several other directories ...

在 utils/config.py修改 voc_data_dir cfg项, 或者使用诸如“-voc data dir=/path/to/VOCdevkit/VOC2007/`”之类的参数将其传递给程序。

5.2 [可选]准备 caffe-pretrained vgg16

如果您想使用caffe pretrain模型作为初始权重，可以运行下面的步骤从caffe得到vgg16权重，这与原论文的使用相同。

python misc/convert_caffe_pretrain.py

This scripts would download pretrained model and converted it to the format compatible with torchvision. If you are in China and can not download the pretrain model, you may refer to
这个脚本将下载经过预训练的模型并将其转换为与torchvision兼容的格式。如果您在中国无法下载pretrain模型，您可以参考这里

然后您可以通过在caffe_pretrain_path中设置参数来指定caffe预训练模型 vgg16_caffe.pth 在 utils/config.py 的存储位置。默认路径就可以。

如果要使用torchvision中的预训练模型，可以跳过这一步。
NOTE,caffe预训练模型表现出稍好的性能。

NOTE:caffe模型需要BGR 0-255中的图像，而torchvision模型需要RGB和0-1中的图像。查看data/数据集.py获取更多细节。

5.3 开始训练

python train.py train --env='fasterrcnn' --plot-every=100

您可以参考utils/config.py 获取更多的arguments。

一些关键的arguments:

--caffe-pretrain=False: 使用caffe或torchvision的pretrain模型（默认：torchvison）
--plot-every=n: 将预测、损失等每一批都可视化。
--env: 可视化visdom env
--voc_data_dir: VOC数据存储在哪里
--use-drop: 在RoI head中使用dropout，默认为False
--use-Adam: 使用Adam而不是默认的SGD。（你需要给Adam设置一个非常低的’lr’）
--load-path:预先训练的模型路径，默认为“None”，如果指定了它，它将被加载。

您可以打开浏览器，访问http://<ip>：8097，并看到以下培训过程的可视化：

在这里插入图片描述

Troubleshooting故障排除

数据加载器:收到0项ancdata

看讨论, 在 train.py中已经被修复解决了. 所以我认为你不会有这个问题。
支持windows

我没有带GPU的windows机器来调试和测试它。如果有人可以提出拉取请求并进行测试，这是受欢迎的。

Acknowledgement

This work builds on many excellent works, which include:

Yusuke Niitani’s ChainerCV (mainly)
Ruotian Luo’s pytorch-faster-rcnn which based on Xinlei Chen’s tf-faster-rcnn
faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu.It mainly refer to longcw’s faster_rcnn_pytorch
All the above Repositories have referred to py-faster-rcnn by Ross Girshick and Sean Bell either directly or indirectly.

yumin1997

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
A Simple and Fast Implementation of Faster R-CNN翻译文档（方便看）

A Simple and Fast Implementation of Faster R-CNN1. Introduction[Update:] I’ve further simplified the code to pytorch 1.5, torchvision 0.6, and replace the customized ops roipool and nms with the one from torchvision. if you want the old version code, pl
复制链接

扫一扫

专栏目录