mmdetection 中文文档二（由原英文文档翻译）

最新推荐文章于 2024-07-29 10:09:57 发布

寂寞军刀

最新推荐文章于 2024-07-29 10:09:57 发布

阅读量1.9k

点赞数

分类专栏：深度学习文章标签： pytorch 深度学习

原文链接：https://mmdetection.readthedocs.io/en/latest/GETTING_STARTED.html#

版权

深度学习专栏收录该内容

9 篇文章 0 订阅

订阅专栏

二、入门
本节提供MMDetection使用的基础教程。关于安装说明，请看上一节。
（一）预训练模型检测（推理）
我们提供了一个测试脚本来评测一个完整的数据集（比如COCO,PASCAL VOC等），也提供了一些高级接口来更加容易的整合其他工程。
1.测试一个数据集。
包括：单GPU测试，多GPU测试，可视化检测结果。
你可以使用如下命令来测试一个数据集：

单GPU测试

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [–out ${RESULT_FILE}] [–eval ${EVAL_METRICS}] [–show]
#多GPU测试
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [–out ${RESULT_FILE}] [–eval ${EVAL_METRICS}]
可选参数：
RESULT_FILE：输出pickle格式结果的文件名。如果不明确指定，结果将不会被保存到文件。
EVAL_METRICS：被检测物体的结果。允许取以下值：proposal_fast, proposal, bbox, segm, keypoints.
–show：如果明确，被检测结果将在一个新的窗口上以图片显示。这个只适用于单GPU测试。请确保GUI在你的环境中式可用的，否则你可能会遇到以下错误：cannot connect to X server。
例子：
假设你已经在checkpoints/文件夹下，下载好了checkpoints文件。
测试Faster R-CNN模型并显示结果命令如下：
python tools/test.py configs/faster_rcnn_r50_fpn_1x.py checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth
–show
测试Mask R-CNN模型并预测bbox和maskAP，命令如下：
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth
–out results.pkl --eval bbox segm
用8个GPU测试Mask R-CNN并预测bbox和maskAP，命令如下：
./tools/dist_test.sh configs/mask_rcnn_r50_fpn_1x.py
checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth
8 --out results.pkl --eval bbox segm
2.网络摄像头例子
python demo/webcam_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [–device ${GPU_ID}] [–camera-id ${CAMERA-ID}] [–score-thr ${SCORE_THR}]
例子：
python demo/webcam_demo.py configs/faster_rcnn_r50_fpn_1x.py checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth

（二）检测图像的高级API
1.同步接口
下面是一个构建模型来测试图像的例子
from mmdet.apis import init_detector, inference_detector, show_result
import mmcv

config_file = ‘configs/faster_rcnn_r50_fpn_1x.py’
checkpoint_file = ‘checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth’

build the model from a config file and a checkpoint file

model = init_detector(config_file, checkpoint_file, device=‘cuda:0’)

test a single image and show the results

img = ‘test.jpg’ # or img = mmcv.imread(img), which will only load it once
result = inference_detector(model, img)

visualize the results in a new window

show_result(img, result, model.CLASSES)

or save the visualization results to image files

show_result(img, result, model.CLASSES, out_file=‘result.jpg’)

test a video and show the results

video = mmcv.VideoReader(‘video.mp4’)
for frame in video:
result = inference_detector(model, frame)
show_result(frame, result, model.CLASSES, wait_time=1)
notebook版本的例子可以在demo/inference_demo.ipynb.中找到。
2.异步接口-Python3.7以上版本支持
异步接口允许不因GPU而阻塞CPU工作的接口代码，使单线程应用时CPU/GPU利用率更高。不论是在不同输入数据样本间还是在不同推理管道的模型间，推理都可以同时进行。
通过tests/async_benchmark.py可以看出使用同步和异步接口的速度差别。代码如下：
import asyncio
import torch
from mmdet.apis import init_detector, async_inference_detector, show_result
from mmdet.utils.contextmanagers import concurrent

async def main():
config_file = ‘configs/faster_rcnn_r50_fpn_1x.py’
checkpoint_file = ‘checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth’
device = ‘cuda:0’
model = init_detector(config_file, checkpoint=checkpoint_file, device=device)

# queue is used for concurrent inference of multiple images
streamqueue = asyncio.Queue()
# queue size defines concurrency level
streamqueue_size = 3

for _ in range(streamqueue_size):
    streamqueue.put_nowait(torch.cuda.Stream(device=device))

# test a single image and show the results
img = 'test.jpg'  # or img = mmcv.imread(img), which will only load it once

async with concurrent(streamqueue):
    result = await async_inference_detector(model, img)
# visualize the results in a new window
show_result(img, result, model.CLASSES)
# or save the visualization results to image files
show_result(img, result, model.CLASSES, out_file='result.jpg')

asyncio.run(main())

（三）训练一个模型
MMDetection 使用MMDistributedDataParallel执行分布式训练，使用MMDataParallel执行非分布式训练。
所有的输出（日志文件和checkpoints文件）将被保存在config文件中work_dir定义的工作目录下。
重点：在config文件中的默认学习率是针对8GPU和2img/gpu状态定义的(batch size = 8*2 = 16)。根据线性缩放规则，你应该根据你使用的GPU数量和每个GPU处理的图片数量来设置合适的学习率，比如：4 GPUs * 2 img/gpu时lr=0.01, 16 GPUs * 4 img/gpu时lr=0.08.
1.单GPU训练
python tools/train.py ${CONFIG_FILE}
如果你想要通过命令来定义工作目录，你可以将下列参数加到后面：–work_dir ${YOUR_WORK_DIR}.
2.多GUP训练
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
还有以下可选参数：
–validate（重点推荐）：在训练过程中每K个epuchs执行一次评估。
–work_dir ${WORK_DIR}:覆盖config文件中定义的工作目录，使用自定义的目录。
–resume_from ${CHECKPOINT_FILE}：从前面产生的checkpoint文件中执行。
resume_from 和 load_from的不同：resume_from同时加载模型权重和优化器状态，同时epoch也继承由checkpoint文件中定义的数字。常常被用来恢复被异常事件打断的训练过程。Load_from仅仅是加载模型权重，同时训练epoch从0开始。它常常被用来微调。
3.多机器训练
如果你使用由slurm管理的集群来运行MMDetection，你可以使用下面的脚本：slurm_train.sh，命令如下：
./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} [$ {GPUS}]
下面是一个使用16个GPU来训练分区上Mask R-CNN的例子。
./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x.py /nfs/xxxx/mask_rcnn_r50_fpn_1x 16
你可以通过slurm_train.sh文件来查看所有的参数和环境变量。
如果你有一些通过以太网连接的机器，你可以参考pytorch的launch utility。通常，如果你没有高速网络的话会比较慢。

（四）一些有用的工具
1.日志分析
你可以根据训练日志文件，画出loss/mAP曲线。第一次使用，需要运行pip install seaborn来安装它的依赖包。

                曲线图像

代码如下：
python tools/analyze_logs.py plot_curve [–keys ${KEYS}] [–title ${TITLE}] [–legend ${LEGEND}] [–backend ${BACKEND}] [–style ${STYLE}] [–out ${OUT_FILE}]
例子：
在一些运行中画出分类损失。
python tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
在一些损失中画出分类和回归损失，并保存将图保存为PDF格式。
python tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_reg --out losses.pdf
在一个图中比较两个运行实例的bbox mAP。
python tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --legend run1 run2
你也可以计算平均训练速度。
python tools/analyze_logs.py cal_train_time ${CONFIG_FILE} [–include-outliers]
输出应该是下面的样子：
-----Analyze train time of work_dirs/some_exp/20190611_192040.log.json-----
slowest epoch 11, average time is 1.2024
fastest epoch 1, average time is 1.1909
time std over epochs is 0.0028
average iter time: 1.1959 s/iter
2.分析class-wise性能
你可以通过分析class-wise mAP，来更加综合性的理解模型，代码如下:
python coco_eval.py ${RESULT} --ann ${ANNOTATION_PATH} --types bbox --classwise
当前，我们仅支持所有类别预测下的class-wise mAP，我们将在以后支持class-wise mAR。
3.获得FLOPs（每秒浮点数）和参数数量（实验性的）
我们提供了一个改写自flops-counter.pytorch的脚本，来计算所给模型的FLOPS和参数量。代码如下：
python tools/get_flops.py ${CONFIG_FILE} [–shape ${INPUT_SHAPE}]
你将得到类似于下面的结果：
Input shape: (3, 1280, 800)
Flops: 239.32 GMac
Params: 37.74 M
注意：这个工具仍然是实验性的，我们不能保证数据的准确性。你可以在简单的性能对比中使用这些数据，但是，如果你要在技术报告或者文章中使用它时，要重复检查它们。
1）FLOPs和输入形状有关，而参数与之无关。默认的输入形状为（1,3,1280,800）。
2）一些像GN操作和定制操作之类的是不被计算到FLOPS里的。你可以通过修改来为新的操作增加支持。mmdet/utils/flops_counter.py.
3）两阶FLOPS是依赖于提交数。
4.发布模型
在你上传模型到AWS之前，你需要做一下工作：
1）转变模型权重为CPU张量。
2）删除优化器状态。
3）计算checkpoint文件的hash，并把hash码加到文件名上。
python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
例子：
python tools/publish_model.py work_dirs/faster_rcnn/latest.pth faster_rcnn_r50_fpn_1x_20190801.pth
最终的输出文件名类似于:
faster_rcnn_r50_fpn_1x_20190801-{hash id}.pth
4)测试检测器的稳健性
请参考ROBUSTNESS_BENCHMARKING.md.
（五）如何做
1.使用自己的数据集
最简单的方式是将你的数据集转换为支持的数据格式（COCO或者PASCAL VOC）
下面，我们将展示一个例子，该例子加入了一个定制的有5个类别的数据集，假设它是COCO数据格式。创建一个mmdet/datasets/my_dataset.py文件，文件内容如下：
from .coco import CocoDataset
from .registry import DATASETS

@DATASETS.register_module
class MyDataset(CocoDataset):

CLASSES = ('a', 'b', 'c', 'd', 'e')

创建一个mmdet/datasets/init.py文件，文件内容如下：
from .my_dataset import MyDataset
然后你可以使用config文件中的MyDataset，使用和CocoDataset相同的API。
如果你不想将注释格式转换为COCO格式或者PASCAL格式也没事。实际上，我们定义了一个简单的注释格式，同时，所有出现的数据集被处理的与它兼容，不管是线上还是线下。
数据集的注释是一个字典列表，每个字典对应一个图片。主要包括三个域：filename(相对路径)，测试用的width,height，还有一个为训练用的域ann。Ann也是一个至少包含2个域的字典，bboxes和labels,两个都输numpy arrays。一些数据集可以提供注释，比如crowd/difficult/ignored bboxes，我们使用bboxes_ignore和labels_ignore来覆盖他们。下面是一个例子：
[
{
‘filename’: ‘a.jpg’,
‘width’: 1280,
‘height’: 720,
‘ann’: {
‘bboxes’: <np.ndarray, float32> (n, 4),
‘labels’: <np.ndarray, int64> (n, ),
‘bboxes_ignore’: <np.ndarray, float32> (k, 4),
‘labels_ignore’: <np.ndarray, int64> (k, ) (optional field)
}
},
…
]
有两种方法使用定制数据集
1）.线上转换。你可以写一个类似于CocoDataset和VOCDataset数据集格式的类，继承于CustomDataset，同时使用下面两种方法来覆盖：load_annotations(self, ann_file)和get_ann_info(self, idx)。
2）.离线转换
你可以把注释格式转换为上面提到的两种格式，然后后保存为一个pickle或者json文件，类似于pascal_voc.py.然后你可以简单的使用CustomDataset。
2.开发一个新的组件
我们大体上将模型组件分为四类。
Backbone（主干）：通常一个FCN网络来抽出特征图，比如ResNet,MobileNet
Neck（脖子）:这个组件介于主干和头之间，比如FPN,PAFPN
Head（头）：具体工作任务组件。比如bbox预测和mask预测。
Roi extractor（兴趣区提取器）：从特征图中提取ROI特征的部分，比如ROI Align
下面我们展示如何使用MobileNet模板来开发新的组件。
1)创建一个新的文件：mmdet/models/backbones/mobilenet.py.
import torch.nn as nn

from …registry import BACKBONES

@BACKBONES.register_module
class MobileNet(nn.Module):

def __init__(self, arg1, arg2):
    pass

def forward(x):  # should return a tuple
    pass

def init_weights(self, pretrained=None):
    pass

2）将上面的模块导入到mmdet/models/backbones/init.py
from .mobilenet import MobileNet
3）将上述文件应用到你的配置文件中
model = dict(
…
backbone=dict(
type=‘MobileNet’,
arg1=xxx,
arg2=xxx),
…
它是如何运作的更多信息，请参考TECHNICAL_DETAILS.md