一、安装
1.要求
- Python 3.6+
- PyTorch 1.3+
- CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
- GCC 5+
- MMCV
2.安装
(1)一般会新建虚拟环境,在虚拟环境下安装
conda create -n open-mmlab python=3.6 -y
source activate open-mmlab
(2)安装Pytorch和torchvision
conda install pytorch cudatoolkit=10.1 torchvision -c pytorch
镜像源操作:
#可以添加使用Pytorch的清华源:
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
#添加阿里镜像源:
conda config --add channels https://mirrors.aliyun.com/pypi/simple/
#换回conda默认的镜像源:
conda config --remove-key channels
# 显示检索路径,每次安装包时会将包源路径显示出来
conda config --set show_channel_urls yes
conda config --set always_yes True
# 显示所有镜像通道路径命令
conda config --show channels
(3)安装MMCV,参考网页
安装使用mmcv-full,我的安装mmcv或pip安装mmcv-full都有报错
pip install mmcv-full==latest+torch1.4.0+cu100 -f https://download.openmmlab.com/mmcv/dist/index.html
(4)安装MMDetection,MMDetection地址
直接下载或者git下载,然后
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
(5)试验,下载一个faster_rcnn_r50_fpn_1x的预训练模型,保存到mmdetection/checkpoints目录下,运行下面的代码,如果能显示图片和预测框,说明安装成功了。
之前有一些使用show_result函数,但是这个版本(2018-2019 Open-MMLab,2020年10月下载)中好像换成了show_result_pyplot。
import os
import cv2
import mmcv
from mmdet.apis import init_detector, inference_detector, show_result_pyplot
config_file = 'configs/myconfig/faster_rcnn_r50_fpn_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_x101_64x4d_fpn_1x_coco_20200204-833ee192.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
# 测试视频和显示测试结果
#video = mmcv.VideoReader('/home/pv/pv_video/5.MP4') # mp4,avi,mov等常用的视频格式都可以!
#测试图片
filespath = "1.jpg"
dst_Path=os.path.join("result/", config_file.split("/")[-1].split(".")[0]) # 推断结果保存路径
img = mmcv.imread(filespath)
result = inference_detector(model, jpgpath)
# 推断结果保存,result是一个二维数组,多少个框,每个框的坐标和置信度。
#for label in range(len(result)):
# bbox = result[label]
# for i in range(bbox.shape[0]):
# if bbox[i][4]>0.1:
# cv2.rectangle(img,(bbox[i][0],bbox[i][1]),(bbox[i][2],bbox[i][3]),(0,0,255))
show_result_pyplot(model, filespath, result) # 不保存测试的图片(帧)
#show_result_pyplot(frame, result, model.CLASSES, wait_time=1,out_file=os.path.join(dst_Path,'result_{}.jpg'.format(num))) #保存测试后的图片(帧)!
至此,MMDetection安装完成。
二、使用
1.修改自己的数据集格式为COCO或者VOC格式。
2.修改配置文件
第6步创建自定义配置文件。
(1)修改基础配置
./configs/base 的目录结构:
base
├─ datasets
├─ models
├─ schedules
└─ default_runtime.py
可以看出,包含四类配置:
datasets:定义数据集
models:定义模型架构
schedules:定义训练计划
default_runtime.py:定义运行信息
打开 ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py,修改数据集配置的路径:
_base_ = [
'../_base_/models/faster_rcnn_r50_fpn.py',
'../_base_/datasets/coco_detection.py',
'../_base_/schedules/schedule_1x.py',
'../_base_/default_runtime.py'
]
(2)修改数据集配置
打开 ./configs/base/datasets/coco_detection.py,修改数据集的路径 data_root、ann_file、img_prefix,重复次数 times,并添加标签类别 classes。注意classes中如果只有一类,后面加个逗号:
dataset_type = 'VOCDataset'
data_root = 'data/VOCdevkit/MyDataset/'
classes = ('Car', 'Pedestrian', 'Cyclist')
data = dict(
train=dict(
type='RepeatDataset',
times=1,
dataset=dict(
type=dataset_type,
ann_file=data_root + 'ImageSets/Main/train.txt',
img_prefix=data_root,
pipeline=train_pipeline,
classes=classes)),
val=dict(
type=dataset_type,
ann_file=data_root + 'ImageSets/Main/val.txt',
img_prefix=data_root,
pipeline=test_pipeline,
classes=classes),
test=dict(
type=dataset_type,
ann_file=data_root + 'ImageSets/Main/test.txt',
img_prefix=data_root,
pipeline=test_pipeline,
classes=classes))
(3)修改模型架构配置
打开 ./configs/base/models/faster_rcnn_r50_fpn.py,修改 roi_head 的类别个数 num_classes:
model = dict(
type='FasterRCNN',
pretrained='torchvision://resnet50',
roi_head=dict(
type='StandardRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=3,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0))))
注意:
MMDetection 2.0 后的版本,类别个数不需要加 1。
(4)修改训练计划配置
打开 ./configs/base/schedules/schedule_1x.py,修改学习率 lr 和迭代轮数 total_epochs:
# optimizer
optimizer = dict(
type='SGD',
lr=0.02 / 8,
momentum=0.9,
weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[7])
total_epochs = 8
Tips:
Faster R-CNN 的默认学习率 lr=0.02 对应批大小 batch_size=16。
因此需要根据实际情况,按比例缩放学习率。
batch\_size = num\_gpus \times samples\_per\_gpu
lr = 0.02 \times (batch\_size / 16)
(5)修改运行信息配置
打开 ./configs/base/default_runtime.py修改 log_config 的日志记录间隔 interval,并开启 TensorBoard 记录器:
log_config = dict(
interval=100,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
(6)创建自定义配置
另外,也可以将上面步骤 1-5 修改的配置写在一个文件中。这样就能够更方便地管理不同的配置文件,避免因频繁修改导致出错。
打开 configs 目录:
cd configs
新建自定义配置目录:
mkdir myconfig
在 ./myconfig 目录下,新建 faster_rcnn_r50_fpn_1x_mydataset.py:
# 修改基础配置
_base_ = [
'../_base_/models/faster_rcnn_r50_fpn.py',
'../_base_/datasets/coco_detection.py',
'../_base_/schedules/schedule_1x.py',
'../_base_/default_runtime.py'
]
# 修改数据集配置
dataset_type = 'CocoDataset'
data_root = '/home/pv/Data/GETURE/Geture_detector/img_opt_augement/'
classes = ('hand',)
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(448,448), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
train=dict(
type=dataset_type,
ann_file=data_root + 'train.json',
img_prefix=data_root + 'train',
pipeline=train_pipeline,
classes=classes),
val=dict(
type=dataset_type,
ann_file=data_root + 'val.json',
img_prefix=data_root + 'val',
pipeline=test_pipeline,
classes=classes))
# test=dict(
# type=dataset_type,
# ann_file=data_root + 'ImageSets/Main/test.txt',
# img_prefix=data_root,
# pipeline=test_pipeline,
# classes=classes))
# 修改模型架构配置
model = dict(
type='FasterRCNN',
roi_head=dict(
type='StandardRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=1,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)
)))
# 修改训练计划配置
# optimizer
optimizer = dict(lr=0.02 / 8)
# learning policy
lr_config = dict(
warmup=None,
step=[7])
total_epochs = 8
# 修改运行信息配置
log_config = dict(
interval=100,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
load_from = "/home/pv/MMDetector/mmdetection-master/work_dirs/faster_rcnn_r50_fpn_coco/latest.pth"