记录一次 mmdetection 自定义数据训练和推理

最新推荐文章于 2024-05-03 15:18:17 发布

何小义的AI进阶路

最新推荐文章于 2024-05-03 15:18:17 发布

阅读量1.7k

点赞数 2

分类专栏：计算机视觉人工智能深度学习文章标签：深度学习目标检测计算机视觉

本文链接：https://blog.csdn.net/hzy459176895/article/details/123690217

版权

人工智能同时被 3 个专栏收录

20 篇文章 0 订阅

订阅专栏

计算机视觉

13 篇文章 1 订阅

订阅专栏

深度学习

13 篇文章 1 订阅

订阅专栏

总体参考如下（还有其他CSDN和知乎贴子）：

Welcome to MMDetection’s documentation! — MMDetection 2.22.0 文档https://mmdetection.readthedocs.io/zh_CN/latest/https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/get_started.mdhttps://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/get_started.md

1. 环境安装

除了安装基础的python，pytorch等，重点是mmcv、mmcv-full 和 mmdet

由于要用到开发场景，不要用pip安装封装好的包，用官方建议（官方install那一步也有讲）：

pip install openmim

mim install mmdet

2. 代码

直接clone的mmlab官方源码(我用的master分支)：

GitHub - open-mmlab/mmdetection: OpenMMLab Detection Toolbox and Benchmarkhttps://github.com/open-mmlab/mmdetection

3. 数据集

对于目标检测业务，首先应该用labelme或者labelimg对图像进行打标。关于打标，这里不再赘述，后边单独讲labelme的使用（或者先去百度学习一下）！

假设我们已经用labelme对想实现的数据集打好标签了（打的polygon多边形类型点），现在data文件夹有images和labels （这是labelme打完的结论，暂时不管）：

---->

对于mmdetection，大多用的coco类型数据进行预训练的，我们想要迁移学习，也需要将labelme打标的数据转换为coco数据类型才能供训练使用，这里可以使用 labelme2coco.py 进行处理，运行完了就会像上边右图一样，多了一个可用于mmdet训练的annotation标签文件夹。

labelme2coco.py：

# -*- coding:utf8 -*-

"""
labelme普通标记数据 转为coco数据格式
"""

# inference visualization
from pycocotools.coco import COCO
import matplotlib.pyplot as plt
from PIL import Image
from pathlib import Path
import os
import json
import numpy as np
import random


def data_label_view():

    # 显示检测和分割的效果 ##########################

    img_folder = '你的图片image路径'
    ann_file = '标签的路径annotations路径'

    coco = COCO(ann_file)
    for idx in range(0, 15):  # val例子
        imgIds = coco.getImgIds()[idx]
        img_name = coco.loadImgs(imgIds)[0]['file_name']
        img_path = os.path.join(img_folder, img_name)
        img = Image.open(img_path).convert('RGB')
        ann_ids = coco.getAnnIds(imgIds=imgIds)
        anns = coco.loadAnns(ann_ids)
        plt.figure(num=0, figsize=(10, 10))
        plt.imshow(img)
        coco.showAnns(anns, draw_bbox=True)
        plt.show()
    
def data_to_coco_(DATA_PATH, IMG_TYPE):
    # 统计现有数据集中，目标类别数量 ############
    all_images_path = DATA_PATH + '/images'  # labelme打标后的图片和标签路径
    all_labels_path = DATA_PATH + '/labels'

    label_count = {}
    check_status = True
    for item in os.listdir(all_images_path):
        if IMG_TYPE not in item:
            continue
        label_path = os.path.join(all_labels_path, item.split('.')[0] + '.json')
        if not os.path.exists(label_path):
            check_status = False
        else:
            with open(label_path, 'r') as f:
                label_data = json.load(f)
            for shape in label_data['shapes']:
                label_count[shape['label']] = label_count.get(shape['label'], 0) + 1  # 每种类的数量统计
    for item in os.listdir(all_labels_path):
        if 'json' not in item:
            continue
        image_path = os.path.join(all_images_path, item.split('.')[0] + '.' + IMG_TYPE)
        if not os.path.exists(image_path):
            check_status = False
    check_info = 'passed' if check_status else 'failed'
    print(f'Simple check {check_info}')
    if check_status:
        print(label_count)

    # 制作类的顺序字典 ##################
    category_dict = {k: v for v, k in enumerate(label_count.keys())}
    category = [{'supercategory': k, 'id': v, 'name': k} for k, v in category_dict.items()]
    inverted_category = {v: k for k, v in category_dict.items()}
    print('category_dict:   ', category_dict)

    dataset_name = all_images_path

    image_names = [i for i in os.listdir(dataset_name) if i.endswith('.' + IMG_TYPE)]  # 只找那些'.png'结尾的

    random.seed(0)
    random.shuffle(image_names)  # 随机打乱
    spilt_ratio = 0.8  # 训练集比例
    split_idx = int(len(image_names) * spilt_ratio)

    images = dict()
    images['train'] = image_names[:split_idx]
    images['val'] = image_names[split_idx:]

    for dataset_name in ['train', 'val']:
        annotations = {'info': '',  # 构造coco数据集的类型的annotation
                       'licenses': [],
                       'images': [],
                       'annotations': [],
                       'categories': category}
        shape_id = 0
        for order, item in enumerate(images[dataset_name]):
            if IMG_TYPE not in item:
                continue
            label_path = os.path.join(all_labels_path, item.split('.')[0] + '.json')
            with open(label_path, 'r') as f:
                label_data = json.load(f)
            image_info = {'license': '',
                          'file_name': item,
                          'coco_url': '',
                          'height': label_data['imageHeight'],
                          'width': label_data['imageWidth'],
                          'date_captured': '',
                          'flickr_url': '',
                          'id': order}
            annotations['images'].append(image_info)
            for shape in label_data['shapes']:
                segmentation = list(np.array(shape['points']).reshape(-1))  # 维度拉平一级，w/h依次记录
                x = segmentation[0::2]
                y = segmentation[1::2]
                wbox, hbox = max(x) - min(x), max(y) - min(y)
                ann = {'segmentation': [segmentation],
                       'area': wbox * hbox,
                       'iscrowd': 0,
                       'image_id': order,
                       'bbox': [min(x), min(y), wbox, hbox],  # bbox：min_w， min_h, w, h
                       'category_id': category_dict[shape['label']],
                       'id': shape_id}
                annotations['annotations'].append(ann)
                shape_id += 1

        anns_file_path = DATA_PATH + f'/annotations/{dataset_name}.json'
        with open(anns_file_path, 'w') as f:
            json.dump(annotations, f)



if __name__ == '__main__':


    # # 新的数据标签制作成coco模式
    DATA_PATH = 'data根路径'
    IMG_TYPE = 'png'
    data_to_coco_(DATA_PATH, IMG_TYPE)

    # 例子查看
    data_label_view()

4. 配置文件

注：以下描述的路径均在项目根目录的基础路径下！

（1）新建 config/_base_/datasets/a_coco_detection_mydataset.py

这是参考coco_detetion.py，然后改动一些设置：注意是写自己数据的一些相关的。

# dataset settings
dataset_type = 'CocoDataset'  # 根据coco数据集而改动的自定义数据集配置
data_root = '/你的数据集根目录'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)

resize_para = (800, 800)   # 假设需要 resize成 800*800
# 你的数据最后的分类label（注意，要跟数据集上标签文件从前往后id顺序对得上）
CLASSES_LIST = ('people', 'building', '...')

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=resize_para, keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),  # 随机0.5的概率翻转
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),  # 将Resize之后的图片Pad成size_divisor=32的倍数
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=resize_para,
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    samples_per_gpu=4,  # 每批次样本数，比如4， 48个样本则12批次
    workers_per_gpu=2,  # works核数
    train=dict(
        classes=CLASSES_LIST,
        type=dataset_type,
        ann_file=data_root + 'annotations/train.json',
        img_prefix=data_root + 'images',
        pipeline=train_pipeline),
    # 训练时
    val=dict(
        classes=CLASSES_LIST,
        type=dataset_type,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'images',
        pipeline=test_pipeline),
    test=dict(
        classes=CLASSES_LIST,
        type=dataset_type,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'images',
        pipeline=test_pipeline))

（2）新建 config/_base_/model/a_faster_rcnn_r50_fpn_mydataset.py

这是参考 faster_rcnn_r50_fpn.py，然后网络的设置简单运行的话不需要动，仅仅直接改一个 roi_head 下的 bbox_head 下的 num_classes 为你的目标检测分类数量即可。

# model settings

CLASS_NUM = 6  # 你的数据类别数量，比如6个类（一般情况下，不动net，则只改这里就可以了）

model = dict(
    type='FasterRCNN',

    backbone=dict(
        type='ResNet',
        depth=50,  # ResNet 的深度, 可以是 {18, 34, 50, 101, 152}.
        num_stages=4,  # 使用 ResNet 的 stage 数量(默认: 4).
        out_indices=(0, 1, 2, 3),  # 需要输出的 stage 的索引.
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch',  # 网络风格：如果设置pytorch，则stride为2的层是conv3x3的卷积层；如果设置caffe，则stride为2的层是第一个conv1x1的卷积层
        # backbone预训练模型，下载放到/home/yons/.cache/torch/hub/checkpoints（这里默认用torchvision的）
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
        frozen_stages=2,  # 冻结预训练前两层，对迁移学习有更好的效果
    ),

    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],  # 每个尺度的输入通道数, 也是 backbone 的输出通道数
        out_channels=256,  # fpn 的输出通道数, 所有尺度的输出通道数相同, 都是一个值.
        num_outs=5),  # 输出的特征层的数量 （应该是FPN的p2-p6层次）

    rpn_head=dict(
        type='RPNHead',  # rpn头部
        in_channels=256,  # RPN网络的输入通道数
        feat_channels=256,  # 特征层的通道数
        anchor_generator=dict(
            type='AnchorGenerator',   # 绝大多数都是用AnchorGenerator
            scales=[8],  # anchor的生成个数, 特征图上每一个位置所生成的anchor个数为scale * base_sizes
            ratios=[0.5, 1.0, 2.0],  # anchor的几种宽高比率 (三种，大中小一共九种)
            strides=[4, 8, 16, 32, 64]),  # 在每个特征层上的anchor的步长（对应于原图）
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',  # bboxheader 的类型
            target_means=[.0, .0, .0, .0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),  # 均值和方差
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),  # 分类损失  交叉熵  （这里主要是前景/背景）
        loss_bbox=dict(type='L1Loss', loss_weight=1.0)),  # 回归损失 L1  坐标偏移  rpn cls loss 和 box loss 是针对anchor的。

    roi_head=dict(  # 封装了二阶段检测器的第二阶段的模块
        type='StandardRoIHead',
        bbox_roi_extractor=dict(  # RoI feature extractor 用于 bbox regression.
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),  # feature maps的输出尺度，相当于输出7*7
            out_channels=256,  # 输出特征图的通道数
            featmap_strides=[4, 8, 16, 32]),  # 多尺度特征图的步幅
        bbox_head=dict(
            type='Shared2FCBBoxHead',  # 全连接层类型
            in_channels=256,  # 输入通道数
            fc_out_channels=1024,  # 全连接输出通道数
            roi_feat_size=7,  # ROI特征层尺寸

            num_classes=CLASS_NUM,  # 这里区别于coco数据集

            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0., 0., 0., 0.],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            # 是否采用class_agnostic的方式来预测，class_agnostic表示输出bbox时只考虑其是否为前景，
            # 后续分类的时候再根据该bbox在网络中的类别得分来分类，也就是说一个框可以对应多个类别
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),  # 这里是具体类别（分类loss 交叉熵loss）
            loss_bbox=dict(type='L1Loss', loss_weight=1.0))),

    # model training and testing settings
    train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',  # RPN网络的正负样本划分
                pos_iou_thr=0.7,  # 正样本iou阈值
                neg_iou_thr=0.3,  # 负样本iou阈值
                # 正样本的iou最小值。如果assign给ground truth的anchors中最大的IOU低于0.3，则忽略所有的anchors，否则保留最大IOU的anchor
                min_pos_iou=0.3,
                match_low_quality=True,   # 是否匹配低质量anchor
                ignore_iof_thr=-1),   # 忽略bbox的阈值，当ground truth中包含需要忽略的bbox时使用，-1表示不忽略
            sampler=dict(
                type='RandomSampler',  # 正负样本提取器类型
                num=256,  # 需提取的正负样本数量
                pos_fraction=0.5,  # 正样本比例
                neg_pos_ub=-1,  # 最大负样本比例，大于该比例的负样本忽略，-1表示不忽略
                add_gt_as_proposals=False),  # 把ground truth加入proposal作为正样本
            allowed_border=-1,  # 不允许在bbox周围外扩一定的像素，0表示允许
            pos_weight=-1,
            debug=False),
        rpn_proposal=dict(
            nms_pre=2000,  # 在NMS之前的box个数
            max_per_img=1000,  # nms后最大可用ior数量
            nms=dict(type='nms', iou_threshold=0.7),  # 0.7的iou阈值
            min_bbox_size=0),
        rcnn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.5,
                neg_iou_thr=0.5,
                min_pos_iou=0.5,  # 正样本的iou最小值。如果assign给ground truth的anchors中最大的IOU低于，则忽略所有的anchors，否则保留最大IOU的anchor
                match_low_quality=False,
                ignore_iof_thr=-1),  # 忽略bbox的阈值，当ground truth中包含需要忽略的bbox时使用，-1表示不忽略
            sampler=dict(
                type='RandomSampler',
                num=512,  # 需提取的正负样本数量
                pos_fraction=0.25,  # 正样本比例
                neg_pos_ub=-1,  # 最大负样本比例，大于该比例的负样本忽略，-1表示不忽略
                add_gt_as_proposals=True),  # 把ground truth加入proposal作为正样本
            pos_weight=-1,  # 正样本权重，-1表示不改变原始的权重
            debug=False)),

    test_cfg=dict(
        rpn=dict(
            nms_pre=1000,  # 在nms之前保留的的得分最高的proposal数量
            max_per_img=1000,  # 在后处理完成之后保留最大的proposal数量
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=dict(
            score_thr=0.05,
            nms=dict(type='nms', iou_threshold=0.5),
            max_per_img=100)
        # soft-nms is also supported for rcnn testing
        # e.g., nms=dict(type='soft_nms', iou_threshold=0.5, min_score=0.05)
    ))

# print(model.keys())

（3）新建 config/_base_/schedules/a_schedule_1x_mydataset.py
这是参考schedule_1x.py或直接修改它，自定义优化器，学习率，和迭代次数等等。

# optimizer
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)  # weight_decay:权重惩罚，正则化
optimizer_config = dict(grad_clip=None)  # grad_clip:是否考虑设置梯度截断阈值，防止梯度爆炸
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,  # 共warmup 多少个批次
    warmup_ratio=0.0001,
    step=[30, 60, 80])  # 设置几次，每次学习率降低0.1倍
runner = dict(type='EpochBasedRunner', max_epochs=100)  # 训练迭代次数

（4）新建 config/_base_/t_default_runtime_mydataset.py
这是参考 default_runtime.py 或者直接修改它，设置多少批次打印日志，多少迭代保存等。

# 预训练模型
# 在https://github.com/open-mmlab/mmdetection/tree/master/configs/faster_rcnn下载的预训练model （我这里用的faster_rcnn_r50_fpn_1x_coco得预训练模型）
load_from = 'checkpoints/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
checkpoint_config = dict(interval=20)  # 多少次保存一次模型  # 前面设定的100次总迭代
evaluation = dict(interval=10, metric='bbox')  # 多少次评估一次模型，评估标注bbox的

# yapf:disable
log_config = dict(
    interval=4,  # 多少批次显示一次
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
resume_from = None
workflow = [('train', 2)]

（5）新建 config/faster_rcnn/a_faster_rcnn_r50_fpn_1x_mydataset.py
这里借鉴faster_rcnn_r50_fpn_1x_coco.py的训练配置，按照如下顺序定义自己的配置文件。

这里是最终模型的配置文件，整合前面的系列配置：

_base_ = [
    '../_base_/datasets/a_coco_detection_mydataset.py',
    '../_base_/models/a_faster_rcnn_r50_fpn_mydataset.py',
    '../_base_/schedules/a_schedule_1x_mydataset.py',
    '../_base_/t_default_runtime_mydataset.py'
]

5. 训练

对tools/train.py, 修改如下两个参数（参数文件和训练结果输出地），然后 python tools/train.py 运行即可。

6. 推理验证

(1) 如果只是一张图简单验证时候，可以用如下方法： test_one.py

from mmcv import Config
import os
from mmdet.apis import init_detector, inference_detector, show_result_pyplot


config_file = 'configs/faster_rcnn/a_faster_rcnn_r50_fpn_1x_mydataset.py'
checkpoint_file = 'tr_oil_det_20220323/faster_rcnn/epoch_100.pth'
cfg = Config.fromfile(config_file)


def one_pic_detect():

    # build the model from a config file and a checkpoint file
    model = init_detector(cfg, checkpoint_file, device='cpu')
    # test a single image
    img = r'F:\xxx\aaa.png'
    result = inference_detector(model, img)
    # show the results
    show_result_pyplot(model, img, result)

    model.show_result(img, result, out_file='../result-220323.jpg')
    print('推理完成...')

one_pic_detect()

(2) 如果是批量对val验证集进行推理，可以对应修改tools/test.py中 config、checkpoint、work-dir 等参数，然后运行脚本。

何小义的AI进阶路

关注

2
点赞
踩
13

收藏

觉得还不错? 一键收藏
1
评论
记录一次 mmdetection 自定义数据训练和推理

总体参考如下（还有其他CSDN和知乎贴子）：1. 环境安装除了安装基础的python，pytorch等，重点是mmcv和mmcls！由于要用到开发场景，不要用pip安装封装好的包，用官方建议（官方install那一步也有讲）：pip install openmimmim install -e .2. 代码直接clone的mmlab官方源码：GitHub - open-mmlab/mmdetection: OpenMMLab Detection Toolbox and Be..
复制链接

扫一扫

专栏目录