目标检测00-04：mmdetection(Foveabox为例)-config文件注释-持续更新

江南才尽，年少无知！

已于 2022-07-16 17:24:17 修改

阅读量1k

点赞数 2

分类专栏： # 目标检测 # OpenMMLab 文章标签： mmdetection Foveabox OpenMMLab 目标检测 free anchor

于 2020-09-03 17:14:00 首次发布

本文链接：https://blog.csdn.net/weixin_43013761/article/details/108388011

版权

OpenMMLab 同时被 2 个专栏收录

32 篇文章 57 订阅

订阅专栏

目标检测

27 篇文章 8 订阅

订阅专栏

以下链接是个人关于mmdetection(Foveabox-目标检测框架)所有见解，如有错误欢迎大家指出，我会第一时间纠正。有兴趣的朋友可以加微信：17575010159 相互讨论技术。若是帮助到了你什么，一定要记得点赞！因为这是对我最大的鼓励。 $\color{blue}{文末附带}$ $\color{blue}{公众号 -}$ $\color{blue}{ 海量资源}。$

目标检测00-00：mmdetection(Foveabox为例)-目录-史上最新无死角讲解

前言

该章节的内容比较单调，把 cfg 文件注释单独作为一篇博客列出来，是为了大家方便查找和分析（如果有错误的地方，需要大家及时指出）。本人在进行测试训练的时候，就是使用该配置，对应的数据集已经在前面的博客公布了。
$\color{red}{mmdetection 的 config 文件不能存在中文注释，所以我提供了两个版本，即：注释版本，和非注释版}$
复制 $\color{blue}{非注释的代码 }$ 到 configs/foveabox/my_fovea_r50_fpn_4x4_2x_coco.py （自行创建）之中。

注释版本：

dataset_type = 'MyCocoDataset'
data_root = './data/coco_meter/'

# model settings
model = dict(
    type='FOVEA', # 设置为FOVEA，则其最终会调用到类mmdet.models.detectors.fovea.FOVEA
    pretrained='torchvision://resnet50', # 预训练模型
    backbone=dict( # 主干网络相关配置
        type='ResNet', # 主干网络的类型
        depth=50, # 主干网络的深度
        num_stages=4, # 阶段的数量
        out_indices=(0, 1, 2, 3), # 输出的的特征层
        frozen_stages=1, # 冻结指定阶段的权重
        norm_cfg=dict(type='BN', requires_grad=True), # 使用BN正则化
        norm_eval=True, # 评估时也使用BN
        style='pytorch'), # 风格形式为pytorch
    neck=dict(
        type='FPN', # FPN，特征金字塔
        in_channels=[256, 512, 1024, 2048], # 输入通道数
        out_channels=256, # 输出通道数
        start_level=1, # 特征金字塔的开始层
        num_outs=5, # 输出的数目
        add_extra_convs='on_input'),
    bbox_head=dict(
        type='FoveaHead', # 头部网络类型
        num_classes=1, # 数据集的类别数目
        in_channels=256, # 输入通道数
        stacked_convs=4, # 头部网络（对应论文中Fig 4 的cls branch 与 boxbranch）的卷积层数，
        feat_channels=256, # 特征层的输入通道数
        strides=[8, 16, 32, 64, 128], # 可以理解为论文中的sl
        base_edge_list=[16, 32, 64, 128, 256], # 对应论文中的rl
        scale_ranges=((1, 64), (32, 128), (64, 256), (128, 512), (256, 2048)), # 对应文论中的[rl/η,rl·η], η默认值为2
        sigma=0.4, # 对应论文中的σ
        with_deform=False,
        loss_cls=dict( # FocalLoss 的相关配置
            type='FocalLoss',
            use_sigmoid=True,
            gamma=1.50,
            alpha=0.4,
            loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))
# training and testing settings
train_cfg = dict()

# 图像测试时的相关参数
test_cfg = dict(
    nms_pre=1000, # 每张图片最多对 1000 box进行正则化处理
    score_thr=0.05, # 阈值超过0.05的才做nms处理
    nms=dict(type='nms', iou_threshold=0.5), # nms的相关参数
    max_per_img=100) # 每张图片最大的人数

# 对图像进行正则化的相关参数
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True)

train_pipeline = [
    dict(type='LoadImageFromFile'), # 加载图片像素
    dict(type='LoadAnnotations', with_bbox=True), # 加载注释
    dict(type='Resize', img_scale=(640, 480), keep_ratio=True), # 把图片缩放到指定大小
    dict(type='RandomFlip', flip_ratio=0.5), # 随机进行水平翻转
    dict(type='Normalize', **img_norm_cfg), # 进行正则化处理
    dict(type='Pad', size_divisor=32), #进行像素填充
    dict(type='DefaultFormatBundle'), # 对多张图片进行聚集
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), # 训练网络需要的参数
]

# 进行测试评估的数据转换
test_pipeline = [
    dict(type='LoadImageFromFile'), # 加载图片像素
    dict(
        type='MultiScaleFlipAug', # 进行多种数据增强
        img_scale=(640, 480),
        flip=False, # 关闭水平翻转
        transforms=[
            dict(type='Resize', keep_ratio=True), #不多图片进行缩放，保持原图大小
            dict(type='RandomFlip'), # 随机水平翻转
            dict(type='Normalize', **img_norm_cfg), # 正则化处理
            dict(type='Pad', size_divisor=32), # 对图像进行填充
            dict(type='ImageToTensor', keys=['img']), # 转换为tensor变量
            dict(type='Collect', keys=['img']), # 测试过程只需要输入图像即可
        ])
]

# 数据集的相关配置参数
data = dict(
    samples_per_gpu=2,  # 可以理解为batch_size
    workers_per_gpu=2,  # 每个GPU分配的线程数目
    train=dict(
        type=dataset_type,  # 加载数据的类型
        ann_file=data_root + 'annotations/train.json', # 训练ison文件的路径
        img_prefix=data_root + 'train/', # 训练图像目录的目录
        pipeline=train_pipeline), # 数据转换的方式，如缩放，剪切等等
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/test.json',
        img_prefix=data_root + 'test/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/test.json',
        img_prefix=data_root + 'test/',
        pipeline=test_pipeline))


# optimizer，# 优化器的相关配置参数
optimizer = dict(type='SGD',
                 lr=0.002,
                 momentum=0.9,
                 weight_decay=0.0001)

optimizer_config = dict(grad_clip=None)



# learning policy。学习率的衰减策略
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[100, 150])

total_epochs = 200  # 迭代到该epoch数，则模型停止，不再进行训练

# yapf:disable
# log记录的相关参数，可以使用Tensorboard的方式
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

# yapf:enable
# 和分布式训练相关的参数
dist_params = dict(backend='nccl')
# loss信息打印的等级
log_level = 'INFO'
# 权重文件的路径（覆盖初始化参数）
load_from = None
# 是否继续训练
resume_from = None
# 该设置默认即可，不需要理会，如果有想了解的朋友可以参考博客:
#
workflow = [('train', 1)]
# 训练10个epoch保存一次模型
checkpoint_config = dict(interval=10)
# 训练10 epoch进行一次评估
evaluation = dict(interval=10, metric='bbox', classwise=True)

非注释版

dataset_type = 'MyCocoDataset'
data_root = './data/coco_meter/'

# model settings
model = dict(
    type='FOVEA',
    pretrained='torchvision://resnet50',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch'),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        start_level=1,
        num_outs=5,
        add_extra_convs='on_input'),
    bbox_head=dict(
        type='FoveaHead',
        num_classes=1,
        in_channels=256,
        stacked_convs=4,
        feat_channels=256,
        strides=[8, 16, 32, 64, 128],
        base_edge_list=[16, 32, 64, 128, 256],
        scale_ranges=((1, 64), (32, 128), (64, 256), (128, 512), (256, 2048)),
        sigma=0.4,
        with_deform=False,
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=1.50,
            alpha=0.4,
            loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))
# training and testing settings
train_cfg = dict()

test_cfg = dict(
    nms_pre=1000,
    score_thr=0.05,
    nms=dict(type='nms', iou_threshold=0.5),
    max_per_img=100)

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True)

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]

test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(640, 480),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/train.json',
        img_prefix=data_root + 'train/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/test.json',
        img_prefix=data_root + 'test/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/test.json',
        img_prefix=data_root + 'test/',
        pipeline=test_pipeline))


# optimizer
optimizer = dict(type='SGD',
                 lr=0.002,
                 momentum=0.9,
                 weight_decay=0.0001)

optimizer_config = dict(grad_clip=None)

# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[100, 150])

total_epochs = 200
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

# yapf:enable
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='bbox')

公众号

江南才尽，年少无知！

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
目标检测00-04：mmdetection(Foveabox为例)-config文件注释-持续更新

该章节的内容比较单调，把cfg文件注释单独作为一篇博客列出来，是为了大家方便查找和分析（如果有错误的地方，需要大家及时指出）。本人在进行测试训练的时候，就是使用该配置，对应的数据集已经在前面的博客公布了。mmdetection的config文件不能存在中文注释，所以我提供了两个版本，即注释版本，和非注释版\color{red}{mmdetection的config文件不能存在中文注释，所以我提供了两个版本，即注释版本，和非注释版}mmdetec。...
复制链接

扫一扫