mmdetection训练(1)voc格式的数据集(自制)
提前准备
voc数据集,mmdetection代码库
一、voc数据集
需要有以下三个文件夹的格式数据
(这里不需要完全按照vocdevkit/voc2007这种进行构建,下面教大家如何修改)
存放xml标签文件
放的train.txt,test.txt,和val.txt
存放原始的图片数据
二、修改配置代码进行训练(敲黑板!!!!!)
强调:这里包括以下的内容,均在自己创建模板文件下进行修改,原则上不对原始代码进行修改,修改原始代码繁琐且容易搞混破坏代码整体结构,以下均为自己创建配置文件,请自己按需改写(个人喜欢这样的配置方式)。
1.数据集相关内容修改
(1)configs/base/datasets/voc0712.py中
修改相关的路径与voc数据集保持一致(这个文件夹中只要修改路径不要修改别的东西)
# dataset settings
dataset_type = 'VOCDataset'
# data_root = 'data/VOCdevkit/'
data_root = '/home/ubuntu/data/Official-SSDD-OPEN/BBox_SSDD/'
# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically Infer from prefix (not support LMDB and Memcache yet)
# data_root = 's3://openmmlab/datasets/detection/segmentation/VOCdevkit/'
# Method 2: Use `backend_args`, `file_client_args` in versions before 3.0.0rc6
# backend_args = dict(
# backend='petrel',
# path_mapping=dict({
# './data/': 's3://openmmlab/datasets/segmentation/',
# 'data/': 's3://openmmlab/datasets/segmentation/'
# }))
backend_args = None
###数据增强的方法
train_pipeline = [
dict(type='LoadImageFromFile', backend_args=backend_args),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', scale=(1000, 600), keep_ratio=True),
dict(type='RandomFlip', prob=0.5),
dict(type='PackDetInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile', backend_args=backend_args),
dict(type='Resize', scale=(1000, 600), keep_ratio=True),
# avoid bboxes being resized
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
]
###数据加载
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
batch_sampler=dict(type='AspectRatioBatchSampler'),
dataset=dict(
type='RepeatDataset',
times=3,
dataset=dict(
type='ConcatDataset',
# VOCDataset will add different `dataset_type` in dataset.metainfo,
# which will get error if using ConcatDataset. Adding
# `ignore_keys` can avoid this error.
ignore_keys=['dataset_type'],
datasets=[
dict(
type=dataset_type,
data_root=data_root,
# ann_file='VOC2007/ImageSets/Main/trainval.txt',
ann_file='voc_style/ImageSets/Main/train.txt',
data_prefix=dict(sub_data_root='voc_style/'),
filter_cfg=dict(
filter_empty_gt=True, min_size=32, bbox_min_size=32
),
pipeline=train_pipeline,
backend_args=backend_args),
# dict(
# type=dataset_type,
# data_root=data_root,
# ann_file='VOC2012/ImageSets/Main/trainval.txt',
# data_prefix=dict(sub_data_root='VOC2012/'),
# filter_cfg=dict(
# filter_empty_gt=True, min_size=32, bbox_min_size=32),
# pipeline=train_pipeline,
# backend_args=backend_args)
])))
val_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='voc_style/ImageSets/Main/test.txt',
data_prefix=dict(sub_data_root='voc_style/'),
test_mode=True,
pipeline=test_pipeline,
backend_args=backend_args))
test_dataloader = val_dataloader
# Pascal VOC2007 uses `11points` as default evaluate mode, while PASCAL
# VOC2012 defaults to use 'area'.
val_evaluator = dict(type='VOCMetric', metric='mAP', eval_mode='11points')
test_evaluator = val_evaluator
(2)修改 mmdet/datasets/voc.py文件
修改自己数据集的类别信息与框的颜色,并且一定注释取消voc2007和2012版本判断的要求,方便后面使用自己的数据集路径。
# Copyright (c) OpenMMLab. All rights reserved.
from mmdet.registry import DATASETS
from .xml_style import XMLDataset
@DATASETS.register_module()
class VOCDataset(XMLDataset):
"""Dataset for PASCAL VOC."""
# 标准的voc格式类别信息
# METAINFO = {
# 'classes':
# ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
# 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
# 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'),
# # palette is a list of color tuples, which is used for visualization.
# 'palette': [(106, 0, 228), (119, 11, 32), (165, 42, 42), (0, 0, 192),
# (197, 226, 255), (0, 60, 100), (0, 0, 142), (255, 77, 255),
# (153, 69, 1), (120, 166, 157), (0, 182, 199),
# (0, 226, 252), (182, 182, 255), (0, 0, 230), (220, 20, 60),
# (163, 255, 0), (0, 82, 0), (3, 95, 161), (0, 80, 100),
# (183, 130, 88)]
# }
### 修改的数据类别信息
METAINFO = {
'classes':
('ship', ),
# palette is a list of color tuples, which is used for visualization.
'palette': [(106, 0, 228)]
}
def __init__(self, **kwargs):
super().__init__(**kwargs)
# if 'VOC2007' in self.sub_data_root:
# self._metainfo['dataset_type'] = 'VOC2007'
# elif 'VOC2012' in self.sub_data_root:
# self._metainfo['dataset_type'] = 'VOC2012'
# else:
# self._metainfo['dataset_type'] = None
(3)修改网络配置文件中的输出类别(非必须操作)
configs/base/models/faster-rcnn_r50_fpn.py
2.自定义配置文件构建
(1)在代码根目录新建myconfig.py的文件,
(2)复制以下内容到其中:
新的配置文件主要是分为三个部分
1、倒入相应的库文件(base)
2、模型加载文件:一定要家在修改num_classses=‘你的类别’
3、数据集配置:直接复制configs/base/datasets/voc0712.py即可
# 新配置继承了基本配置,并做了必要的修改
# _base_ = './configs/faster_rcnn/mask-rcnn_r50-caffe_fpn_ms-poly-1x_coco.py'
_base_ = './configs/faster_rcnn/faster-rcnn_r50_fpn_1x_voc.py'
# 我们还需要更改 head 中的 num_classes 以匹配数据集中的类别数
########------模型相关配置--------#########
model = dict(
roi_head=dict(
bbox_head=dict(num_classes=1)))
########------修改数据集相关配置--------#########
backend_args = None
# dataset settings
dataset_type = 'VOCDataset'
# data_root = 'data/VOCdevkit/'
data_root = '/home/ubuntu/data/Official-SSDD-OPEN/BBox_SSDD/'
###数据增强的方法
train_pipeline = [
dict(type='LoadImageFromFile', backend_args=backend_args),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', scale=(1000, 600), keep_ratio=True),
dict(type='RandomFlip', prob=0.5),
dict(type='PackDetInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile', backend_args=backend_args),
dict(type='Resize', scale=(1000, 600), keep_ratio=True),
# avoid bboxes being resized
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
]
###数据加载
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
batch_sampler=dict(type='AspectRatioBatchSampler'),
dataset=dict(
type='RepeatDataset',
times=3,
dataset=dict(
type='ConcatDataset',
# VOCDataset will add different `dataset_type` in dataset.metainfo,
# which will get error if using ConcatDataset. Adding
# `ignore_keys` can avoid this error.
ignore_keys=['dataset_type'],
datasets=[
dict(
type=dataset_type,
data_root=data_root,
# ann_file='VOC2007/ImageSets/Main/trainval.txt',
ann_file='voc_style/ImageSets/Main/train.txt',
data_prefix=dict(sub_data_root='voc_style/'),
filter_cfg=dict(
filter_empty_gt=True, min_size=32, bbox_min_size=32
),
pipeline=train_pipeline,
backend_args=backend_args),
# dict(
# type=dataset_type,
# data_root=data_root,
# ann_file='VOC2012/ImageSets/Main/trainval.txt',
# data_prefix=dict(sub_data_root='VOC2012/'),
# filter_cfg=dict(
# filter_empty_gt=True, min_size=32, bbox_min_size=32),
# pipeline=train_pipeline,
# backend_args=backend_args)
])))
val_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='voc_style/ImageSets/Main/test.txt',
data_prefix=dict(sub_data_root='voc_style/'),
test_mode=True,
pipeline=test_pipeline,
backend_args=backend_args))
test_dataloader = val_dataloader
###数据加载
train_dataloader = dict(
batch_size=2,
num_workers=2,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
batch_sampler=dict(type='AspectRatioBatchSampler'),
dataset=dict(
type='RepeatDataset',
times=3,
dataset=dict(
type='ConcatDataset',
# VOCDataset will add different `dataset_type` in dataset.metainfo,
# which will get error if using ConcatDataset. Adding
# `ignore_keys` can avoid this error.
ignore_keys=['dataset_type'],
datasets=[
dict(
type=dataset_type,
data_root=data_root,
# ann_file='VOC2007/ImageSets/Main/trainval.txt',
ann_file='voc_style/ImageSets/Main/train.txt',
data_prefix=dict(sub_data_root='voc_style/'),
filter_cfg=dict(
filter_empty_gt=True, min_size=32, bbox_min_size=32
),
pipeline=train_pipeline,
backend_args=backend_args),
# dict(
# type=dataset_type,
# data_root=data_root,
# ann_file='VOC2012/ImageSets/Main/trainval.txt',
# data_prefix=dict(sub_data_root='VOC2012/'),
# filter_cfg=dict(
# filter_empty_gt=True, min_size=32, bbox_min_size=32),
# pipeline=train_pipeline,
# backend_args=backend_args)
])))
val_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file='voc_style/ImageSets/Main/test.txt',
data_prefix=dict(sub_data_root='voc_style/'),
test_mode=True,
pipeline=test_pipeline,
backend_args=backend_args))
test_dataloader = val_dataloader
# 使用预训练的 Mask R-CNN 模型权重来做初始化,可以提高模型性能
# load_from = 'https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
三、训练及其评估测试
模型训练的命令:
python tools/train.py myconfig_voc.py```
模型测试的命令
```bash
在这里插入代码片
总结
训练遇到的问题
记录下遇到的问题。训练SSDD数据集的时候,发现coco格式训练正常,但voc格式训练出现map值很低,一直升不上去。试了下2.x版本的mmdetection训练voc格式没问题,解决方案是在configs/base/datasets/voc0712.py中删掉bbox_min_size=32即可,原文链接:https://blog.csdn.net/Pliter/article/details/134389961