在目标检测时候,一般我们常构建的dataloader为coco格式或者voc格式的,下面分别针对这两种格式介绍一下如何在Detectron2上创建自己的数据集
COCO格式的数据集
coco格式的数据集不详细介绍了,可以参考别人的总结,比如Dataset - COCO Dataset 数据特点,关于数据集中我有个疑问还没来得及验证,就是annotations中的area是真实的segmentation还是bbox的面积,在这篇文章《COCO数据集标注格式详解----object instances》我找到了答案。还有两点需要注意:(1)bbox或者segmentation可能是浮点数,不是整数;(2)segmentation可以使用RLE表示也可以使用polygon表达,通常使用polygon,两种格式例子如下
RLE格式:{"segmentation": {"counts":
[8214,6,629,17,2,6,614,28,611,29,610,31,609,31,609,32,608,32,608,32,608,31,609,31,610,29,612,27,615,16,3,4,620,11,35186,6,633,9,630,11,628,14,626,14,626,15,625,15,625,16,624,16,624,16,625,14,627,13,628,11,631,8,634,4,40318,5,629,14,624,17,622,19,620,20,619,22,617,23,617,23,617,22,618,22,618,21,619,7,1,4,3,4,621,6,3,1,631,3,638,1,133135,5,633,8,631,10,630,10,630,11,629,11,629,11,629,11,629,11,629,11,629,10,631,9,632,7,634,4,99294],"size":
[640,549]},"area": 962,"iscrowd": 1,"image_id":
374545,"bbox": [12,524,381,33],"category_id":
1,"id": 900100374545},
polygon格式:{"segmentation": [[510.66.423.01.511.72.420.03.510.45.416.0.510.34.413.02.510.77.410.26.510.77.407.5.510.34.405.16.511.51.402.83.511.41.400.49.510.24.398.16.509.39.397.31.504.61.399.22.502.17.399.64.500.89.401.66.500.47.402.08.499.09.401.87.495.79.401.98.490.59.401.77.488.79.401.77.485.39.398.58.483.9.397.31.481.56.396.35.478.48.395.93.476.68.396.03.475.4.396.77.473.92.398.79.473.28.399.96.473.49.401.87.474.56.403.47.473.07.405.59.473.39.407.71.476.68.409.41.479.23.409.73.481.56.410.69.480.4.411.85.481.35.414.93.479.86.418.65.477.32.420.03.476.04.422.58.479.02.422.58.480.29.423.01.483.79.419.93.486.66.416.21.490.06.415.57.492.18.416.85.491.65.420.24.492.82.422.9.493.56.424.39.496.43.424.6.498.02.423.01.498.13.421.31.497.07.420.03.497.07.415.15.496.33.414.51.501.1.411.96.502.06.411.32.503.02.415.04.503.33.418.12.501.1.420.24.498.98.421.63.500.47.424.39.505.03.423.32.506.2.421.31.507.69.419.5.506.31.423.32.510.03.423.01.510.45.423.01]],"area":
702.1057499999998,"iscrowd": 0,"image_id":
289343,"bbox": [473.07,395.93,38.65,28.67],"category_id":
18,"id": 1768},
RLE格式是一个字典,对应iscrowd=1的情况,标注的是一群人或者目标
polygon格式时一个列表,对应iscrowd=0的情况,标注的是单个目标,注意这个目标可能会因为遮挡被拆分为几块
如果只有检测框没有分割的数据集准备的时候只需要将segmentation按顺序写入bbox的四个边框点就可以了
好了数据集就介绍到这里,下面介绍如何在detectron2中添加对COCO格式数据集的支持:
只需要修改detectron2/data/datasets/builtin.py,参考register_all_coco()函数添加自己的训练集如下:
_PREDEFINED_SPLITS_SELF_COCO = {}
_PREDEFINED_SPLITS_SELF_COCO["data1"] = {
"trainv1": ("JPEGImages", "seg_train.json"),
"testv1": ("JPEGImages", "segtest.json"),
}
def register_all_self_coco(root="/home/user/data"):
for dataset_name, splits_per_dataset in _PREDEFINED_SPLITS_SELF_COCO.items():
for key, (image_root, json_file) in splits_per_dataset.items():
register_coco_instances(
key,
{},
os.path.join(root, json_file),
os.path.join(root, image_root),
)
print("register {} success".format(dataset_name))
有两点需要注意:
1. coco格式提供图片路径和标注路径就可以了,因此_PREDEFINED_SPLITS_SELF_COCO中只有三项内容,datasetname, imagepath和annopath
2. 使用register_coco_instances函数的时候第二个参数留空了,这是register_coco_instances函数推荐的用法
def register_coco_instances(name, metadata, json_file, image_root):
"""
Register a dataset in COCO's json annotation format for
instance detection, instance segmentation and keypoint detection.
(i.e., Type 1 and 2 in http://cocodataset.org/#format-data.
`instances*.json` and `person_keypoints*.json` in the dataset).
This is an example of how to register a new dataset.
You can do something similar to this function, to register new datasets.
Args:
name (str): the name that identifies a dataset, e.g. "coco_2014_train".
metadata (dict): extra metadata associated with this dataset. You can
leave it as an empty dict.
json_file (str): path to the json instance annotation file.
image_root (str): directory which contains all the images.
"""
# 1. register a function which returns dicts
DatasetCatalog.register(name, lambda: load_coco_json(json_file, image_root, name))
# 2. Optionally, add metadata about this dataset,
# since they might be useful in evaluation, visualization or logging
MetadataCatalog.get(name).set(
json_file=json_file, image_root=image_root, evaluator_type="coco", **metadata
)
这样就完成了自制coco格式数据集的添加过程,重新build一下工程就好了
PASCAL VOC格式的数据集
voc格式的数据集是一张图像一个xml,训练的时候需要提供图像路径、标注路径以及list列表,这里我们不参考voc的按年划分数据集。需要分成两个步骤:
1. 参考detectron2/data/datasets/pascal_voc.py添加私有voc的dataloader self_voc.py,内容如下
# -*- coding: utf-8 -*-
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
from fvcore.common.file_io import PathManager
import os
import numpy as np
import xml.etree.ElementTree as ET
from detectron2.structures import BoxMode
from detectron2.data import DatasetCatalog, MetadataCatalog
__all__ = ["register_self_voc"]
# fmt: off
CLASS_NAMES = [
"aeroplane",
]
# fmt: on
def load_voc_instances(dirname: str, split: str):
"""
Load Pascal VOC detection annotations to Detectron2 format.
Args:
dirname: Contain "Annotations", "ImageSets", "JPEGImages"
split (str): one of "train", "test", "val", "trainval"
"""
with PathManager.open(os.path.join(dirname, "ImageSets", split + ".txt")) as f:
fileids = np.loadtxt(f, dtype=np.str)
dicts = []
for fileid in fileids:
anno_file = os.path.join(dirname, "Annotations", fileid + ".xml")
jpeg_file = os.path.join(dirname, "JPEGImages", fileid + ".jpg")
tree = ET.parse(anno_file)
r = {
"file_name": jpeg_file,
"image_id": fileid,
"height": int(tree.findall("./size/height")[0].text),
"width": int(tree.findall("./size/width")[0].text),
}
instances = []
for obj in tree.findall("object"):
cls = obj.find("name").text
# We include "difficult" samples in training.
# Based on limited experiments, they don't hurt accuracy.
# difficult = int(obj.find("difficult").text)
# if difficult == 1:
# continue
bbox = obj.find("bndbox")
bbox = [float(bbox.find(x).text) for x in ["xmin", "ymin", "xmax", "ymax"]]
# Original annotations are integers in the range [1, W or H]
# Assuming they mean 1-based pixel indices (inclusive),
# a box with annotation (xmin=1, xmax=W) covers the whole image.
# In coordinate space this is represented by (xmin=0, xmax=W)
bbox[0] -= 1.0
bbox[1] -= 1.0
instances.append(
{"category_id": CLASS_NAMES.index(cls), "bbox": bbox, "bbox_mode": BoxMode.XYXY_ABS}
)
r["annotations"] = instances
dicts.append(r)
return dicts
def register_self_voc(name, dirname, split):
DatasetCatalog.register(name, lambda: load_voc_instances(dirname, split))
MetadataCatalog.get(name).set(
thing_classes=CLASS_NAMES, dirname=dirname, split=split
)
2. 修改detectron2/data/datasets/builtin.py