Detectron2数据注册流程:
1. detectron2/data/datasets/builtin.py
会把所有数据集进行注册,其中register_all_coco, register_all_pascal_voc, register_all_cityscapes 等实现数据的注册,会把每个要使用的数据名称(如coco_2014_train, voc_2012_train, cityscapes_fine_instance_seg_train, cityscapes_fine_instance_seg_train)、元数据、路径和文件通过register_coco_instances, register_all_pascal_voc, register_all_cityscapes等注册;其中元数据包含:
(
"COCO": [thing_dataset_id_to_contiguous_id, thing_colors, thing_classes, json_file, image_root, evaluator_type (coco)],
"PASCAL VOC": [thing_classes, dirname, year, split, evaluator_type (pascal_voc)],
"cityscapes": [thing_classes, stuff_classes, image_dir, gt_dir, evaluator_type (cityscapes for instance seg, sem_seg for semantic seg)]
)
2. detectron2/data/datasets/register_coco.py
会通过DatasetCatalog和MetadataCatalog实现数据的注册;其中DatasetCatalog使用了数据名称和load_coco_json得到的dataset_dicts实现注册;MetadataCatalog则是把元数据加载到对应数据名称比如coco_2014_train。
detectron2/data/datasets/coco.py load_coco_json实现了数据的解析;
pycocotools/coco.py COCO函数用于解析json文件;
2. detectron2/data/datasets/pascal_voc.py
会通过DatasetCatalog和MetadataCatalog实现数据的注册;其中DatasetCatalog使用了数据名称和load_voc_instances得到的dicts实现注册;MetadataCatalog则是把元数据加载到对应数据名称。
2. detectron2/data/datasets/cityscapes.py
在第一步已经注册了;其中DatasetCatalog使用了数据名称 (sem_seg、instance_seg) 和 (load_cityscapes_semantic、load_cityscapes_instances) 和 (cityscapes_files_to_dict) 分别对应组合得到的ret实现注册用于语义分割任务和实例分割任务;MetadataCatalog则是把元数据加载到对应数据名称。
build_detection_train_loader(cfg) 用于加载数据
dataset = MapDataset(dataset, mapper)得到的用于训练的数据结构 list(dict)
# dict_keys(['file_name', 'image_id', 'height', 'width', 'image', 'instances'])
{
'file_name':'.../xx.png',
'image_id': 'xx.png',
'height': 1024,
'width': 2048,
'image': tensor, CHW, dtype=torch.uint8),
'instances': Instances(
num_instances=13,
image_height=667,
image_width=1333,
fields=[
gt_boxes: Boxes(N*4 tensor),
gt_classes: N*1 tensor]),
gt_masks: PolygonMasks(num_instances=N)
]
)
}
dataset = DatasetFromList(dataset_dicts, copy=False)
DatasetCatalog注册的数据格式:
类别数 (不包含背景):COCO 80类, PASCAL VOC 20类, Cityscapes 普通8类,Cityscapes stuff 20类。
# load_coco_json会把image_id按顺序排好,并且以双层list形式[[]]把annotations以image_id为基准排好,
# category_id也按顺序排好。加载返回:
[
{
"file_name": str,
"height": int,
"width": int,
"image_id": TODO: int or str?,
"annotations": [
{
"category_id": int after mapping to [0, 7),
"iscrowd": bool,
"segmentation": polygon,
"bbox": [xmin,ymin,xmax,ymax],
"bbox_mode": XYWH.ABS
},...
]
}, ...
]
# load_cityscapes_semantic + cityscapes_files_to_dict, category_id也按顺序排好.
[
{
"file_name": str,
"height": int,
"width": int,
"sem_seg_file_name": str
}, ...
]
# load_cityscapes_instances + cityscapes_files_to_dict, category_id也按顺序排好.
[
{
"file_name": str,
"height": int,
"width": int,
"image_id": int,
"annotations": [
{
"id": int,
"image_id": int,
"category_id": int after mapping to [0, 80),
"segmentation": RLE or [polygon],
"area": float,
"bbox": [x,y,width,height],
"iscrowd": 0 for polygen or 1 for RLE,
"bbox_mode": XYWH.ABS
},...
]
}, ...
]
# load_voc_instances, 注意自带不包含语义分割数据加载,需要自己制作数据集
[
{
"file_name": str,
"height": int,
"width": int,
"image_id": int,
"annotations": [
{
"category_id": int after mapping to [0, 20),
"bbox": [xmin, ymin, xmax, ymax],
"bbox_mode": XYXY.ABS
},...
]
}, ...
]
数据集原始数据结构:
# "COCO_json"
[
{
"info": {
"year": int,
"version": str,
"description": str,
"contributor": str,
"url": str,
"date_created": datetime
},
"licenses": [
{
"id": int,
"name": str,
"url": str
},...
],
"images": [
{
"coco_url": str
"data_captured": str
"file_name": str
"flicker_url": str
"id": int
"height": int
"width": int
"license": int
},...
],
"annotations": [
{
"id": int,
"image_id": int,
"category_id": int,
"segmentation": RLE or [polygon],
"area": float,
"bbox": [x,y,width,height],
"iscrowd": 0 for polygen or 1 for RLE
},...
],
"categories": [
{
"id": int,
"name": str,
"supercategory": str
},...
]
},...
]
# cityscapes polygons
{
"imgHeight": int,
"imgWidth": int ,
"objects": [
"label": str,
"polygon": array [[]]
]
}