coco数据集解析（instances）

MXL147

已于 2023-08-25 10:32:16 修改

阅读量1.6k

点赞数 1

文章标签：深度学习

于 2023-05-24 17:45:37 首次发布

本文链接：https://blog.csdn.net/qq_39066502/article/details/130852028

版权

简介

instances_train2017.json和instances_val2017.json文件均分为五大部分，这五部分对应的关键字分别为info、licenses、images、annotations、categories。

{ 
    "info" : info,  
    "licenses" : [license1, license2, license3, ...],  
    "images" : [image1, image2, image3, ...],  
    "annotations" : [annataton1, annataton2, annataton3, ...],  
    "categories" : [category1, category2, category3, ...]
}

info部分包含了数据集的年份、版本、作者，以及描述等信息：

  info
  {
    "description": string类型
    "url": string类型
    "version": string类型
    "year": int类型
    "contributor": string类型
    "date_created": string类型
  }

licenses部分则包含了数据集的发布证书信息，由于有多个证书，将它们的信息以序列表的形式进行存储，序列表中每个证书的存储形式是一样的：

      licenses
      {
        "url": string 类型
        "id": int类型
        "name": string 类型
      }

images部分包含了图像信息，以列表形式存储，每张图像信息的存储形式是一样的：

  images
  {
    "license": int 类型，表示该图像的liecens证书属于licenses部分中的哪一个证书，对应licenses部分中证书的id号           
    "file_name": string 类型，图片的文件名，比如000000000001.jpg
    "coco_url": string 类型，coco图片链接url
    "height": int 类型，图片的高
    "width": int 类型，图片的宽
    "date_captured": string 类型，图片的获取日期
    "flickr_url": string 类型，flickr图片链接url
    "id": int 类型，图片id，和annotations中的image_id相对应
  }

annotations部分主要包含图片中检测目标的分类信息和位置信息，由于有多张图片且每张图片中可能包含多个检测目标，将每个检测目标的信息以序列表的形式进行存储，序列表中每个检测目标信息的存储形式是一样的：

  annotations
  {
    "segmentation": float类型，检测目标的轮廓分割级标签
    "area": float类型，检测目标的面积
    "iscrowd": int型，0或1：目标是否被遮盖，默认为0
    "image_id": 该检测目标所属于的图片的id，对应images部分的id
    "bbox": float型，包含该检测目标的矩形框信息：左上角点的x坐标、y坐标、矩形宽、矩形高
    "category_id": 该检测目标所属的类别id，对应categories部分的id
    "id": 数据集中每个检测目标的id号
  }

categories部分主要包含了检测目标的分类信息，由于检测目标总共有80个类别，将每个类别的信息以序列表的形式进行存储，序列表中每个类别信息的存储形式是一样的：

  categories
  {
    "supercategory": string 类型，类别所属的大类，如卡车和轿车都属于机动车这个大类
    "id": int类型，类别的id，对应以上annotations部分的category_id
    "name": string 类型，类别名称，比如person、dog、cat等
  };

信息提取

以instances_val2017.json的提取为例：

初始目录如下：

annotations/instances_val2017.json是5000张图片的标注信息

images下是5000张图片

import json
import os
json_path = "annotations/instances_val2017.json"
json_labels = json.load(open(json_path, "r"))
annotations = json_labels['annotations'] # list
images = json_labels['images'] # list
categories = json_labels['categories'] # list

CLASSES = [
    'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
    'fire hydrant',
    'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
    'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
    'kite',
    'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork',
    'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
    'donut',
    'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
    'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase',
    'scissors',
    'teddy bear', 'hair drier', 'toothbrush']

存储图片名和图片id之间的映射关系：

idtoimage = {}
for image in images:
    file_name = image['file_name']
    image_id = image['id'] 
    height = image['height'] 
    width = image['width'] 
    idtoimage[image_id] = [file_name,height,width]

存储类别名和类别id之间的映射关系。此处的类别id不是0-79,需要用CLASSES转换一下

idtoclss = {}
for category in categories:
    id = category['id']
    name = category['name'] # 类别名
    idtoclss[id] = name

解析分割：<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>,归一化

for annotation in annotations:
    try:
        segmentation = annotation['segmentation'][0]  # 分割点坐标
        image_id = annotation['image_id']  
        category_id = annotation['category_id']  # 类别编号
                
        classname = idtoclss[category_id] # 类别名
        category_id = CLASSES.index(classname) # 转下编号
        
        file = idtoimage[image_id]
        filename,h,w = file[0],file[1],file[2]
        x = [i/w for i in segmentation[0::2]] # x坐标归一化
        y = [i/h for i in segmentation[1::2]] # y坐标归一化
        xy = ''
        for i in range(len(x)):
            xy += str(x[i]) + ' ' + str(y[i]) + ' '
        line = str(category_id)+ ' ' + xy + '\n'
        outfile = filename.split('.')[0]+'.txt'
        outfile = os.path.join('labels_instances',outfile)
        with open(outfile,'a') as f:
            f.write(line)
    except:
        continue

解析目标检测：<object-class> <cx> <cy> <width> <height>,归一化

for annotation in annotations:
    try:
        bbox = annotation['bbox']  # 左上角x,y,w,h
        image_id = annotation['image_id']  
        category_id = annotation['category_id']  # 类别编号
                
        classname = idtoclss[category_id] # 类别名
        category_id = CLASSES.index(classname) # 转下编号

        file = idtoimage[image_id]
        filename,h,w = file[0],file[1],file[2]

        box_w, box_h = bbox[2]/w, bbox[3]/h
        cx = (bbox[0] + bbox[2]/2) / w
        cy = (bbox[1] + bbox[3]/2) / h
        line = [str(i) for i in [category_id, cx, cy, box_w, box_h]]
        line = ' '.join(line) + '\n'
        outfile = filename.split('.')[0]+'.txt'
        outfile = os.path.join('labels_bbox',outfile)
        with open(outfile,'a') as f:
            f.write(line)
    except:
        continue