COCO数据集的标注信息与配套API的用法

最新推荐文章于 2024-06-11 22:47:00 发布

dekiang

最新推荐文章于 2024-06-11 22:47:00 发布

阅读量950

点赞数

分类专栏： Dataset

本文链接：https://blog.csdn.net/weixin_41560402/article/details/105960128

版权

Dataset 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

本文主要介绍COCO数据集配套API的用法，在介绍过程中穿插说明用于目标检测时的标注信息。

1. API安装

pycocotools是COCO配套的Python API，可以方便地获取标注文件的各项信息，pycocotools的安装为：

pip install pycocotools

2. COCO API 用法

from pycocotools.coco import COCO
# COCO数据集中下载的用于目标检测的标注文件
annotation_file = "instances_train2017.json"
# COCO类实例化后得到的对象
coco = COCO(annotation_file)

运行以上结果，在屏幕中输出：

loading annotations into memory...
Done (t=13.34s)
creating index...
index created!

coco对象中含有属性dataset，dataset是在python中读取出 json 标注格式文件后得到的 dict ，其格式如下：

{
    "info": info,
    "licenses": [license],
    "images": [image],
    "annotations": [annotation],
    "categories": [category]
}

我们获取字典dataset中所有的键：

print(list(coco.dataset.keys()))
# 输出结果：['info', 'licenses', 'images', 'annotations', 'categories']

json文件（即，字典dataset）中包含5个字段信息：info, licenses, images, annotations，categories。其中，前3个字段（info, licenses, images）在不同应用中是共享的，后2个字段（annotation、category）在不同类型的json文件中是不一样的。

下面以目标检测的标注来说明后3个字段（前2个字段在目标检测中用不到）。

1. images字段

images_num = len(coco.dataset['images'])
print(images_num)
print(coco.dataset['images'][0:2])

输出结果：

118287
[
 {'license': 3, 
  'file_name': '000000391895.jpg',
  'coco_url': 'http://images.cocodataset.org/train2017/000000391895.jpg', 
  'height': 360, 
  'width': 640, 
  'date_captured': '2013-11-14 11:18:45', 
  'flickr_url': 'http://farm9.staticflickr.com/8186/8119368305_4e622c8349_z.jpg', 
  'id': 391895}, 
  
  {'license': 4, 
   'file_name': '000000522418.jpg', 
   'coco_url': 'http://images.cocodataset.org/train2017/000000522418.jpg', 
   'height': 480, 
   'width': 640, 
   'date_captured': '2013-11-14 11:38:44', 
   'flickr_url': 'http://farm1.staticflickr.com/1/127244861_ab0c0381e7_z.jpg', 
   'id': 522418}
   ]

coco.dataset['images']是一个列表，列表中的元素均为字典类型，每个元素对应一张图片的信息，列表的长度就是数据集的大小（用于目标检测的训练数据集大小为118287）。
每个列表元素的字段信息为：

file_name：图片名称
id：图片id，每张图片对应一个唯一的id
width：图片的宽
height：图片的高

2. annotations字段

bounding_boxes_num = len(coco.dataset['annotations'])
print(bounding_boxes_num)
print(coco.dataset['annotations'][0:2])

输出结果：

860001
[
{'segmentation': [], 
 'area': 2765.1486500000005, 
 'iscrowd': 0, 'image_id': 558840, 
 'bbox': [199.84, 200.46, 77.71, 70.88], 
 'category_id': 58, 
 'id': 156}, 
 
 {'segmentation': [], 
  'area': 1545.4213000000007, 
  'iscrowd': 0, 
  'image_id': 200365, 
  'bbox': [234.22, 317.11, 149.39, 38.55], 
  'category_id': 58, 
  'id': 509}
 ]

coco.dataset['annotations']是一个列表，列表中的元素均为字典类型，每个元素对应一个物体的信息（对于目标检测，即为边界框），列表的长度就是所有边界框的数量的大小（用于目标检测的训练数据集中共包含860001个bounding box，显然，一张图片可能含有多个物体）。
每个列表元素的字段信息为：

bbox：边界框（左上角x坐标，左上角y坐标，宽w，高h）
image_id：图片id，指向该边界框所在的图片
category_id：类别id
id：边界框id，每个边界框都有一个唯一的id

3. categories字段

categories_num = len(coco.dataset['categories'])
print(categories_num)
print(coco.dataset['categories'][0:2])

输出结果：

80
[
 {'supercategory': 'person', 
  'id': 1, 
  'name': 'person'}, 
 
 {'supercategory': 'vehicle', 
  'id': 2, 
  'name': 'bicycle'}
  ]

coco.dataset['categories']是一个列表，列表中的元素均为字典类型，每个元素对应一个类别信息，列表的长度就是总的类别数80。
每个列表元素的字段信息为：

supercategory：大类
id：类别id，每个类别对应一个唯一的id
category_id：类别id
name：具体的类别名称

参考资料：
[1]: COCO 数据集格式了解.
[2]: COCO数据集的标注格式.
[3]: COCO2017数据集api说明.

dekiang

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
COCO数据集的标注信息与配套API的用法

本文主要介绍COCO数据集配套API的用法，在介绍过程中穿插说明用于目标检测时的标注信息。1. API安装pycocotools是COCO配套的Python API，可以方便地获取标注文件的各项信息，pycocotools的安装为：pip install pycocotools2. COCO API 用法from pycocotools.coco import COCO# COCO数...
复制链接

扫一扫

专栏目录