目标检测学习笔记-MS COCO数据集

我真好看欸

于 2024-04-11 20:51:21 发布

阅读量1.2k

点赞数 33

文章标签：目标检测学习笔记计算机视觉人工智能

本文链接：https://blog.csdn.net/weixin_44566115/article/details/137652467

版权

1.MS COCO数据集简介

MS COCO是一个非常大型且常用的数据集，其中包括了目标检测，分割，图像描述等。其主要特性如下：

Object segmentation: 目标级分割
Recognition in context: 图像情景识别
Superpixel stuff segmentation: 超像素分割
330K images (>200K labeled): 超过33万张图像，标注过的图像超过20万张
1.5 million object instances: 150万个对象实例
80 object categories: 80个目标类别
91 stuff categories: 91个材料类别
5 captions per image: 每张图像有5段情景描述
250,000 people with keypoints: 对25万个人进行了关键点标注

下面是官文介绍论文中统计的对比图。通过对比图可以看到，coco数据集不仅类别更多，每个类别标注的目标也更多。，
在这里插入图片描述

2.MS COCO数据集下载

以coco2017数据集为例，主要下载三个文件

2017 Train images [118K/18GB]：训练过程中使用到的所有图像文件
2017 Val images [5K/1GB]：验证过程中使用到的所有图像文件
2017 Train/Val annotations [241MB]：对应训练集和验证集的标注json文件
下载后都解压到coco2017目录下，可以得到如下目录结构：

├── coco2017: 数据集根目录
     ├── train2017: 所有训练图像文件夹(118287张)
     ├── val2017: 所有验证图像文件夹(5000张)
     └── annotations: 对应标注文件夹
     		  ├── instances_train2017.json: 对应目标检测、分割任务的训练集标注文件
     		  ├── instances_val2017.json: 对应目标检测、分割任务的验证集标注文件
     		  ├── captions_train2017.json: 对应图像描述的训练集标注文件
     		  ├── captions_val2017.json: 对应图像描述的验证集标注文件
     		  ├── person_keypoints_train2017.json: 对应人体关键点检测的训练集标注文件
     		  └── person_keypoints_val2017.json: 对应人体关键点检测的验证集标注文件夹

3.数据集的使用

3.1 准备工作

下载上述三个数据集文件依次根据目录结构放入对应的地方。

3.2 使用python的json库查看json文件中存储的标注形式

在这里插入图片描述

单步调试可以看到读入进来后是个字典的形式，包括了info、licenses、images、annotations以及categories信息：
在这里插入图片描述

其中：

images是一个列表（元素个数对应图像的张数），列表中每个元素都是一个dict，对应一张图片的相关信息。包括对应图像名称、图像宽度、高度等信息。
annotations是一个列表（元素个数对应数据集中所有标注的目标个数，注意不是图像的张数），列表中每个元素都是一个dict对应一个目标的标注信息。包括目标的分割信息（polygons多边形）、目标边界框信息[x,y,width,height]（左上角x,y坐标，以及宽高）、目标面积、对应图像id以及类别id等。iscrowd参数只有0或1两种情况，一般0代表单个对象，1代表对象集合。
categories是一个列表（元素个数对应检测目标的类别数）列表中每个元素都是一个dict对应一个类别的目标信息。包括类别id、类别名称和所属超类。

对于每个类别：
超类一些类别的统称
id：stuff91类中的索引，仔细看并不是1 ~ 80。所以如果后面要去训练80个类别的目标检测的话，需要做一个映射，把这些索引映射到1 ~ 80当中。

3.3使用pycocotools读取标注文件

# Linux 系统安装pycocotools
pip install pycocotools
# Windows系统安装pycocotools
pip install pycocotools-windows

在这里插入图片描述

在这里插入图片描述
代码如下所示：

import os
from pycocotools.coco import COCO
from PIL import Image, ImageDraw

import matplotlib
matplotlib.use('Agg')  
import matplotlib.pyplot as plt

json_path = "/home/tlx01/tlx_stu/zhy/learn/coco2017/instances_val2017.json"
img_path = "/home/tlx01/tlx_stu/zhy/learn/coco2017/val2017"

# load coco data
coco = COCO(annotation_file=json_path)

# get all image index info
ids = list(sorted(coco.imgs.keys()))
print("number of images: {}".format(len(ids)))

# get all coco class labels
coco_classes = dict([(v["id"], v["name"]) for k, v in coco.cats.items()])

# 遍历前三张图像
for img_id in ids[:3]:
    # 获取对应图像id的所有annotations idx信息
    ann_ids = coco.getAnnIds(imgIds=img_id)

    # 根据annotations idx信息获取所有标注信息
    targets = coco.loadAnns(ann_ids)

    # get image file name
    path = coco.loadImgs(img_id)[0]['file_name']

    # read image
    img = Image.open(os.path.join(img_path, path)).convert('RGB')
    draw = ImageDraw.Draw(img)
    # draw box to image
    for target in targets:
        x, y, w, h = target["bbox"]
        x1, y1, x2, y2 = x, y, int(x + w), int(y + h)
        draw.rectangle((x1, y1, x2, y2))
        draw.text((x1, y1), coco_classes[target["category_id"]])

    # show image
    plt.imshow(img)
    plt.show()

最后代码经过运行可得到如下结果
在这里插入图片描述

3.4 读取每张图像的segmentation信息


import os
import random

import numpy as np
from pycocotools.coco import COCO
from pycocotools import mask as coco_mask
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt

random.seed(0)

json_path = "/data/coco2017/annotations/instances_val2017.json"
img_path = "/data/coco2017/val2017"

# random pallette
pallette = [0, 0, 0] + [random.randint(0, 255) for _ in range(255*3)]

# load coco data
coco = COCO(annotation_file=json_path)

# get all image index info
ids = list(sorted(coco.imgs.keys()))
print("number of images: {}".format(len(ids)))

# get all coco class labels
coco_classes = dict([(v["id"], v["name"]) for k, v in coco.cats.items()])

# 遍历前三张图像
for img_id in ids[:3]:
    # 获取对应图像id的所有annotations idx信息
    ann_ids = coco.getAnnIds(imgIds=img_id)
    # 根据annotations idx信息获取所有标注信息
    targets = coco.loadAnns(ann_ids)

    # get image file name
    path = coco.loadImgs(img_id)[0]['file_name']
    # read image
    img = Image.open(os.path.join(img_path, path)).convert('RGB')
    img_w, img_h = img.size

    masks = []
    cats = []
    for target in targets:
        cats.append(target["category_id"])  # get object class id
        polygons = target["segmentation"]   # get object polygons
        rles = coco_mask.frPyObjects(polygons, img_h, img_w)
        mask = coco_mask.decode(rles)
        if len(mask.shape) < 3:
            mask = mask[..., None]
        mask = mask.any(axis=2)
        masks.append(mask)

    cats = np.array(cats, dtype=np.int32)
    if masks:
        masks = np.stack(masks, axis=0)
    else:
        masks = np.zeros((0, height, width), dtype=np.uint8)

    # merge all instance masks into a single segmentation map
    # with its corresponding categories
    target = (masks * cats[:, None, None]).max(axis=0)
    # discard overlapping instances
    target[masks.sum(0) > 1] = 255
    target = Image.fromarray(target.astype(np.uint8))

    target.putpalette(pallette)
    plt.imshow(target)
    plt.show()

通过pycocotools读取的图像segmentation信息，配合matplotlib库绘制标注图像如下：
在这里插入图片描述

3.5 验证目标检测任务mAP

首先要弄清楚cocoapi指定的数据格式（训练网络预测的结果），在官网的Evaluate下拉框中选择Results Format，可以看到每种任务的指定数据格式要求。
在这里插入图片描述

这里主要讲讲针对目标检测的格式。根据官方文档给的预测结果格式可以看到，我们需要以列表的形式保存结果，列表中的每个元素对应一个检测目标（每个元素都是字典类型），每个目标记录了四个信息：

image_id记录该目标所属图像的id（int类型）
category_id记录预测该目标的类别索引，注意这里索引是对应stuff中91个类别的索引信息（int类型）
bbox记录预测该目标的边界框信息，注意对应目标的[xmin, ymin, width, height] (list[float]类型)
score记录预测该目标的概率（float类型）
下图是训练Faster R-CNN后在coco2017验证集上预测的结果：

接着将预测结果保存成json文件

import json

results = []  # 所有预测的结果都保存在该list中
# write predict results into json file
json_str = json.dumps(results, indent=4)
with open('predict_results.json', 'w') as json_file:
    json_file.write(json_str)

数据准备：

COCO2017验证集json文件instances_val2017.json
自己训练的Faster R-CNN(VGG16)在验证集上预测的结果predict_results.json
示例代码：

from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval


# accumulate predictions from all images
# 载入coco2017验证集标注文件
coco_true = COCO(annotation_file="/data/coco2017/annotations/instances_val2017.json")
# 载入网络在coco2017验证集上预测的结果
coco_pre = coco_true.loadRes('predict_results.json')

coco_evaluator = COCOeval(cocoGt=coco_true, cocoDt=coco_pre, iouType="bbox")
coco_evaluator.evaluate()
coco_evaluator.accumulate()
coco_evaluator.summarize()

输出结果：

loading annotations into memory...
Done (t=0.43s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.65s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=21.15s).
Accumulating evaluation results...
DONE (t=2.88s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.233
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.415
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.233
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.104
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.262
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.323
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.216
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.319
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.327
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.145
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.361
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.463

我真好看欸

关注

33
点赞
踩
21

收藏

觉得还不错? 一键收藏
1
评论
目标检测学习笔记-MS COCO数据集

MS COCO是一个非常大型且常用的数据集，其中包括了目标检测，分割，图像描述等。下面是官文介绍论文中统计的对比图。通过对比图可以看到，coco数据集不仅类别更多，每个类别标注的目标也更多。
复制链接

扫一扫