COCO 数据集目标检测等相关评测指标

最新推荐文章于 2024-06-07 17:23:43 发布

AIHGF

最新推荐文章于 2024-06-07 17:23:43 发布

阅读量1.3w

点赞数 7

分类专栏：目标检测

本文链接：https://blog.csdn.net/zziahgf/article/details/86698506

版权

原文：COCO 数据集目标检测等相关评测指标 - AIUAI

COCO Detection Evaluation

1. 评测指标定义

COCO 提供了 12 种用于衡量目标检测器性能的评价指标.

[1] - 除非特别说明， $A P$ 和 $A R$ 一般是在多个 IoU(Intersection over Union) 值间取平均值. 具体地，采用了 10 个 IoU阈值 - 0.50:0.05:0.95. 对比于传统的只计算单个 IoU 阈值(0.50)的指标(对应于这里的指标 $AP^{IoU=0.50}$ )，这是一种突破. 对多个 IoU 阈值求平均，能够使得目标检测器具有更好的定位位置.

[2] - $A P$ 是对所有类别的求平均值. 这在传统上被称为平均准确度(mAP, mean average precision). 这里并未区分 $A P$ 和 $m A P$ (类似的， $A R$ 和 $m A R$ )，假定从上下文中具有清晰的差异. 即：如， $AP^{50} = mAP^{50}$ ， $AP^{75} = mAP^{75}$ ，… 但， $AP^{50}$ 一定大于 $AP^{75}$ .

[3] - $A P$ (所有 10 个 IoU 阈值和全部 80 个类别的平均值) 作为最终 COCO竞赛胜者的标准. 在考虑目标检测器再 COCO 上的性能时，这是单个最重要的评价度量指标.

[4] - COCO数据集中小目标物体数量比大目标物体更多. 具体地，标注的约有 41% 的目标物体是都很小的(small, 面积< 32x32=1024)，约有 34% 的目标物体是中等的(medium, 1024=32x32 < 面积 < 96x96=9216)，约有 24% 的目标物体是大的(large, 面积 > 96x96=9216). 面积(area) 是指 segmentation mask 中像素的数量.

[5] - $A R$ 是指每张图片中，在给定固定数量的检测结果中的最大召回(maximum recall)，在所有 IoUs 和全部类别上求平均值. $A R$ 与 proposal evaluation 中所使用的相同，但这里 $A R$ 是按类别计算的.

[6] - 所有的评测指标允许每张图片(在全部的类别中)最多 100 个 top-scoring 检测结果进行计算.

[7] - 边界框(bounding boxes)的检测和segmentation mask 的所有评测指标是一致的，除了 IoU 的计算. 边界框的 IoU 计算是关于 boxes的，而 segmentation mask 的 IoU 计算是关于 masks 的.

2. 评测指标实现 - cocoeval

PythonAPI/pycocotools/cocoeval.py

评测参数如：(括号里的默认值，一般不需要修改.)

通过调用 evaluate() 函数和 accumulate() 函数来运行，以计算得到衡量检测质量的两个数据结构(data structures).

这两个数据结构分别是 evalImages 和 eval，其分别每张图片的检测质量和整个数据集上的聚合检测质量.

数据结构 evalImages 共有 KxA 个元素，每个元素表示一个评测设置；而数据结构 eval 将这些信息组合为 precision 和 recall 数组. 具体如下：

Python 中的定义如：

__author__ = 'tsungyi'

import numpy as np
import datetime
import time
from collections import defaultdict
from . import mask as maskUtils
import copy

class COCOeval:
    # COCO 数据集的检测评估接口.
    # The usage for CocoEval is as follows:
    #  cocoGt=..., cocoDt=...       # load dataset and results
    #  E = CocoEval(cocoGt,cocoDt); # initialize CocoEval object
    #  E.params.recThrs = ...;      # set parameters as desired
    #  E.evaluate();                # run per image evaluation
    #  E.accumulate();              # accumulate per image results
    #  E.summarize();               # display summary metrics of results
    # For example usage see evalDemo.m and http://mscoco.org/.
    #
    # The evaluation parameters are as follows (defaults in brackets):
    #  imgIds     - [all] N img ids to use for evaluation
    #  catIds     - [all] K cat ids to use for evaluation
    #  iouThrs    - [.5:.05:.95] T=10 IoU thresholds for evaluation
    #  recThrs    - [0:.01:1] R=101 recall thresholds for evaluation
    #  areaRng    - [...] A=4 object area ranges for evaluation
    #  maxDets    - [1 10 100] M=3 thresholds on max detections per image
    #  iouType    - ['segm'] set iouType to 'segm', 'bbox' or 'keypoints'
    #  iouType replaced the now DEPRECATED useSegm parameter.
    #  useCats    - [1] if true use category labels for evaluation
    # Note: if useCats=0 category labels are ignored as in proposal scoring.
    # Note: multiple areaRngs [Ax2] and maxDets [Mx1] can be specified.
    #
    # evaluate(): evaluates detections on every image and every category and
    # concats the results into the "evalImgs" with fields:
    #  dtIds      - [1xD] id for each of the D detections (dt)
    #  gtIds      - [1xG] id for each of the G ground truths (gt)
    #  dtMatches  - [TxD] matching gt id at each IoU or 0
    #  gtMatches  - [TxG] matching dt id at each IoU or 0
    #  dtScores   - [1xD] confidence of each dt
    #  gtIgnore   - [1xG] ignore flag for each gt
    #  dtIgnore   - [TxD] ignore flag for each dt at each IoU
    #
    # accumulate(): accumulates the per-image, per-category evaluation
    # results in "evalImgs" into the dictionary "eval" with fields:
    #  params     - parameters used for evaluation
    #  date       - date evaluation was performed
    #  counts     - [T,R,K,A,M] parameter dimensions (see above)
    #  precision  - [TxRxKxAxM] precision for every evaluation setting
    #  recall     - [TxKxAxM] max recall for every evaluation setting
    # Note: precision and recall==-1 for settings with no gt objects.
    #
    # See also coco, mask, pycocoDemo, pycocoEvalDemo
    #
    def __init__(self, cocoGt=None, cocoDt=None, iouType='segm'):
        '''
        Initialize CocoEval using coco APIs for gt and dt
        :param cocoGt: coco object with ground truth annotations
        :param cocoDt: coco object with detection results
        :return: None
        '''
        if not iouType:
            print('iouType not specified. use default iouType segm')
        self.cocoGt   = cocoGt              # ground truth COCO API
        self.cocoDt   = cocoDt              # detections COCO API
        self.params   = {
   }                  # evaluation parameters
        self.evalImgs = defaultdict(list)   # per-image per-category evaluation results [KxAxI] elements
        self.eval     = {
   }                  # accumulated evaluation results
        self._gts = defaultdict(list)       # gt for evaluation
        self._dts = defaultdict(list)       # dt for evaluation
        self.params = Params(iouType=iouType) # parameters
        self._paramsEval = {
   }               # parameters for evaluation
        self.stats = []                     # result summarization
        self.ious = {
   }                      # ious between all gts and dts
        if not cocoGt is None:
            self.params.imgIds = sorted(cocoGt.getImgIds())
            self.params.catIds = sorted(cocoGt.getCatIds())


    def _prepare(self):
        '''
        Prepare ._gts and ._dts for evaluation based on params
        :return: None
        '''
        def _toMask(anns, coco):
            # modify ann['segmentation'] by reference
            for ann in anns:
                rle = coco.annToRLE(ann)
                ann['segmentation'] = rle
        p = self.params
        if p.useCats:
            gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))
            dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))
        else:
            gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds))
            dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds))

        # convert ground truth to mask if iouType == 'segm'
        if p.iouType == 'segm':
            _toMask(gts, self.cocoGt)
            _toMask(dts, self.cocoDt)
        # set ignore flag
        for gt in gts:
            gt['ignore'] = gt['ignore'] if 'ignore' in gt else 0
            gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']
            if p.iouType == 'keypoints':
                gt['ignore'] = (gt['num_keypoints'] == 0) or gt['ignore']
        self._gts = defaultdict(list)       # gt for evaluation
        self._dts = defaultdict(list)       # dt for evaluation
        for gt in gts:
            self._gts[gt['image_id'], gt['category_id']].append(gt)
        for dt in dts:
            self._dts[dt['image_id'], dt['category_id']].append(dt)
        self.evalImgs = defaultdict(list)   # per-image per-category evaluation results
        self.eval     = {
   }                  # accumulated evaluation results

    def evaluate(self):
        '''
        Run per image evaluation on given images and store results (a list of dict) in self.evalImgs
        :return: None
        '''
        tic = time.time()
        print('Running per image evaluation...')
        p = self.params
        # add backward compatibility if useSegm is specified in params
        if not p.useSegm is None:
            p.iouType = 'segm' if p.useSegm == 1 else 'bbox'
            print('useSegm (deprecated) is not None. Running {} evaluation'.format(p.iouType))
        print('Evaluate annotation type *{}*'.format(p.iouType))
        p.imgIds = list(np.unique(p.imgIds))
        if p.useCats:
            p.catIds = list(np.unique(p.catIds))
        p.maxDets = sorted(p.maxDets)
        self.params=p

        self._prepare()
        # loop through images, area range, max detection number
        catIds = p.catIds if p.useCats else [-1]

        if p.iouType == 'segm' or p.iouType == 'bbox':
            computeIoU = self.computeIoU
        elif p.iouType == 'keypoints':
            computeIoU = self.computeOks
        self.ious = {
   (imgId, catId): computeIoU(imgId, catId) \
                        for imgId in p.imgIds
                        for catId in catIds}

        evaluateImg = self.evaluateImg
        maxDet = p.maxDets[-1]
        self.evalImgs = [evaluateImg(imgId, catId, areaRng, maxDet)
                 for catId in catIds
                 for areaRng in p.areaRng
                 for imgId in p.imgIds
             ]
        self._paramsEval = copy.deepcopy(self.params)
        toc = time.time()
        print('DONE (t={:0.2f}s).'.format(toc-tic))

    def computeIoU(self, imgId, catId):
        p = self.params
        if p.useCats:
            gt = self._gts[imgId,catId]
            dt = self._dts[imgId,catId]
        else:
            gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]
            dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]
        if len(gt) == 0 and len(dt) ==0:
            return []
        inds = np.argsort([-d['score'] for d in dt], kind='mergesort')
        dt = [dt[i] for i in inds]
        if len(dt) > p.maxDets[-1]:
            dt=dt[0:p.maxDets[-1]]

        if p.iouType == 'segm':
            g = [g['segmentation'] for g in gt]
            d = [d['segmentation'] for d in dt]
        elif p.iouType == 'bbox':
            g = [g['bbox'] for g in gt]
            d = [d['bbox'] for d in dt]
        else:
            raise Exception('unknown iouType for iou computation')

        # compute iou between each dt and gt region
        iscrowd = [int(o['iscrowd']) for o in gt]
        ious = maskUtils.iou(d,g,iscrowd)
        return ious

    def computeOks(self, imgId, catId):
        p = self.params
        # dimention here should be Nxm
        gts = self._gts[imgId, catId]
        dts = self._dts[imgId, catId]
        inds = np.argsort([

最低0.47元/天解锁文章

AIHGF

关注

7
点赞
踩
42

收藏

觉得还不错? 一键收藏
11
评论
COCO 数据集目标检测等相关评测指标

原文：COCO 数据集目标检测等相关评测指标 - AIUAICOCO Detection Evaluation1. 评测指标定义COCO 提供了 12 种用于衡量目标检测器性能的评价指标.[1] - 除非特别说明，APAPAP 和 ARARAR 一般是在多个 IoU(Intersection over Union) 值间取平均值. 具体地，采用了 10 个 IoU阈值 - 0.5...
复制链接

扫一扫

专栏目录