【Darknet】计算mAP函数validate_detector_map详解

之前和百度有个合作,要测一下他们模型的mAP,所以研究了一下Darknet里mAP到底是怎么算的。

validate_detector_map函数的原型是

float validate_detector_map(char *datacfg, char *cfgfile, char *weightfile, float thresh_calc_avg_iou, const float iou_thresh, const int map_points, int letter_box, network *existing_net)

datacfg - data文件
cfgfile - cfg文件
weightfile - weight文件
thresh_calc_avg_iou - 计算precision和recall的阈值(注:mAP和此值无关
iou_thresh - iou阈值,即目标与gt的iou超过多少认为是检测正确
map_points - 用多少个recall点来计算mAP,点越多越精确,点少算出的mAP偏小。默认为0,即用全部的点

// MS COCO - uses 101-Recall-points on PR-chart.
// PascalVOC2007 - uses 11-Recall-points on PR-chart.
// PascalVOC2010-2012 - uses Area-Under-Curve on PR-chart.
// ImageNet - uses Area-Under-Curve on PR-chart.

letter_box - 是否保持原始分辨率
existing_net - 是否已存在网络,训练时调map是存在的,直接调map是要从配置文件新建网络
 
 
 
下面分析一下核心代码。

1.每4幅图像一组计算,对于每一幅图像,先inference得到检测结果,再过滤掉小于阈值(注:这里阈值传的是0.005,因为要得到所有的检测结果)的检测。hier_thresh是以前YOLOv2用的,现在没用了。

for (t = 0; t < nthreads && i + t - nthreads < m; ++t)
{
   const int image_index = i + t - nthreads;
   char *path = paths[image_index];
   char *id = basecfg(path);
   float *X = val_resized[t].data;
   network_predict(net, X);

   int nboxes = 0;
   float hier_thresh = 0;
   detection *dets;
   if (args.type == LETTERBOX_DATA) 
   {
       dets = get_network_boxes(&net, val[t].w, val[t].h, thresh, hier_thresh, 0, 1, &nboxes, letter_box);
   }
   else 
   {
       dets = get_network_boxes(&net, 1, 1, thresh, hier_thresh, 0, 0, &nboxes, letter_box);
   }
   if (nms) 
   {
       if (l.nms_kind == DEFAULT_NMS) do_nms_sort(dets, nboxes, l.classes, nms);
       else diounms_sort(dets, nboxes, l.classes, nms, l.nms_kind, l.beta_nms);
   }

2.得到网络的检测结果后,都存入detections这个box_prob类的数组里,它对应的属性有bbox,prob,index,类别,是否与gt匹配,对应gt的index。这个detections是后面求mAP用的。然后对每一个prob大于0的检测(实际上是大于0.005,因为小于此值的检测在NMS时被清零0),寻找与它IOU超过阈值且最大,类别相同的gt。如果能找到这样的gt,则更新truth_flag和unique_truth_index。

for (i = 0; i < nboxes; ++i)
{
    int class_id;
    for (class_id = 0; class_id < classes; ++class_id) 
    {
    	float prob = dets[i].prob[class_id];
    	if (prob > 0) 
    	{
            detections_count++;
            detections = (box_prob*)xrealloc(detections, detections_count * sizeof(box_prob));
            detections[detections_count - 1].b = dets[i].bbox;
            detections[detections_count - 1].p = prob;
            detections[detections_count - 1].image_index = image_index;
            detections[detections_count - 1].class_id = class_id;
            detections[detections_count - 1].truth_flag = 0;
            detections[detections_count - 1].unique_truth_index = -1;
            int truth_index = -1;
            float max_iou = 0;
            for (j = 0; j < num_labels; ++j)
            {
                box t = { truth[j].x, truth[j].y, truth[j].w, truth[j].h };
                float current_iou = box_iou(dets[i].bbox, t);
                if (current_iou > iou_thresh && class_id == truth[j].id) 
                {
                    if (current_iou > max_iou) 
                    {
                        max_iou = current_iou;
                        truth_index = unique_truth_count + j;
                    }
                }
            }
            // best IoU
            if (truth_index > -1) 
            {
                detections[detections_count - 1].truth_flag = 1;
                detections[detections_count - 1].unique_truth_index = truth_index;
            }

3.存完detections后,然后计算TP、FP和平均IOU。这时的阈值就是thresh_calc_avg_iou了,从外部传入的,用于计算这个特定阈值下的TP、FP和平均IOU。但mAP是衡量多个阈值下的precision和recall的整体情况,与具体阈值无关。这里的found指当前检测的gt是否被匹配过。假设当前bbox预测第truth_index个gt,但这个gt已经被前面的bbox预测过了(z的范围是checkpoint_detections_count到detections_count - 1,即当前图像上已经处理过的bbox),由于NMS后各个预测结果的prob是降序排的,所以前面的那个预测的才是TP,这个是FP。

// calc avg IoU, true-positives, false-positives for required Threshold
if (prob > thresh_calc_avg_iou)
{
    int z, found = 0;
    for (z = checkpoint_detections_count; z < detections_count - 1; ++z)
    {
        if (detections[z].unique_truth_index == truth_index)
        {
            found = 1; break;
        }
    }
    if (truth_index > -1 && found == 0)
    {
        avg_iou += max_iou;
        ++tp_for_thresh;
        avg_iou_per_class[class_id] += max_iou;
        tp_for_thresh_per_class[class_id]++;
    }
    else
    {
        fp_for_thresh++;
        fp_for_thresh_per_class[class_id]++;
    }
}

4.统计完所有图像后,计算平均IOU和各类的平均IOU。TP的IOU已计入avg_iou,FP的IOU是0。

if ((tp_for_thresh + fp_for_thresh) > 0)
    avg_iou = avg_iou / (tp_for_thresh + fp_for_thresh);
int class_id;
for(class_id = 0; class_id < classes; class_id++)
{
    if ((tp_for_thresh_per_class[class_id] + fp_for_thresh_per_class[class_id]) > 0)
        avg_iou_per_class[class_id] = avg_iou_per_class[class_id] / (tp_for_thresh_per_class[class_id] + fp_for_thresh_per_class[class_id]);
}

5.下面开始计算每个类的AP和mAP。先将detections按降序排好,detections[0]对应所有类别中最大的prob。rank表示置信度的等级,rank = 0时对应的prob最大,而rank = detections_count - 1时prob最小。
再来看一下pr的含义,pr是一个classes × detections_count的数组,pr[class_id][rank]表示第class_id类只考虑prob大于等于第rank级对应的prob的检测结果的pr,也就是prob >= detections[rank].p这样条件下的所有目标的pr情况。所以初始化pr[class_id][rank].tp = pr[class_id][rank - 1].tp,且pr[class_id][rank] >= pr[class_id][rank-1]。因为rank提高了,要求的prob降低了,出现的检测结果不会比之前少,TP和FP也不会降低。最后rank == detections_count - 1时,所有检测的prob都大于这个水平(高于0.005)。truth_flags和之前一样是gt是否匹配了某个检测结果的标志。在每一个检测结果对应的prob上,根据其是否检测到了gt增加TP或FP数,再计算其precision和recall。

qsort(detections, detections_count, sizeof(box_prob), detections_comparator);

// for PR-curve
pr_t** pr = (pr_t**)calloc(classes, sizeof(pr_t*));//pr[classes][detections_count]
for (i = 0; i < classes; ++i)
pr[i] = (pr_t*)calloc(detections_count, sizeof(pr_t));

for (rank = 0; rank < detections_count; ++rank) 
{
    if (rank > 0) 
    {
        int class_id;
        for (class_id = 0; class_id < classes; ++class_id) 
        {
            pr[class_id][rank].tp = pr[class_id][rank - 1].tp;
            pr[class_id][rank].fp = pr[class_id][rank - 1].fp;
        }
    }
    box_prob d = detections[rank];
    // if (detected && isn't detected before)
    if (d.truth_flag == 1) 
    {
        if (truth_flags[d.unique_truth_index] == 0)
        {
            truth_flags[d.unique_truth_index] = 1;
            pr[d.class_id][rank].tp++;    // true-positive
        } 
        else
            pr[d.class_id][rank].fp++;
    }
    else 
    {
        pr[d.class_id][rank].fp++;    // false-positive
    }
    for (i = 0; i < classes; ++i)
    {
        const int tp = pr[i][rank].tp;
        const int fp = pr[i][rank].fp;
        const int fn = truth_classes_count[i] - tp;    // false-negative = objects - true-positive
        pr[i][rank].fn = fn;
        if ((tp + fp) > 0) pr[i][rank].precision = (double)tp / (double)(tp + fp);
        else pr[i][rank].precision = 0;
        if ((tp + fn) > 0) pr[i][rank].recall = (double)tp / (double)(tp + fn);
        else pr[i][rank].recall = 0;
    }
}

6.有了各点的pr情况后,下面就可以计算mAP了。分为两种情况,map_points为0时考虑所有recall点的precision,再累积求和,相当于PR曲线下的面积(注:采用外插方法,每一个recall对应的precision取不小于该recall的所有点中precision的最大值)。由于prob是由高到低排序的,从rank由大到小来看recall是从高到低遍历,对应的precision从低到高。recall是单调下降的,但precision可能有波动,如果随着recall下降precision没上升,则不计算这个点,直到遇到更高的precision才累加。map_points不为0时就更直观了,直接搜索大于recall点的最大precision值。相同数据集 -points 0 要比 -points 101 的mAP高一点。

for (i = 0; i < classes; ++i) 
{
	double avg_precision = 0;
	if (map_points == 0)
	{
		double last_recall = pr[i][detections_count - 1].recall;
		double last_precision = pr[i][detections_count - 1].precision;
		for (rank = detections_count - 2; rank >= 0; --rank)
		{
			double delta_recall = last_recall - pr[i][rank].recall;
			last_recall = pr[i][rank].recall;
			if (pr[i][rank].precision > last_precision) 
				last_precision = pr[i][rank].precision;
			avg_precision += delta_recall * last_precision;
		}
		//add remaining area of PR curve when recall isn't 0 at rank-1
		double delta_recall = last_recall - 0;
		avg_precision += delta_recall * last_precision;
	}
	// MSCOCO - 101 Recall-points, PascalVOC - 11 Recall-points
	else
	{
		int point;
		for (point = 0; point < map_points; ++point) 
		{
			double cur_recall = point * 1.0 / (map_points - 1);
			double cur_precision = 0;
			for (rank = 0; rank < detections_count; ++rank)
			{
				if (pr[i][rank].recall >= cur_recall)    // > or >=
					if (pr[i][rank].precision > cur_precision)
						cur_precision = pr[i][rank].precision;
			}
			avg_precision += cur_precision;
		}
		avg_precision = avg_precision / map_points;
	}
	mean_average_precision += avg_precision;
}

直接看代码可能有点抽象,可以结合这篇文章后面的图理解一下。pr数组里每一个元素对应pr图上的一个点。计算过程中始终维护着last_precision这个变量,表示当前见过的最大precision。计算mAP时从右往左遍历这张图,可以想象一个点从右往左划过整个绿线:
(1)向左移动时,delta_recall为移动的水平距离,这时last_precision不变,增加的AP为这段水平距离和最大precision组成的矩形面积(外插);
(2)向上移动时,recall不变,delta_recall = 0,所以AP不增加,但last_precision持续增加,达到下一个最高点。
 
 
 
以上就是个人对mAP函数的一些理解,欢迎交流讨论。

  • 2
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 5
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值