【Darknet】计算mAP函数validate_detector_map详解

最新推荐文章于 2023-06-21 10:05:27 发布

冷月枫晚

最新推荐文章于 2023-06-21 10:05:27 发布

阅读量1.8k

点赞数 2

分类专栏： Darknet

本文链接：https://blog.csdn.net/u013798145/article/details/109185580

版权

Darknet 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

之前和百度有个合作，要测一下他们模型的mAP，所以研究了一下Darknet里mAP到底是怎么算的。

validate_detector_map函数的原型是

float validate_detector_map(char *datacfg, char *cfgfile, char *weightfile, float thresh_calc_avg_iou, const float iou_thresh, const int map_points, int letter_box, network *existing_net)

datacfg - data文件
cfgfile - cfg文件
weightfile - weight文件
thresh_calc_avg_iou - 计算precision和recall的阈值（注：mAP和此值无关）
iou_thresh - iou阈值，即目标与gt的iou超过多少认为是检测正确
map_points - 用多少个recall点来计算mAP，点越多越精确，点少算出的mAP偏小。默认为0，即用全部的点

// MS COCO - uses 101-Recall-points on PR-chart.
// PascalVOC2007 - uses 11-Recall-points on PR-chart.
// PascalVOC2010-2012 - uses Area-Under-Curve on PR-chart.
// ImageNet - uses Area-Under-Curve on PR-chart.

letter_box - 是否保持原始分辨率
existing_net - 是否已存在网络，训练时调map是存在的，直接调map是要从配置文件新建网络

下面分析一下核心代码。

1.每4幅图像一组计算，对于每一幅图像，先inference得到检测结果，再过滤掉小于阈值（注：这里阈值传的是0.005，因为要得到所有的检测结果）的检测。hier_thresh是以前YOLOv2用的，现在没用了。

for (t = 0; t < nthreads && i + t - nthreads < m; ++t)
{
   const int image_index = i + t - nthreads;
   char *path = paths[image_index];
   char *id = basecfg(path);
   float *X = val_resized[t].data;
   network_predict(net, X);

   int nboxes = 0;
   float hier_thresh = 0;
   detection *dets;
   if (args.type == LETTERBOX_DATA) 
   {
       dets = get_network_boxes(&net, val[t].w, val[t].h, thresh, hier_thresh, 0, 1, &nboxes, letter_box);
   }
   else 
   {
       dets = get_network_boxes(&net, 1, 1, thresh, hier_thresh, 0, 0, &nboxes, letter_box);
   }
   if (nms) 
   {
       if (l.nms_kind == DEFAULT_NMS) do_nms_sort(dets, nboxes, l.classes, nms);
       else diounms_sort(dets, nboxes, l.classes, nms, l.nms_kind, l.beta_nms);
   }

2.得到网络的检测结果后，都存入detections这个box_prob类的数组里，它对应的属性有bbox，prob，index，类别，是否与gt匹配，对应gt的index。这个detections是后面求mAP用的。然后对每一个prob大于0的检测（实际上是大于0.005，因为小于此值的检测在NMS时被清零0），寻找与它IOU超过阈值且最大，类别相同的gt。如果能找到这样的gt，则更新truth_flag和unique_truth_index。

for (i = 0; i < nboxes; ++i)
{
    int class_id;
    for (class_id = 0; class_id < classes; ++class_id) 
    {
    	float prob = dets[i].prob[class_id];
    	if (prob > 0) 
    	{
            detections_count++;
            detections = (box_prob*)xrealloc(detections, detections_count * sizeof(box_prob));
            detections[detections_count - 1].b = dets[i].bbox;
            detections[detections_count - 1].p = prob;
            detections[detections_count - 1].image_index = image_index;
            detections[detections_count - 1].class_id = class_id;
            detections[detections_count - 1].truth_flag = 0;
            detections[detections_count - 1].unique_truth_index = -1;
            int truth_index = -1;
            float max_iou = 0;
            for (j = 0; j < num_labels; ++j)
            {
                box t = { truth[j].x, truth[j].y, truth[j].w, truth[j].h };
                float current_iou = box_iou(dets[i].bbox, t);
                if (current_iou > iou_thresh && class_id == truth[j].id) 
                {
                    if (current_iou > max_iou) 
                    {
                        max_iou = current_iou;
                        truth_index = unique_truth_count + j;
                    }
                }
            }
            // best IoU
            if (truth_index > -1) 
            {
                detections[detections_count - 1].truth_flag = 1;
                detections[detections_count - 1].unique_truth_index = truth_index;
            }

3.存完detections后，然后计算TP、FP和平均IOU。这时的阈值就是thresh_calc_avg_iou了，从外部传入的，用于计算这个特定阈值下的TP、FP和平均IOU。但mAP是衡量多个阈值下的precision和recall的整体情况，与具体阈值无关。这里的found指当前检测的gt是否被匹配过。假设当前bbox预测第truth_index个gt，但这个gt已经被前面的bbox预测过了（z的范围是checkpoint_detections_count到detections_count - 1，即当前图像上已经处理过的bbox），由于NMS后各个预测结果的prob是降序排的，所以前面的那个预测的才是TP，这个是FP。

// calc avg IoU, true-positives, false-positives for required Threshold
if (prob > thresh_calc_avg_iou)
{
    int z, found = 0;
    for (z = checkpoint_detections_count; z < detections_count - 1; ++z)
    {
        if (detections[z].unique_truth_index == truth_index)
        {
            found = 1; break;
        }
    }
    if (truth_index > -1 && found == 0)
    {
        avg_iou += max_iou;
        ++tp_for_thresh;
        avg_iou_per_class[class_id] += max_iou;
        tp_for_thresh_per_class[class_id]++;
    }
    else
    {
        fp_for_thresh++;
        fp_for_thresh_per_class[class_id]++;
    }
}

4.统计完所有图像后，计算平均IOU和各类的平均IOU。TP的IOU已计入avg_iou，FP的IOU是0。

if ((tp_for_thresh + fp_for_thresh) > 0)
    avg_iou = avg_iou / (tp_for_thresh + fp_for_thresh);
int class_id;
for(class_id = 0; class_id < classes; class_id++)
{
    if ((tp_for_thresh_per_class[class_id] + fp_for_thresh_per_class[class_id]) > 0)
        avg_iou_per_class[class_id] = avg_iou_per_class[class_id] / (tp_for_thresh_per_class[class_id] + fp_for_thresh_per_class[class_id]);
}

5.下面开始计算每个类的AP和mAP。先将detections按降序排好，detections[0]对应所有类别中最大的prob。rank表示置信度的等级，rank = 0时对应的prob最大，而rank = detections_count - 1时prob最小。
再来看一下pr的含义，pr是一个classes × detections_count的数组，pr[class_id][rank]表示第class_id类只考虑prob大于等于第rank级对应的prob的检测结果的pr，也就是prob >= detections[rank].p这样条件下的所有目标的pr情况。所以初始化pr[class_id][rank].tp = pr[class_id][rank - 1].tp，且pr[class_id][rank] >= pr[class_id][rank-1]。因为rank提高了，要求的prob降低了，出现的检测结果不会比之前少，TP和FP也不会降低。最后rank == detections_count - 1时，所有检测的prob都大于这个水平（高于0.005）。truth_flags和之前一样是gt是否匹配了某个检测结果的标志。在每一个检测结果对应的prob上，根据其是否检测到了gt增加TP或FP数，再计算其precision和recall。

qsort(detections, detections_count, sizeof(box_prob), detections_comparator);

// for PR-curve
pr_t** pr = (pr_t**)calloc(classes, sizeof(pr_t*));//pr[classes][detections_count]
for (i = 0; i < classes; ++i)
pr[i] = (pr_t*)calloc(detections_count, sizeof(pr_t));

for (rank = 0; rank < detections_count; ++rank) 
{
    if (rank > 0) 
    {
        int class_id;
        for (class_id = 0; class_id < classes; ++class_id) 
        {
            pr[class_id][rank].tp = pr[class_id][rank - 1].tp;
            pr[class_id][rank].fp = pr[class_id][rank - 1].fp;
        }
    }
    box_prob d = detections[rank];
    // if (detected && isn't detected before)
    if (d.truth_flag == 1) 
    {
        if (truth_flags[d.unique_truth_index] == 0)
        {
            truth_flags[d.unique_truth_index] = 1;
            pr[d.class_id][rank].tp++;    // true-positive
        } 
        else
            pr[d.class_id][rank].fp++;
    }
    else 
    {
        pr[d.class_id][rank].fp++;    // false-positive
    }
    for (i = 0; i < classes; ++i)
    {
        const int tp = pr[i][rank].tp;
        const int fp = pr[i][rank].fp;
        const int fn = truth_classes_count[i] - tp;    // false-negative = objects - true-positive
        pr[i][rank].fn = fn;
        if ((tp + fp) > 0) pr[i][rank].precision = (double)tp / (double)(tp + fp);
        else pr[i][rank].precision = 0;
        if ((tp + fn) > 0) pr[i][rank].recall = (double)tp / (double)(tp + fn);
        else pr[i][rank].recall = 0;
    }
}

6.有了各点的pr情况后，下面就可以计算mAP了。分为两种情况，map_points为0时考虑所有recall点的precision，再累积求和，相当于PR曲线下的面积（注：采用外插方法，每一个recall对应的precision取不小于该recall的所有点中precision的最大值）。由于prob是由高到低排序的，从rank由大到小来看recall是从高到低遍历，对应的precision从低到高。recall是单调下降的，但precision可能有波动，如果随着recall下降precision没上升，则不计算这个点，直到遇到更高的precision才累加。map_points不为0时就更直观了，直接搜索大于recall点的最大precision值。相同数据集 -points 0 要比 -points 101 的mAP高一点。

for (i = 0; i < classes; ++i) 
{
	double avg_precision = 0;
	if (map_points == 0)
	{
		double last_recall = pr[i][detections_count - 1].recall;
		double last_precision = pr[i][detections_count - 1].precision;
		for (rank = detections_count - 2; rank >= 0; --rank)
		{
			double delta_recall = last_recall - pr[i][rank].recall;
			last_recall = pr[i][rank].recall;
			if (pr[i][rank].precision > last_precision) 
				last_precision = pr[i][rank].precision;
			avg_precision += delta_recall * last_precision;
		}
		//add remaining area of PR curve when recall isn't 0 at rank-1
		double delta_recall = last_recall - 0;
		avg_precision += delta_recall * last_precision;
	}
	// MSCOCO - 101 Recall-points, PascalVOC - 11 Recall-points
	else
	{
		int point;
		for (point = 0; point < map_points; ++point) 
		{
			double cur_recall = point * 1.0 / (map_points - 1);
			double cur_precision = 0;
			for (rank = 0; rank < detections_count; ++rank)
			{
				if (pr[i][rank].recall >= cur_recall)    // > or >=
					if (pr[i][rank].precision > cur_precision)
						cur_precision = pr[i][rank].precision;
			}
			avg_precision += cur_precision;
		}
		avg_precision = avg_precision / map_points;
	}
	mean_average_precision += avg_precision;
}

直接看代码可能有点抽象，可以结合这篇文章后面的图理解一下。pr数组里每一个元素对应pr图上的一个点。计算过程中始终维护着last_precision这个变量，表示当前见过的最大precision。计算mAP时从右往左遍历这张图，可以想象一个点从右往左划过整个绿线：
（1）向左移动时，delta_recall为移动的水平距离，这时last_precision不变，增加的AP为这段水平距离和最大precision组成的矩形面积（外插）；
（2）向上移动时，recall不变，delta_recall = 0，所以AP不增加，但last_precision持续增加，达到下一个最高点。

以上就是个人对mAP函数的一些理解，欢迎交流讨论。