老卫带你学---目标检测的性能指标

50 篇文章 5 订阅
44 篇文章 0 订阅

在我们目标检测的研究中,为了体现模型各方面的性能,我们会有以下几个指标进行衡量。

1.precision-recall

在训练YOLO v2的过程中,系统会显示出一些评价训练效果的值,如Recall,IoU等等。为了怕以后忘了,现在把自己对这几种度量方式的理解记录一下。 

这一文章首先假设一个测试集,然后围绕这一测试集来介绍这几种度量方式的计算方法。

大雁与飞机

假设现在有这样一个测试集,测试集中的图片只由大雁和飞机两种图片组成,如下图所示: 
这里写图片描述

假设你的分类系统最终的目的是:能取出测试集中所有飞机的图片,而不是大雁的图片。

现在做如下的定义: 
True positives : 飞机的图片被正确的识别成了飞机。 
True negatives: 大雁的图片没有被识别出来,系统正确地认为它们是大雁。 
False positives: 大雁的图片被错误地识别成了飞机。 
False negatives: 飞机的图片没有被识别出来,系统错误地认为它们是大雁。

假设你的分类系统使用了上述假设识别出了四个结果,如下图所示: 
这里写图片描述

那么在识别出的这四张照片中: 
True positives : 有三个,画绿色框的飞机。 
False positives: 有一个,画红色框的大雁。

没被识别出来的六张图片中: 
True negatives : 有四个,这四个大雁的图片,系统正确地没有把它们识别成飞机。 
False negatives: 有两个,两个飞机没有被识别出来,系统错误地认为它们是大雁。

Precision 与 Recall

Precision其实就是在识别出来的图片中,True positives所占的比率: 
这里写图片描述 
其中的n代表的是(True positives + False positives),也就是系统一共识别出来多少照片 。 
在这一例子中,True positives为3,False positives为1,所以Precision值是 3/(3+1)=0.75。 
意味着在识别出的结果中,飞机的图片占75%。

Recall 是被正确识别出来的飞机个数与测试集中所有飞机的个数的比值: 
这里写图片描述 
Recall的分母是(True positives + False negatives),这两个值的和,可以理解为一共有多少张飞机的照片。 
在这一例子中,True positives为3,False negatives为2,那么Recall值是 3/(3+2)=0.6。 
意味着在所有的飞机图片中,60%的飞机被正确的识别成飞机.。

调整阈值

你也可以通过调整阈值,来选择让系统识别出多少图片,进而改变Precision 或 Recall 的值。 
在某种阈值的前提下(蓝色虚线),系统识别出了四张图片,如下图中所示: 
这里写图片描述 
分类系统认为大于阈值(蓝色虚线之上)的四个图片更像飞机。

我们可以通过改变阈值(也可以看作上下移动蓝色的虚线),来选择让系统识别能出多少个图片,当然阈值的变化会导致Precision与Recall值发生变化。比如,把蓝色虚线放到第一张图片下面,也就是说让系统只识别出最上面的那张飞机图片,那么Precision的值就是100%,而Recall的值则是20%。如果把蓝色虚线放到第二张图片下面,也就是说让系统只识别出最上面的前两张图片,那么Precision的值还是100%,而Recall的值则增长到是40%。

那么如何去自动化实现呢?

第一步:修改pascal_voc.py文件

文档开始添加

import matplotlib.pyplot as plt
import pylab as pl
from sklearn.metrics import precision_recall_curve
from itertools import cycle

修改_do_python_eval函数:

    def _do_python_eval(self, output_dir = 'output'):
        annopath = os.path.join(
            self._devkit_path,
            'VOC' + self._year,
            'Annotations',
            '{:s}.xml')
        imagesetfile = os.path.join(
            self._devkit_path,
            'VOC' + self._year,
            'ImageSets',
            'Main',
            self._image_set + '.txt')
        cachedir = os.path.join(self._devkit_path, 'annotations_cache')
        aps = []
        # The PASCAL VOC metric changed in 2010
        use_07_metric = True if int(self._year) < 2010 else False
        print 'VOC07 metric? ' + ('Yes' if use_07_metric else 'No')
        if not os.path.isdir(output_dir):
            os.mkdir(output_dir)
        for i, cls in enumerate(self._classes):
            if cls == '__background__':
                continue
            filename = self._get_voc_results_file_template().format(cls)
            rec, prec, ap = voc_eval(
                filename, annopath, imagesetfile, cls, cachedir, ovthresh=0.5,
                use_07_metric=use_07_metric)
            aps += [ap]
            pl.plot(rec, prec, lw=2, 
                    label='Precision-recall curve of class {} (area = {:.4f})'
                          ''.format(cls, ap))
            print('AP for {} = {:.4f}'.format(cls, ap))
            with open(os.path.join(output_dir, cls + '_pr.pkl'), 'w') as f:
                cPickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)
 
        pl.xlabel('Recall')
        pl.ylabel('Precision')
        plt.grid(True)
        pl.ylim([0.0, 1.05])
        pl.xlim([0.0, 1.0])
        pl.title('Precision-Recall')
        pl.legend(loc="upper right")     
        plt.show()
 
        print('Mean AP = {:.4f}'.format(np.mean(aps)))
        print('~~~~~~~~')
        print('Results:')
        for ap in aps:
            print('{:.3f}'.format(ap))
        print('{:.3f}'.format(np.mean(aps)))
        print('~~~~~~~~')
        print('')
        print('--------------------------------------------------------------')
        print('Results computed with the **unofficial** Python eval code.')
        print('Results should be very close to the official MATLAB eval code.')
        print('Recompute with `./tools/reval.py --matlab ...` for your paper.')
        print('-- Thanks, The Management')
        print('--------------------------------------------------------------')

第二步 执行test_net.py文件

解释:这段代码不是根据日志信息输出的,而是在测试过程中输出的

当你之前已经训练好模型,那么需要重新执行一下测试命令已输出曲线,命令如下:

./tools/test_net.py --gpu 0 --def models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt --net output/faster_rcnn_end2end/voc_2007_trainval/vgg16_faster_rcnn_iter_70000.caffemodel --cfg experiments/cfgs/faster_rcnn_end2end.yml

2.IOU

IoU这一值,可以理解为系统预测出来的框与原来图片中标记的框的重合程度。 

计算方法即检测结果Detection Result与 Ground Truth 的交集比上它们的并集,即为检测的准确率: 

如下图所示: 
蓝色的框是:GroundTruth 
黄色的框是:DetectionResult 
绿色的框是:DetectionResult ⋂ GroundTruth 
红色的框是:DetectionResult ⋃ GroundTruth

这里写图片描述

3.Roc

代码如下:
其中部分参数根据项目需要进行修改
n_classes为目标类别数目;
classes为目标名称;
array1为样本原本真实类别;
array2为样本预测概率(置信度);



import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle

from sklearn import svm, datasets
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from scipy import interp

n_classes=4
classes=['suozhai','xunsan','kuiyang','suizhi']
array1=(
    (1,0,0,0),#suozhai
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (0, 1, 0, 0),#xunsan
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0, 1, 0, 0),
    (0,0,1,0),#kuiyang
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,1,0),
    (0, 0, 0, 1),  # suizhi
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    (0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1),
    #(0, 0, 0, 1)
)
array2=(
    (0.999,0,0,0),#suozhai
    (0,0,0,0.999),
    (1,0,0,0),
    (0.999,0,0,0),
    (0,0.987,0,0),
    (0.999,0,0,0),
    (1,0,0,0),
    (1,0,0,0),
    (0,0,0,0.997),
    (0.999,0,0,0),
    (0.999,0,0,0),
    (0,0,0.998,0),
    (1,0,0,0),
    (0,0,0,0.999),
    (0.997,0,0,0),
    (1,0,0,0),
    (0.999,0,0,0),
    (0.999,0,0,0),
    (0.999,0,0,0),
    (0,0.999,0,0),
    (0.998,0,0,0),
    (1,0,0,0),
    (0.998,0,0,0),
    (0,0.998,0,0),
    (0,0,0,0.999),
    (0.999,0,0,0),
    (0,0.6,0,0),
    (0,0,0.4,0),
    (0,0,0,0.6),
    (0,0.4,0,0),
    (0,0.992,0,0),#xunsan
    (0,1,0,0),
    (0,1,0,0),
    (0,0.999,0,0),
    (0,0.999,0,0),
    (0,0.999,0,0),
    (0.54,0,0,0),
    (0,1,0,0),
    (0,1,0,0),
    (0,0,0.4,0),
    (0,0.995,0,0),
    (0,0.999,0,0),
    (0,0.999,0,0),
    (0,0.998,0,0),
    (0,1,0,0),
    (0,1,0,0),
    (0,1,0,0),
    (0,0.998,0,0),
    (0,1,0,0),
    (0,1,0,0),
    (0,0.999,0,0),
    (0,0.999,0,0),
    (0,0.999,0,0),
    (0,0.999,0,0),
    (0,0.997,0,0),
    (0,0.999,0,0),
    (0.4,0,0,0),
    (0,0,0.6,0),
    (0,0,0,0.3),
    (0.4,0,0,0),
    (0,0,0.999,0),#kuiyang
    (0,0,0.999,0),
    (0,0,0.999,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0.914,0,0),
    (0,0,1,0),
    (0,0,0.999,0),
    (0,0,1,0),
    (0,0,1,0),
    (0,0,0.999,0),
    (0.4,0,0,0),
    (0,0,1,0),
    (0,0,0.999,0),
    (0,0,1,0),
    (0,0,1,0),
    (0, 0, 1, 0),
    (0, 0, 1, 0),
    (0, 0, 1, 0),
    (0, 0, 1, 0),
    (0,0,0.999,0),
    (0,0,1,0),
    (0,0,0.999,0),
    (0.4,0,0,0),
    (0, 0.6, 0, 0),
    (0, 0, 0, 0.35),
    (0.4,0,0,0),
    (0, 0.6, 0, 0),
    (0, 0, 0, 0.35),
    (0, 0.45, 0, 0),

    (0.374,0,0,0),#suizhi
    (0,0,0,0.999),
    (0,0,0,0.999),
    (0,0,0,0.975),
    (0, 0, 0, 0.945),
    (0, 0, 0, 0.916),
    (0, 0, 0, 0.935),
    (0, 0, 0, 0.967),
    (0, 0, 0, 0.949),
    (0.436,0,0,0),
    (0,0.358,0,0),
    (0,0,0.587,0),
    (0.123,0,0,0),
    (0.236,0,0,0),
    (0,0.356,0,0),
    (0,0,0.368,0),
    (0,0,0.489,0),
    (0.675,0,0,0),
    (0,0.489,0,0),
    (0,0,0.674,0),
    #(0,0.587,0,0),
    #(0,0,468,0,0),
    #(0.289,0,0,0),
    #(0,0,0.385,0),
    #(0.358,0,0,0),
    #(0,0.648,0,0),
    #(0,0,0.486,0),
    #(0.368,0,0,0),
    #(0,0.489,0,0),
    #(0,0.289,0,0)
)

y_test=np.array(array1)
y_score=np.array(array2)


fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
    print(i)


# Compute micro-average ROC curve and ROC area
fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])


plt.figure()
lw = 2
plt.plot(fpr[2], tpr[2], color='darkorange',
         lw=lw, label='ROC curve (area = %0.2f)' % roc_auc[2])
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic example')
plt.legend(loc="lower right")
plt.show()


# Compute macro-average ROC curve and ROC area




# First aggregate all false positive rates
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))

# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])

# Finally average it and compute AUC
mean_tpr /= n_classes

fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

# Plot all ROC curves
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
         label='micro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["micro"]),
         color='deeppink', linestyle=':', linewidth=4)

plt.plot(fpr["macro"], tpr["macro"],
         label='macro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["macro"]),
         color='navy', linestyle=':', linewidth=4)

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=lw,
             label='ROC curve of class {0} (area = {1:0.2f})'
             ''.format(classes[i], roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Roc Curve')
plt.legend(loc="lower right")
plt.show()

而Auc的值,我们一般这样评估:
.90-1 = very good (A)
.80-.90 = good (B)
.70-.80 = not so good ©
.60-.70 = poor (D)
.50-.60 = fail (F)

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值