[VarifocalNet] VarifocalNet: An IoU-aware Dense Object Detector (CVPR. 2021oral)

最新推荐文章于 2024-08-21 10:08:46 发布

Ah丶Weii

最新推荐文章于 2024-08-21 10:08:46 发布

阅读量467

点赞数

分类专栏：学习

本文链接：https://blog.csdn.net/weixin_43823854/article/details/116542880

版权

文章目录

1. Motivation
2. Contribution
3. Method
4. Experiments

1. Motivation

之前的工作，使用分类分数或者结合分类和定位的分数来筛选候选框。

Prior work uses the classification score or a combination of classification and predicted localization scores to rank candidates.

在检测中的后处理操作中，一般会使用NMS，通过分类分数来对候选框进行排名，然而这会影响检测的性能，作者认为原因在于分类的分数不是总作为衡量bbox定位精度的估计，带有低分的分类分数但正确的bbox可能会在NMS中被错误的去除。

Generally, the classification score is used to rank the bounding box in NMS.

This harms the detection performance, because the classification score is not always a good estimate of the bounding box localization accuracy [10] and accurately localized detections with low classification scores may be mistakenly removed in NMS.

在之前的方法中，额外添加的IOU score或者Center-ness Score作为localization accuracy estimation,，在测试中，把分类分数乘上对应2者的score，得到的分数才作为NMS中的’classification score’。但作者认为这种方法是sub-optimal，额外的网络分支来预测定位分数，是不简洁的方法，并且会导致(incurs)额外的计算开销。

They are sub-optimal because multiplying the two imperfect predictions may lead to a worse rank basis and we show in Section 3 that the upper bound of the performance achieved by such methods is limited.

针对IOU-NET以及FCOS中的ctrness的操作，它们都是额外加入了一个定位精度的分数，作者引出的一个问题:

Instead of predicting an additional localization accuracy score, can we merge it into the classification score?

针对之前两种方法都无法得到一个reliable的排位，作者提出了IACS（Iou arare Classification Score)作为一种分类和定位的联合表示，并且制定了Varifocal Loss，提出了一种star-shaped星行候选框表示，用于IACS的预测以及边界框的refinement。结合以上两种新的成分以及一个bbox refinement branch，基于FCOS_ATSS检测器的性能得到改善。

如表1所示，在val2017上，在FCOS+ATSS模型，作者分析了添加了不同的先验信息对网络精度的影响。如果使用gt_ctrness以及gt_iou替代原本的FCOS中的预测信息，上升的精度分别为41.1以及43.5，但是实际inference中实际是没有gt信息的（test2017），因此作者认为，这点精度的提高，只能说明ctr x scores以及iou x scores来进行得分的排序并不能带来高性能。

This indicates that using the product of either the predicted centerness score or the IoU score and the classification score to rank detections is certainly unable to bring significant performance gain.

如果使用gt_bbox，没有ctrness的情况下都能达到56.1，但是如果使用gt_cls(即在对于标签位置设置为1）的情况下，有ctrness和没有ctrness差别很大，（43.1 vs 58.1)，说明ctrness对于筛选精确的框有帮助。

Because the centerness score can differentiate accurate and inaccurate boxes to some extent

作者发现将classification scores替换为gt_IOU(只不过作者重新命名为gt_cls_IOU，之前使用gt_IO替换ctrness，命名为了gt_ctr_IOU)。（gt_IOU定义如下）：

The IoU between the predicted bounding box and the ground-truth one (termed as gt IoU).

The most surprising result is the one obtained by replacing the classification score of the ground-truth class with the gt IoU (gt cls iou).

gt_ctr_IOU的条件下，在没有使用ctrness时，效果可以达到74.7ap

这揭示了在大量的候选框中，已经存在了非常精确的bbox。作者认为将原始的分类得分替换为IOU-aware的分类得分（IACS)是有效的选择措施。

This in fact reveals that there already exist accurately localized bounding boxes in the large candidate pool for most objects.

replacing the classification score of the ground-truth class with the gt IoU is the most promising selection measure. We refer to the element of such a score vector as the IoU-aware Classification Score (IACS)

2. Contribution

作者展示了正确的rank候选框的方法是对于dense detectors高性能的关键，IACS可以实现很好的ranking。

We show that accurately ranking candidate detections is critical for high performing dense object detectors, and IACS achieves a better ranking than other methods (Section 3).

作者提出了Varifocal Loss。

We propose a new Varifocal Loss for training dense object detectors to regress the IACS.

作者提出star-shape bbox用于计算IACS和修正bbox。

We design a new star-shaped bounding box feature representation for computing the IACS and refining the bounding box.

作者将此网络命名为VarifocalNet / VFNet.

We develop a new dense object detector based on the FCOS [9]+ATSS [12] and the proposed components, named VarifocalNet or VFNet for short, to exploit the advantage of the IACS. An illustration of our method is shown in Figure 1.

3. Method

Compared with the FCOS+ATSS, it has three new components: the varifcoal loss, the star-shaped bounding box feature representation and the bounding box refinement.

3.1 IACS – IoU-Aware Classification Score

作者将gt class label的位置（原本1的gt）替换为了预测bbox和gt bbox的IOU的值。其余位置的label 还是为0。

We define the IACS as a scalar element of a classification score vector, in which the value at the ground-truth class label position is the IoU between the predicted bounding box and its ground truth, and 0 at other positions.