[VarifocalNet] VarifocalNet: An IoU-aware Dense Object Detector (CVPR. 2021oral)

image-20210430104811400

1. Motivation

之前的工作,使用分类分数或者结合分类和定位的分数来筛选候选框。

Prior work uses the classification score or a combination of classification and predicted localization scores to rank candidates.

在检测中的后处理操作中,一般会使用NMS,通过分类分数来对候选框进行排名,然而这会影响检测的性能,作者认为原因在于分类的分数不是总作为衡量bbox定位精度的估计,带有低分的分类分数但正确的bbox可能会在NMS中被错误的去除

Generally, the classification score is used to rank the bounding box in NMS.

This harms the detection performance, because the classification score is not always a good estimate of the bounding box localization accuracy [10] and accurately localized detections with low classification scores may be mistakenly removed in NMS.

在之前的方法中,额外添加的IOU score或者Center-ness Score作为localization accuracy estimation,,在测试中,把分类分数乘上对应2者的score,得到的分数才作为NMS中的’classification score’。但作者认为这种方法是sub-optimal,额外的网络分支来预测定位分数,是不简洁的方法,并且会导致(incurs)额外的计算开销。

They are sub-optimal because multiplying the two imperfect predictions may lead to a worse rank basis and we show in Section 3 that the upper bound of the performance achieved by such methods is limited.

针对IOU-NET以及FCOS中的ctrness的操作,它们都是额外加入了一个定位精度的分数,作者引出的一个问题:

Instead of predicting an additional localization accuracy score, can we merge it into the classification score?

针对之前两种方法都无法得到一个reliable的排位,作者提出了IACS(Iou arare Classification Score)作为一种分类和定位的联合表示,并且制定了Varifocal Loss,提出了一种star-shaped星行候选框表示,用于IACS的预测以及边界框的refinement。结合以上两种新的成分以及一个bbox refinement branch,基于FCOS_ATSS检测器的性能得到改善。

image-20210507095457699

如表1所示,在val2017上,在FCOS+ATSS模型,作者分析了添加了不同的先验信息对网络精度的影响。如果使用gt_ctrness以及gt_iou替代原本的FCOS中的预测信息,上升的精度分别为41.1以及43.5,但是实际inference中实际是没有gt信息的(test2017),因此作者认为,这点精度的提高,只能说明ctr x scores以及iou x scores来进行得分的排序并不能带来高性能。

This indicates that using the product of either the predicted centerness score or the IoU score and the classification score to rank detections is certainly unable to bring significant performance gain.

如果使用gt_bbox,没有ctrness的情况下都能达到56.1,但是如果使用gt_cls(即在对于标签位置设置为1)的情况下,有ctrness和没有ctrness差别很大,(43.1 vs 58.1),说明ctrness对于筛选精确的框有帮助。

Because the centerness score can differentiate accurate and inaccurate boxes to some extent

作者发现将classification scores替换为gt_IOU(只不过作者重新命名为gt_cls_IOU,之前使用gt_IO替换ctrness,命名为了gt_ctr_IOU)。(gt_IOU定义如下):

The IoU between the predicted bounding box and the ground-truth one (termed as gt IoU).

The most surprising result is the one obtained by replacing the classification score of the ground-truth class with the gt IoU (gt cls iou).

gt_ctr_IOU的条件下,在没有使用ctrness时,效果可以达到74.7ap

这揭示了在大量的候选框中,已经存在了非常精确的bbox。作者认为将原始的分类得分替换为IOU-aware的分类得分(IACS)是有效的选择措施。

This in fact reveals that there already exist accurately localized bounding boxes in the large candidate pool for most objects.

replacing the classification score of the ground-truth class with the gt IoU is the most promising selection measure. We refer to the element of such a score vector as the IoU-aware Classification Score (IACS)

2. Contribution

  • 作者展示了正确的rank候选框的方法是对于dense detectors高性能的关键,IACS可以实现很好的ranking。

We show that accurately ranking candidate detections is critical for high performing dense object detectors, and IACS achieves a better ranking than other methods (Section 3).

  • 作者提出了Varifocal Loss。

We propose a new Varifocal Loss for training dense object detectors to regress the IACS.

  • 作者提出star-shape bbox用于计算IACS和修正bbox。

We design a new star-shaped bounding box feature representation for computing the IACS and refining the bounding box.

  • 作者将此网络命名为VarifocalNet / VFNet.

We develop a new dense object detector based on the FCOS [9]+ATSS [12] and the proposed components, named VarifocalNet or VFNet for short, to exploit the advantage of the IACS. An illustration of our method is shown in Figure 1.

3. Method

Compared with the FCOS+ATSS, it has three new components: the varifcoal loss, the star-shaped bounding box feature representation and the bounding box refinement.

3.1 IACS – IoU-Aware Classification Score

作者将gt class label的位置(原本1的gt)替换为了预测bbox和gt bbox的IOU的值。其余位置的label 还是为0。

We define the IACS as a scalar element of a classification score vector, in which the value at the ground-truth class label position is the IoU between the predicted bounding box and its ground truth, and 0 at other positions.

3.2 Varifocal Loss

image-20210430105840214

原始的Focal Loss:

image-20210507102032133

作者借鉴了Focal loss中对于类别不平衡问题的处理,但是不同于Focal loss平等的处理正负样本,作者asymmetrically(不对称)处理它们(我认为不对称是指上下2个式子的函数不是对称的)。本文提出的Varifocal Loss:

image-20210507102100792

其中p是pred ICAS,q分为两类,对于foreground point,q是IOU,对于background point,对于所有类的target,q是0,如图1所示。

作者提到,varifocal loss只是减少了负样本对于loss的贡献,使用了

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值