论文阅读:ATSS

1、论文总述

论文全称:Bridging the Gap Between Anchor-based and Anchor-free Detection via
Adaptive Training Sample Selection

最近看目标检测进展的时候,总是看到ATSS这个词,但自己又不懂,看来这是一篇值得看的论文了,遂读之。
总的来说:就是重新定义了目标检测里的正负样本,算法结合了anchor-based和anchor-free网络里正负样本定义的优点,但我觉得论文题目取得有点误导人,因为这篇论文的主要工作并没有Bridging the Gap Between Anchor-based and Anchor-free Detection,主要工作还是在正负样本定义这块,而且后面的大量实验都是在RetinaNet上做得,所以相当于对Anchor-based目标检测网络的进一步优化,对Anchor-free 目标检测网络并没有太大益处。

看这篇论文之前的3月份好像,和师兄讨论过关于目标检测的正负样本定义问题,因为师兄的毕设就是基于正负样本定义的,当时我觉得还可以继续搞下去,大致思想其实和ATSS的思想差不多,都是统计IOU的均值信息,我当时是想为每一个类别统计一个IOU的均值信息,这样评估mAP的时候可以分类别对比更具有说服力

论文的主要内容有三方面:

在这里插入图片描述

2、three main differences between RetinaNet and FCOS

(1) The number of anchors tiled per location. RetinaNet tiles several anchor boxes per location,
while FCOS tiles one anchor point1 per location.
(2) The definition of positive and negative samples. RetinaNet resorts to the Intersection over Union (IoU) for positives and negatives, while FCOS utilizes spatial and scale constraints to select samples.
(3) The regression starting status. RetinaNet regresses the object bounding box from the preset
anchor box, while FCOS locates the object from the anchor point.

3、RetinaNet and FCOS 的正负样本分配策略

As shown in Figure 1(a), RetinaNet utilizes
IoU to divide the anchor boxes from different pyramid levels into positives and negatives. It first labels the best anchor box of each object and the anchor boxes with IoU > θp
as positives, then regards the anchor boxes with IoU < θn
as negatives, finally other anchor boxes are ignored during
training.
As shown in Figure 1(b), FCOS uses spatial and
scale constraints to divide the anchor points from different
pyramid levels. It first considers the anchor points within
the ground-truth box as candidate positive samples, then selects the final positive samples from candidates based on the scale range defined for each pyramid level3
, finally those unselected anchor points are negative samples.
As shown in Figure 1, FCOS first uses the spatial constraint to find candidate positives in the spatial dimension, then uses the scale constraint to select final positives in the
scale dimension. In contrast, RetinaNet utilizes IoU to directly select the final positives in the spatial and scale dimension simultaneously.

在这里插入图片描述

4、交换RetinaNet and FCOS 的正负样本分配策略之后?

在这里插入图片描述

Conclusion. According to these experiments conducted in
a fair way, we can point out that the essential difference
between one-stage anchor-based detectors and center-based
anchor-free detectors is actually how to define positive and
negative training samples,
which is important for current
object detection and deserves further study

主要意思是说:交换两者的正负样本采集策略之后,RetinaNet的性能提高了,而FCOS的性能下降了,进而得出结论,并说明背后的原理还值得进一步研究。

但是这里得提一点: RetinaNet and FCOS两者的mAP并没有差多少,不到1,而且retinanet的每个cell设置的是只有一个anchor,我自己觉得这么比较对基于anchor-based的RetinaNet是不公平的。

5、ATSS算法流程

在这里插入图片描述在这里插入图片描述

这里对Figure3进行说明:
1)统计出来的IOU的均值和方差都比较大的话,那这个GT的正样本主要是从FPN的某一个level上取,对应(a)中的Level3
2)统计出来的IOU的均值和方差都比较小的话,那这个GT的正样本是从FPN的某几个level上取,对应(b)中的Level1 、Level2

注意:这就相当于为那些不好匹配anchor的难训练目标降低了作为正样本的要求,这样的话,难目标的正样本数量相比以前的采样策略更加多一些

这也是论文这段话所说的意思:

Maintaining fairness between different objects.
According to the statistical theory, about 16% of samples are in
the confidence interval [mg + vg, 1] in theory. Although the
IoU of candidates is not a standard normal distribution, the
statistical results show that each object has about 0.2 ∗ kL
positive samples, which is invariant to its scale, aspect ratio
and location. In contrast, strategies of RetinaNet and FCOS
tend to have much more positive samples for larger objects,
leading to unfairness between different objects.

6、the necessity of tiling multiple anchors per location

在这里插入图片描述

These results indicate that under the traditional IoU-based
sample selection strategy, tiling more anchor boxer per location is necessary.

首先:传统的基于IOU的正负样本采样策略,在每个cell上多放几个anchor是对性能有提升的

Besides, when we change the number of anchor
scales or aspect ratios from 3 to 1, the results are almost
unchanged as listed in the fourth and fifth rows of Table 7.
In other words, as long as the positive samples are selected
appropriately, no matter how many anchors are tiled at each
location, the results are the same.
Thus, we conclude that
tiling multiple anchors per location is a thankless operation
under our proposed method and further study is needed to
discover the right role of multiple anchors per location.

但是基于ATSS的采样策略下,在每个cell上多放几个anchor是对性能没有有提升。

7、Detection results on MS COCO test-dev set

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值