1. Motivation
- While the second stage has a probabilistic interpretation, the combination of the two stages does not.
- A probabilistic two-stage detector is faster and more accu- rate than both its one- and two-stage precursors.
2. Contribution
- We build a probabilistic two-stage detector on top of state- of-the-art one-stage detectors.
- The resulting detectors are faster and more accurate than both their one- and two-stage precursors.
3. Method
Two stage detectors
本文认为双阶段中的RPN结构定义负样本(bg)非常的conservatively,它们更在乎召回率recall而不是精度precision。
-
However, an RPN defines background regions very conservatively.
-
This label definition favors recall over precision and accurate likelihood estimation.
在本文中,只对检测器的分类分支进行改进,不对回归分支进行更改。
- In this work, we keep the bounding-box regression unchanged and only focus on the class distribution. A
对于two stages来说,第一阶段是得到无类别的目标似然class-agnostic object likelihood P ( O k ) P(O_k) P(Ok),第二阶段基于 proposals得到conditional categorical classification P ( C k ∣ O k ) P(C_k | O_k) P(Ck∣Ok),2阶段的联合类别分布joint class distribution可以被建模为公式 :
对于annotation object(正样本), ( O k = 1 O_k = 1 Ok=1 表示在第一阶段是中postive detection),表明在first stage中 O k = 1 O_k = 1 Ok=1,并且在second stage中正确预测出 P ( C k ∣ O k = 1 ) P(C_k| O_k = 1) P(Ck∣Ok=1),公式如下:
对于背景类,有2种可能性,其一是在first stage就被检测出为背景 P ( O k = 0 ) P(O_k = 0) P(Ok=0),其二是在second stage中的正样本中,检测出 P ( b g ∣ O k = 1 ) P(bg|O_k = 1) P(bg∣Ok=1)为背景。

准确的评价需要对第2阶段的所有第1阶段产出进行密集的评价,这将大大降低训练的速度。为了避免训练时速度减慢,本文制定了2个下界来联合优化,第一个下界如下所示:
公式3最大化了从第一阶段得到的高置信度得分的背景类的第二阶段对数似然。
- This lower bound maximizes the log-likelihood of back- ground of the second stage for any high-scoring object in the first stage.
第二个下界如下所示:

- With lower bound Eq. (4) and the positive objective Eq. (2), first-stage training reduces to a maximum-likelihood estimate with positive labels at annotated objects and negative labels for all other locations.
Detector design
本文制定的检测器和传统的二分类检测器的区别在于,本文制定检测器的分类得分会乘上 class-agnostic detection P ( O k ) P(O_k) P(Ok),也就是第一阶段的objectness scores。
这就需要一个更强的first stage detector,不仅仅来最大化proposal recall,并且对于每一个proposal预测reliable object likelihood。
- In our experiments, we use strong one-stage de- tectors to estimate this log-likelihood, as described in the next section.
4. Building a probabilistic two-stage detector
4.1 RetinaNet
- In our first-stage design, we found it sufficient to have a single shared head for both tasks, as object-or-not classification is easier and requires less network capacity.
4.2 CenterNet
对于原始的CenterNet,加入了FPN操作,具体来说,参考了RetinaNet-ResNet的操作。使用P3-P7层,对于FPN的输出,使用FCOS采用的4层分类和回归分支,来产生heatmap以及bbox 回归特征图。
- We upgrade CenterNet to multiple scales using an FPN.
- Specifically, we use the RetinaNet- style ResNet-FPN as the backbone with output feature maps from stride 8 to 128
既然引入了FPN,那么就要考虑到如何要尺度不同的gt在不同的FPN层中预测,本文使用了FCOS的方法,使用gt center annotations以及gt size。
- During training, we assign ground-truth center annotations to specific FPN levels based on the object size, within a fixed assignment range
本文使用FCOS中的center到物体边界作为回归的gt,讲center中心3x3区域作为positives。
4.3 Hyperparameters
有以下几点需要注意的:
- To make it compatible, we use levels P3-P7 for both one- and two-stage detectors.
- We increase the positive IoU threshold.
- We use a maximum of 256 proposal boxes in the second stage, and use the default 1K boxes for RPN-based mod- els unless stated otherwise.
5. Experiments
CenterNet* 以及 CenterNet2的区别在于,CenterNet2是 CascadeRCNN-CenterNet,而CenterNet*则是加入了FPN等操作改进的CenterNet。
- The CascadeRCNN-CenterNet design performs best among these probabilistic two-stage models. We thus adopt this basic structure in the following experiments and refer to it as CenterNet2 for brevity.
5.1 Real-time models

5.2 State-of-the-art comparison


被折叠的 条评论
为什么被折叠?



