[IQDet] (CVPR. 2021)

最新推荐文章于 2022-07-14 15:21:10 发布

Ah丶Weii

最新推荐文章于 2022-07-14 15:21:10 发布

阅读量341

点赞数

分类专栏：学习

本文链接：https://blog.csdn.net/weixin_43823854/article/details/118770090

版权

该研究提出了一个名为IQDet的实例级质量分布方法，通过从真实框中提取的区域特征来近似预测框的质量。它通过质量分布编码器学习实例级质量，并指导鲁棒的采样策略。实验表明，IQDet在COCO数据集上实现了最先进的结果，显著提高了检测性能。

摘要由CSDN通过智能技术生成

1. Motivation

The improvements in sampling strategies can be divided into two tendencies.

(1) From Static to Dynamic.

(2) From Sample-wise to Instance-wise.
These sampling strategies might have a few limitations.

(1) Static rules are not learnable and prediction-aware (e.g. center region and anchor-based), which may be not always the best choice for some eccentric object.

(2) Some Dynamic rules like PAA might suffer from the noisy samples and per-sample quality rules, without jointly formulating a quality distribution in spatial dimensions, as shown in Fig. 1(b).

(3) They sample uniformly over regu- lar grids of image owing to the dense prediction paradigm, which is difficult to assemble enough high-quality and diverse samples.

Our main contribution is to propose an instance-wise quality distribution, which is extracted from the regional feature of the ground-truth to approximate each predic- tion’s quality. It guides noise-robustly sampling and it is a prediction-aware strategy.
Besides, we formulate an assignment and resampling strategy according to the distribution. It is adapted to the semantic pattern and scale of each instance and simulta- neously training with sufficient and high-quality samples.
We achieve state-of-the-art results on COCO dataset without bells and whistles. Our method leads to 2.8 AP improvements from 38.7 AP to 41.1 AP on single-stage method FCOS. ResNext-101-DCN based IQDet yields 51.6 AP, achieving state-of-the-art performance without introducing any additional overhead.

本文提出了一个新的学习分布的subnet，命名为Quality Distribution Encoder(QDE)。

具体做法，先根据gt的location提取gt的信息。这一步通过使用RoIAlign layer来实现，输入的RoI是GT box，作者认为提取GT信息的regional feature在空间维度上和分布的制定是对齐的。

To effectively encode the instance-wise feature, we first extract the feature of an object according to the GT location and it is realized by applying the RoIAlign layer to each pyramid feature, where the input RoI is the ground-truth box.
Specifically, the motivation of using GT feature is that extracting the regional feature of GT is properly aligning with the distribution assignment in spatial dimensions.

由于未知的分布不容易学习，basic idea是使用encoder将未知的分布映射为一个已知的分布，例如高斯混合模型GMM。