[IQDet] (CVPR. 2021)

该研究提出了一个名为IQDet的实例级质量分布方法,通过从真实框中提取的区域特征来近似预测框的质量。它通过质量分布编码器学习实例级质量,并指导鲁棒的采样策略。实验表明,IQDet在COCO数据集上实现了最先进的结果,显著提高了检测性能。
摘要由CSDN通过智能技术生成
image-20210715220934716

1. Motivation

  • The improvements in sampling strategies can be divided into two tendencies.

    (1) From Static to Dynamic.

    (2) From Sample-wise to Instance-wise.

  • These sampling strategies might have a few limitations.

    (1) Static rules are not learnable and prediction-aware (e.g. center region and anchor-based), which may be not always the best choice for some eccentric object.

    (2) Some Dynamic rules like PAA might suffer from the noisy samples and per-sample quality rules, without jointly formulating a quality distribution in spatial dimensions, as shown in Fig. 1(b).

    (3) They sample uniformly over regu- lar grids of image owing to the dense prediction paradigm, which is difficult to assemble enough high-quality and diverse samples.

image-20210714223646593

2. Contribution

  • Our main contribution is to propose an instance-wise quality distribution, which is extracted from the regional feature of the ground-truth to approximate each predic- tion’s quality. It guides noise-robustly sampling and it is a prediction-aware strategy.
  • Besides, we formulate an assignment and resampling strategy according to the distribution. It is adapted to the semantic pattern and scale of each instance and simulta- neously training with sufficient and high-quality samples.
  • We achieve state-of-the-art results on COCO dataset without bells and whistles. Our method leads to 2.8 AP improvements from 38.7 AP to 41.1 AP on single-stage method FCOS. ResNext-101-DCN based IQDet yields 51.6 AP, achieving state-of-the-art performance without introducing any additional overhead.

3. Method

image-20210715155154850

3.1 Formulation of Quality Distribution Encoder

本文提出了一个新的学习分布的subnet,命名为Quality Distribution Encoder(QDE)。

具体做法,先根据gt的location提取gt的信息。这一步通过使用RoIAlign layer来实现,输入的RoI是GT box,作者认为提取GT信息的regional feature在空间维度上和分布的制定是对齐的。

  • To effectively encode the instance-wise feature, we first extract the feature of an object according to the GT location and it is realized by applying the RoIAlign layer to each pyramid feature, where the input RoI is the ground-truth box.

  • Specifically, the motivation of using GT feature is that extracting the regional feature of GT is properly aligning with the distribution assignment in spatial dimensions.

由于未知的分布不容易学习,basic idea是使用encoder将未知的分布映射为一个已知的分布,例如高斯混合模型GMM。

  • It can form smooth approximations to arbitrarily shaped distribution.
  • The individual component may model some underlying set of hidden classes.

对于每一个gt的质量分布的概率密度函数,可以由公式1表示,理解为gt的所有K个componet的GMM 函数加权共同作用的结果,其中式子中的 π , μ , σ \pi, \mu, \sigma π,μ,σ都是通过网络预测得到, d ⃗ \vec d d 应该可以直接由图像中的pred center到gt center求出。

image-20210715162759490
  • K和 θ \theta θ分别表示component number和encoder parameters。
  • $\vec \pi $表示图片I中沿x和y空间维度上的混合参数mixing coefficient。
  • d ⃗ \vec d d 表示物体内部到gt center采样的沿x和y方向的
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值