文章目录
- 1. Motivation
- 2. Contribution
- 3. Method
- 4. Experiment
1. Motivation
Query based object detection。
Query based object detection frameworks achieve comparable performance with previous state-of-the-art object detectors.
How to fully leverage such frameworks to perform instance segmentation remains an open problem.
目标检测上query based方法的成功,使得将query检测框架应用于实例分割也是可以实现的。
The results show that query based instance-level perception is a very promising research direction. Thus, enabling query based detection framework to perform instance segmentation is highly desirable.
但直接将query based应用于Cascade Mask R-CNN 以及HTC等SOTA non query paradigm中是inefficient的。
Therefore, an instance segmenta- tion method tailored for the query based end-to-end frame- work is urgently needed.
在这篇文章中作者提出了QueryInst,基于query的实例分割,通过在动态mask heads上并行监督驱动。
In this paper, we present QueryInst, a query based instance segmentation method driven by parallel supervision on dynamic mask heads.
核心的思想是利用在不同stages的queries的内在intrinsic一对一对应联系,即mask RoI feature和object queries在同一阶段的一对一联系。
The key insight of QueryInst is to leverage the intrinsic one-to-one correspondence in object queries across different stages, as well as one-to-one correspondence between mask RoI features and object queries in the same stage.
This approach eliminates the explicit multi-stage mask head connection and the proposal distribution inconsistency issues inherent in non-query based multi-stage instance segmentation methods.
2. Contribution
- 一种新的思路,在基于query的端到端检测框架中使用动态的mask heads
We attempt to solve instance segmentation from a new perspective that uses parallel dynamic mask heads in the query based end-to-end detection framework.
- 一种多任务结合task-joint paradigm,联合synergy 目标检测和实例分割任务。
We set up a task-joint paradigm for query based object detection and instance segmentation by leveraging the shared query and multi-head self-attention design.
- 还在视频实例分割任务上取得了SOTA。
We extend the QueryInst to video instance seg- mentation task (VIS) [56] task by simply adding a vanilla track head.
3. Method
3.1 Query based Object Detector
In this work, we present a query based instance segmentation method on the top of the query based Sparse R-CNN detector.
图2 a表示的是sparse rcnn的流程示意图。公式如下:
q ∈ R N × d q \in R^{N \times d} q∈RN×d表示object query。(类比对于sparse rcnn就是 proposal feature eg. 100 x 256)
通过pool operator P b o x P^{box} Pbox提取bbox的特征 x t b o x x^{box}_t xtbox (eg. 100x49x256)。
q通过MSA进行特征提取,得到transformed query q ∗ q^* q