Region Proposal by Guided Anchoring

Region anchors are the cornerstone of modern object detection techniques. State-of-the-art detectors mostly rely on a dense anchoring scheme, where anchors are sampled uniformly over the spatial domai...
摘要由CSDN通过智能技术生成

Region anchors are the cornerstone of modern object detection techniques. State-of-the-art detectors mostly rely on a dense anchoring scheme, where anchors are sampled uniformly over the spatial domain with a predefined set of scales and aspect ratios. In this paper, we revisit this foundational stage. Our study shows that it can be done much more effectively and efficiently. Specifically, we present an alternative scheme, named Guided Anchoring, which leverages semantic features to guide the anchoring. The proposed method jointly predicts the locations where the center of objects of interest are likely to exist as well as the scales and aspect ratios at different locations. On top of predicted anchor shapes, we mitigate the feature inconsistency with a feature adaption module. We also study the use of high-quality proposals to improve detection performance. The anchoring scheme can be seamlessly integrated to proposal methods and detectors. With Guided Anchoring, we achieve 9.1% higher recall on MS COCO with 90% fewer anchors than the RPN baseline. We also adopt Guided Anchoring in Fast R-CNN, Faster R-CNN and RetinaNet, respectively improving the detection mAP by 2.2%, 2.7% and 1.2%.
区域锚点是现代物体检测技术的基石。最先进的探测器主要依赖于密集锚定方案,其中锚点在空间域上均匀地采样,具有预定义的一组尺度和纵横比。在本文中,我们重新审视了这个基础阶段。我们的研究表明,它可以更加有效和高效地完成。具体来说,我们提出了一个名为Guided Anchoring的替代方案,它利用语义特征来指导锚定。所提出的方法联合预测可能存在感兴趣对象的中心的位置以及不同位置处的尺度和纵横比。在预测的锚形状之上,我们通过特征适应模块减轻特征不一致性。我们还研究使用高质量的提案来提高检测性能。锚定方案可以无缝集成到提议方法和检测器中。通过引导锚定,我们在MS COCO上的召回率提高了9.1%,锚点数量比RPN基线少90%。我们还采用快速R-CNN中的导向锚定,更快的R-CNN和RetinaNet,分别将检测mAP提高了2.2%,2.7%和1.2%。
Anchors are regression references and classification candidates to predict proposals (for two-stage detectors) or final bounding boxes (for single-stage detectors). Modern object detection pipelines usually begin with a large set of densely distributed anchors. Take Faster RCNN [30], a popular object detection framework, for instance, it first generates region proposals from a dense set of anchors and then classifies them into specific classes and refines their locations via bounding box regression.
There are two general rules for a reasonable anchor de-sign: alignment and consistency. Firstly, to use convolutional features as anchor representations, anchor centers need to be well aligned with feature map pixels. Secondly, the receptive field and semantic scope are consistent in different regions of a feature map, so the scale and shape of anchors across different locations should be consistent. Sliding window is a simple and widely adopted anchoring scheme following the rules. For most detection methods, the anchors are defined by such a uniform scheme, where every location in a feature map is associated with k anchors with predefined scales and aspect ratios.
锚点是回归参考和分类候选者,用于预测建议(用于两级探测器)或最终边界框(用于单级探测器)。现代物体检测管道通常以大量密集分布的锚点开始。例如,采用更快的RCNN [30],一种流行的对象检测框架,它首先从一组密集的锚点生成区域提议,然后将它们分类为特定的类,并通过边界框回归来细化它们的位置。
合理的锚设计有两个一般规则:对齐和一致性。首先,要使用卷积特征作为锚表示,锚点中心需要与特征映射像素很好地对齐。其次,感受野和语义范围在特征图的不同区域是一致的,因此不同位置的锚的尺度和形状应该是一致的。滑动窗口是遵循规则的简单且广泛采用的锚定方案。对于大多数检测方法,锚由这种统一方案定义,其中特征图中的每个位置与具有预定义比例和纵横比的k锚相关联。
Anchor-based detection pipelines have been shown effective in both benchmarks [7, 22, 8, 5] and real-world systems. However, the uniform anchoring scheme described above is not necessarily the optimal way to prepare the anchors. This scheme can lead to two difficulties: (1) A neat set of anchors of fixed aspect ratios has to be predefined for different problems. A wrong design may hamper the speed and accuracy of the detector. (2) To maintain a sufficiently high recall for proposals, a large number of anchors are needed, while most of them correspond to false candidates that are irrelevant to the object of interests. Meanwhile, large number of anchors can lead to significant computational cost especially when the pipeline involves a heavy classifier in the proposal stage.
In this work, we present a more effective method to prepare anchors, with the aim to mitigate the issues of handpicked priors. Our method is motivated by the observation that objects are not distributed evenly over the image plane. The scale of an object is also closely related to the imagery content, its location and geometry of the scene. Following this intuition, our method generates sparse anchors in two steps: first identifying sub-regions that may contain objects and then determining the scales and aspect ratios at different locations.
Learnable anchor shapes are promising, but it breaks the aforementioned rule of consistency, thus presents a new challenge for learning anchor representations for accurate classification and regression, Scales and aspect ratios of anchors are now variable instead of fixed, so different feature map pixels have to learn adaptive representations that fit to the corresponding anchors. To solve this problem, we introduce an effective module to adapt the features based on anchor geometry.

基于锚定的检测管道已经在基准[7,22,8,5]和现实世界系统中显示出有效性。然而,上述均匀锚固方案不一定是制备锚的最佳方式。该方案可能导致两个困难:(1)必须针对不同问题预定义一组固定宽高比的整齐锚。错误的设计可能会妨碍探测器的速度和精度。 (2)为了对提案保持足够高的召回率,需要大量的锚点,而大多数锚点对应于与利益对象无关的虚假候选人。同时,大量锚点可能导致显着的计算成本,尤其是当管道在提议阶段涉及重型分类器时。
在这项工作中,我们提出了一种更有效的方法来准备锚,目的是减轻精心挑选的先验问题。我们的方法是通过观察物体不均匀分布在图像平面上来激发的。对象的比例也与图像内容,场景的位置和几何形状密切相关。遵循这种直觉,我们的方法分两步生成稀疏锚点:首先识别可能包含对象的子区域,然后确定不同位置的比例和纵横比。
可学习的锚形状很有前景,但它打破了上述一致性规则,因此为学习锚定表示提供了新的挑战,以便进行准确的分类和回归,锚点的比例和纵横比现在是可变的而不是固定的,因此不同的特征地图像素必须学习适合相应锚点的自适应表示。为了解决这个问题,我们引入了一个有效的模块来适应基于锚几何的特征。
We formulate a Guided Anchoring Region Proposal Network (GA-RPN) with the aforementioned guided anchoring and feature adaptation scheme. Thanks to the dynamically predicted anchors, our approach achieves 9.1% higher recall with 90% substantially fewer anchors than the RPN baseline that adopts dense anchoring scheme. By predicting the scales and aspect ratios instead of fixing them based on a predefined list, our scheme handles tall or wide objects more effectively. Besides region proposals, the guided anchoring scheme can be easily integrated into any detectors that depend on anchors. Consistent performance gains can be achieved with our scheme. For instance, GA-Fast-RCNN, GA-Faster-RCNN and GA-RetinaNet improve overall mAP by 2.2%, 2.7% and 1.2% respectively on COCO dataset over their baselines with sliding window anchoring. Furthermore, we explore the use of high-quality proposals, and propose a fine-tuning schedule using GA-RPN proposals, which can improve the performance of any trained models, e.g., it improves a fully converged Faster R-CNN model from 37.4% to 39.6%, in only 3 epochs.
The main contributions of this work lie in several aspects. (1) We propose a new

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值