【论文阅读】【二维目标检测】RepPoints: Point Set Representation for Object Detection

最新推荐文章于 2024-06-17 09:33:00 发布

麒麒哈尔

最新推荐文章于 2024-06-17 09:33:00 发布

阅读量1.5k

点赞数 2

分类专栏：论文阅读

本文链接：https://blog.csdn.net/wqwqqwqw1231/article/details/102852804

版权

论文阅读同时被 2 个专栏收录

54 篇文章 75 订阅

订阅专栏

CNN在CV的使用

9 篇文章 1 订阅

订阅专栏

文章目录

RepPoints: Point Set Representation for Object Detection
思考

关键词：ICCV2019，anchor free，two-stage

该论文是目前anchor free的文章单模型效果最好的，anchor free的其他文章在我的另外一篇博客里有介绍。对比其他文章，该文章显式的不同有如下：

其他文章基本都用的是单阶段的模型
除了GA-RPN，其他文章基本都没有用Deformable Convolution
其他使用关键点的表示box的文章是把object detection问题转成了key point estimation的问题，本文仍然采用object detection的思路

RepPoints: Point Set Representation for Object Detection

作者提出，bounding box在object detection虽然使用方便评价方便，但其实对于物体的定位来说还是相对于粗糙的，bounding box中仍然包含大量背景。RepPoints就是为了更精细化地表示object localization提出的。

值得提的一点是论文中对two stage中多阶段中的box的命名非常清晰易懂：“from anchors and proposals to final predictions”，之前读Cascade R-CNN，*“hypothesis”*读的我发晕。

没读懂的一点是*“In contrast, RepPoints are learned in a top-down fashion from the input image / object features, allowing for end-to-end training and producing fine-grained localization without additional supervision.”*为什么RepPoints就是top-down了？

RepPoints

RepPointss是使用9个representative points来表示物体的位置，为什么是9个呢，我猜想是因为使用3*3的deformable convolution能生成9个offset，得到9个点。

Center point based initial object representation这种表示方法是RepPoints的特例，即使用1个点来表示object的位置。这样做相比于anchor的好处是，用1个点表示object的问题，是一个2d空间内的表示问题，很容易就能覆盖整个图像中object的位置（每个像素都判断是否是center），但使用anchor则是一个4d空间内的表示问题，不容易完全覆盖。
"An important benefit of the center point representation lies in its much tighter hypothesis space compared to the anchor based counterparts. While anchor based approaches usually rely on a large number of multi-ratio and multi-scale anchors to ensure dense coverage of the large 4-d bounding box hypothesis space, a center point based approach can more easily cover its 2-d space. In fact, all objects will have center points located within the image. "

RepPoints的refinement也比anchor容易，因为points的refinement就是位置的偏移量，其scale实一样的。

RepPoints可以转换成Bounding box，参与使用IoU为标准的对比。

RPDet

在这里插入图片描述

上面两张图很好的展现了RPDet的结构，Classification的部分是在fisrt stage就确定的，second stage就是在refine RepPoints。具体的target assignment如下：
在这里插入图片描述

实验

RepPoints vs. bounding box：分别在baseline上使用RepPoints和bounding box，证明了RepPoints的有效性
Supervision source for RepPoints learning：有趣的是在使用recognition loss监督RepPoints的学习也能有效提升性能，而在bounding box的对比中则没有提升。“The use of the object recognition loss can drive the RepPoints to locate themselves at semantically meaningful positions on an object, which leads to fine-grained localization and improves object feature extraction for the following recognition stage. Note.”
Anchor-free vs. anchor-based：“For both detectors using bounding boxes and RepPoints, the center point based method surpass the anchor based method by +1.1 mAP and +1.4 mAP, respectively, likely because of its better coverage of ground-truth objects.”

-RepPoints act complementary to deformable RoI pooling：这一块其实效果也就涨了0.1，并说明不了太大的问题。在Appendix中对deformable RoI pooling不能有效provide a geometric representation of objects很有意思，但对于RepPoints能provide a geometric representation of objects解释的同样很粗糙。

思考

1、与AlignDet：Revisiting Feature Alignment for One-stage Object Detection的对比
在知乎上，对AlignDet的评论中，RepPoints的作者评论道：
“hello，我们是RepPoints的作者，很感谢分享工作到arXiv上面。我们仔细阅读了paper，但是我们感觉AlignDet就是我们RepPoints ablation的baseline方法（table 1和2的第一行Bounding Box，除了AlignDet用的是7x7 RoIAlign/RoIConv，而我们用的是3x3）。也许是我们没把方法描述清楚，感觉AlignDet作者可能没有get到我们很细节的做法，有点抱歉。。”

细看文章之后，发现原文中存在以下叙述：
“The two sets of RepPoints are replaced by bounding box representation, where the geometric re- finement is achieved by the standard bounding box regres- sion method, and the feature extraction is replaced by the
RoIAlign [13] method using 3 × 3 grid points”.
和正文下的注释
“ $^2$ It can be also implemented by a deformable convolution operator with an unlearnable input offset field induced by the 3×3 grid points.”

这真是太秀了，细细理解一下，还确实一样。

2、Tabel 3 的理解：
第二行可能的解释就是类似于CenterNet的操作。
但init为box，proposal为RepPoints，这个操作也没有详细解释，怎么从box变为RepPoints的操作没有说明，这一块感觉理解不了。

麒麒哈尔

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
【论文阅读】【二维目标检测】RepPoints: Point Set Representation for Object Detection

为什么不单独测试两组reppoints，与上一篇的对比，说是在abalation study一样的那个init center——》box，与yolo什么关系init box——》reppoints是怎么操作的Appendix中的reppoint没解释清楚啊。...
复制链接

扫一扫

专栏目录