Faster R-CNN的feature stride

最新推荐文章于 2024-05-29 22:01:54 发布

HackerTom

最新推荐文章于 2024-05-29 22:01:54 发布

阅读量847

点赞数 2

分类专栏：机器学习文章标签： Faster R-CNN object detect 目标检测 pytorch 人工智能

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/HackerTom/article/details/115579525

版权

机器学习专栏收录该内容

121 篇文章

订阅专栏

Faster R-CNN^[1] 在其 Implementation Details 提到：

We also note that for ZF and VGG nets, the total stride on the last conv layer is 16 pixels on the re-scaled image, and thus is ∼10 pixels on a typical PASCAL image (∼500×375).

这个 stride 是用作 feature map 与 original image 的座标对应关系。对于一张 [500, 375, 3] 的 image（所谓典型 PASCAL image size），当用 VGG16 做 backbone，所得 feature map 的 size 就是 [31, 23, 512]，其中 $\frac{500}{31}\approx\frac{375}{23}\approx 16$ 。

这在生成 anchor 时用到。以 [2] 这份复现为例，这个参数就是 feat_stride。Faster R-CNN 基于 feature map（的形状）产生 anchors，然后基于 anchors（相当于 base bbox）和 box-regression layer 输出的 refinement 参数产生 predicted bbox，参考 [3]。

然而 ground-truth bbox 是在 original image 上的，anchor 是在 feature map 上的，要将 anchor 映射回 original image。操作就是以 feature map 上各位置为 anchor centroid，乘以 feature stride 得到将这些 anchor centroids 映射回 original image 上时的座标。

References

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。