论文：FPN

最新推荐文章于 2024-05-16 09:52:46 发布

xxiaozr

最新推荐文章于 2024-05-16 09:52:46 发布

阅读量893

点赞数

分类专栏：论文

本文链接：https://blog.csdn.net/xxiaozr/article/details/79982979

版权

论文专栏收录该内容

29 篇文章 0 订阅

订阅专栏

Introduction:

如图 a 所示，基于 image pyramids 的 feature pyramids ，通过变换目标的 pyramids level 来弥补目标尺度的变化，后来卷积网络，more robust to vairance in scale，但是输入是单尺度的，如图b

但是pyramids仍然对结果很重要，但是会开销变大，如图 c，SSD 是第一个试图使用卷积 pyramidal feature 的.

本文试图构建一个 feature pyramid 使其在所有的 scales 都具有很强的 semantics.如图 d ,我们用 top-down path-way 来结合 low-resolution ( strong semantically features ) 和 high-resolution( semantically weak features ).

具有同样的 top-down 和连接结构的如下上图所示

其目标是，产生一个 single high-level feature map of a fine resolution，之后在其上进行预测，我们的方法在每一层进行独立的预测

Related Work

Hand-engineered features and early neural networks

SIFT 和 HOG 算子：。。

Feature Pyramid Networks

Bottom-up pathway: 卷积层的前向计算，产生金字塔形的feature map ，scaling step 是 2，有很多层有相同size 的 feature map，我们称之为一个 stage .对于我们的特征金字塔，视一个 stage 为一个 pyramid level，我们选择一个 stage 的最后一层来当做我们的 reference set of feature map. 因为最后一层的特征比较 strong

Top-down pathway and lateral connections:自上而下通过上采样空间粗糙但特征更强的 high pyramid levesl 来获得高分辨率 features. 这些层又通过pathway 和来自于 bottom-up 的features 结合。每一个lateral connection 都将自上而下和自下而上的具有相同 spatial size 的 feature maps 相结合。bottom-up的特征图具有 lower-level semantics，但是他的激活在位置上更准确，因为他经过少的采样。

上图展示了自上而下的过程，先将 coarser-resolution上采样，之后和经过 1*1 卷积 ( 减少channel dimensions ) 后的自底向上的 feature map进行 element-wise addtion.最后在每一个层 merged map上使用一个 3*3 卷积操作来消除上采样的混叠效应。

因为 all levels of the pyramid 使用相同的 classifiers/regressors，我们设置每一个 feature dimension ( numbers of channels ) 有一个固定的 d，本文设为 256。

Applications

Feature Pyramid Networks for RPN

anchor 是在 spatial map 的每一个位置上滑动产生的，RPN 是在 feature map 上进行 3*3卷积后在进行两个 1*1 卷积分别用于分类和回归。我们在 feature pyramid 的每一个 level 上进行相同的操作，因为这个操作在每一个 level 上进行，所以 anchors 的 scale 就可以是单尺度的了，我们定义 anchors 在每一个 level 分别有 { 32*32,64*64,128*128,256*256,512*512}个 pixels，ratios 为 { 1:2，1:1，2:1}，剩下的实现和 Faster rcnn 相同。

Feature Pyramid Networks for Fast R-CNN

Fast R-CNN 是应用在单尺度 feature map上的，我们需要将不同尺度的 ROI 应用到 pyramid levels

我们将 predictor heads( 包括 calss_specific classifiers 和 bounding box regressors) 在所有的 levels 上的所有的 roi 上进行

heads 的参数是一致的

Experments on Object Detection

Region Proposal with RPN

Ablation Experiments:

Comparision with baselines:只使用单层的 map，anchor 还是使用的多尺度，在 C4 和 C5的结果没有多大的区别揭示了单尺度feature map 不可以，还是存在 coarser resolution 和 stronger ， semantics 的 trade-off，在 FPN 上使用 RPN 提高了很多

How important is top-down enrichment: 去掉 top-down 的结构，将 1*1 lateral connections followed by 3*3 convolutions 连接到 bottom-up pyramid 上，这种方法和 baseline 差不多，远小于我们的方法，我们认为这是因为 bottom-up pyramid 的不同 levels 中有很大的 semantic gaps

How important are pyramid representations: 不使用 resorting pyramid representation，只将 head 操作放在具有更强 semantic feature 的 highest-resolution map P2 上，和单尺度一样，assign anchors to P2 feature map only. 这个方法优于 baseline , 但是没有我们的方法好。因为RPN 是具有 fixed window size 的 sliding window detector 所以在 pyramid levels 上进行滑动可以增加他的 robustness to scale variance. 另外，因为 P2 的 resolution 比较大，产生的 anchor 比较多，而很多 anchor 对精度是没有帮助的

Object Detection with Fase/Faster R-CNN

xxiaozr

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
论文：FPN

Introduction:如图 a 所示，基于 image pyramids 的 feature pyramids ，通过变换目标的 pyramids level 来弥补目标尺度的变化，后来卷积网络，more robust to vairance in scale，但是输入是单尺度的，如图b但是pyramids仍然对结果很重要，但是会开销变大，如图 c，SSD 是第一个试图使用卷积 pyramid...
复制链接

扫一扫