PANet（2018）

最新推荐文章于 2023-11-14 21:50:44 发布

dadaHaHa1234

最新推荐文章于 2023-11-14 21:50:44 发布

阅读量4.4k

点赞数 7

本文链接：https://blog.csdn.net/qq_32425195/article/details/104482110

版权

关键：根据提议的ROI在每层特征图上都裁剪相应区域的特征，然后池化为指定大小，然后用max将特征融合。使用融合后的特征做预测

Abstract:

神经网络中信息的流通路径很重要。我们提出PANet，通过增加从最底层到最上层的信息传输路径，加强了特征金字塔。并且提出了不同层的特征图通过自适应池化后融合在一起，然后统一做预测。

1. Introduction

（1）设计网络的一些原则

通过残差和密集连接，缩短信息传播所要的路径；加宽网络，使用并行路径，增加网络传输灵活性和专一性等。

There have been several principles proposed for designing networks in image classification that are also effective for object recognition. For example, shortening information path and easing information propagation by clean residual connection [23, 24] and dense connection [26] are useful. Increasing the flexibility and diversity of information paths by creating parallel paths following the splittransform-merge strategy [61, 6] is also beneficial

（2）我们发现：底层特征信息在目标定位时很有用，但是信息从底层特征图到高层特征图上的传递路径很长，细节信息从底层传到高层困难。只从一张特征图上获得的信息不够全面。

Specifically, features in low levels are helpful for large instance identification. But there is a long path from low-level structure to topmost features, increasing difficulty to access accurate localization information。Further, each proposal is predicted based on feature grids pooled from one feature level, which is assigned heuristically. This process can be updated since information discarded in other levels may be helpful for final prediction.

（3）我们的贡献：bottom-up path augmentation is created；to recover broken information path between each proposal and all feature levels, we develop adaptive feature pooling。Finally, to capture different views of each proposal, we augment mask prediction with tiny fully-connected (fc) layers,

2.相关工作

（1）实例分割主要有两种办法：proposal-based;segmentationbased.

(2)不同的层的特征的混合：

（3）更大的上下文区域(Larger Context Region):Features pooled from a larger region provide surrounding context. Global pooling was used in PSPNet [67] and ParseNet [43] to greatly improve quality of semantic segmentation. Similar trend was observed by Peng et al. [47] where global convolutionals were utilized

3. Framework

网络结构：

3个点：Bottom-up Path Augmentation，Adaptive Feature Pooling，Fully-connected Fusion

3.1. Bottom-up Path Augmentation

高层特征含有语义信息，而低层特征包含细节信息的更具体的描述：高层的神经元是对一大片区域做出响应，相当于对一块区域的语义信息响应；而低层的神经元是对局部细节信息做出响应。随着特征图变高，感受野变大，那么该神经元就负责对更大的区域的特征做出响应。

我们增加了上图中的绿色虚线连接，这样细节信息传递到上层只经过了小于10个conv layers，而作为对比，原来的FPN中，细节信息传递到最上面的层则经历了很多个卷积层。

从底向上的连接（Augmented Bottom-up Structure）：如下面两图所示

3.2. Adaptive Feature Pooling

(1)在FPN中，根据proposals的大小，分配该proposals从哪一个特征层中提取特征，这并不是一个最佳选择；而且，各层在目标识别时都有一定的重要性，无论大的proposals还是小的proposals，都需要不同层的特征.因此，我们提出adaptive feature pooling，从所有的层中提取特征并且将他们fusing在一起。

我们将不同层提取的特征通过pooling池化为同一大小，然后用max操作融合在一起。下图表示了各层特征在识别不同大小的目标时都有其作用：

我们在做特征融合时，将各层池化的特征，先经过一个参数层（比如全连接层，或者卷积层，来做特征选择等），然后再将特征融合在一起。比如FPN中有两个卷积层，池化的特征->第一个全连接层->特征融合->第二个全连接层。 Mask RCNN中有两个卷积层，池化的特征->第一个卷积层->特征融合->第二个卷积层。

In following sub-networks, pooled feature grids go through one parameter layer independently, which is followed by the fusion operation, to enable network to adapt features. For example, there are two fc layers in the box branch in FPN. We apply the fusion operation after the first layer. Since four consecutive convolutional layers are used in mask prediction branch in Mask R-CNN, we place fusion operation between the first and second convolutional layers.