RefineNet

最新推荐文章于 2022-12-07 17:03:39 发布

写个翻译

最新推荐文章于 2022-12-07 17:03:39 发布

阅读量261

点赞数 1

文章标签：计算机视觉

本文链接：https://blog.csdn.net/weixin_46707677/article/details/108583540

版权

RefineNet: Multi-Path Refinement Networksfor High-Resolution Semantic Segmentation

摘要

RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections. In this way, the deeper layers that capture high-level semantic features can be directly refined using fine-grained features from earlier convolutions.
（RefineNet是一种通用的多路径西化网络，它利用下采样过程中所有可用的信息，使用远程残差连接实现高分辨率预测。通过这种方式，捕获高级语义特征的更深层可以用早期卷积中的细粒度特征进行细化）

1引言

细节丢失（由于pooling，卷积）----应用反卷积操作----而反卷积恢复下采样后的特征—无法输出准确的分辨率
deeplab—广泛应用，但有两个缺点----计算量大，有细节的损失
利用中间层产生高分辨率的信息----缺乏空间信息

（作者）所有层次的特征都有利于语义分割-----利用多层特征生成高分辨率预测
（贡献）
1.1相关工作
如何有效的利用中间层的特征？

2背景

介绍卷积网络以及 dilated (atrous) convolution

3提出的方法

3.1. Multi-Path Refinement

we aim to exploit multi-level features for high-resolution prediction with long-range residual connections.
（利用多层特征进行高分辨率预测，并进行远程残差连接）

在这里插入图片描述（RefineNet-m是指resnet模块m的输出）
each ResNet output is passed through one convolutional layer to adapt the dimensionality.
Fig. 2C，we start from the last block in ResNet, and connect the output of ResNet block-4 to RefineNet-4. In the next stage, the output of RefineNet-4 and the ResNet block-3 are fed to RefineNet-3 as 2-path inputs. The goal of RefineNet-3 is to use the high-resolution features from ResNet block-3 to refine the low-resolution feature map output by RefineNet-4 in the previous stage. RefineNet-2 and RefineNet-1也是如此 . As the last step, the final high-resolution feature maps are fed to a dense soft-max layer to make the final prediction in the form of a dense score map. This score map is then upsampled to match the original image using bilinear interpolation.
3.2. RefineNet
在这里插入图片描述

注：下图引自https://blog.csdn.net/kevin_zhao_zl/article/details/84750779

在这里插入图片描述
注：上图引自https://blog.csdn.net/kevin_zhao_zl/article/details/84750779

Residual convolution unit.（图3b）
The first part of each RefineNet block consists of an adaptive convolution set that mainly fine-tunes the pretrained ResNet weights for our task. To that end, each input path is passed sequentially through two residual convolution units (RCU)（无BN的resnet卷积单元的简化版本）
Multi-resolution fusion.（图3c）
All path inputs are then fused into a high-resolution feature map by the multi-resolution fusion block. This block first applies convolutions for input adaptation, which generate feature maps of the same feature dimension (the smallest one among the inputs), and then upsamples all (smaller) feature maps to the largest resolution of the inputs. Finally, all features maps are fused by summation. (将路径融和成高分辨率的特征图。1，用于输入自适应的卷积，以产生相同维度的特征图（按输入中最小的） 2，上采样特征图到输入中最大的分辨率（对小分辨率进行上采样） 3，特征图相加进行融和)
Chained residual pooling.（图3d）
The proposed chained residual pooling aims to capture background context from a large image region. One pooling block takes the output of the previous pooling block as input. Therefore, the current pooling block is able to re-use the result from the previous pooling operation and thus access the features from a large region without using a large pooling window.
The output feature maps of all pooling blocks are fused together with the input feature map through summation of residual connections. In one pooling block, each pooling operation is followed by convolutions which serve as a weighting layer for the sum-mation fusion.
（从大的图像区域捕捉背景上下文。一个池化块的输出作为下一个池化快的输入，故当前的池化快可以重新应用之前池化操作的结果，因此可以在不利用大的池化窗口的情况下访问大的区域中的特征。
输出特征图和输入特征图求和进行融合。在池化快中，池化后的卷积相当于求和融合的权重层）
Note that we include one non-linear activation layer (ReLU) in the chained residual pooling block. We observed that this ReLU is important for the effectiveness of subsequent pooling operations and it also makes the model less sensitive to changes in the learning rate. We observed that one single ReLU in each RefineNet block does not noticeably reduce the effectiveness of gradient flow.（relu，对接下来pooling的有效性很重要，而且使得模型对学习率的变化没这么敏感；并不影响梯度传播）

Output convolutions.
The final step of each RefineNet block is another residual convolution unit (RCU). This results in a sequence of three RCUs between each block. To reflect this behavior in the last RefineNet-1 block, we place two additional RCUs before the final softmax prediction
step. The goal here is to employ non-linearity operations on the multi-path fused feature maps to generate features for further processing or for final prediction.
(最后是做一个RCU模块。因此每一个RefineNet block中有三个RCU模块。*在soft-max预测之前放置了两个额外的RCU模块？？？*目的是运用线性运算融合多路径的特征图产生特征。)

3.3. Identity Mappings in RefineNet（恒等映射）
short-range and long-range residual connections
Short-range residual connections refer to local shortcut connections in one RCU or the residual pooling component, while long-range residual connections refer to the connections between RefineNet modules and the ResNet blocks.

4. Experiments

4.1. Object Parsing
4.2. Semantic Segmentation
4.3. Variants of cascaded RefineNet

5. Conclusion

The cascaded architecture is able to effectively combine high-level semantics and low-level features to produce high-resolution segmentation maps. Our design choices are inspired by the idea of identity mapping which facilitates gradient propagation across long-range connections and thus enables effective end-to-end learning.
（级联结构能够有效地将高层语义和底层特征相结合；恒等映射，它促进了跨远程连接的梯度传播，从而实现了有效的端到端学习）

写个翻译

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
RefineNet

RefineNet: Multi-Path Refinement Networksfor High-Resolution Semantic Segmentation摘要RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution pred.
复制链接

扫一扫