ICNet

摘要

propose an image cascade network (ICNet) that incorporates multi-resolution branches
提出了一种融合多分辨率分支的图像级联网络(ICNet))
introduce the cascade feature fusion unit to quickly achieve high-quality segmentation.
引入级联特征融合单元,快速实现高质量分割)

1.引言

  • 快速语义分割的研究现状
    高质量切分研究居多

  • 关注和贡献
    we focus on building a practically fast semantic segmentation system with decent prediction accuracy. (致力于构建一个预测准确率较高的快速语义切分系统)
    It achieves decent trade-off between efficiency and accuracy.(它在效率和准确性之间实现了适当的折衷)

    It exploits efficiency of processing low-resolution images and high inference quality of high-resolution ones.(充分利用了低分辨率图像的处理效率和高分辨率图像的高推理质量)The idea is to let low-resolution images go through the full semantic perception network first for a coarse prediction map. Then cascade feature fusion unit and cascade label guidance strategy are proposed to integrate medium and high resolution features, which refine the coarse semantic map gradually. (这个想法是让低分辨率的图像首先通过完整的语义感知网络,得到一个粗略的预测图。然后提出级联特征融合单元和级联标签引导策略对中、高分辨率特征进行融合,逐步细化粗语义地图。)

贡献
– network ,it utilizes semantic information in low resolution along with details from high-resolution images efficiently.
有效地利用了高分辨率图像的细节和低分辨率的语义信息
– The developed cascade feature fusion unit together with cascade label guidance can recover and refine segmentation prediction progressively with a low computation cost.
所开发的级联特征融合单元与级联标签引导相结合,能够以较低的计算代价逐步恢复和细化分割预测
– Our ICNet achieves 5× speedup of inference time, and reduces memory consumption by 5× times. It can run at high resolution 1024×2048 in speed of 30 fps while accomplishing high-quality results.
ICNet实现了推理时间的5倍加速,内存消耗减少了5倍。它可以以30fps的速度以1024×2048的高分辨率运行,同时获得高质量的结果

2.相关工作

  • 高质量的语义分割
  • 高效语义分割
  • 视频语义分割

3.Image Cascade Network

PSPNet—ICNet—cascade feature fusion unit and cascade label guidance

3.1 Speed Analysis
The computation complexity is associated with feature map resolution, number of kernels and network width(计算复杂度与特征图分辨率、核数和网络宽度相关)
Computation increases squarely regarding image resolution.(计算量与图像分辨率成正比增加)

3.2 Network Architecture
在这里插入图片描述
it takes cascade image inputs (i.e., low-, medium- and high resolution images), adopts cascade feature fusion unit and is trained with cascade label guidance
(它采用级联图像输入(即低、中、高分辨率图像),采用级联特征融合单元,并通过级联标签指导进行训练)
The input image with full resolution is downsampled by factors of 2 and 4, forming cascade input to medium- and high-resolution branches.
(对全分辨率的输入图像进行2和4倍的下采样,形成中高分辨率分支的级联输入。)

get semantic extraction using low-resolution input
(用低分辨率输入得到语义提取)
To get high quality segmentation, medium and high resolution branches help recover and refine the coarse prediction.
(为了获得高质量的分割,中分辨率和高分辨率分支有助于恢复和细化粗略预测。)
different-branch output feature maps are fused by cascade-feature-fusion unit and trained with cascade label guidance
(级联特征融合单元对不同分支输出的特征图进行融合,并利用级联标签指导进行训练)

3.3 Cascade Feature Fusion
在这里插入图片描述
The input to this unit contains three components: two feature maps F1and F2with sizes C1× H1× W1and C2× H2× W2 respectively, and a ground-truth label with resolution 1×H2×W2. F2 is with doubled spatial size of F1.
(该单元的输入包含三个分量:大小分别为C1×H1×W1和C2×H2×W2的两个特征地图F1和F2,以及分辨率为1×H2×W2的地面真相标签。F2的空间大小是F1的两倍。)

3.4 Cascade Label Guidance
It utilizes different-scale (e.g., 1/16, 1/8, and 1/4) ground truth labels to guide the learning stage of low, medium and high resolution input.
(它利用不同尺度(如1/16、1/8和1/4)的真值标签来指导低、中、高分辨率输入的学习阶段)
In the testing phase, the low and medium guidance operations are simply abandoned, where only high-resolution branch is retained.
(在测试阶段,只保留高分辨率分支。)

4 Structure Comparison and Analysis

在这里插入图片描述
以前的框架都是在高分辨率输入的情况下进行相对密集的计算。而在我们的级联结构中,只将最低分辨率的输入输入到重CNN中,大大减少了计算量,从而得到了粗略的语义预测。较高分辨率的输入旨在逐步恢复和改进关于模糊边界和丢失细节的预测。因此,它们被轻量级的CNN处理。新引入的级联特征融合单元和级联标签引导策略综合了中、高分辨率的特征,逐步细化粗略的语义图。

5 Experimental Evaluation

5.1 Implementation Details
细节…

5.2 Cityscapes

Intuitive Speedup (直接加速)

  1. Downsampling Input
    Image resolution is the most critical factor that affects running speed
    (图像分辨率是影响运行速度的最关键因素)
    效果不好效果不好
  2. Downsampling Feature
    scale down the feature map by a large ratio in the inference process.
    (在推理过程中,按较大比例缩小要素地图)
    在这里插入图片描述
    特征图小,推理速度快,但是即使在1:32的比例下,运行时间仍然不满足很长,不满足实时要求
  3. Model Compression
    trim kernels in each layer(修剪每层中的内核)
    效果不好

Cascade Branches(级联分支)

在这里插入图片描述

Cascade Structure(级联结构)

We also do ablation study on cascade feature fusion unit and cascade label guidance.(对级联特征融合单元和级联标签制导进行了消融研究)

在这里插入图片描述

Methods Comparison(方法比较)
在这里插入图片描述

Visual Improvement
在这里插入图片描述

Quantitative Analysis

5.3 CamVid
5.4 COCO-Stuff

6 Conclusion

We have proposed a real-time semantic segmentation system ICNet. It incorporates effective strategies to accelerate network inference speed without sacrificing much performance.
(我们提出了一个实时语义切分系统ICNet。它结合了有效的策略,在不牺牲太多性能的情况下加快了网络推理速度。)
The major contributions include the new framework for saving operations in multiple resolutions and the powerful fusion unit.
(主要贡献包括用于保存多分辨率操作的新框架和功能强大的融合单元。)

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值