【STDC】《Rethinking BiSeNet For Real-time Semantic Segmentation》

bryant_meng

已于 2023-02-09 10:08:46 修改

阅读量1.3k

点赞数 5

分类专栏： CNN / Transformer 文章标签： BiseNet STDC Dice Loss light-weight

于 2021-06-16 21:22:22 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/bryant_meng/article/details/116771475

版权

CNN / Transformer 专栏收录该内容

211 篇文章 7 订阅

订阅专栏

在这里插入图片描述

CVPR-2021

好久没有写博客了，抽个空赶紧把阅读笔记梳理下，头发秃了容易忘事 🕔，哈哈

文章目录

1 Background and Motivation
2 Advantages / Contributions
3 Method
- 3.1 Encoding Network
- 3.2 Design of Decoder
4 Experiments
6 Conclusion（own）

1 Background and Motivation

看标题，rethinking，噢，懂了懂了，基于 BiSeNet 的改进，

BiSeNet 是采用 context path 和 spatial path 双路结构，配合【SENet】《Squeeze-and-Excitation Networks》，以增强语义分割网络的特征提取能力！

在这里插入图片描述

本文基于 BiSeNet，指出

BiSeNet 的 context path 中 backbone 套用现有的分类网络，没有针对 segmentation 任务专门设计，影响语义分割的精度；
spatial path 虽提升了细节分割能力，但同时引入额外的计算量，增加了 inference 的负担，影响分割的速度

本文专门设计利于分割的轻量级 context path——Short-Term Dense Concatenate network（STDC），提出不影响推理速度的 spatial path（Detail Guidance module ），achieve state-of-the-art speed-accuracy trade-off

在这里插入图片描述

2 Advantages / Contributions

设计 STDC 网络，设计 Detail Guidance module 作为分割的 decoder， achieve state-of-the-art speed-accuracy trade-off

3 Method

在这里插入图片描述

3.1 Encoding Network

在这里插入图片描述

也即 BiSENet 的 context path 部分，图 3 a) 即 STDC 网络，由 b)、c) 所示的 STDC 模块组成
在这里插入图片描述
b 和 c 每个module 都有 n 个 block，区别仅是跨 stage 时候，block2 的 stride 为 2 的区别，module 很像 densenet，通道数随着 block 的深入呈指数下降，最后一个 block 的通道数和倒数第二个 block 的通道数一样，STDC module 参数量计算如下

在这里插入图片描述
M 和 N 是输入输出 channels，n 是 block 个数

类 DenseNet 的设计结构，可提取 scalable receptive field and multi-scale information，且参数量随着 n 的增大反而下降了（n>=2）

在这里插入图片描述
作者实验设定 n = 4

网路的细节设定如下

在这里插入图片描述
每个 stage 两行 R 含义是 STDC module © 和 STDC module (d) 的堆叠数量

we only use one convolutional block in each of Stage 1&2, which is proved to be sufficient according to our experiences.

哈哈哈，stage1 和 stage2 中的 channels 还没有升上去，指数级的通道下降方式吃不消

3.2 Design of Decoder

这里涉及到了作者设计的 Detail Aggregation Module（下图的 b，c 结构），以及 BiSENet 中的 ARM（Attention Refinement Module）和 FFM （Feature Fusion Module）模块

在这里插入图片描述
注意 stage5 后的结果，global average pooling ->up-sampling 后和 refine（ARM 模块，也即使 SE attention）的 stage4 stage5 特征 concat 作为 FFM 模块输入之一了

FFM 的另一个输入来自被 Detail Aggregation Module 监督的 stage3 特征，细节如下

GT 通过拉普拉斯金字塔，上采样 concat 配合 learn-able 1x1 conv 生成二值 mask 来监督由 stage3 生成的 Detail head

作者引入的 Detail Aggregation Module（上图 b、c）的作用为：

leading to more precise preservation of spatial details in low-level layers without extra computation cost in the inference time.

由于产生的 detail GT 前景较少，背景较多，直接用 binary cross-entropy 监督容易导致正负样本不均衡，作者在 binary cross-entropy 基础上，辅助了 Dice Loss

在这里插入图片描述

Dice Loss 的介绍可以参考

医学影像分割—Dice Loss

Note that this branch(Detail Aggregation Module) is discarded in the inference phase.

4 Experiments

5.1 Datasets

ImageNet
Cityscapes
CamVid

5.2 Ablation Study

1）Effectiveness of STDC Module
在这里插入图片描述
blocks 越多，越快，精度越高

2）Effectiveness of Our backbone
在这里插入图片描述

3）Effectiveness of Detail Guidance
在这里插入图片描述
对比有无 Detail Guidance 的 stage3 （b、C）

The features of Stage 3 with detail guidance encode more spatial information comparing to that of Stage 3 without detail guidance.

5.4 Compare with State-of-the-arts

1）Results on ImageNet
在这里插入图片描述

2）Results on Cityscapes

在这里插入图片描述
3）Results on CamVid

6 Conclusion（own）

global pooling -> context information
https://github.com/MichaelFan01/STDC-Seg

关注

5
点赞
踩
18

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。