【语义分割】swiftnet--In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation

最新推荐文章于 2023-04-08 08:30:35 发布

1273545169

最新推荐文章于 2023-04-08 08:30:35 发布

阅读量2.6k

点赞数

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/baidu_27643275/article/details/97023406

版权

github：swiftnet

一、Swiftnet

（1）采用encoder-decoder的网络架构，选择resnet18和mobilenetv2作为其主干网；

（2）提出了一种扩大感受野的方法： fuse shared features at multiple resolutions in a novel fashion；

（3）给出了一些语义分割的小trick

二、Basic building blocks

`2.1 、Recognition encoder`

swiftnet选择resnet18和mobilenetv2作为主干网。

resnet18的计算量是mobilenetv2的6倍。

但mobilenetv2中使用了深度可分离卷积，GPU firmware (the cuDNN library)并没有直接支持此类卷积，故很多实践中，mobilenetv2反而比resnet18速度慢。

DenseNet中同样存在此问题，因为其需要efficient convolution over a non-contiguous tensor，cuDNN并不支持此操作。

2.2、Upsampling decoder

为了保持实时的处理速度，上采样流程需尽可能简单。

首先低分辨率特征图使用双线性插值上采样，然后与来自lateral connection分支的lateral features相加。最后再输入到一个3x3 conv中。

需注意两点：
（1）lateral connection分支应该接在sum后，若接在relu后会降低准确率；
（2）若使用深度可分离卷积或1x1 conv替代最后的3x3 conv，准确率会下降。
在这里插入图片描述
所有蓝色UP（decoder单元）的channel数均为128，故lateral features需用1x1的卷积使特征图的channel一致。

2.3、Module for increasing the receptive field

使用spatial pyramid pooling和pyramid fusion方法扩大感受野的同时保证实时。

两个encoder共享权重
在这里插入图片描述

三、Experiment

在这里插入图片描述

3.1、Validation of the upsampling capacity

在这里插入图片描述

3.2、Size of the receptive field

在这里插入图片描述

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
【语义分割】swiftnet--In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation

Swiftnet（1）采用encoder-decoder的网络架构，选择resnet18和mobilenetv2作为其主干网；（2）提出了一种扩大感受野的方法： fuse shared features at multiple resolutions in a novel fashion；（3）给出了一些语义分割的小trick
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。