文章目录
Abstract
DABNet 中提出了一个新的 Depth-wise Asymmetric Bottleneck (DAB) module
to make a trade-off between accuracy and inference speed.
DAB模块中采用了深度可分离卷积
、空洞卷积
以及分解卷积
。此结构 extracts local and contextual information jointly
并且大大减少参数量。
DABNet没有任何的context module
, pretrained model
, andpost-processing scheme
.
DABNet can run on high-resolution images (512×1024) at 104 FPS
on a single GTX 1080Ti card and 70.1%
Mean IoU on the Cityscapes test dataset with merely 0.76 M
parameters.
1、Related Work
Real-time semantic segmentation:
ENet、ICNet、ERFNet、ESPNet、BiSeNet
Dilated convolution:
Dilated convolution is able to create sizeable receptive field while
maintaining the number of parameter
.
when dilation rate becomes larger and larger (in most cases, it increases up to 16 or more), we need to implement lots of padding to maintain the size of the feature map. This
brings heavy computational cost
Convolution Factorization
深度可分离卷积、分组卷积、卷积核分解
2、Network
2.1、Depth-wise Asymmetric Bottleneck
参考MobileNetv2最后一个1x1的卷积后不加relu
第一个3x3卷积为普通卷积
。虽然1x1卷积可减少计算量,但感受野小,使用3x3卷积在模型深度和感受野之间取得平衡。
在普通卷积中应用空洞卷积会增大计算量,故仅在分支中的深度可分离卷积中应用空洞卷积。
实现可参考:DAB
2.2、Network Architecture Design
第一个downsample
结构同ENet中的相同: concatenation
of a 3 × 3 convolution with stride 2 and a 2 × 2 max-pooling.
第二个downsample
:3 × 3 convolution with stride 2
DBANet仅进行三倍下采样
。
借鉴ESPNetv2,DABNet在原图和downsample模块建立了一个long-range shortcut connection
。
先将原图通过平均池化来下采样
,然后与Downsample模块的前一个卷积concat,再输入到Downsample模块。
实现可参考shortcut connection
3、Experiments