【GSConv】《Slim-neck by GSConv: A better design paradigm of detector architectures for XXX》

最新推荐文章于 2024-03-12 21:08:48 发布

bryant_meng

最新推荐文章于 2024-03-12 21:08:48 发布

阅读量1.1k

点赞数 26

分类专栏： CNN / Transformer 文章标签：人工智能深度学习 GSConv Slim-neck

本文链接：https://blog.csdn.net/bryant_meng/article/details/135641682

版权

CNN / Transformer 专栏收录该内容

210 篇文章 7 订阅

订阅专栏

在这里插入图片描述

《Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles》

github：https://github.com/alanli1997/slim-neck-by-gsconv

arXiv-2022

1 Background and Motivation

目标检测在 on-board edge computing platforms 的应用中, a giant model is difficult to achieve the real-time detection requirement. depth-wise convolution 对轻量化设计很有帮助，但是 cannot achieve the sufficient accuracy

标准卷积和 depth-wise convolution 的对比
在这里插入图片描述

depth-wise separable convolution（DSC）通道完全独立（channel dense convolutional computation maximally preserves the hidden connections between each channel, but the channel-sparse convolution severs these connections completely. ），

DSC in a much lower feature extraction and fusion capability than the standard convolution（SC）

文中 DSC 更准确的叫法应该是 depth-wise convolution，因为（depth-wise separable convolution = depth-wise + point-wise convolution）

缓解的方式

Xception 和 mobilenet family 后面跟 1*1 dense convolutions（point-wise convolution）
【Xception】《Xception: Deep Learning with Depthwise Separable Convolutions》
【MobileNet】《MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications》
【MobileNet V2】《MobileNetV2：Inverted Residuals and Linear Bottlenecks》
【MobileNet V3】《Searching for MobileNetV3》
ShuffleNet family 用 channel shuffle
【ShuffleNet】《ShuffleNet：An Extremely Efficient Convolutional Neural Network for Mobile Devices》
【ShuffleNet V2】《ShuffleNet V2：Practical Guidelines for Efficient CNN Architecture Design》
GhostNet 用 “halved” SC operations
【GhostNet】《GhostNet：More Features from Cheap Operations》

作者杂糅了下上面的 3 类方法，提出了 GSConv，excellent trade-off between the model’s accuracy and speed.

在这里插入图片描述

GSConv 文中没有全名，有说是 Ghost-Shuffle Conv 的，合理

在这里插入图片描述
可以看到，GS 比 DSC 更贴合 SC，good trade-off 速度和精度

2 Related Work

yolov4-v5 框架中的

backbone
neck
head

3 Advantages / Contributions

设计提出新的卷积形式 GSConv
基于GSConv 设计目标检测新范式， GSConv-Slim-Neck Detectors
在公开数据集上验证其有效性

4 Method

在这里插入图片描述

看看代码

class GSConv(nn.Module):
    def __init__(self, c1, c2, k=1, s=1, g=1, act=True):
        super().__init__()
        c_ = c2 // 2
        self.cv1 = Conv(c1, c_, k, s, None, g, act)	
        self.cv2 = Conv(c_, c_, 5, 1, None, c_, act)
 
    def forward(self, x):
        x1 = self.cv1(x)
        x2 = torch.cat((x1, self.cv2(x1)), 1)
        # shuffle
        b, n, h, w = x2.data.size()
        b_n = b * n // 2
        y = x2.reshape(b_n, 2, h * w)
        y = y.permute(1, 0, 2)
        y = y.reshape(2, -1, n // 2, h, w)
        return torch.cat((y[0], y[1]), 1)

shuffle 操作针对 channel 维度

eg： c1~c4 完后，c1 c3 c2 c4
eg： c1~c6 完后，c1 c3 c5 c2 c4 c6
eg： c1~c8 完后，c1 c3 c5 c7 c2 c4 c6 c8

通道数要是偶数，奇数容易 shuffle 后维度不一致，下图是 shuffle channel 一个简单示例

在这里插入图片描述

SC，DSC，GSConv time complexity
在这里插入图片描述

slim-neck 的几种设计形式
在这里插入图片描述

（b）最好用

看看搭起来的网络结构与原始结构对比（仅作用在 neck 处）

在这里插入图片描述

损失函数用的是 EIoU

激活函数 Swish 和 Mish 一个更快，一个精度更高

作者还加了下面的 tricks

1）SPP
SPP 的两种形式，后者更快
在这里插入图片描述
2）attention

效果最好的是 CA
在这里插入图片描述 One suggestion is that the attention modules are usually placed at the end of the backbone to achieve better results

5 Experiments

5.1 Datasets and Metrics

WiderPerson
PASCAL VOC
SODA10M
DOTA1.0

评价指标 mAP

5.2 Ablation Studie

（1）消融下 convolution methods
在这里插入图片描述
（2）消融下 shuffle 操作的实现方式

Transposition operating 实现最高效，但部署到边缘设备中不一定支持的很好

while linear fusion approach is a substitute when the transposition is not supported on some devices.

（3）消融了下 GS bottleneck 插入到网络中的位置
在这里插入图片描述
插入到 neck 部分比较好

（4）对比下激活函数和损失函数
在这里插入图片描述
Mish 更准，Swish 更快，EIoU > CIoU

整体情况

在这里插入图片描述

5.3 Comparisons between the Slim-neck detectors and the Originals

在这里插入图片描述

But it is worth noting that the advantages of the GSConv become less obvious as the computing power of the platform grows

6 Conclusion（own）

anchor-free more flexible and no more parameters can be controlled manually, but this also leads to an increase in instability of the model
GSConv 为什么不能在网络中“量产”（all stages 都适用），作者的解释 if the GSConv be used at all stages of the model, the model’s network layers will be deeper and these deep layers will aggravate the resistance to data flow and increase the inference time significantly.
作者用在颈部，redundant repetitive information is less and compressed not needed, and the attention modules works better