YOLOv5改进（四）--轻量化模型ShuffleNetv2

wisdom_zhe

已于 2024-06-01 18:01:33 修改

阅读量961

点赞数 31

分类专栏：目标检测文章标签： YOLO 目标检测人工智能深度学习

于 2024-05-30 15:54:20 首次发布

本文链接：https://blog.csdn.net/qq_44231797/article/details/139325501

版权

目标检测专栏收录该内容

24 篇文章 20 订阅

订阅专栏

文章目录

1、前言
2、摘要
3、ShuffleNetv2网络结构
4、ShuffleNetV2
5、ShuffleNetV2_Focus (Tensor版)
6、创新点思路
7、目标检测系列文章

1、前言

移动端设备也需要既准确又快的小模型。为了满足这些需求，一些轻量级的CNN网络如MobileNet和ShuffleNet被提出，它们在速度和准确度之间做了很好地平衡。本文主要轻量化YOLOv5网络，即引入ShuffleNetv2，它是旷视2018年提出的ShuffleNet升级版本，并被ECCV2018收录。在同等复杂度下，ShuffleNetv2比ShuffleNet和MobileNetv2更准确。

ShuffleNetv2论文

2、摘要

目前衡量模型复杂度的一个通用指标是FLOPs，具体指的是multiply-add数量，但是这却是一个间接指标，因为它不完全等同于速度。如图1中的（c）和（d），可以看到相同FLOPs的两个模型，其速度却存在差异。这种不一致主要归结为两个原因，首先影响速度的不仅仅是FLOPs，如内存使用量（memory access cost, MAC），这不能忽略，对于GPUs来说可能会是瓶颈。另外模型的并行程度也影响速度，并行度高的模型速度相对更快。另外一个原因，模型在不同平台上的运行速度是有差异的，如GPU和ARM，而且采用不同的库也会有影响。

在这里插入图片描述

3、ShuffleNetv2网络结构

主干网络为ShuffleNetV2, 其主要结构Inverted_residual_unit的组成单元如下图 c、d所示:
在这里插入图片描述

(a) the basic ShuffleNet-V1 unit; (b) the ShuffleNet-V1 unit for spatial down sampling (2×);

ShuffleNetV1结构的问题：

如上图(a)(b)在ShuffleNetv1的模块中，大量使用了1x1组卷积，这违背了G2原则，另外v1采用了类似ResNet中的瓶颈层（bottleneck layer），输入和输出通道数不同，这违背了G1原则。同时使用过多的组，也违背了G3原则。短路连接中存在大量的元素级Add运算，这违背了G4原则。

ShuffleNetV2改进：

为了改善v1的缺陷，v2版本引入了一种新的运算：channel split。具体来说，在开始时先将输入特征图在通道维度分成两个分支：通道数分别为c{'}和c-c{‘} ，实际实现时 c^{’}=c/2 。左边分支做同等映射，右边的分支包含3个连续的卷积，并且输入和输出通道相同，这符合G1。而且两个1x1卷积不再是组卷积，这符合G2，另外两个分支相当于已经分成两组。两个分支的输出不再是Add元素，而是concat在一起，紧接着是对两个分支concat结果进行channle shuffle，以保证两个分支信息交流。其实concat和channel shuffle可以和下一个模块单元的channel split合成一个元素级运算，这符合原则G4。

对于下采样模块，不再有channel split，而是每个分支都是直接copy一份输入，每个分支都有stride=2的下采样，最后concat在一起后，特征图空间大小减半，但是通道数翻倍。

在同等条件下，ShuffleNetv2相比其他模型速度稍快，而且准确度也稍好一点。同时作者还设计了大的ShuffleNetv2网络，相比ResNet结构，其效果照样具有竞争力。

4、ShuffleNetV2

(1) 在models/common.py的加入ShuffleNetV2类

class Conv_maxpool(nn.Module):  
    def __init__(self, c1, c2):  # ch_in, ch_out  
        super().__init__()  
        self.conv= nn.Sequential(
            nn.Conv2d(c1, c2, kernel_size=3, stride=2, padding=1, bias=False),
            nn.BatchNorm2d(c2),
            nn.ReLU(inplace=True),
        )
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)

    def forward(self, x):  
        return self.maxpool(self.conv(x))

class ShuffleNetV2(nn.Module):
    def __init__(self, inp, oup, stride):  # ch_in, ch_out, stride
        super().__init__()

        self.stride = stride

        branch_features = oup // 2
        assert (self.stride != 1) or (inp == branch_features << 1)

        if self.stride == 2:
            # copy input
            self.branch1 = nn.Sequential(
                nn.Conv2d(inp, inp, kernel_size=3, stride=self.stride, padding=1, groups=inp),
                nn.BatchNorm2d(inp),
                nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(branch_features),
                nn.ReLU(inplace=True))
        else:
            self.branch1 = nn.Sequential()

        self.branch2 = nn.Sequential(
            nn.Conv2d(inp if (self.stride == 2) else branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),

            nn.Conv2d(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1, groups=branch_features),
            nn.BatchNorm2d(branch_features),

            nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
        )

    def forward(self, x):
        if self.stride == 1:
            x1, x2 = x.chunk(2, dim=1)
            out = torch.cat((x1, self.branch2(x2)), dim=1)
        else:
            out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)

        out = self.channel_shuffle(out, 2)

        return out

    def channel_shuffle(self, x, groups):
        N, C, H, W = x.size()
        out = x.view(N, groups, C // groups, H, W).permute(0, 2, 1, 3, 4).contiguous().view(N, C, H, W)

        return out

（2）在models/yolo.py的parse_model函数，添加Conv_maxpool、ShuffleNetV2 两个模块

if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP,C3, Conv_maxpool,ShuffleNetV2]:

在这里插入图片描述

(3)构建yolov5-shufflenetv2.yaml 网络模型

# Parameters
nc: 6  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.5 # layer channel multiple
# anchors
anchors:
  - [ 10,13, 16,30, 33,23 ]  # P3/8
  - [ 30,61, 62,45, 59,119 ]  # P4/16
  - [ 116,90, 156,198, 373,326 ]  # P5/32

  # YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  # ShuffleNetV2: [out, stride]
  [ [ -1, 1, Conv_maxpool, [ 32 ] ], # 0-P2/4        32*320*320
    [ -1, 1, ShuffleNetV2, [ 128, 2 ] ],  # 1-P3/8   128*160*160
    [ -1, 3, ShuffleNetV2, [ 128, 1 ] ],  # 2        128*160*160
    [ -1, 1, ShuffleNetV2, [ 256, 2 ] ],  # 3-P4/16  256*80*80
    [ -1, 7, ShuffleNetV2, [ 256, 1 ] ],  # 4        256*80*80
    [ -1, 1, ShuffleNetV2, [ 512, 2 ] ],  # 5-P5/32  512*40*40
    [ -1, 3, ShuffleNetV2, [ 512, 1 ] ],  # 6        512*40*40
  ]

# YOLOv5 v6.0 head
head:
  [ [ -1, 1, Conv, [ 256, 1, 1 ] ],  # 256*40*40
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], # 256*80*80
    [ [ -1, 4 ], 1, Concat, [ 1 ] ],  # cat backbone P4 # 512*80*80
    [ -1, 1, C3, [ 256, False ] ],  # 10                  256*80*80

    [ -1, 1, Conv, [ 128, 1, 1 ] ], # 128*80*80
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],  #    128*160*160
    [ [ -1, 2 ], 1, Concat, [ 1 ] ],  # cat backbone P3   256*160*160
    [ -1, 1, C3, [ 128, False ] ],  # 14 (P3/8-small)     256*160*160

    [ -1, 1, Conv, [ 128, 3, 2 ] ], #                     128*80*80
    [ [ -1, 11 ], 1, Concat, [ 1 ] ],  # cat head P4      256*80*80
    [ -1, 1, C3, [ 256, False ] ],  # 17 (P4/16-medium)   256*80*80

    [ -1, 1, Conv, [ 256, 3, 2 ] ],#                      256*40*40
    [ [ -1, 7 ], 1, Concat, [ 1 ] ],  # cat head P5       512*40*40
    [ -1, 1, C3, [ 512, False ] ],  # 20 (P5/32-large)    512*40*40

    [ [ 14, 17, 20 ], 1, Detect, [ nc, anchors ] ],  # Detect(P3, P4, P5)
  ]

5、ShuffleNetV2_Focus (Tensor版)

(1) 在models/common.py的加入ShuffleNetV2_Focus类

# 在common.py顶部导入包
from torch import Tensor
from typing import Callable, Any, List

# 引入 ShuffleNetV2_Focus
def channel_shuffle(x: Tensor, groups: int) -> Tensor:
    batchsize, num_channels, height, width = x.size()
    channels_per_group = num_channels // groups

    # reshape
    x = x.view(batchsize, groups,
               channels_per_group, height, width)

    x = torch.transpose(x, 1, 2).contiguous()

    # flatten
    x = x.view(batchsize, -1, height, width)

    return x


class ShuffleNetV2_Focus(nn.Module):
    def __init__(
        self,
        inp: int,
        oup: int,
        stride: int
    ) -> None:
        super(ShuffleNetV2_Focus, self).__init__()

        if not (1 <= stride <= 3):
            raise ValueError('illegal stride value')
        self.stride = stride

        branch_features = oup // 2
        assert (self.stride != 1) or (inp == branch_features << 1)

        if self.stride > 1:
            self.branch1 = nn.Sequential(
                self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1),
                nn.BatchNorm2d(inp),
                nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(branch_features),
                nn.ReLU(inplace=True),
            )
        else:
            self.branch1 = nn.Sequential()

        self.branch2 = nn.Sequential(
            nn.Conv2d(inp if (self.stride > 1) else branch_features,
                      branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
            self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1),
            nn.BatchNorm2d(branch_features),
            nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(branch_features),
            nn.ReLU(inplace=True),
        )

    @staticmethod
    def depthwise_conv(
        i: int,
        o: int,
        kernel_size: int,
        stride: int = 1,
        padding: int = 0,
        bias: bool = False
    ) -> nn.Conv2d:
        return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i)

    def forward(self, x: Tensor) -> Tensor:
        if self.stride == 1:
            x1, x2 = x.chunk(2, dim=1)
            out = torch.cat((x1, self.branch2(x2)), dim=1)
        else:
            out = torch.cat((self.branch1(x), self.branch2(x)), dim=1)

        out = channel_shuffle(out, 2)

        return out

(2)在models/yolo.py的parse_model函数，添加ShuffleNetV2_Focus 两个模块

if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP,C3, ShuffleNetV2_Focus]:

(3)构建yolov5-shufflenetv2-focus.yaml 网络模型

# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 0.5  # layer channel multiple

# anchors
anchors:
  - [4,5,  8,10,  13,16]  # P3/8
  - [23,29,  43,55,  73,105]  # P4/16
  - [146,217,  231,300,  335,433]  # P5/32

# Custom backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],    # 0-P2/4
   [-1, 1, InvertedResidual, [128, 2]], # 1-P3/8
   [-1, 3, InvertedResidual, [128, 1]], # 2
   [-1, 1, InvertedResidual, [256, 2]], # 3-P4/16
   [-1, 7, InvertedResidual, [256, 1]], # 4
   [-1, 1, InvertedResidual, [512, 2]], # 5-P5/32
   [-1, 3, InvertedResidual, [512, 1]], # 6
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, C3, [128, False]],  # 10

   [-1, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 2], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, C3, [128, False]],  # 14 (P3/8-small)

   [-1, 1, Conv, [128, 3, 2]],
   [[-1, 11], 1, Concat, [1]],  # cat head P4
   [-1, 1, C3, [128, False]],  # 17 (P4/16-medium)

   [-1, 1, Conv, [128, 3, 2]],
   [[-1, 7], 1, Concat, [1]],  # cat head P5
   [-1, 1, C3, [128, False]],  # 20 (P5/32-large)

   [[14, 17, 20], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]