理解yolov7网络结构

 以下是yolov7网络结构配置的yaml,对每一层的输出加了注释。

# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [12,16, 19,36, 40,28]  # P3/8
  - [36,75, 76,55, 72,146]  # P4/16
  - [142,110, 192,243, 459,401]  # P5/32

# yolov7 backbone
backbone:
  # [from, number, module, args]           [N,3,640,640]
  [[-1, 1, Conv, [32, 3, 1]],  # 0         [N,32,640,640]
  
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2     [N,64,320,320]
   [-1, 1, Conv, [64, 3, 1]],              #[N,64,320,320]
   
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4    [N,128,160,160]
   [-1, 1, Conv, [64, 1, 1]],  # -6         [N,64,160,160]
   [-2, 1, Conv, [64, 1, 1]],  # -5         [N,64,160,160]
   [-1, 1, Conv, [64, 3, 1]],              #[N,64,160,160]
   [-1, 1, Conv, [64, 3, 1]],  # -3        #[N,64,160,160]
   [-1, 1, Conv, [64, 3, 1]],              #[N,64,160,160]
   [-1, 1, Conv, [64, 3, 1]],  # -1        #[N,64,160,160]
   [[-1, -3, -5, -6], 1, Concat, [1]],     #[N,256,160,160]
   [-1, 1, Conv, [256, 1, 1]],  # 11       #[N,256,160,160]
         
   [-1, 1, MP, []],                        #[N,256,80,80]
   [-1, 1, Conv, [128, 1, 1]],             #[N,128,80,80]
   [-3, 1, Conv, [128, 1, 1]],             #[N,128,160,160]
   [-1, 1, Conv, [128, 3, 2]],             #[N,128,80,80]
   [[-1, -3], 1, Concat, [1]],  # 16-P3/8  #[N,256,80,80]
   [-1, 1, Conv, [128, 1, 1]],             #[N,128,80,80]
   [-2, 1, Conv, [128, 1, 1]],             #[N,128,80,80]
   [-1, 1, Conv, [128, 3, 1]],             #[N,128,80,80]
   [-1, 1, Conv, [128, 3, 1]],             #[N,128,80,80]
   [-1, 1, Conv, [128, 3, 1]],             #[N,128,80,80]
   [-1, 1, Conv, [128, 3, 1]],             #[N,128,80,80]
   [[-1, -3, -5, -6], 1, Concat, [1]],     #[N,512,80,80]
   [-1, 1, Conv, [512, 1, 1]],  # 24       #[N,512,80,80]
         
   [-1, 1, MP, []],                        #[N,512,40,40]
   [-1, 1, Conv, [256, 1, 1]],             #[N,256,40,40]
   [-3, 1, Conv, [256, 1, 1]],             #[N,256,80,80]
   [-1, 1, Conv, [256, 3, 2]],             #[N,256,40,40]
   [[-1, -3], 1, Concat, [1]],  # 29-P4/16 #[N,512,40,40]
   [-1, 1, Conv, [256, 1, 1]],             #[N,256,40,40]
   [-2, 1, Conv, [256, 1, 1]],             #[N,256,40,40]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,40,40]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,40,40]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,40,40]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,40,40]
   [[-1, -3, -5, -6], 1, Concat, [1]],     #[N,1024,40,40]
   [-1, 1, Conv, [1024, 1, 1]],  # 37      #[N,1024,40,40]
         
   [-1, 1, MP, []],                        #[N,1024,20,20]
   [-1, 1, Conv, [512, 1, 1]],             #[N,512,20,20]
   [-3, 1, Conv, [512, 1, 1]],             #[N,512,40,40]
   [-1, 1, Conv, [512, 3, 2]],             #[N,512,20,20]
   [[-1, -3], 1, Concat, [1]],  # 42-P5/32 #[N,1024,20,20]
   [-1, 1, Conv, [256, 1, 1]],             #[N,256,20,20]
   [-2, 1, Conv, [256, 1, 1]],             #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],             #[N,256,20,20]
   [[-1, -3, -5, -6], 1, Concat, [1]],     #[N,1024,20,20]
   [-1, 1, Conv, [1024, 1, 1]],  # 50      #[N,1024,20,20]
  ]

# yolov7 head
head:
  [[-1, 1, SPPCSPC, [512]], # 51                   #[N,512,20,20]
  
   [-1, 1, Conv, [256, 1, 1]],                     #[N,256,20,20]
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],     #[N,256,40,40]
   [37, 1, Conv, [256, 1, 1]], # route backbone P4 #[N,256,40,40]
   [[-1, -2], 1, Concat, [1]],                     #[N,512,40,40]
   
   [-1, 1, Conv, [256, 1, 1]],                     #[N,256,40,40]
   [-2, 1, Conv, [256, 1, 1]],                     #[N,256,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],     #[N,1024,40,40]
   [-1, 1, Conv, [256, 1, 1]], # 63                #[N,256,40,40]
   
   [-1, 1, Conv, [128, 1, 1]],                     #[N,128,40,40]
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],     #[N,128,80,80]
   [24, 1, Conv, [128, 1, 1]], # route backbone P3 #[N,128,80,80]
   [[-1, -2], 1, Concat, [1]],                     #[N,256,80,80]
   
   [-1, 1, Conv, [128, 1, 1]],                     #[N,128,80,80]
   [-2, 1, Conv, [128, 1, 1]],                     #[N,128,80,80]
   [-1, 1, Conv, [64, 3, 1]],                      #[N,64,80,80]
   [-1, 1, Conv, [64, 3, 1]],                      #[N,64,80,80]
   [-1, 1, Conv, [64, 3, 1]],                      #[N,64,80,80]
   [-1, 1, Conv, [64, 3, 1]],                      #[N,64,80,80]
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],     #[N,512,80,80]
   [-1, 1, Conv, [128, 1, 1]], # 75                #[N,128,80,80]
      
   [-1, 1, MP, []],                                #[N,128,40,40]
   [-1, 1, Conv, [128, 1, 1]],                     #[N,128,40,40]
   [-3, 1, Conv, [128, 1, 1]],                     #[N,128,80,80]
   [-1, 1, Conv, [128, 3, 2]],                     #[N,128,40,40]
   [[-1, -3, 63], 1, Concat, [1]],                 #[N,512,40,40]
   
   [-1, 1, Conv, [256, 1, 1]],                     #[N,256,40,40]
   [-2, 1, Conv, [256, 1, 1]],                     #[N,256,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [-1, 1, Conv, [128, 3, 1]],                     #[N,128,40,40]
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],     #[N,1024,40,40]
   [-1, 1, Conv, [256, 1, 1]], # 88                #[N,256,40,40]
      
   [-1, 1, MP, []],                                #[N,256,20,20]
   [-1, 1, Conv, [256, 1, 1]],                     #[N,256,20,20]
   [-3, 1, Conv, [256, 1, 1]],                     #[N,256,40,40]
   [-1, 1, Conv, [256, 3, 2]],                     #[N,256,20,20]
   [[-1, -3, 51], 1, Concat, [1]],                 #[N,1024,20,20]
   
   [-1, 1, Conv, [512, 1, 1]],                     #[N,512,20,20]
   [-2, 1, Conv, [512, 1, 1]],                     #[N,512,20,20]
   [-1, 1, Conv, [256, 3, 1]],                     #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],                     #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],                     #[N,256,20,20]
   [-1, 1, Conv, [256, 3, 1]],                     #[N,256,20,20]
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],     #[N,2048,20,20]
   [-1, 1, Conv, [512, 1, 1]], # 101               #[N,512,20,20]
   
   [75, 1, RepConv, [256, 3, 1]],                  #[N,256,80,80]
   [88, 1, RepConv, [512, 3, 1]],                  #[N,512,40,40]
   [101, 1, RepConv, [1024, 3, 1]],                #[N,1024,20,20]

   [[102,103,104], 1, IDetect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

网络结构图

网络结构分块讲解

1、基础模块

 主要是1x1以及3x3的卷积,3x3的卷积区别主要是stride不同,最后都是用SiLU进行激活。

 2、多路卷积模块

[-1, 1, Conv, [256, 1, 1]],                     #[4,256,40,40]
[-2, 1, Conv, [256, 1, 1]],                     #[4,256,40,40]
[-1, 1, Conv, [128, 3, 1]],                     #[4,128,40,40]
[-1, 1, Conv, [128, 3, 1]],                     #[4,128,40,40]
[-1, 1, Conv, [128, 3, 1]],                     #[4,128,40,40]
[-1, 1, Conv, [128, 3, 1]],                     #[4,128,40,40]
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],     #[4,1024,40,40]
[-1, 1, Conv, [256, 1, 1]], # 63                #[4,256,40,40]

or

[-1, 1, Conv, [64, 1, 1]],  # -6         [4,64,160,160]
[-2, 1, Conv, [64, 1, 1]],  # -5         [4,64,160,160]
[-1, 1, Conv, [64, 3, 1]],              #[4,64,160,160]
[-1, 1, Conv, [64, 3, 1]],  # -3        #[4,64,160,160]
[-1, 1, Conv, [64, 3, 1]],              #[4,64,160,160]
[-1, 1, Conv, [64, 3, 1]],  # -1        #[4,64,160,160]
[[-1, -3, -5, -6], 1, Concat, [1]],     #[4,256,160,160]
[-1, 1, Conv, [256, 1, 1]],  # 11       #[4,256,160,160]

 对应的模块

or

         其中主要是用了较多的1x1和3x3的卷积,每个卷积的输出会作为下一个卷积的输入,同时还会和其他卷积的输出进行concat,这样的结构提升了网络的精度但是这也会增加一定的耗时。

3、MP模块

[-1, 1, MP, []],                        #[4,256,80,80]
[-1, 1, Conv, [128, 1, 1]],             #[4,128,80,80]
[-3, 1, Conv, [128, 1, 1]],             #[4,128,160,160]
[-1, 1, Conv, [128, 3, 2]],             #[4,128,80,80]

MP代码如下:

class MP(nn.Module):
    def __init__(self, k=2):
        super(MP, self).__init__()
        self.m = nn.MaxPool2d(kernel_size=k, stride=k)

    def forward(self, x):
        return self.m(x)

图解结构如下: 

          

        将kernel_size和stride设置为2的Maxpool2d,该模块会用到1x1卷积,3x3卷积,以及一个Maxpool2d,之后将卷积输出进行concat,从而组成了该模块。

4、SPPCSPC模块

        该模块是利用金字塔池化和CSP结构得到的模块,其中也是用了较多的支路,具体代码如下:

class SPPCSPC(nn.Module):
    # CSP https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)):
        super(SPPCSPC, self).__init__()
        c_ = int(2 * c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(c_, c_, 3, 1)
        self.cv4 = Conv(c_, c_, 1, 1)
        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
        self.cv5 = Conv(4 * c_, c_, 1, 1)
        self.cv6 = Conv(c_, c_, 3, 1)
        self.cv7 = Conv(2 * c_, c2, 1, 1)

    def forward(self, x):
        x1 = self.cv4(self.cv3(self.cv1(x)))
        y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1)))
        y2 = self.cv2(x)
        return self.cv7(torch.cat((y1, y2), dim=1))

        图解之后得到下图:

         其中的MP就是金字塔池化,最后将两条分支的Y1和Y2的输出进行concat。

5、RepConv模块

 代码如下:

class RepConv(nn.Module):
    # Represented convolution
    # https://arxiv.org/abs/2101.03697

    def __init__(self, c1, c2, k=3, s=1, p=None, g=1, act=True, deploy=False):
        super(RepConv, self).__init__()

        self.deploy = deploy
        self.groups = g
        self.in_channels = c1
        self.out_channels = c2

        assert k == 3
        assert autopad(k, p) == 1

        padding_11 = autopad(k, p) - k // 2

        self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

        if deploy:
            self.rbr_reparam = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=True)

        else:
            self.rbr_identity = (nn.BatchNorm2d(num_features=c1) if c2 == c1 and s == 1 else None)

            self.rbr_dense = nn.Sequential(
                nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False),
                nn.BatchNorm2d(num_features=c2),
            )

            self.rbr_1x1 = nn.Sequential(
                nn.Conv2d( c1, c2, 1, s, padding_11, groups=g, bias=False),
                nn.BatchNorm2d(num_features=c2),
            )

    def forward(self, inputs):
        if hasattr(self, "rbr_reparam"):
            return self.act(self.rbr_reparam(inputs))

        if self.rbr_identity is None:
            id_out = 0
        else:
            id_out = self.rbr_identity(inputs)

        return self.act(self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out)

图解之后得到下图:

该模块的具体讲解可以参看:BACKBONE部分。

最后,如有理解不当的地方,欢迎大佬来纠错。

  • 5
    点赞
  • 68
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 5
    评论
Visio 是一款非常有效的绘图工具,可以帮助我们绘制各种网络结构图。Yolov7 是一种目标检测算法,具有很高的精度和速度,在图像处理领域得到了广泛应用。 要使用 Visio 绘制 Yolov7 网络结构,我们首先需要了解 Yolov7 的架构。Yolov7 采用了 Darknet53 作为其主干网络,并在其上添加了多个检测层。 我们可以按照以下步骤使用 Visio 绘制 Yolov7 网络结构: 1. 打开 Visio 软件,在空白画布上创建一个新的绘图。 2. 在绘图工具栏上选择需要绘制的形状,例如矩形、圆形等。 3. 将 Darknet53 主干网络作为 Yolov7 的第一层,绘制一个大矩形,并标注为 Darknet53。 4. 在 Darknet53 的下方,绘制多个较小的矩形,代表检测层。根据 Yolov7 的结构,可能有多个检测层,每个检测层负责不同尺度的目标检测。 5. 在每个检测层的矩形内,根据具体的结构要求绘制各个卷积层、池化层和全连接层。这些层可以使用 Visio 的线条和文本工具进行绘制和标注。 6. 确保每个层次之间有适当的连接线,表示数据流的传输和处理。 7. 添加必要的注释,包括各个层次的名称、输入输出的尺寸等信息。 8. 在绘图完成后,检查并调整绘图的布局和格式,以确保图像清晰易读。 通过以上步骤,我们可以使用 Visio 绘制出 Yolov7网络结构图。这样的图像可以帮助我们更好地理解 Yolov7 的结构,并在开发和调试过程中提供参考。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

athrunsunny

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值