YOLOv7的yolov7.yaml详解

将YOLOv7流程图与yolov7.yaml文件进行一一对应,相互匹配,便于理解整个网络过程。

具体注释如下

# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [12,16, 19,36, 40,28]  # P3/8
  - [36,75, 76,55, 72,146]  # P4/16
  - [142,110, 192,243, 459,401]  # P5/32

# yolov7 backbone
backbone:
  # [from, number, module, args]    输入为 640*640*3
  [[-1, 1, Conv, [32, 3, 1]],  # 0    第零层
  
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2   320*320*64
   [-1, 1, Conv, [64, 3, 1]],
   
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4    160*160*128

#  该部分整体为ELAN,H和W不发生变化,倒数第二行channel数由C变为2C,最终输出channel由倒数第一行的Conv决定,该部分是2C
   [-1, 1, Conv, [64, 1, 1]],
   [-2, 1, Conv, [64, 1, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],    # 160*160*256   Concat是在channel维度上进行拼接
   [-1, 1, Conv, [256, 1, 1]],  # 11    160*160*256

#  该部分为MP1,H和W变为原来的0.5倍,C不变
   [-1, 1, MP, []],   # 80*80*256   MP为最大池化MaxPooling,H和W变为原来的0.5倍,channel不变
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 16-P3/8   80*80*256

#   ELAN,H和W不发生变化,倒数第二行channel数由C变为2C,最终输出channel由倒数第一行的Conv决定,该部分是2C
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],    # 80*80*512
   [-1, 1, Conv, [512, 1, 1]],  # 24    80*80*512

#   MP1 + ELAN
   [-1, 1, MP, []],
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 29-P4/16    40*40*512
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],    # 40*40*1024
   [-1, 1, Conv, [1024, 1, 1]],  # 37   40*40*1024

#   MP1 + ELAN  注意该部分ELAN与前面的不同,输出的channel与之前相同
   [-1, 1, MP, []],
   [-1, 1, Conv, [512, 1, 1]],
   [-3, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [512, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 42-P5/32    20*20*1024
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],    # 20*20*1024
   [-1, 1, Conv, [1024, 1, 1]],  # 50   20*20*1024
  ]

# yolov7 head
head:
  [[-1, 1, SPPCSPC, [512]], # 51    20*20*512   最终输出只有channel发生变化

#  UP+CONCAT 前两行构成UP模块,第三行对第二组MP1+ELAN后的输出(# 37)进行卷积,最后统一concat
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],    # 40*40*256   上采样,H和W变为原来的2倍,channel不变
   [37, 1, Conv, [256, 1, 1]], # route backbone P4
   [[-1, -2], 1, Concat, [1]],    # 40*40*512

#   ELAN-H
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],    # 40*40*1024
   [-1, 1, Conv, [256, 1, 1]], # 63   40*40*256

#  UP+CONCAT
   [-1, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],    # 80*80*128
   [24, 1, Conv, [128, 1, 1]], # route backbone P3
   [[-1, -2], 1, Concat, [1]],    # 80*80*256

#   ELAN-H
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],    # 80*80*512
   [-1, 1, Conv, [128, 1, 1]], # 75   80*80*128

#   MP2
   [-1, 1, MP, []],   # 40*40*128
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],    # 40*40*128
   [[-1, -3, 63], 1, Concat, [1]],    # 40*40*512

#   ELAN-H
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],    # 40*40*1024
   [-1, 1, Conv, [256, 1, 1]], # 88   40*40*256

#  MP2
   [-1, 1, MP, []],   # 20*20*256
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],    # 20*20*256
   [[-1, -3, 51], 1, Concat, [1]],    # 20*20*1024

#   ELAN-H
   [-1, 1, Conv, [512, 1, 1]],
   [-2, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],    # 20*20*2048
   [-1, 1, Conv, [512, 1, 1]], # 101    20*20*512

#   RepConv是重参数化卷积,用3*3卷积重参数化,加速推理,不改变H和W,具体内容参考论文
   [75, 1, RepConv, [256, 3, 1]],
   [88, 1, RepConv, [512, 3, 1]],
   [101, 1, RepConv, [1024, 3, 1]],

   [[102,103,104], 1, IDetect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

 参考链接:深入浅出 Yolo 系列之 Yolov7 基础网络结构详解_yolov7网络结构_计算机视觉linke的博客-CSDN博客

上图为参照的V7结构图,其中,右下方的MP2的输出是80*80*256,个人计算结果是40*40*256。

MP1、MP2、UP模块看结构图不易理解,容易误解,建议自己梳理yaml文件,思路很清晰。

图里的ELAN模块比论文的好理解很多,

SPPCSPC在yaml文件里没有解释,具体代码在common.py里,附上代码如下,可结合流程图梳理,很清晰

class SPPCSPC(nn.Module):
    # CSP https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)):
        super(SPPCSPC, self).__init__()
        c_ = int(2 * c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(c_, c_, 3, 1)
        self.cv4 = Conv(c_, c_, 1, 1)
        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
        self.cv5 = Conv(4 * c_, c_, 1, 1)
        self.cv6 = Conv(c_, c_, 3, 1)
        self.cv7 = Conv(2 * c_, c2, 1, 1)

    def forward(self, x):
        x1 = self.cv4(self.cv3(self.cv1(x)))
        y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1)))
        y2 = self.cv2(x)
        return self.cv7(torch.cat((y1, y2), dim=1))

最右面的是检测头部分,自下到上为P3 P4 P5

RepConv为重参数化卷积,简单来说就是在训练时的多分支结构等效为推理时的单路径结构,精度提升一点点,速度提升很多,个人看完RepVGG论文后感觉很有意义,数学推导很有意思也很有道理。附上论文中的示意图以及参考博主的流程图

As an AI language model, I can provide you with an overview of the YOLOv7 YAML file. The YOLOv7 YAML file is a configuration file that specifies the configuration parameters for the YOLOv7 object detection algorithm. YOLOv7 is a deep learning algorithm that uses convolutional neural networks (CNN) to detect objects in images and videos. The YAML file contains various parameters for configuring the YOLOv7 algorithm, such as the number of classes to be detected, the size of the input image, the anchor boxes used for object detection, and the parameters for the neural network architecture. Here is an example of a YAML file for YOLOv7: ```yaml train: dataset: /path/to/train/dataset/ val_dataset: /path/to/val/dataset/ batch_size: 64 epochs: 100 early_stop_patience: 10 learning_rate: 0.001 num_workers: 4 weights: '' pretrain_weights: '' checkpoint_interval: 10 image_size: 416 strides: [8, 16, 32] anchors: [[10,13], [16,30], [33,23], [30,61], [62,45], [59,119], [116,90], [156,198], [373,326]] channels: 3 num_classes: 80 model: YOLOv7 save_dir: /path/to/save/trained/models/ ``` In this YAML file, the `train` section contains the parameters for training the YOLOv7 model, such as the dataset paths, batch size, number of epochs, and learning rate. The `image_size` parameter specifies the input image size, while the `anchors` parameter specifies the anchor boxes used for object detection. The `num_classes` parameter specifies the number of object classes to be detected, while the `model` parameter specifies the YOLOv7 model architecture to be used. The `save_dir` parameter specifies the directory where trained models will be saved. Overall, the YOLOv7 YAML file provides a flexible and customizable way to configure the YOLOv7 algorithm for object detection tasks.
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值