YOLOv8入门 | yaml文件解读，YOLOv8网络结构打印以及网络结构图绘制【小白必看】

kay_545

于 2024-07-31 16:15:00 发布

阅读量828

点赞数 32

文章标签： YOLO 目标检测人工智能深度学习面试 yolov8源码解读 python

本文链接：https://blog.csdn.net/m0_67647321/article/details/140815093

版权

💡💡💡本专栏所有程序均经过测试，可成功执行💡💡💡

专栏目录：《YOLOv8改进有效涨点》专栏介绍 & 专栏目录 | 目前已有80+篇内容，内含各种Head检测头、损失函数Loss、Backbone、Neck、NMS等创新点改进——点击即可跳转

本篇文章主要分享YOLOv8中网络模型yaml文件，我们一般只知道如何去训练模型，和配置yaml文件，但是对于yaml文件是如何输入到模型里，yaml文件中每句话的内容是什么含义，模型如何将yaml文件解析出来可能是不知道的，这篇文章就给大家分享一下yaml文件的内容含义，在最后打印出详细的网络结构参数以及如何绘制网络结构图。

专栏地址：YOLOv8改进——更新各种有效涨点方法——点击即可跳转 订阅学习不迷路

1. yaml文件的内容解析

文件位置：ultralytics/ultralytics/cfg/models/v8/yolov8.yaml，

文件详解：

有些计算方式大家需要提前了解

关于GFLOPs的计算感兴趣的同学可以查看：8.13 卷积的memory，params，GFLOPs计算方法
关于如何计算卷积后的长和宽：8.3 卷积后图像的长和宽大小的计算方式

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80  # 类别数目，nc代表"number of classes"，即模型用于检测的对象类别总数。 80就是你所训练的数据的类别数量。由于默认使用COCO数据集，所以这里nc=80；但是这个在训练的时候你不是80个类别这里不修改也无所谓，只需要修改数据集的yaml文件即可。
scales: # 模型复合缩放常数，用于定义模型的不同尺寸和复杂度。即从下面n，s，m，l，x中选择其中一个便可以指定，例如在train.py中直接写 'model=yolov8n.yaml' ，便是指定了yolov8的n模型，如果写'model=yolov8x.yaml'，这样指定的就是YOLOv8的x模型，也就是最大的模型。
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n：225层, 3157200参数, 3157184梯度, 8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s：225层, 11166560参数, 11166544梯度, 28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m：295层, 25902640参数, 25902624梯度, 79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l：365层, 43691520参数, 43691504梯度, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x：365层, 68229648参数, 68229632梯度, 258.5 GFLOPs


# YOLOv8.0n backbone 主干网络
backbone:
  # [from, repeats, module, args] # from代表来自第几层，-1代表来自上一层，repeats是模块重复的次数，module是模块的名称，args是模块的参数
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2 第0层，-1代表将上层的输出作为本层的输入。第0层的输入是640*640*3的图像。1代表该模块不重复，Conv代表卷积层，对应的参数：64代表输出通道数，3代表卷积核大小k，2代表stride步长。卷积后输出的特征图尺寸为320*320*64，长宽为初始图片的1/2。
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4 第1层，本层和上一层是一样的操作（128代表输出通道数，3代表卷积核大小k，2代表stride步长）。卷积后输出的特征图尺寸为160*160*128，长宽为初始图片的1/4
  - [-1, 3, C2f, [128, True]] # 第2层，本层是C2f模块，3代表本层重复3次。128代表输出通道数，True表示Bottleneck有shortcut。输出的特征图尺寸为160*160*128。
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8 第3层，进行卷积操作（256代表输出通道数，3代表卷积核大小k，2代表stride步长），输出特征图尺寸为80*80*256（卷积的参数都没变，所以都是长宽变成原来的1/2，和之前一样），特征图的长宽已经变成初始输入图像的1/8。
  - [-1, 6, C2f, [256, True]] # 第4层，本层是C2f模块，可以参考第2层的讲解。6代表本层重复6次。256代表输出通道数，True表示Bottleneck有shortcut。经过这层之后，特征图尺寸依旧是80*80*256。
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16 第5层，进行卷积操作（512代表输出通道数，3代表卷积核大小k，2代表stride步长），输出特征图尺寸为40*40*512（卷积的参数都没变，所以都是长宽变成原来的1/2，和之前一样），特征图的长宽已经变成输入图像的1/16。
  - [-1, 6, C2f, [512, True]] # 第6层，本层是C2f模块，可以参考第2层的讲解。6代表本层重复6次。512代表输出通道数，True表示Bottleneck有shortcut。经过这层之后，特征图尺寸依旧是40*40*512。
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32 第7层，进行卷积操作（1024代表输出通道数，3代表卷积核大小k，2代表stride步长），输出特征图尺寸为20*20*1024（卷积的参数都没变，所以都是长宽变成原来的1/2，和之前一样），特征图的长宽已经变成输入图像的1/32。
  - [-1, 3, C2f, [1024, True]] #第8层，本层是C2f模块，可以参考第2层的讲解。3代表本层重复3次。1024代表输出通道数，True表示Bottleneck有shortcut。经过这层之后，特征图尺寸依旧是20*20*1024。
  - [-1, 1, SPPF, [1024, 5]]  # 9 第9层，本层是快速空间金字塔池化层（SPPF）。1024代表输出通道数，5代表池化核大小k。结合模块结构图和代码可以看出，最后concat得到的特征图尺寸是20*20*（512*4），经过一次Conv得到20*20*1024。

# YOLOv8.0n head 
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']] # 第10层，本层是上采样层。-1代表将上层的输出作为本层的输入。None代表上采样的size=None（输出尺寸）不指定。2代表scale_factor=2，表示输出的尺寸是输入尺寸的2倍。mode=nearest代表使用的上采样算法为最近邻插值算法。经过这层之后，特征图的长和宽变成原来的两倍，通道数不变，所以最终尺寸为40*40*1024。
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4 第11层，本层是concat层，[-1, 6]代表将上层和第6层的输出作为本层的输入。[1]代表concat拼接的维度是1。从上面的分析可知，上层的输出尺寸是40*40*1024，第6层的输出是40*40*512，最终本层的输出尺寸为40*40*1536。
  - [-1, 3, C2f, [512]]  # 12 第12层，本层是C2f模块，可以参考第2层的讲解。3代表本层重复3次。512代表输出通道数。与Backbone中C2f不同的是，此处的C2f的bottleneck模块的shortcut=False。

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']] # 第13层，本层也是上采样层（参考第10层）。经过这层之后，特征图的长和宽变成原来的两倍，通道数不变，所以最终尺寸为80*80*512。
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3 第14层，本层是concat层，[-1, 4]代表将上层和第4层的输出作为本层的输入。[1]代表concat拼接的维度是1。从上面的分析可知，上层的输出尺寸是80*80*512，第6层的输出是80*80*256，最终本层的输出尺寸为80*80*768。
  - [-1, 3, C2f, [256]]  # 15 (P3/8-small) 第15层，本层是C2f模块，可以参考第2层的讲解。3代表本层重复3次。256代表输出通道数。经过这层之后，特征图尺寸变为80*80*256，特征图的长宽已经变成输入图像的1/8。

  - [-1, 1, Conv, [256, 3, 2]] # 第16层，进行卷积操作（256代表输出通道数，3代表卷积核大小k，2代表stride步长），输出特征图尺寸为40*40*256（卷积的参数都没变，所以都是长宽变成原来的1/2，和之前一样）。
  - [[-1, 12], 1, Concat, [1]]  # cat head P4 第17层，本层是concat层，[-1, 12]代表将上层和第12层的输出作为本层的输入。[1]代表concat拼接的维度是1。从上面的分析可知，上层的输出尺寸是40*40*256，第12层的输出是40*40*512，最终本层的输出尺寸为40*40*768。
  - [-1, 3, C2f, [512]]  # 18 (P4/16-medium) 第18层，本层是C2f模块，可以参考第2层的讲解。3代表本层重复3次。512代表输出通道数。经过这层之后，特征图尺寸变为40*40*512，特征图的长宽已经变成输入图像的1/16。

  - [-1, 1, Conv, [512, 3, 2]] # 第19层，进行卷积操作（512代表输出通道数，3代表卷积核大小k，2代表stride步长），输出特征图尺寸为20*20*512（卷积的参数都没变，所以都是长宽变成原来的1/2，和之前一样）。
  - [[-1, 9], 1, Concat, [1]]  # cat head P5 第20层，本层是concat层，[-1, 9]代表将上层和第9层的输出作为本层的输入。[1]代表concat拼接的维度是1。从上面的分析可知，上层的输出尺寸是20*20*512，第9层的输出是20*20*1024，最终本层的输出尺寸为20*20*1536。
  - [-1, 3, C2f, [1024]]  # 21 (P5/32-large) 第21层，本层是C2f模块，可以参考第2层的讲解。3代表本层重复3次。1024代表输出通道数。经过这层之后，特征图尺寸变为20*20*1024，特征图的长宽已经变成输入图像的1/32。

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5) 第20层，本层是Detect层，[15, 18, 21]代表将第15、18、21层的输出（分别是80*80*256、40*40*512、20*20*1024）作为本层的输入。nc是数据集的类别数。

1.1 参数配置

# Parameters

nc: 80 # 类别数目，nc代表"number of classes"，即模型用于检测的对象类别总数。 80就是你所训练的数据的类别数量。由于默认使用COCO数据集，所以这里nc=80；但是这个在训练的时候你不是80个类别这里不修改也无所谓，只需要修改数据集的yaml文件即可。

scales: # 模型复合缩放常数，用于定义模型的不同尺寸和复杂度。即从下面n，s，m，l，x中选择其中一个便可以指定，例如在train.py中直接写 'model=yolov8n.yaml' ，便是指定了yolov8的n模型，如果写'model=yolov8x.yaml'，这样指定的就是YOLOv8的x模型，也就是最大的模型。

# [depth, width, max_channels]

n: [0.33, 0.25, 1024] # YOLOv8n：225层, 3157200参数, 3157184梯度, 8.9 GFLOPs

s: [0.33, 0.50, 1024] # YOLOv8s：225层, 11166560参数, 11166544梯度, 28.8 GFLOPs

m: [0.67, 0.75, 768] # YOLOv8m：295层, 25902640参数, 25902624梯度, 79.3 GFLOPs

l: [1.00, 1.00, 512] # YOLOv8l：365层, 43691520参数, 43691504梯度, 165.7 GFLOPs

x: [1.00, 1.25, 512] # YOLOv8x：365层, 68229648参数, 68229632梯度, 258.5 GFLOPs

[depth, width, max_channels]
depth: 网络的深度，控制子模块的数量
width：网络的宽度，控制卷积核的数量
max_channels：最大通道数

scales: # 模型复合缩放常数，用于定义模型的不同尺寸和复杂度。即从下面n，s，m，l，x中选择其中一个便可以指定，例如在train.py中直接写 'model=yolov8n.yaml' ，便是指定了yolov8的n模型，如果写'model=yolov8x.yaml'，这样指定的就是YOLOv8的x模型，也就是最大的模型。

然后我们再来看下这句话的意思，

这句话就是说，你可以直接在train.py中这样写

from ultralytics import YOLO
 
# Load a model
# model = YOLO('yolov8n.yaml')  # build a new model from YAML
# model = YOLO('yolov8n.pt')  # load a pretrained model (recommended for training)
 
model = YOLO(r'/projects/ultralytics/ultralytics/cfg/models/v8/yolov8n.yaml')  # build from YAML and transfer weights
 
# Train the model
model.train()

有点同学可能会感到困惑：可是文件夹下面并没有yolov8n.yaml这个文件啊，这样不是路径错误吗？其实不然。

在task.py文件中，有这个函数可以提取到n这个scales，将n和yolov8.yaml分开

def guess_model_scale(model_path):
    """
    Takes a path to a YOLO model's YAML file as input and extracts the size character of the model's scale. The function
    uses regular expression matching to find the pattern of the model scale in the YAML file name, which is denoted by
    n, s, m, l, or x. The function returns the size character of the model scale as a string.

    Args:
        model_path (str | Path): The path to the YOLO model's YAML file.

    Returns:
        (str): The size character of the model's scale, which can be n, s, m, l, or x.
    """
    with contextlib.suppress(AttributeError):
        import re

        return re.search(r"yolov\d+([nslmx])", Path(model_path).stem).group(1)  # n, s, m, l, or x
    return ""

这句代码用于从一个路径字符串中提取YOLO模型的版本信息。具体来说，它使用正则表达式匹配模型路径字符串中的某一部分并提取其中的一部分内容。

下面逐步解释这句代码的含义：

re.search(r"yolov\d+([nslmx])", Path(model_path).stem)：
- re.search 是 Python 正则表达式模块 re 中的一个函数，用于在字符串中搜索与正则表达式模式匹配的内容。
- r"yolov\d+([nslmx])" 是一个原始字符串，表示正则表达式模式：
  - yolov 表示匹配字符串 "yolov"。
  - \d+ 表示匹配一个或多个数字（\d 表示数字，+ 表示一个或多个）。
  - ([nslmx]) 是一个捕获组，表示匹配单个字符，可以是 n, s, l, m, 或 x 中的任意一个。
  - Path(model_path).stem 是将 model_path 作为路径对象处理，并获取其文件名部分（不包括扩展名）。
.group(1)：

group(1) 返回匹配的第一个捕获组的内容，即正则表达式中的 ([nslmx]) 部分。

举例说明

假设 model_path 为 "path/to/yolov8n.pt"，则解释如下：

Path(model_path).stem 得到 "yolov8n"。
re.search(r"yolov\d+([nslmx])", "yolov8n") 匹配 "yolov8n" 字符串，其中：
- yolov8 匹配 yolov\d+ 部分。
- n 匹配 ([nslmx]) 部分。
.group(1) 返回捕获组的内容，即 "n"。

因此，这句代码的作用是从 model_path 中提取 YOLO 模型版本的最后一个字符（n, s, l, m, x）。

1.2 backbone

[from, repeats, module, args]

from: 有三种可能的值分别是 -1、具体的数值、list存放数值。

-1代表此层的输入就是上一层的输出
如果是具体的某个数字6则代表本层的输入来自于模型的第6层
有的层是list存在两个值可能是多个值，则代表对应两个值的输出为本层的输入，例如concat

repeats: 这个参数是为了、C2f设置的，其他模块用不到，代表着C2f中Bottleneck重复的次数，当我们的模型用到的是3时候，repeats=3那么则代表C2f当中的Bottleneck串行3个。

module：模块类名，通过这个类名在conv.py，block.py和head.py中寻找相应的类，进行模块化的搭建网络。

args：以conv[64, 3, 2]为例，分别对应[channel, kernel, stride] 。channel是输出feature map的通道数，kernel是卷积核的个数， stride是卷积核移动步幅。此处代表输入到对应模块的参数，此处和parse_model函数中的定义方法有关，对于C2f来说传入的参数->第一个参数是上一个模型的输出通道数，第二个参数就是args的第一个参数，然后以此类推。

以C2f[128,true]为例，128是输出feature map的通道数，True代表Bottleneck模块中的shortcut=True,没有写代表False。

以SPPF[1024,5]为例，1024是输出feature map的通道数，5是SPPF模块中池化核的尺寸。

以nn.upsample为例，None表示不指定输出尺寸，2表示输出尺寸为输入尺寸的2倍，“nearest”表示上采样差值方式为最近邻差值。

1.3 head

head用于将主干网络(backbone)的特征图(feature maps)转化为目标检测的输出结果，head部分定义了模型的检测头，即用于最终目标检测的网络结构。头部网络的主要作用是进行目标分类和定位。它根据颈部网络提供的融合特征图，对每个特征点进行分类（目标类别预测）和定位（边界框预测）。

2. 网络结构打印

在yolov8的文件夹下新建info.py，然后粘贴下面的代码

from ultralytics import YOLO
# 加载训练好的模型或者网络结构配置文件
# model = YOLO('ultralytics/run/best.pt')
model = YOLO('ultralytics/cfg/models/v8/yolov8n.yaml')
# 打印模型参数信息
print(model.info())

# 打印模型参数详细信息
print(model.info(detailed=True))

# 打印模型参数信息
print(model.info()) 输出内容如下：

只打印了层数，参数，梯度和GFLOPs

YOLOv8 summary: 225 layers, 3,157,200 parameters, 3,157,184 gradients 8.9GFLOPs
(225, 3157200, 3157184, 8.9GFLOPs)

# 打印模型参数详细信息
print(model.info(detailed=True)) 输出内容如下：

打印出了模型每一层网络结构的名字、参数量以及该层的结构形状。

layer                                     name  gradient   parameters                shape         mu      sigma
    0                      model.0.conv.weight      True          432        [16, 3, 3, 3]     0.0087      0.113 torch.float32
    1                        model.0.bn.weight      True           16                 [16]          1          0 torch.float32
    2                          model.0.bn.bias      True           16                 [16]          0          0 torch.float32
    3                      model.1.conv.weight      True         4608       [32, 16, 3, 3]   7.56e-05     0.0485 torch.float32
    4                        model.1.bn.weight      True           32                 [32]          1          0 torch.float32
    5                          model.1.bn.bias      True           32                 [32]          0          0 torch.float32
    6                  model.2.cv1.conv.weight      True         1024       [32, 32, 1, 1]    0.00353      0.102 torch.float32
    7                    model.2.cv1.bn.weight      True           32                 [32]          1          0 torch.float32
    8                      model.2.cv1.bn.bias      True           32                 [32]          0          0 torch.float32
    9                  model.2.cv2.conv.weight      True         1536       [32, 48, 1, 1]    0.00545     0.0834 torch.float32
   10                    model.2.cv2.bn.weight      True           32                 [32]          1          0 torch.float32
   11                      model.2.cv2.bn.bias      True           32                 [32]          0          0 torch.float32
   12              model.2.m.0.cv1.conv.weight      True         2304       [16, 16, 3, 3]   0.000192     0.0481 torch.float32
   13                model.2.m.0.cv1.bn.weight      True           16                 [16]          1          0 torch.float32
   14                  model.2.m.0.cv1.bn.bias      True           16                 [16]          0          0 torch.float32
   15              model.2.m.0.cv2.conv.weight      True         2304       [16, 16, 3, 3]   7.72e-06     0.0489 torch.float32
   16                model.2.m.0.cv2.bn.weight      True           16                 [16]          1          0 torch.float32
   17                  model.2.m.0.cv2.bn.bias      True           16                 [16]          0          0 torch.float32
   18                      model.3.conv.weight      True        18432       [64, 32, 3, 3]   -0.00032     0.0341 torch.float32
   19                        model.3.bn.weight      True           64                 [64]          1          0 torch.float32
   20                          model.3.bn.bias      True           64                 [64]          0          0 torch.float32
   21                  model.4.cv1.conv.weight      True         4096       [64, 64, 1, 1]   0.000252      0.072 torch.float32
   22                    model.4.cv1.bn.weight      True           64                 [64]          1          0 torch.float32
   23                      model.4.cv1.bn.bias      True           64                 [64]          0          0 torch.float32
   24                  model.4.cv2.conv.weight      True         8192      [64, 128, 1, 1]   0.000322     0.0511 torch.float32
   25                    model.4.cv2.bn.weight      True           64                 [64]          1          0 torch.float32
   26                      model.4.cv2.bn.bias      True           64                 [64]          0          0 torch.float32
   27              model.4.m.0.cv1.conv.weight      True         9216       [32, 32, 3, 3]   0.000172     0.0344 torch.float32
   28                model.4.m.0.cv1.bn.weight      True           32                 [32]          1          0 torch.float32
   29                  model.4.m.0.cv1.bn.bias      True           32                 [32]          0          0 torch.float32
   30              model.4.m.0.cv2.conv.weight      True         9216       [32, 32, 3, 3]  -0.000275      0.034 torch.float32
   31                model.4.m.0.cv2.bn.weight      True           32                 [32]          1          0 torch.float32
   32                  model.4.m.0.cv2.bn.bias      True           32                 [32]          0          0 torch.float32
   33              model.4.m.1.cv1.conv.weight      True         9216       [32, 32, 3, 3]  -0.000292     0.0342 torch.float32
   34                model.4.m.1.cv1.bn.weight      True           32                 [32]          1          0 torch.float32
   35                  model.4.m.1.cv1.bn.bias      True           32                 [32]          0          0 torch.float32
   36              model.4.m.1.cv2.conv.weight      True         9216       [32, 32, 3, 3]  -7.28e-05     0.0343 torch.float32
   37                model.4.m.1.cv2.bn.weight      True           32                 [32]          1          0 torch.float32
   38                  model.4.m.1.cv2.bn.bias      True           32                 [32]          0          0 torch.float32
   39                      model.5.conv.weight      True        73728      [128, 64, 3, 3]   9.28e-05      0.024 torch.float32
   40                        model.5.bn.weight      True          128                [128]          1          0 torch.float32
   41                          model.5.bn.bias      True          128                [128]          0          0 torch.float32
   42                  model.6.cv1.conv.weight      True        16384     [128, 128, 1, 1]   0.000167      0.051 torch.float32
   43                    model.6.cv1.bn.weight      True          128                [128]          1          0 torch.float32
   44                      model.6.cv1.bn.bias      True          128                [128]          0          0 torch.float32
   45                  model.6.cv2.conv.weight      True        32768     [128, 256, 1, 1]   8.12e-05      0.036 torch.float32
   46                    model.6.cv2.bn.weight      True          128                [128]          1          0 torch.float32
   47                      model.6.cv2.bn.bias      True          128                [128]          0          0 torch.float32
   48              model.6.m.0.cv1.conv.weight      True        36864       [64, 64, 3, 3]  -0.000144      0.024 torch.float32
   49                model.6.m.0.cv1.bn.weight      True           64                 [64]          1          0 torch.float32
   50                  model.6.m.0.cv1.bn.bias      True           64                 [64]          0          0 torch.float32
   51              model.6.m.0.cv2.conv.weight      True        36864       [64, 64, 3, 3]  -0.000178     0.0241 torch.float32
   52                model.6.m.0.cv2.bn.weight      True           64                 [64]          1          0 torch.float32
   53                  model.6.m.0.cv2.bn.bias      True           64                 [64]          0          0 torch.float32
   54              model.6.m.1.cv1.conv.weight      True        36864       [64, 64, 3, 3]  -9.93e-05      0.024 torch.float32
   55                model.6.m.1.cv1.bn.weight      True           64                 [64]          1          0 torch.float32
   56                  model.6.m.1.cv1.bn.bias      True           64                 [64]          0          0 torch.float32
   57              model.6.m.1.cv2.conv.weight      True        36864       [64, 64, 3, 3]   7.46e-05     0.0241 torch.float32
   58                model.6.m.1.cv2.bn.weight      True           64                 [64]          1          0 torch.float32
   59                  model.6.m.1.cv2.bn.bias      True           64                 [64]          0          0 torch.float32
   60                      model.7.conv.weight      True       294912     [256, 128, 3, 3]   1.19e-05      0.017 torch.float32
   61                        model.7.bn.weight      True          256                [256]          1          0 torch.float32
   62                          model.7.bn.bias      True          256                [256]          0          0 torch.float32
   63                  model.8.cv1.conv.weight      True        65536     [256, 256, 1, 1]  -1.78e-05      0.036 torch.float32
   64                    model.8.cv1.bn.weight      True          256                [256]          1          0 torch.float32
   65                      model.8.cv1.bn.bias      True          256                [256]          0          0 torch.float32
   66                  model.8.cv2.conv.weight      True        98304     [256, 384, 1, 1]  -5.52e-05     0.0294 torch.float32
   67                    model.8.cv2.bn.weight      True          256                [256]          1          0 torch.float32
   68                      model.8.cv2.bn.bias      True          256                [256]          0          0 torch.float32
   69              model.8.m.0.cv1.conv.weight      True       147456     [128, 128, 3, 3]   7.66e-05      0.017 torch.float32
   70                model.8.m.0.cv1.bn.weight      True          128                [128]          1          0 torch.float32
   71                  model.8.m.0.cv1.bn.bias      True          128                [128]          0          0 torch.float32
   72              model.8.m.0.cv2.conv.weight      True       147456     [128, 128, 3, 3]    8.7e-07      0.017 torch.float32
   73                model.8.m.0.cv2.bn.weight      True          128                [128]          1          0 torch.float32
   74                  model.8.m.0.cv2.bn.bias      True          128                [128]          0          0 torch.float32
   75                  model.9.cv1.conv.weight      True        32768     [128, 256, 1, 1]  -0.000411     0.0361 torch.float32
   76                    model.9.cv1.bn.weight      True          128                [128]          1          0 torch.float32
   77                      model.9.cv1.bn.bias      True          128                [128]          0          0 torch.float32
   78                  model.9.cv2.conv.weight      True       131072     [256, 512, 1, 1]   0.000127     0.0255 torch.float32
   79                    model.9.cv2.bn.weight      True          256                [256]          1          0 torch.float32
   80                      model.9.cv2.bn.bias      True          256                [256]          0          0 torch.float32
   81                 model.12.cv1.conv.weight      True        49152     [128, 384, 1, 1]  -0.000128     0.0295 torch.float32
   82                   model.12.cv1.bn.weight      True          128                [128]          1          0 torch.float32
   83                     model.12.cv1.bn.bias      True          128                [128]          0          0 torch.float32
   84                 model.12.cv2.conv.weight      True        24576     [128, 192, 1, 1]  -0.000642     0.0418 torch.float32
   85                   model.12.cv2.bn.weight      True          128                [128]          1          0 torch.float32
   86                     model.12.cv2.bn.bias      True          128                [128]          0          0 torch.float32
   87             model.12.m.0.cv1.conv.weight      True        36864       [64, 64, 3, 3]   1.28e-05     0.0241 torch.float32
   88               model.12.m.0.cv1.bn.weight      True           64                 [64]          1          0 torch.float32
   89                 model.12.m.0.cv1.bn.bias      True           64                 [64]          0          0 torch.float32
   90             model.12.m.0.cv2.conv.weight      True        36864       [64, 64, 3, 3]  -0.000192     0.0241 torch.float32
   91               model.12.m.0.cv2.bn.weight      True           64                 [64]          1          0 torch.float32
   92                 model.12.m.0.cv2.bn.bias      True           64                 [64]          0          0 torch.float32
   93                 model.15.cv1.conv.weight      True        12288      [64, 192, 1, 1]  -0.000297     0.0419 torch.float32
   94                   model.15.cv1.bn.weight      True           64                 [64]          1          0 torch.float32
   95                     model.15.cv1.bn.bias      True           64                 [64]          0          0 torch.float32
   96                 model.15.cv2.conv.weight      True         6144       [64, 96, 1, 1]   0.000114      0.059 torch.float32
   97                   model.15.cv2.bn.weight      True           64                 [64]          1          0 torch.float32
   98                     model.15.cv2.bn.bias      True           64                 [64]          0          0 torch.float32
   99             model.15.m.0.cv1.conv.weight      True         9216       [32, 32, 3, 3]  -0.000316      0.034 torch.float32
  100               model.15.m.0.cv1.bn.weight      True           32                 [32]          1          0 torch.float32
  101                 model.15.m.0.cv1.bn.bias      True           32                 [32]          0          0 torch.float32
  102             model.15.m.0.cv2.conv.weight      True         9216       [32, 32, 3, 3]   0.000419     0.0341 torch.float32
  103               model.15.m.0.cv2.bn.weight      True           32                 [32]          1          0 torch.float32
  104                 model.15.m.0.cv2.bn.bias      True           32                 [32]          0          0 torch.float32
  105                     model.16.conv.weight      True        36864       [64, 64, 3, 3]   9.78e-05     0.0242 torch.float32
  106                       model.16.bn.weight      True           64                 [64]          1          0 torch.float32
  107                         model.16.bn.bias      True           64                 [64]          0          0 torch.float32
  108                 model.18.cv1.conv.weight      True        24576     [128, 192, 1, 1]    0.00016     0.0417 torch.float32
  109                   model.18.cv1.bn.weight      True          128                [128]          1          0 torch.float32
  110                     model.18.cv1.bn.bias      True          128                [128]          0          0 torch.float32
  111                 model.18.cv2.conv.weight      True        24576     [128, 192, 1, 1]    0.00015     0.0416 torch.float32
  112                   model.18.cv2.bn.weight      True          128                [128]          1          0 torch.float32
  113                     model.18.cv2.bn.bias      True          128                [128]          0          0 torch.float32
  114             model.18.m.0.cv1.conv.weight      True        36864       [64, 64, 3, 3]   8.61e-05      0.024 torch.float32
  115               model.18.m.0.cv1.bn.weight      True           64                 [64]          1          0 torch.float32
  116                 model.18.m.0.cv1.bn.bias      True           64                 [64]          0          0 torch.float32
  117             model.18.m.0.cv2.conv.weight      True        36864       [64, 64, 3, 3]   0.000194      0.024 torch.float32
  118               model.18.m.0.cv2.bn.weight      True           64                 [64]          1          0 torch.float32
  119                 model.18.m.0.cv2.bn.bias      True           64                 [64]          0          0 torch.float32
  120                     model.19.conv.weight      True       147456     [128, 128, 3, 3]  -3.91e-05      0.017 torch.float32
  121                       model.19.bn.weight      True          128                [128]          1          0 torch.float32
  122                         model.19.bn.bias      True          128                [128]          0          0 torch.float32
  123                 model.21.cv1.conv.weight      True        98304     [256, 384, 1, 1]  -0.000318     0.0295 torch.float32
  124                   model.21.cv1.bn.weight      True          256                [256]          1          0 torch.float32
  125                     model.21.cv1.bn.bias      True          256                [256]          0          0 torch.float32
  126                 model.21.cv2.conv.weight      True        98304     [256, 384, 1, 1]   1.61e-05     0.0295 torch.float32
  127                   model.21.cv2.bn.weight      True          256                [256]          1          0 torch.float32
  128                     model.21.cv2.bn.bias      True          256                [256]          0          0 torch.float32
  129             model.21.m.0.cv1.conv.weight      True       147456     [128, 128, 3, 3]   7.26e-05      0.017 torch.float32
  130               model.21.m.0.cv1.bn.weight      True          128                [128]          1          0 torch.float32
  131                 model.21.m.0.cv1.bn.bias      True          128                [128]          0          0 torch.float32
  132             model.21.m.0.cv2.conv.weight      True       147456     [128, 128, 3, 3]   4.85e-05      0.017 torch.float32
  133               model.21.m.0.cv2.bn.weight      True          128                [128]          1          0 torch.float32
  134                 model.21.m.0.cv2.bn.bias      True          128                [128]          0          0 torch.float32
  135             model.22.cv2.0.0.conv.weight      True        36864       [64, 64, 3, 3]  -0.000168     0.0241 torch.float32
  136               model.22.cv2.0.0.bn.weight      True           64                 [64]          1          0 torch.float32
  137                 model.22.cv2.0.0.bn.bias      True           64                 [64]          0          0 torch.float32
  138             model.22.cv2.0.1.conv.weight      True        36864       [64, 64, 3, 3]    0.00011     0.0241 torch.float32
  139               model.22.cv2.0.1.bn.weight      True           64                 [64]          1          0 torch.float32
  140                 model.22.cv2.0.1.bn.bias      True           64                 [64]          0          0 torch.float32
  141                  model.22.cv2.0.2.weight      True         4096       [64, 64, 1, 1]   -0.00119     0.0722 torch.float32
  142                    model.22.cv2.0.2.bias      True           64                 [64]          1          0 torch.float32
  143             model.22.cv2.1.0.conv.weight      True        73728      [64, 128, 3, 3]   6.97e-05      0.017 torch.float32
  144               model.22.cv2.1.0.bn.weight      True           64                 [64]          1          0 torch.float32
  145                 model.22.cv2.1.0.bn.bias      True           64                 [64]          0          0 torch.float32
  146             model.22.cv2.1.1.conv.weight      True        36864       [64, 64, 3, 3]  -0.000177      0.024 torch.float32
  147               model.22.cv2.1.1.bn.weight      True           64                 [64]          1          0 torch.float32
  148                 model.22.cv2.1.1.bn.bias      True           64                 [64]          0          0 torch.float32
  149                  model.22.cv2.1.2.weight      True         4096       [64, 64, 1, 1]  -0.000289     0.0721 torch.float32
  150                    model.22.cv2.1.2.bias      True           64                 [64]          1          0 torch.float32
  151             model.22.cv2.2.0.conv.weight      True       147456      [64, 256, 3, 3]    1.4e-05      0.012 torch.float32
  152               model.22.cv2.2.0.bn.weight      True           64                 [64]          1          0 torch.float32
  153                 model.22.cv2.2.0.bn.bias      True           64                 [64]          0          0 torch.float32
  154             model.22.cv2.2.1.conv.weight      True        36864       [64, 64, 3, 3]  -0.000207      0.024 torch.float32
  155               model.22.cv2.2.1.bn.weight      True           64                 [64]          1          0 torch.float32
  156                 model.22.cv2.2.1.bn.bias      True           64                 [64]          0          0 torch.float32
  157                  model.22.cv2.2.2.weight      True         4096       [64, 64, 1, 1]   0.000311     0.0727 torch.float32
  158                    model.22.cv2.2.2.bias      True           64                 [64]          1          0 torch.float32
  159             model.22.cv3.0.0.conv.weight      True        46080       [80, 64, 3, 3]   5.27e-05     0.0241 torch.float32
  160               model.22.cv3.0.0.bn.weight      True           80                 [80]          1          0 torch.float32
  161                 model.22.cv3.0.0.bn.bias      True           80                 [80]          0          0 torch.float32
  162             model.22.cv3.0.1.conv.weight      True        57600       [80, 80, 3, 3]  -5.35e-05     0.0216 torch.float32
  163               model.22.cv3.0.1.bn.weight      True           80                 [80]          1          0 torch.float32
  164                 model.22.cv3.0.1.bn.bias      True           80                 [80]          0          0 torch.float32
  165                  model.22.cv3.0.2.weight      True         6400       [80, 80, 1, 1]  -0.000259     0.0642 torch.float32
  166                    model.22.cv3.0.2.bias      True           80                 [80]      -11.5   1.92e-06 torch.float32
  167             model.22.cv3.1.0.conv.weight      True        92160      [80, 128, 3, 3]   -6.1e-05      0.017 torch.float32
  168               model.22.cv3.1.0.bn.weight      True           80                 [80]          1          0 torch.float32
  169                 model.22.cv3.1.0.bn.bias      True           80                 [80]          0          0 torch.float32
  170             model.22.cv3.1.1.conv.weight      True        57600       [80, 80, 3, 3]   0.000153     0.0215 torch.float32
  171               model.22.cv3.1.1.bn.weight      True           80                 [80]          1          0 torch.float32
  172                 model.22.cv3.1.1.bn.bias      True           80                 [80]          0          0 torch.float32
  173                  model.22.cv3.1.2.weight      True         6400       [80, 80, 1, 1]  -0.000754      0.064 torch.float32
  174                    model.22.cv3.1.2.bias      True           80                 [80]      -10.2          0 torch.float32
  175             model.22.cv3.2.0.conv.weight      True       184320      [80, 256, 3, 3]   3.96e-05      0.012 torch.float32
  176               model.22.cv3.2.0.bn.weight      True           80                 [80]          1          0 torch.float32
  177                 model.22.cv3.2.0.bn.bias      True           80                 [80]          0          0 torch.float32
  178             model.22.cv3.2.1.conv.weight      True        57600       [80, 80, 3, 3]  -3.82e-05     0.0215 torch.float32
  179               model.22.cv3.2.1.bn.weight      True           80                 [80]          1          0 torch.float32
  180                 model.22.cv3.2.1.bn.bias      True           80                 [80]          0          0 torch.float32
  181                  model.22.cv3.2.2.weight      True         6400       [80, 80, 1, 1]   5.84e-05     0.0644 torch.float32
  182                    model.22.cv3.2.2.bias      True           80                 [80]      -8.76          0 torch.float32
  183                 model.22.dfl.conv.weight     False           16        [1, 16, 1, 1]        7.5       4.76 torch.float32
YOLOv8 summary: 225 layers, 3,157,200 parameters, 3,157,184 gradients, 8.9GFLOPs
(225, 3157200, 3157184, 8.9)

3. 网络结构图的绘制

绘制网络结构的整体部分主要是参考yaml文件，而详细的部分需要去看每个模块的代码实现。

给大家推荐一个绘制网络结构图的网站：(diagrams.net)

画图的时候我们先要看如何输入，我们在yaml文件的这里看到输入是一张640*640*3的图像，所以从上面开始是input一个图像，接下来是两个Conv，

然后接下来是C2f模块，

这样一直画到SPPF结构，画head,【我这里是画的四个检测头，原理和三个检测头是一样的】

然后接下来我们连接一下每个模块之间的线，根据yaml文件连线即可

详细的结构图绘制等我有时间了会出视频，文字可能不是很直观。

我的yaml文件

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P2-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]
  s: [0.33, 0.50, 1024]
  m: [0.67, 0.75, 768]
  l: [1.00, 1.00, 512]
  x: [1.00, 1.25, 512]

# YOLOv8.0 backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9

# YOLOv8.0-p2 head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, C2f, [512]] # 12

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 3, C2f, [256]] # 15 (P3/8-small)

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 2], 1, Concat, [1]] # cat backbone P2
  - [-1, 3, C2f, [128]] # 18 (P2/4-xsmall)

  - [-1, 1, Conv, [128, 3, 2]]
  - [[-1, 15], 1, Concat, [1]] # cat head P3
  - [-1, 3, C2f, [256]] # 21 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]] # cat head P4
  - [-1, 3, C2f, [512]] # 24 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]] # cat head P5
  - [-1, 3, C2f, [1024]] # 27 (P5/32-large)

  - [[18, 21, 24, 27], 1, Detect, [nc]] # Detect(P2, P3, P4, P5)