前言
YOLOv12的模型文件中主要使用了A2C2f模块
,这需要安装flash-attn包
。本文总结了安装过程中可能遇到的问题和解决办法,并介绍如何根据自己的环境正确安装flash-attn包
,并顺利训练YOLOv12。
文章目录
一、未安装或未安装成功可能发生的报错和解决办法
1.1 RuntimeError: FlashAttention only supports Ampere GPUs or newer.
安装完成后报错:
报错原因: 当前显卡版本不支持,我用的V100
,报这个错
解决办法: 使用其它显卡,例如RTX X090
、H100
、A100
1.2 False, “import FlashAttention error! Please install FlashAttention first.”
未安装报错:
报错原因: 未安装flash-attn
解决办法: 成功安装后便不会报错,参考第二节的安装步骤。
1.3 TypeError: argument of type ‘PosixPath’ is not iterable
安装完成后报错:
报错原因: 运行时候报错,参考:https://github.com/sunsmarterjie/yolov12/issues/2
解决办法: 在ultralytics/utils/downloads.py
中的attempt_download_asset
函数下,找到if 'v12' in file:
,并修改成如下样式:
if 'v12' in str(file): # 此处
repo = "sunsmarterjie/yolov12"
release = "v1.0"
二、flash-attn安装步骤
2.1 查看本地环境
安装flash-attn
需根据自己的Python版本
、CUDA版本
还有torch版本
,首先查看这些版本信息。
查看Python版本:
python --version
查看CUDA版本:
nvcc -V
查看torch版本:
pip show torch
我的Python版本是3.12.3、CUDA版本是12.1、torch版本是2.2.0
2.2 下载并安装
下载flash-attn,
Linux下载链接:https://github.com/Dao-AILab/flash-attention/releases
Windows下载链接:https://github.com/bdashore3/flash-attention/releases
根据自己的Python
、CUDA
和torch
的版本信息,选择对应的flash-attn
,注意选择abiFALSE
版本。
我的环境需要下载这个:
下载完成后,放在YOLOv12项目包的根目录,并在终端中安装flash-attn
包,安装命令(替换成自己的包名称即可):
pip install flash_attn-2.7.4.post1+cu12torch2.2cxx11abiFALSE-cp312-cp312-linux_x86_64.whl
安装完成后就配置完成了,可以进行训练了,YOLOv11中也可以按此配置,修改成YOLOv12。
2.3 YOLOv12模型结构
# YOLOv12 🚀, AGPL-3.0 license
# YOLOv12 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov12n.yaml' will call yolov12.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.50, 0.25, 1024] # summary: 465 layers, 2,603,056 parameters, 2,603,040 gradients, 6.7 GFLOPs
s: [0.50, 0.50, 1024] # summary: 465 layers, 9,285,632 parameters, 9,285,616 gradients, 21.7 GFLOPs
m: [0.50, 1.00, 512] # summary: 501 layers, 20,201,216 parameters, 20,201,200 gradients, 68.1 GFLOPs
l: [1.00, 1.00, 512] # summary: 831 layers, 26,454,880 parameters, 26,454,864 gradients, 89.7 GFLOPs
x: [1.00, 1.50, 512] # summary: 831 layers, 59,216,928 parameters, 59,216,912 gradients, 200.3 GFLOPs
# YOLO12n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 2, C3k2, [256, False, 0.25]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 2, C3k2, [512, False, 0.25]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 4, A2C2f, [512, True, 4]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 4, A2C2f, [1024, True, 1]] # 8
# YOLO12n head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 2, A2C2f, [512, False, -1]] # 11
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 2, A2C2f, [256, False, -1]] # 14
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 11], 1, Concat, [1]] # cat head P4
- [-1, 2, A2C2f, [512, False, -1]] # 17
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 8], 1, Concat, [1]] # cat head P5
- [-1, 2, C3k2, [1024, True]] # 20 (P5/32-large)
- [[14, 17, 20], 1, Detect, [nc]] # Detect(P3, P4, P5)
三、成功运行结果
from n params module arguments
0 -1 1 1856 ultralytics.nn.modules.conv.Conv [3, 64, 3, 2]
1 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2]
2 -1 1 111872 ultralytics.nn.modules.block.C3k2 [128, 256, 1, True, 0.25]
3 -1 1 590336 ultralytics.nn.modules.conv.Conv [256, 256, 3, 2]
4 -1 1 444928 ultralytics.nn.modules.block.C3k2 [256, 512, 1, True, 0.25]
5 -1 1 2360320 ultralytics.nn.modules.conv.Conv [512, 512, 3, 2]
6 -1 2 2690560 ultralytics.nn.modules.block.A2C2f [512, 512, 2, True, 4]
7 -1 1 2360320 ultralytics.nn.modules.conv.Conv [512, 512, 3, 2]
8 -1 2 2690560 ultralytics.nn.modules.block.A2C2f [512, 512, 2, True, 1]
9 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
10 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]
11 -1 1 1248768 ultralytics.nn.modules.block.A2C2f [1024, 512, 1, False, -1]
12 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
13 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]
14 -1 1 378624 ultralytics.nn.modules.block.A2C2f [1024, 256, 1, False, -1]
15 -1 1 590336 ultralytics.nn.modules.conv.Conv [256, 256, 3, 2]
16 [-1, 11] 1 0 ultralytics.nn.modules.conv.Concat [1]
17 -1 1 1183232 ultralytics.nn.modules.block.A2C2f [768, 512, 1, False, -1]
18 -1 1 2360320 ultralytics.nn.modules.conv.Conv [512, 512, 3, 2]
19 [-1, 8] 1 0 ultralytics.nn.modules.conv.Concat [1]
20 -1 1 1642496 ultralytics.nn.modules.block.C3k2 [1024, 512, 1, True]
21 [14, 17, 20] 1 1411795 ultralytics.nn.modules.head.Detect [1, [256, 512, 512]]
YOLOv12m summary: 501 layers, 20,140,307 parameters, 20,140,291 gradients, 67.7 GFLOPs
专栏目录:YOLOv12改进目录一览 | 涉及卷积层、轻量化、注意力、损失函数、Backbone、SPPF、Neck、检测头等全方位改进