安装YOLOv12中所需的Flash Attention（flash-attn），保姆级配置教程，将YOLOv11环境变成YOLOv12

Limiiiing

已于 2025-03-25 13:53:08 修改

阅读量2.7k

点赞数 30

分类专栏： YOLOv12改进专栏文章标签：深度学习目标检测计算机视觉 YOLOv12

于 2025-03-03 12:54:54 首次发布

本文链接：https://blog.csdn.net/qq_42591591/article/details/145930647

版权

YOLOv12改进专栏专栏收录该内容

该专栏为热销专栏榜第18名

152 篇文章 ¥99.90 ¥299.90

订阅专栏

前言

YOLOv12的模型文件中主要使用了A2C2f模块，这需要安装flash-attn包。本文总结了安装过程中可能遇到的问题和解决办法，并介绍如何根据自己的环境正确安装flash-attn包，并顺利训练YOLOv12。

一、未安装或未安装成功可能发生的报错和解决办法

1.1 RuntimeError: FlashAttention only supports Ampere GPUs or newer.

安装完成后报错：

报错原因： 当前显卡版本不支持，我用的V100，报这个错

解决办法： 使用其它显卡，例如RTX X090、H100、A100

1.2 False, “import FlashAttention error! Please install FlashAttention first.”

未安装报错：

报错原因： 未安装flash-attn

解决办法： 成功安装后便不会报错，参考第二节的安装步骤。

1.3 TypeError: argument of type ‘PosixPath’ is not iterable

安装完成后报错：

报错原因： 运行时候报错，参考：https://github.com/sunsmarterjie/yolov12/issues/2

解决办法： 在ultralytics/utils/downloads.py中的attempt_download_asset函数下，找到if 'v12' in file:，并修改成如下样式：

if 'v12' in str(file):  # 此处
   repo = "sunsmarterjie/yolov12"
   release = "v1.0"

二、flash-attn安装步骤

2.1 查看本地环境

安装flash-attn需根据自己的Python版本、CUDA版本还有torch版本，首先查看这些版本信息。

查看Python版本：

python --version

在这里插入图片描述
查看CUDA版本：

nvcc -V

在这里插入图片描述
查看torch版本：

pip show torch

在这里插入图片描述

我的Python版本是3.12.3、CUDA版本是12.1、torch版本是2.2.0

2.2 下载并安装

下载flash-attn，
Linux下载链接：https://github.com/Dao-AILab/flash-attention/releases
Windows下载链接：https://github.com/bdashore3/flash-attention/releases

根据自己的Python、CUDA和torch的版本信息，选择对应的flash-attn，注意选择abiFALSE版本。
我的环境需要下载这个：

在这里插入图片描述

下载完成后，放在YOLOv12项目包的根目录，并在终端中安装flash-attn包，安装命令(替换成自己的包名称即可)：

pip install flash_attn-2.7.4.post1+cu12torch2.2cxx11abiFALSE-cp312-cp312-linux_x86_64.whl

在这里插入图片描述

安装完成后就配置完成了，可以进行训练了，YOLOv11中也可以按此配置，修改成YOLOv12。

2.3 YOLOv12模型结构

# YOLOv12 🚀, AGPL-3.0 license
# YOLOv12 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov12n.yaml' will call yolov12.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.50, 0.25, 1024] # summary: 465 layers, 2,603,056 parameters, 2,603,040 gradients, 6.7 GFLOPs
  s: [0.50, 0.50, 1024] # summary: 465 layers, 9,285,632 parameters, 9,285,616 gradients, 21.7 GFLOPs
  m: [0.50, 1.00, 512] # summary: 501 layers, 20,201,216 parameters, 20,201,200 gradients, 68.1 GFLOPs
  l: [1.00, 1.00, 512] # summary: 831 layers, 26,454,880 parameters, 26,454,864 gradients, 89.7 GFLOPs
  x: [1.00, 1.50, 512] # summary: 831 layers, 59,216,928 parameters, 59,216,912 gradients, 200.3 GFLOPs

# YOLO12n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv,  [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv,  [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2,  [256, False, 0.25]]
  - [-1, 1, Conv,  [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2,  [512, False, 0.25]]
  - [-1, 1, Conv,  [512, 3, 2]] # 5-P4/16
  - [-1, 4, A2C2f, [512, True, 4]]
  - [-1, 1, Conv,  [1024, 3, 2]] # 7-P5/32
  - [-1, 4, A2C2f, [1024, True, 1]] # 8

# YOLO12n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 2, A2C2f, [512, False, -1]] # 11

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 2, A2C2f, [256, False, -1]] # 14

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 11], 1, Concat, [1]] # cat head P4
  - [-1, 2, A2C2f, [512, False, -1]] # 17

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 8], 1, Concat, [1]] # cat head P5
  - [-1, 2, C3k2, [1024, True]] # 20 (P5/32-large)

  - [[14, 17, 20], 1, Detect, [nc]] # Detect(P3, P4, P5)

三、成功运行结果

                   from  n    params  module                                       arguments                     
  0                  -1  1      1856  ultralytics.nn.modules.conv.Conv             [3, 64, 3, 2]                 
  1                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  2                  -1  1    111872  ultralytics.nn.modules.block.C3k2            [128, 256, 1, True, 0.25]     
  3                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
  4                  -1  1    444928  ultralytics.nn.modules.block.C3k2            [256, 512, 1, True, 0.25]     
  5                  -1  1   2360320  ultralytics.nn.modules.conv.Conv             [512, 512, 3, 2]              
  6                  -1  2   2690560  ultralytics.nn.modules.block.A2C2f           [512, 512, 2, True, 4]        
  7                  -1  1   2360320  ultralytics.nn.modules.conv.Conv             [512, 512, 3, 2]              
  8                  -1  2   2690560  ultralytics.nn.modules.block.A2C2f           [512, 512, 2, True, 1]        
  9                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 10             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 11                  -1  1   1248768  ultralytics.nn.modules.block.A2C2f           [1024, 512, 1, False, -1]     
 12                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 13             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 14                  -1  1    378624  ultralytics.nn.modules.block.A2C2f           [1024, 256, 1, False, -1]     
 15                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 16            [-1, 11]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 17                  -1  1   1183232  ultralytics.nn.modules.block.A2C2f           [768, 512, 1, False, -1]      
 18                  -1  1   2360320  ultralytics.nn.modules.conv.Conv             [512, 512, 3, 2]              
 19             [-1, 8]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 20                  -1  1   1642496  ultralytics.nn.modules.block.C3k2            [1024, 512, 1, True]          
 21        [14, 17, 20]  1   1411795  ultralytics.nn.modules.head.Detect           [1, [256, 512, 512]]          
YOLOv12m summary: 501 layers, 20,140,307 parameters, 20,140,291 gradients, 67.7 GFLOPs