解决LLaVa安装flash-attn报错

林芝米林

已于 2025-04-05 17:57:30 修改

阅读量396

点赞数 7

文章标签： llama pytorch

于 2025-04-05 17:56:57 首次发布

本文链接：https://blog.csdn.net/weixin_49314598/article/details/147013321

版权

当部署和微调llava多模态大模型，按照官网步骤按部就班执行命令：

pip install flash-attn --no-build-isolation

会发现报错：

Collecting flash-attn
  Downloading flash_attn-2.5.6.tar.gz (2.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 11.4 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-9u5e9dng/flash-attn_e362cbbd46404df8a4978593d8bb899c/setup.py", line 114, in <module>
          raise RuntimeError(
      RuntimeError: FlashAttention is only supported on CUDA 11.6 and above.  Note: make sure nvcc has a supported version by running nvcc -V.
      
      
      torch.__version__  = 2.1.2+cu121

      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

查了很多资料资料之后，发现本质上还是cuda和pytorch版本的问题，下面给出自己的解决方案：

1. 首先还是按照官方的步骤：

#Create and activate a conda environment:

   conda create -n llava python=3.10 -y
   conda activate llava

2. 为了确保cuda和pytorch版本不会出错，直接在官方网站Previous PyTorch Versions | PyTorch上复制找到需要的版本安装，即torch==2.1.2, torchvision==0.16.2, cuda=12.1：（其他版本应该也行但是一定要对应好）
安装好之后可以检查一下是否正确：

python

import torch

会发现：

Traceback (most recent call last):  File "<stdin>", line 1, in <module>
  File "/home/liuzhenyu/miniconda3/envs/llava/lib/python3.10/site-packages/torch/__init__.py", line 1382, in <module>
    from .functional import *  # noqa: F403
  File "/home/liuzhenyu/miniconda3/envs/llava/lib/python3.10/site-packages/torch/functional.py", line 7, in <module>
    import torch.nn.functional as F
  File "/home/liuzhenyu/miniconda3/envs/llava/lib/python3.10/site-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/home/liuzhenyu/miniconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
    from .transformer import TransformerEncoder, TransformerDecoder, \
  File "/home/liuzhenyu/miniconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
    device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/home/liuzhenyu/miniconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /opt/conda/conda-bld/pytorch_1702400410390/work/torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),

根据错误信息，PyTorch 2.1.2 是使用 NumPy 1.x 编译的，而当前环境中的 NumPy 版本为 2.0.1，导致兼容性问题。因此直接：

# 卸载当前的 NumPy 2.x
pip uninstall numpy -y

# 安装兼容的 NumPy 1.x 版本（如 1.24.4）
pip install numpy==1.24.4

然后就会发现pytorch版本没问题了：

但是还有一个问题就是当你用nvcc -V查看cuda版本的时候会发现：

这里的11.5是我自己系统自带的cuda版本，也就是说并不是我们想安装的12.1。因此这里参考了记录贴：用conda管理cuda - 知乎。需要安装 cuda-nvcc：

conda install cuda-nvcc

但是会发现：

冲突原因：
- pytorch-cuda-12.1 需要 cuda-cudart >=12.1,<12.2（即 CUDA 12.1 版本）。
- cuda-nvcc 需要 cuda-cudart >=12.4（即 CUDA 12.4 版本）。
- 两者对 cuda-cudart 的版本要求完全冲突，无法共存。

因此如果必须保留 CUDA 12.1，需寻找与之兼容的 cuda-nvcc 版本：

# 1. 查找 CUDA 12.1 兼容的 cuda-nvcc 版本
conda search cuda-nvcc --channel nvidia

# 2. 安装指定版本（假设找到 12.1 版本）
conda install cuda-nvcc=12.1 -c nvidia

然后再运行nvcc -V的时候就会发现cuda是我们想要的12.1版本了：

这样最复杂的cuda安装问题就解决了。

后面就按部就班安装原始的步骤进行：

3. Upgrade pip and install the package:
   
   pip install --upgrade pip  # enable PEP 660 support
   pip install -e .

4. Install additional packages for training cases:
   
   pip install -e ".[train]"
   pip install flash-attn --no-build-isolation

但是又会发现：

因此可以直接去官网上下载下来Releases · Dao-AILab/flash-attention，注意一定要跟之前安装的cuda和pytorch版本对应（不对应是不行的）：

找到之后进行下载和安装：

wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.3/flash_attn-2.7.3+cu12torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

pip install flash_attn-2.7.3+cu12torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl --no-build-isolation

然后就会显示安装成功：

终于大功告成！