07-04 周四关于vLLM(LLMs_inference)源码安装过程问题与解决

0-21

已于 2024-07-10 09:59:42 修改

阅读量4.3k

点赞数 33

文章标签： vscode

于 2024-07-10 09:57:43 首次发布

本文链接：https://blog.csdn.net/lk142500/article/details/140315365

版权

07-04 周四关于LLMs_inference源码安装过程问题与解决

时间	版本	修改人	描述
2024年7月4日09:48:09	V0.1	宋全恒	新建文档

简介

由于最近需要向vLLM上集成功能，因此，需要能够调试自己的仓库LLMs_Inference，该文档记录了源码编译的完整的过程。

参考链接如下：

Build from source

正常简单执行下述的代码，即可完成源码的编译安装

git clone https://github.com/vllm-project/vllm.git
cd vllm
# export VLLM_INSTALL_PUNICA_KERNELS=1 # optionally build for multi-LoRA capability
pip install -e .  # This may take 5-10 minutes.

但实际上还是比较麻烦的。因为仓库LLMs_Inference是从vllm仓库fork出来的，所以理论上应该是一样的。

仓库介绍

仓库中有多个依赖环境，

这些文件通常用于记录项目的依赖关系，以便在特定环境中进行安装和配置。

requirements.txt：一般用于列出项目所需的所有依赖项及其版本要求。通过在该文件中指定所需的库和版本，方便一次性安装所有依赖。
requirements-cpu.txt、requirements-cuda.txt、requirements-rocm.txt、requirements-neuron.txt：这些文件可能是针对不同的硬件或计算环境的特定依赖列表。例如，requirements-cuda.txt 可能包含与 CUDA（Compute Unified Device Architecture，一种并行计算平台和编程模型）相关的依赖；requirements-rocm.txt 可能涉及 ROCm（Radeon Open Compute platform，AMD 的开源计算平台）的依赖；requirements-neuron.txt 也许和特定的神经元芯片或相关技术的依赖有关。

而 requirements-dev.txt 通常用于开发环境所需的额外依赖项，这些依赖可能不是项目在运行时必需的，但对于开发、测试、构建等过程是需要的。

源码编译安装 vLLM 是否需要安装所有这些依赖文件，取决于你的具体需求和使用场景。

如果你计划在特定的硬件环境（如使用 CUDA、ROCM 等）中运行 vLLM 或进行相关开发，那么可能需要根据相应的环境安装对应的依赖文件。

以安装 vLLM 为例，通常需要先创建 conda 环境并激活，然后查看 requirements.txt 中指定的 PyTorch 版本等依赖信息，再进行安装。

vllm开发环境准备

直接在宿主机上安装

解决torch依赖下载问题

(llms_inference) yuzailiang@ubuntu:/mnt/self-define/sunning/lmdeploy/LLMs_Inference$ python -c "import torch; print('device count:',torch.cuda.device_count(), 'available: ', torch.cuda.is_available())"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/site-packages/torch/__init__.py", line 237, in <module>
    from torch._C import *  # noqa: F403
ImportError: /home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12

上面是直接安装时发现虽然安装了torch==2.3.0但是，无法使用gpu。

解决方式：

单独安装torch依赖

pip install torch==2.3.0 torchvision torchaudio --index-url  https://download.pytorch.org/whl/cu118

安装之后，验证torch可以正确的驱动CUDA调用GPU

(llms_inference) yuzailiang@ubuntu:/mnt/self-define/sunning/lmdeploy/LLMs_Inference$ python -c "import torch; print('device count:',torch.cuda.device_count(), 'available: ', torch.cuda.is_available())"
device count: 8 available:  True

继续执行pip install -e .

问题还是存在

packages/torch/__init__.py", line 237, in <module>
    from torch._C import *  # noqa: F403
ImportError: /home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12

尝试使用cu121-Couldn’t find CUDA library root.

pip install torch==2.3.0 torchvision torchaudio --index-url  https://download.pytorch.org/whl/cu121

Building wheels for collected packages: vllm
  Building editable for vllm (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building editable for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [139 lines of output]
      running editable_wheel
      creating /tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info
      writing /tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/PKG-INFO
      writing dependency_links to /tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/dependency_links.txt
      writing requirements to /tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/requires.txt
      writing top-level names to /tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/top_level.txt
      writing manifest file '/tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/SOURCES.txt'
      reading manifest file '/tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/SOURCES.txt'
      reading manifest template 'MANIFEST.in'
      adding license file 'LICENSE'
      writing manifest file '/tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm.egg-info/SOURCES.txt'
      creating '/tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm-0.4.2+cu120.dist-info'
      creating /tmp/pip-wheel-c7m73v0l/.tmp-t6j0dz53/vllm-0.4.2+cu120.dist-info/WHEEL
      running build_py
      running build_ext
      -- The CXX compiler identification is GNU 9.4.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /usr/bin/c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Build type: RelWithDebInfo
      -- Target device: cuda
      -- Found Python: /home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9 (found version "3.9.19") found components: Interpreter Development.Module
      -- Found python matching: /home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9.
      -- Found CUDA: /usr/local/cuda-12.0 (found version "12.0")
      CMake Error at /tmp/pip-build-env-xxhrqd7n/overlay/lib/python3.9/site-packages/cmake/data/share/cmake-3.30/Modules/Internal/CMakeCUDAFindToolkit.cmake:148 (message):
        Couldn't find CUDA library root.
      Call Stack (most recent call first):
      
            subprocess.CalledProcessError: Command '['cmake', '/mnt/self-define/sunning/lmdeploy/LLMs_Inference', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/tmpwkyqp9r6.build-lib/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=/tmp/tmpuvdjn65m.build-temp', '-DVLLM_TARGET_DEVICE=cuda', '-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache', '-DVLLM_PYTHON_EXECUTABLE=/home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=256']' returned non-zero exit status 1.

上述是非常复杂的环境，因为显示正在使用的cu120,即cuda 12.0.

Building wheels for collected packages: vllm
  Building editable for vllm (pyproject.toml) ... error
  
subprocess.CalledProcessError: Command '['cmake', '/mnt/self-define/sunning/lmdeploy/LLMs_Inference', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/tmplnbtei_9.build-lib/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=/tmp/tmpdu5xwhxm.build-temp', '-DVLLM_TARGET_DEVICE=cuda', '-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache', '-DVLLM_PYTHON_EXECUTABLE=/home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=256']' returned non-zero exit status 1.

但是nvidia-smi显示的cuda版本为12.4，就很奇怪。

(base) yuzailiang@ubuntu:/mnt/self-define/sunning/lmdeploy/LLMs_Inference$ nvidia-smi 
Tue Jul  9 09:02:46 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+---------------------

而在 /usr/local目录下，并没有这个cuda12.4.

(base) yuzailiang@ubuntu:/usr/local$ ll | grep cuda
lrwxrwxrwx  1 root root   22 Jul  8 05:22 cuda -> /etc/alternatives/cuda/
lrwxrwxrwx  1 root root   25 Jul  8 05:22 cuda-12 -> /etc/alternatives/cuda-12/
drwxr-xr-x 17 root root 4096 Jun 2