系统:Ubuntu24.04 LTS 安装失败,还是切换到ubuntu-22.04.5-live-server-amd64执行脚本
参考博客
vllm-rocm安装脚本
## AMD ROCm在Ubuntu22.04编译,所以24.04缺少依赖包,需要增加仓库
sudo add-apt-repository -y -s deb http://security.ubuntu.com/ubuntu jammy main universe
## 一键部署脚本
curl -L https://vllm.9700001.xyz/install.sh -o install.sh && chmod +x install.sh && bash install.sh
当前状态
- 安装中…,已放弃
问题记录
- 缺少numpy包
error: subprocess-exited-with-error
× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
/tmp/pip-build-env-s6y2y8ao/overlay/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
No ROCm runtime is found, using ROCM_HOME='/opt/rocm'
Traceback (most recent call last):
File "/data/vllmenv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
main()
File "/data/vllmenv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/data/vllmenv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 144, in get_requires_for_build_editable
return hook(config_settings)
File "/tmp/pip-build-env-s6y2y8ao/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 473, in get_requires_for_build_editable
return self.get_requires_for_build_wheel(config_settings)
File "/tmp/pip-build-env-s6y2y8ao/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 331, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=[])
File "/tmp/pip-build-env-s6y2y8ao/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 301, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-s6y2y8ao/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 317, in run_setup
exec(code, locals())
File "<string>", line 606, in <module>
File "<string>", line 475, in get_vllm_version
File "<string>", line 428, in get_nvcc_cuda_version
AssertionError: CUDA_HOME is not set
[end of output]
PVE 显卡直通卡死,不支持PVE甚至基于kvm的虚拟机显卡直通,直接上物理机吧。
预构建的docker镜像
## 这个镜像很大,由16.18G,基本上要相当长的时间了
docker pull btbtyler09/vllm-rocm-gcn5:0.8.5
Ollama
在构建vllm-rocm无果后,转向Ollama,使用官网的一键安装脚本即可
环境变量
- OLLAMA_HOST=http://0.0.0.0:11434
- OLLAMA_MODELS=/data/ollama/.ollama
- OLLAMA_KEEP_ALIVE=10m
- OLLAMA_NUM_PARALLEL=1
- OLLAMA_MAX_LOADED_MODELS=3
- OLLAMA_FLASH_ATTENTION=1
- OLLAMA_CONTEXT_LENGTH=8192
systemd
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin"
Environment="OLLAMA_HOST=http://0.0.0.0:11434"
Environment="OLLAMA_MODELS=/data/ollama/.ollama"
Environment="OLLAMA_KEEP_ALIVE=10m"
Environment="OLLAMA_NUM_PARALLEL=1"
Environment="OLLAMA_MAX_LOADED_MODELS=3"
Environment="OLLAMA_FLASH_ATTENTION=1"
Environment="OLLAMA_DEBUG=1"
Environment="OLLAMA_CONTEXT_LENGTH=8192"
[Install]
WantedBy=default.target
windows
设置系统环境变量即可
问题记录
- FastGPT对接,运行一段时间后,感觉后卡死,ollama出现异常
- FastGPT对接知识库,长的系统提示词,system无效:需要设置OLLAMA_CONTEXT_LENGTH=8192,可能是因为默认的上下文比较短,导致知识库的提示无效。