1. 准备阶段
申请 MetaAI 认证链接
当前日期情况下,Meta AI 还是仅仅允许申请者才可以使用LLama2,所以需要提前申请一个URL 链接才可以,这里需要到 下面的Meta AI 官网上进行申请
https://ai.meta.com/resources/models-and-libraries/llama-downloads/https://ai.meta.com/resources/models-and-libraries/llama-downloads/随后在填写的 Email 邮箱里会收到 LLama 模型和 Code 的URL 认证链接
注意这里的链接有效时间只有24小时
新建用户
groupadd llm
useradd -m -g llm <用户名>
passwd <用户名>
给用户提供 sudo 权限
usermod -aG wheel <用户名>
2. 安装CUDA
查看显卡型号
nvidia-smi (System Management Interface SMI)
[root@worker85 ~]# nvidia-smi
Sat Nov 25 16:14:32 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A800 80GB PCIe Off | 00000000:31:00.0 Off | 0 |
| N/A 67C P0 88W / 300W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A800 80GB PCIe Off | 00000000:98:00.0 Off | 0 |
| N/A 73C P0 106W / 300W | 4MiB / 81920MiB | 24% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
这里可以看到我们有2台 A800 芯片
查看 CUDA 版本
[root@worker85 ~]# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
这里可以看到我们的 CUDA 版本为 12.2
由于我们需要安装的 stable diffusion 需要的 11.8 版本,和当前的12.2版本不兼容,所以这里我们需要重新安装
关闭 nouveau
(nouveau 是一个开源的图形驱动程序,用于支持 NVIDIA 显卡. 然而,nouveau 驱动程序可能会在某些情况下导致问题,例如在某些 NVIDIA 显卡上可能会出现冻结或崩溃的情况。这可能是由于硬件兼容性问题或驱动程序的限制所致。在这种情况下,用户可以尝试使用 NVIDIA 官方闭源驱动程序,以获得更好的稳定性和性能。)
按照下面的教程处理即可
下载安装 CUDA 11.8
CUDA Toolkit 11.8 Downloads | NVIDIA Developerhttps://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=CentOS&target_version=7&target_type=runfile_local这里我们使用 Centos 版本的 run 文件安装方法
确认磁盘空间
du -sh ./* # 查看当前文件夹下的文件/文件夹的大小
df -h # 查看 / 目录下所有文件的占用磁盘大小,以及剩余磁盘大小
移动文件+创建软连接
如果发现一个磁盘中的文件过大,可以放在另外一个磁盘中,然后创建一个软连接即可
mv file /dest_folder/
ls -s /dest_folder/file file
安装 CUDA 11.8
(/home/sd/conda_envs/llama2) [sd@worker85 cuda]$ sudo sh cuda_11.8.0_520.61.05_linux.run
[sudo] password for sd:
┌──────────────────────────────────────────────────────────────────────────────┐
│ End User License Agreement │
│ -------------------------- │
│ │
│ NVIDIA Software License Agreement and CUDA Supplement to │
│ Software License Agreement. Last updated: October 8, 2021 │
│ │
│ The CUDA Toolkit End User License Agreement applies to the │
│ NVIDIA CUDA Toolkit, the NVIDIA CUDA Samples, the NVIDIA │
│ Display Driver, NVIDIA Nsight tools (Visual Studio Edition), │
│ and the associated documentation on CUDA APIs, programming │
│ model and development tools. If you do not agree with the │
│ terms and conditions of the license agreement, then do not │
│ download or use the software. │
│ │
│ Last updated: October 8, 2021. │
│ │
│ │
│ Preface │
│ ------- │
│ │
│──────────────────────────────────────────────────────────────────────────────│
│ Do you accept the above EULA? (accept/decline/quit): │
│ accept │
└──────────────────────────────────────────────────────────────────────────────┘
填 accept
│ CUDA Installer │
│ - [X] Driver │
│ [X] 520.61.05 │
│ + [X] CUDA Toolkit 11.8 │
│ [X] CUDA Demo Suite 11.8 │
│ [X] CUDA Documentation 11.8 │
│ - [ ] Kernel Objects │
│ [ ] nvidia-fs │
│ Options │
│ Install │
│ │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │
└──────────────────────────────────────────────────────────────────────────────┘
选择 【install】
- 注意这里的 Driver 如果是已经安装过可以不选
- 这里的nvidia-fs 也不要选,否则最后会出错
┌──────────────────────────────────────────────────────────────────────────────┐
│ Existing installation of CUDA Toolkit 11.8 found: │
│ Upgrade all │
│ Choose components to upgrade │
│ No, abort installation │
│ │
│ │
│ Up/Down: Move | 'Enter': Select │
└──────────────────────────────────────────────────────────────────────────────┘
这里选择 【update all】
安装完成后的 CUDA 一般是在下面的路径
/usr/local/cuda-11.8/bin
安装 anaconda
bash anconda安装文件.sh
多用户共享 anaconda (假设 anaconda 的安装路径为 /usr/local/minianaconda)
/usr/local/minianaconda/bin/conda init
source /home/<用户名>/.bashrc
安装 screen
yum install screen
# 新建 screen
screen -S sd1
# 显示当前的screen
echo $STY
新建 conda 环境
如果需要更换 conda 环境的安装路径(考虑到有时候多个磁盘)
conda config --append envs_dirs <环境新路径>
由于我当前的环境不能做 SSL 验证,所以需要提前设置一下
——更换镜像源 :conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/cloud/pytorch/linux-64/
——设置ssl : conda config --set ssl_verify false
如果使用的是 miniconda,则在新建环境的时候需要注明 python 版本
conda create -n sd1 python==3.10
conda activate sd1
如果使用的是 miniconda,则在新建环境的时候需要注明 python 版本
3. 搭建本地的 LLama2 环境
参考: Meta开源大模型LLama 2 测试环境部署教程 - 知乎
网络不能接通
(/home/sd/conda_envs/sd1) [sd@worker85 stable-diffusion-webui]$ pip install datasets -i https://pypi.tuna.tsinghua.edu.cn/simple/
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f432e0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f435e0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f43790>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f43940>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
刚开始一直出现类似的问题,之前一直认为是 DNS 配置的问题,后来发现是有人关闭了我的网络代理
只需要在各自的 SSH 终端中重新 export 一下对应的网络代理即可
export http_proxy=http://192.168.184.103:3128
export https_proxy=$http_proxy
export no_proxy=127.0.0.1,localhost,local,.local
之后可以试试
(/home/sd/conda_envs/sd1) [sd@worker85 ~]$ wget www.baidu.com
--2023-11-25 22:07:54-- http://www.baidu.com/
Connecting to 192.168.184.103:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 2381 (2.3K) [text/html]
Saving to: ‘index.html’
index.html 100%[=============================================================================================>] 2.33K --.-KB/s in 0s
2023-11-25 22:07:55 (21.6 MB/s) - ‘index.html’ saved [2381/2381]
一切正常就说明已经没有问题了
下载 llama2
git clone https://github.com/facebookresearch/llama.git
cd llama/
安装 python 相关库
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install transformers -i https://pypi.tuna.tsinghua.edu.cn/simple/
下载模型权重
chmod +x download.sh #需要 wget 没有的自行安装 yum install -y wget
./download.sh # 回车 输入 1. 邮件申请的 custom URL 2. 7B-chat
运行 download.sh 时需要上面第一步申请的URL 链接才可以,例如我这边的是下面的链接
https://download.llamameta.net/*?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjoidm40bmpqZm1oMHp3YnN4YmJwczlldWt1IiwiUmVzb3VyY2UiOiJodHRwczpcL1wvZG93bmxvYWQubGxhbWFtZXRhLm5ldFwvKiIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwMTAxNTg5MH19fV19&Signature=uPezLGgClEYl0vBZfS9smN2oLDTsi3jvv1SumtMl0Utj8V2LwE0KhAq4i%7ET97pf8nXrO46TClOFMdVfMy4dFyEJAop7-FvwpGWDM4ZWY4PYlf3t0nJ4YSU4Hrbe3GzApRfADwI23nfKTUibG1CzG6x9EhhrXavY2BcmB8WHCGWmgPSJ4EykvEd61ibLLt80HwAenb2tAmLr169Vcg0y0yszV476CO95o9y3dh0k1MKin1D9tY%7E-IUAslV9grl-Auiv%7EoxxmUdu8j2Ipb82BB7T9f5Yq7CsW-HPmBE1BOsC7RPzWqL%7EyQvi3sqyOvoNiGUkK0t4Ro2zWTjlbEZXVmNg__&Key-Pair-Id=K15QRJLYKIFSLZ&Download-Request-ID=817512153460239
中间可以会遇到 SSL 认证失败的问题,我这边的解决方案是到 download.sh 文件中将所有的 wget 命令后加上
--no-check-certificate
然后 download.sh 就变成了
#!/usr/bin/env bash
# Copyright (c) Meta Platforms, Inc. and affiliates.
# This software may be used and distributed according to the terms of the Llama 2 Community License Agreement.
set -e
read -p "Enter the URL from email: " PRESIGNED_URL
echo ""
read -p "Enter the list of models to download without spaces (7B,13B,70B,7B-chat,13B-chat,70B-chat), or press Enter for all: " MODEL_SIZE
TARGET_FOLDER="." # where all files should end up
mkdir -p ${TARGET_FOLDER}
if [[ $MODEL_SIZE == "" ]]; then
MODEL_SIZE="7B,13B,70B,7B-chat,13B-chat,70B-chat"
fi
echo "Downloading LICENSE and Acceptable Usage Policy"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"LICENSE"} -O ${TARGET_FOLDER}"/LICENSE"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"USE_POLICY.md"} -O ${TARGET_FOLDER}"/USE_POLICY.md"
echo "Downloading tokenizer"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"tokenizer.model"} -O ${TARGET_FOLDER}"/tokenizer.model"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"tokenizer_checklist.chk"} -O ${TARGET_FOLDER}"/tokenizer_checklist.chk"
CPU_ARCH=$(uname -m)
if [ "$CPU_ARCH" = "arm64" ]; then
(cd ${TARGET_FOLDER} && md5 tokenizer_checklist.chk)
else
(cd ${TARGET_FOLDER} && md5sum -c tokenizer_checklist.chk)
fi
for m in ${MODEL_SIZE//,/ }
do
if [[ $m == "7B" ]]; then
SHARD=0
MODEL_PATH="llama-2-7b"
elif [[ $m == "7B-chat" ]]; then
SHARD=0
MODEL_PATH="llama-2-7b-chat"
elif [[ $m == "13B" ]]; then
SHARD=1
MODEL_PATH="llama-2-13b"
elif [[ $m == "13B-chat" ]]; then
SHARD=1
MODEL_PATH="llama-2-13b-chat"
elif [[ $m == "70B" ]]; then
SHARD=7
MODEL_PATH="llama-2-70b"
elif [[ $m == "70B-chat" ]]; then
SHARD=7
MODEL_PATH="llama-2-70b-chat"
fi
echo "Downloading ${MODEL_PATH}"
mkdir -p ${TARGET_FOLDER}"/${MODEL_PATH}"
for s in $(seq -f "0%g" 0 ${SHARD})
do
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/consolidated.${s}.pth"} -O ${TARGET_FOLDER}"/${MODEL_PATH}/consolidated.${s}.pth"
done
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/params.json"} -O ${TARGET_FOLDER}"/${MODEL_PATH}/params.json"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/checklist.chk"} -O ${TARGET_FOLDER}"/${MODEL_PATH}/checklist.chk"
echo "Checking checksums"
if [ "$CPU_ARCH" = "arm64" ]; then
(cd ${TARGET_FOLDER}"/${MODEL_PATH}" && md5 checklist.chk)
else
(cd ${TARGET_FOLDER}"/${MODEL_PATH}" && md5sum -c checklist.chk)
fi
done
然后再次运行 ./download.sh 命令即可
模型转换
cd ~
# clone 转换工具包
git clone https://github.com/huggingface/transformers
# 进入目录
cd transformers
# 安装 transformers
python setup.py install
# 安装 accelerate
pip install accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple/
# 安装 protobuf
pip install protobuf
# 执行模型转换
cd ..
python ./transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir ./llama/ --model_size 7B --output_dir ./huggingface
问题排查
缺少 protobuf
ImportError:
LlamaConverter requires the protobuf library but it was not found in your environment. Checkout the instructions on the
# 解决方法
pip install protobuf -i https://pypi.tuna.tsinghua.edu.cn/simple/
安装部署webui
git clone https://github.com/liltom-eth/llama2-webui.git
cd llama2-webui/
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/ # 清华源报错 换成 官方源
#设置模型路径
vi .env # 修改 MODEL_PATH
##########################################################################################
(/home/sd/conda_envs/llama2) [sd@worker85 llama2-webui]$ vi .env
MODEL_PATH = "/home/sd/LLaMa2/huggingface/"
BACKEND_TYPE = "transformers"
LOAD_IN_8BIT = True
LOAD_IN_4BIT = False
LLAMA_CPP = False
MAX_MAX_NEW_TOKENS = 2048
DEFAULT_MAX_NEW_TOKENS = 1024
MAX_INPUT_TOKEN_LENGTH = 4000
DEFAULT_SYSTEM_PROMPT = "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."
###########################################################################################
# 修改app.py
vi app.py
#最后一行 修改为
demo.queue(max_size=20).launch(server_name="0.0.0.0")
#运行
python app.py
测试
浏览器 打开 http://IP:7860/
(/home/sd/conda_envs/sd1) [sd@worker85 stable-diffusion-webui]$ nvidia-smi
Sun Nov 26 17:49:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A800 80G... Off | 00000000:31:00.0 Off | 0 |
| N/A 83C P0 114W / 300W | 4377MiB / 81920MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA A800 80G... Off | 00000000:98:00.0 Off | 0 |
| N/A 87C P0 84W / 300W | 4377MiB / 81920MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1290455 C python 4374MiB |
| 1 N/A N/A 1290455 C python 4374MiB |
+-----------------------------------------------------------------------------+
4. 搭建本地的 LLama2-Chinese-Alpaca
1. 首先下载一些相关的模型和库
# 下载一个 llama-2 中文大模型框架
git clone https://github.com/ymcui/Chinese-LLaMA-Alpaca-2.git
# 再下载一个已经经过预训练的 llama-2 中文大模型
# https://huggingface.co/hfl/chinese-alpaca-2-7b
# 这里的下载推荐使用 google drive 下载,速度更快
# 随后下载一些中文法律相关的 alpaca 精调指令格式的数据
git clone https://github.com/AndrewZhe/lawyer-llama.git
2. 安装一些环境依赖包
pip install protobuf -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install llama2-wrapper -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple/
3. 安装 deepspeed
pip install deepspeed
ds_report
可能看到下面的warning
(/home/sd/conda_envs/llama2) [sd@worker85 chinese-alpaca-2-7b]$ ds_report
[2023-11-27 04:01:02,438] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-devel package with yum
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
解决方法是
sudo yum install libaio-devel
然后就可以看到warning消失
(/home/sd/conda_envs/llama2) [sd@worker85 chinese-alpaca-2-7b]$ ds_report
[2023-11-27 04:02:25,275] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
async_io ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
cpu_lion ............... [NO] ....... [OKAY]
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
evoformer_attn ......... [NO] ....... [NO]
fused_lamb ............. [NO] ....... [OKAY]
fused_lion ............. [NO] ....... [OKAY]
inference_core_ops ..... [NO] ....... [OKAY]
cutlass_ops ............ [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
ragged_device_ops ...... [NO] ....... [OKAY]
ragged_ops ............. [NO] ....... [OKAY]
random_ltd ............. [NO] ....... [OKAY]
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
[WARNING] using untested triton version (2.0.0), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
spatial_inference ...... [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/home/sd/conda_envs/llama2/lib/python3.9/site-packages/torch']
torch version .................... 2.0.1+cu117
deepspeed install path ........... ['/home/sd/conda_envs/llama2/lib/python3.9/site-packages/deepspeed']
deepspeed info ................... 0.12.3, unknown, unknown
torch cuda version ............... 11.7
torch hip version ................ None
nvcc version ..................... 11.8
deepspeed wheel compiled w. ...... torch 2.0, cuda 11.7
shared memory (/dev/shm) size .... 503.39 GB
注意这里的 torch 版本是不支持稀疏 transformer 的,所以之后可以对 torch 版本尝试做一些优化