华为 OpenEuler OS 上操作 Nvidia A100 做 LLama2 开发（1）

上海拓朗思科技

已于 2023-11-28 10:26:14 修改

阅读量1.3k

点赞数 23

文章标签：人工智能

于 2023-11-27 11:49:52 首次发布

本文链接：https://blog.csdn.net/xuelangqingkong/article/details/134622230

版权

1. 准备阶段

申请 MetaAI 认证链接

当前日期情况下，Meta AI 还是仅仅允许申请者才可以使用LLama2，所以需要提前申请一个URL 链接才可以，这里需要到下面的Meta AI 官网上进行申请

https://ai.meta.com/resources/models-and-libraries/llama-downloads/https://ai.meta.com/resources/models-and-libraries/llama-downloads/随后在填写的 Email 邮箱里会收到 LLama 模型和 Code 的URL 认证链接

注意这里的链接有效时间只有24小时

新建用户

groupadd llm
useradd -m -g llm <用户名>
passwd <用户名>

给用户提供 sudo 权限

usermod -aG wheel <用户名>

2. 安装CUDA

查看显卡型号

nvidia-smi (System Management Interface SMI)

[root@worker85 ~]# nvidia-smi
Sat Nov 25 16:14:32 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A800 80GB PCIe          Off | 00000000:31:00.0 Off |                    0 |
| N/A   67C    P0              88W / 300W |      4MiB / 81920MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A800 80GB PCIe          Off | 00000000:98:00.0 Off |                    0 |
| N/A   73C    P0             106W / 300W |      4MiB / 81920MiB |     24%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

这里可以看到我们有2台 A800 芯片

查看 CUDA 版本

[root@worker85 ~]# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

这里可以看到我们的 CUDA 版本为 12.2

参考下面的 CUDA 版本的兼容性 CUDA Compatibility :: NVIDIA Data Center GPU Driver DocumentationCUDA Compatibility document describes the use of new CUDA toolkit components on systems with older base installations.https://docs.nvidia.com/deploy/cuda-compatibility/

由于我们需要安装的 stable diffusion 需要的 11.8 版本，和当前的12.2版本不兼容，所以这里我们需要重新安装

关闭 nouveau

(nouveau 是一个开源的图形驱动程序，用于支持 NVIDIA 显卡. 然而，nouveau 驱动程序可能会在某些情况下导致问题，例如在某些 NVIDIA 显卡上可能会出现冻结或崩溃的情况。这可能是由于硬件兼容性问题或驱动程序的限制所致。在这种情况下，用户可以尝试使用 NVIDIA 官方闭源驱动程序，以获得更好的稳定性和性能。)

按照下面的教程处理即可

Disable Nouveau - NVIDIA Docs

下载安装 CUDA 11.8

CUDA Toolkit 11.8 Downloads | NVIDIA Developerhttps://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=CentOS&target_version=7&target_type=runfile_local这里我们使用 Centos 版本的 run 文件安装方法

确认磁盘空间

du -sh ./*       # 查看当前文件夹下的文件/文件夹的大小

df -h            # 查看 / 目录下所有文件的占用磁盘大小，以及剩余磁盘大小

移动文件+创建软连接

如果发现一个磁盘中的文件过大，可以放在另外一个磁盘中，然后创建一个软连接即可

mv file /dest_folder/
ls -s /dest_folder/file file

安装 CUDA 11.8

(/home/sd/conda_envs/llama2) [sd@worker85 cuda]$ sudo sh cuda_11.8.0_520.61.05_linux.run
[sudo] password for sd:
┌──────────────────────────────────────────────────────────────────────────────┐
│  End User License Agreement                                                  │
│  --------------------------                                                  │
│                                                                              │
│  NVIDIA Software License Agreement and CUDA Supplement to                    │
│  Software License Agreement. Last updated: October 8, 2021                   │
│                                                                              │
│  The CUDA Toolkit End User License Agreement applies to the                  │
│  NVIDIA CUDA Toolkit, the NVIDIA CUDA Samples, the NVIDIA                    │
│  Display Driver, NVIDIA Nsight tools (Visual Studio Edition),                │
│  and the associated documentation on CUDA APIs, programming                  │
│  model and development tools. If you do not agree with the                   │
│  terms and conditions of the license agreement, then do not                  │
│  download or use the software.                                               │
│                                                                              │
│  Last updated: October 8, 2021.                                              │
│                                                                              │
│                                                                              │
│  Preface                                                                     │
│  -------                                                                     │
│                                                                              │
│──────────────────────────────────────────────────────────────────────────────│
│ Do you accept the above EULA? (accept/decline/quit):                         │
│ accept                                                                       │
└──────────────────────────────────────────────────────────────────────────────┘

填 accept

│ CUDA Installer                                                               │
│ - [X] Driver                                                                 │
│      [X] 520.61.05                                                           │
│ + [X] CUDA Toolkit 11.8                                                      │
│   [X] CUDA Demo Suite 11.8                                                   │
│   [X] CUDA Documentation 11.8                                                │
│ - [ ] Kernel Objects                                                         │
│      [ ] nvidia-fs                                                           │
│   Options                                                                    │
│   Install                                                                    │
│                                                                              │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │
└──────────────────────────────────────────────────────────────────────────────┘

选择【install】

注意这里的 Driver 如果是已经安装过可以不选
这里的nvidia-fs 也不要选，否则最后会出错

┌──────────────────────────────────────────────────────────────────────────────┐
│ Existing installation of CUDA Toolkit 11.8 found:                            │
│ Upgrade all                                                                  │
│ Choose components to upgrade                                                 │
│ No, abort installation                                                       │
│                                                                              │
│                                                                              │
│ Up/Down: Move | 'Enter': Select                                              │
└──────────────────────────────────────────────────────────────────────────────┘

这里选择【update all】

安装完成后的 CUDA 一般是在下面的路径

/usr/local/cuda-11.8/bin

安装 anaconda

bash anconda安装文件.sh

多用户共享 anaconda (假设 anaconda 的安装路径为 /usr/local/minianaconda)

/usr/local/minianaconda/bin/conda init
source /home/<用户名>/.bashrc

安装 screen

yum install screen

# 新建 screen
screen -S sd1 

# 显示当前的screen
echo $STY

新建 conda 环境

如果需要更换 conda 环境的安装路径（考虑到有时候多个磁盘）

 conda config --append envs_dirs <环境新路径>

由于我当前的环境不能做 SSL 验证，所以需要提前设置一下

——更换镜像源 :conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/cloud/pytorch/linux-64/
——设置ssl : conda config --set ssl_verify false

如果使用的是 miniconda，则在新建环境的时候需要注明 python 版本

conda create -n sd1 python==3.10
conda activate sd1

如果使用的是 miniconda，则在新建环境的时候需要注明 python 版本

3. 搭建本地的 LLama2 环境

参考： Meta开源大模型LLama 2 测试环境部署教程 - 知乎

网络不能接通

(/home/sd/conda_envs/sd1) [sd@worker85 stable-diffusion-webui]$ pip install datasets -i https://pypi.tuna.tsinghua.edu.cn/simple/
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f432e0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f435e0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f43790>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fe6d1f43940>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/datasets/

刚开始一直出现类似的问题，之前一直认为是 DNS 配置的问题，后来发现是有人关闭了我的网络代理

只需要在各自的 SSH 终端中重新 export 一下对应的网络代理即可

export http_proxy=http://192.168.184.103:3128
export https_proxy=$http_proxy
export no_proxy=127.0.0.1,localhost,local,.local

之后可以试试

(/home/sd/conda_envs/sd1) [sd@worker85 ~]$ wget www.baidu.com
--2023-11-25 22:07:54--  http://www.baidu.com/
Connecting to 192.168.184.103:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 2381 (2.3K) [text/html]
Saving to: ‘index.html’

index.html                                   100%[=============================================================================================>]   2.33K  --.-KB/s    in 0s

2023-11-25 22:07:55 (21.6 MB/s) - ‘index.html’ saved [2381/2381]

一切正常就说明已经没有问题了

下载 llama2

git clone https://github.com/facebookresearch/llama.git
cd llama/

安装 python 相关库

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install transformers -i https://pypi.tuna.tsinghua.edu.cn/simple/

下载模型权重

chmod +x download.sh   #需要 wget 没有的自行安装 yum install -y wget 
./download.sh # 回车 输入 1. 邮件申请的 custom URL  2. 7B-chat

运行 download.sh 时需要上面第一步申请的URL 链接才可以，例如我这边的是下面的链接

https://download.llamameta.net/*?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjoidm40bmpqZm1oMHp3YnN4YmJwczlldWt1IiwiUmVzb3VyY2UiOiJodHRwczpcL1wvZG93bmxvYWQubGxhbWFtZXRhLm5ldFwvKiIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwMTAxNTg5MH19fV19&Signature=uPezLGgClEYl0vBZfS9smN2oLDTsi3jvv1SumtMl0Utj8V2LwE0KhAq4i%7ET97pf8nXrO46TClOFMdVfMy4dFyEJAop7-FvwpGWDM4ZWY4PYlf3t0nJ4YSU4Hrbe3GzApRfADwI23nfKTUibG1CzG6x9EhhrXavY2BcmB8WHCGWmgPSJ4EykvEd61ibLLt80HwAenb2tAmLr169Vcg0y0yszV476CO95o9y3dh0k1MKin1D9tY%7E-IUAslV9grl-Auiv%7EoxxmUdu8j2Ipb82BB7T9f5Yq7CsW-HPmBE1BOsC7RPzWqL%7EyQvi3sqyOvoNiGUkK0t4Ro2zWTjlbEZXVmNg__&Key-Pair-Id=K15QRJLYKIFSLZ&Download-Request-ID=817512153460239

中间可以会遇到 SSL 认证失败的问题，我这边的解决方案是到 download.sh 文件中将所有的 wget 命令后加上

--no-check-certificate

然后 download.sh 就变成了

#!/usr/bin/env bash

# Copyright (c) Meta Platforms, Inc. and affiliates.
# This software may be used and distributed according to the terms of the Llama 2 Community License Agreement.

set -e

read -p "Enter the URL from email: " PRESIGNED_URL
echo ""
read -p "Enter the list of models to download without spaces (7B,13B,70B,7B-chat,13B-chat,70B-chat), or press Enter for all: " MODEL_SIZE
TARGET_FOLDER="."             # where all files should end up
mkdir -p ${TARGET_FOLDER}

if [[ $MODEL_SIZE == "" ]]; then
    MODEL_SIZE="7B,13B,70B,7B-chat,13B-chat,70B-chat"
fi

echo "Downloading LICENSE and Acceptable Usage Policy"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"LICENSE"} -O ${TARGET_FOLDER}"/LICENSE"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"USE_POLICY.md"} -O ${TARGET_FOLDER}"/USE_POLICY.md"

echo "Downloading tokenizer"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"tokenizer.model"} -O ${TARGET_FOLDER}"/tokenizer.model"
wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"tokenizer_checklist.chk"} -O ${TARGET_FOLDER}"/tokenizer_checklist.chk"
CPU_ARCH=$(uname -m)
  if [ "$CPU_ARCH" = "arm64" ]; then
    (cd ${TARGET_FOLDER} && md5 tokenizer_checklist.chk)
  else
    (cd ${TARGET_FOLDER} && md5sum -c tokenizer_checklist.chk)
  fi

for m in ${MODEL_SIZE//,/ }
do
    if [[ $m == "7B" ]]; then
        SHARD=0
        MODEL_PATH="llama-2-7b"
    elif [[ $m == "7B-chat" ]]; then
        SHARD=0
        MODEL_PATH="llama-2-7b-chat"
    elif [[ $m == "13B" ]]; then
        SHARD=1
        MODEL_PATH="llama-2-13b"
    elif [[ $m == "13B-chat" ]]; then
        SHARD=1
        MODEL_PATH="llama-2-13b-chat"
    elif [[ $m == "70B" ]]; then
        SHARD=7
        MODEL_PATH="llama-2-70b"
    elif [[ $m == "70B-chat" ]]; then
        SHARD=7
        MODEL_PATH="llama-2-70b-chat"
    fi

    echo "Downloading ${MODEL_PATH}"
    mkdir -p ${TARGET_FOLDER}"/${MODEL_PATH}"

    for s in $(seq -f "0%g" 0 ${SHARD})
    do
        wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/consolidated.${s}.pth"} -O ${TARGET_FOLDER}"/${MODEL_PATH}/consolidated.${s}.pth"
    done

    wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/params.json"} -O ${TARGET_FOLDER}"/${MODEL_PATH}/params.json"
    wget --no-check-certificate --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/checklist.chk"} -O ${TARGET_FOLDER}"/${MODEL_PATH}/checklist.chk"
    echo "Checking checksums"
    if [ "$CPU_ARCH" = "arm64" ]; then
      (cd ${TARGET_FOLDER}"/${MODEL_PATH}" && md5 checklist.chk)
    else
      (cd ${TARGET_FOLDER}"/${MODEL_PATH}" && md5sum -c checklist.chk)
    fi
done

然后再次运行 ./download.sh 命令即可

模型转换

cd ~

# clone 转换工具包
git clone https://github.com/huggingface/transformers 

# 进入目录
cd transformers

# 安装 transformers
python setup.py install 

# 安装 accelerate
pip install accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple/

# 安装 protobuf
pip install protobuf

# 执行模型转换
cd ..
python ./transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir ./llama/ --model_size 7B --output_dir ./huggingface

问题排查

缺少 protobuf

ImportError:
LlamaConverter requires the protobuf library but it was not found in your environment. Checkout the instructions on the

# 解决方法
pip install protobuf -i https://pypi.tuna.tsinghua.edu.cn/simple/

安装部署webui

git clone https://github.com/liltom-eth/llama2-webui.git
cd llama2-webui/

pip install -r requirements.txt  -i https://pypi.tuna.tsinghua.edu.cn/simple/ # 清华源报错  换成 官方源


#设置模型路径

vi .env # 修改 MODEL_PATH

##########################################################################################
(/home/sd/conda_envs/llama2) [sd@worker85 llama2-webui]$ vi .env 
MODEL_PATH = "/home/sd/LLaMa2/huggingface/"
BACKEND_TYPE = "transformers"
LOAD_IN_8BIT = True
LOAD_IN_4BIT = False
LLAMA_CPP = False

MAX_MAX_NEW_TOKENS = 2048
DEFAULT_MAX_NEW_TOKENS = 1024
MAX_INPUT_TOKEN_LENGTH = 4000

DEFAULT_SYSTEM_PROMPT = "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."


###########################################################################################

# 修改app.py
vi app.py 
#最后一行 修改为 
demo.queue(max_size=20).launch(server_name="0.0.0.0")

#运行
python app.py

测试

浏览器打开 http://IP:7860/

(/home/sd/conda_envs/sd1) [sd@worker85 stable-diffusion-webui]$ nvidia-smi
Sun Nov 26 17:49:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A800 80G...  Off  | 00000000:31:00.0 Off |                    0 |
| N/A   83C    P0   114W / 300W |   4377MiB / 81920MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A800 80G...  Off  | 00000000:98:00.0 Off |                    0 |
| N/A   87C    P0    84W / 300W |   4377MiB / 81920MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A   1290455      C   python                           4374MiB |
|    1   N/A  N/A   1290455      C   python                           4374MiB |
+-----------------------------------------------------------------------------+

4. 搭建本地的 LLama2-Chinese-Alpaca

1. 首先下载一些相关的模型和库

# 下载一个 llama-2 中文大模型框架
git clone https://github.com/ymcui/Chinese-LLaMA-Alpaca-2.git

# 再下载一个已经经过预训练的 llama-2 中文大模型
# https://huggingface.co/hfl/chinese-alpaca-2-7b
# 这里的下载推荐使用 google drive 下载，速度更快

# 随后下载一些中文法律相关的 alpaca 精调指令格式的数据
git clone https://github.com/AndrewZhe/lawyer-llama.git

2. 安装一些环境依赖包

pip install protobuf -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install llama2-wrapper -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple/

3. 安装 deepspeed

pip install deepspeed
ds_report

可能看到下面的warning

(/home/sd/conda_envs/llama2) [sd@worker85 chinese-alpaca-2-7b]$ ds_report
[2023-11-27 04:01:02,438] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-devel package with yum
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.

解决方法是

sudo yum install libaio-devel

然后就可以看到warning消失

(/home/sd/conda_envs/llama2) [sd@worker85 chinese-alpaca-2-7b]$ ds_report
[2023-11-27 04:02:25,275] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
async_io ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
cpu_lion ............... [NO] ....... [OKAY]
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
evoformer_attn ......... [NO] ....... [NO]
fused_lamb ............. [NO] ....... [OKAY]
fused_lion ............. [NO] ....... [OKAY]
inference_core_ops ..... [NO] ....... [OKAY]
cutlass_ops ............ [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
ragged_device_ops ...... [NO] ....... [OKAY]
ragged_ops ............. [NO] ....... [OKAY]
random_ltd ............. [NO] ....... [OKAY]
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
 [WARNING]  using untested triton version (2.0.0), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
spatial_inference ...... [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/home/sd/conda_envs/llama2/lib/python3.9/site-packages/torch']
torch version .................... 2.0.1+cu117
deepspeed install path ........... ['/home/sd/conda_envs/llama2/lib/python3.9/site-packages/deepspeed']
deepspeed info ................... 0.12.3, unknown, unknown
torch cuda version ............... 11.7
torch hip version ................ None
nvcc version ..................... 11.8
deepspeed wheel compiled w. ...... torch 2.0, cuda 11.7
shared memory (/dev/shm) size .... 503.39 GB

注意这里的 torch 版本是不支持稀疏 transformer 的，所以之后可以对 torch 版本尝试做一些优化

上海拓朗思科技

关注

23
点赞
踩
23

收藏

觉得还不错? 一键收藏
2
评论
华为 OpenEuler OS 上操作 Nvidia A100 做 LLama2 开发（1）

(nouveau 是一个开源的图形驱动程序，用于支持 NVIDIA 显卡. 然而，nouveau 驱动程序可能会在某些情况下导致问题，例如在某些 NVIDIA 显卡上可能会出现冻结或崩溃的情况。由于我们需要安装的 stable diffusion 需要的 11.8 版本，和当前的12.2版本不兼容，所以这里我们需要重新安装。运行 download.sh 时需要上面第一步申请的URL 链接才可以，例如我这边的是下面的链接。如果使用的是 miniconda，则在新建环境的时候需要注明 python 版本。
复制链接

扫一扫