环境部署步骤
-
下载镜像、生成container
原始nvidia 提供镜像的网站(包含kaldi)
例如:21.02版本,包含如下内容:
Ubuntu 20.04 including Python 3.8
NVIDIA CUDA 11.2.0 including cuBLAS 11.3.1
NVIDIA cuDNN 8.1.0
NVIDIA NCCL 2.8.4 (optimized for NVLink™)
MLNX_OFED 5.1
OpenMPI 4.0.5
Nsight Compute 2020.3.0.18
Nsight Systems 2020.4.3.7
TensorRT 7.2.2
-
下载命令:docker pull nvcr.io/nvidia/kaldi:22.01-py3
下载之后,docker images就可以看到这个镜像了。
-
使用如下命令创建容器:
NV_GPU=0,1 nvidia-docker run -itd -P \ --name wyr_wenet_kaldi_cuda11.2 \ --mount type=bind,source=/home/work/wangyaru05,target=/home/work/wangyaru05 \ -v /opt/wfs1/aivoice:/opt/wfs1/aivoice \ --net host \ --shm-size 64G \ nvcr.io/nvidia/kaldi:21.02-py3 bash NV_GPU=0,1,2,3,4,5,6,7 nvidia-docker run -itd -P \ --name wyr_wenet_kaldi_cuda11.2 \ --mount type=bind,source=/home/work/wangyaru05,target=/home/work/wangyaru05 \ -v /opt/wfs1/aivoice:/opt/wfs1/aivoice \ --net host \ --shm-size 64G \ nvcr.io/nvidia/kaldi:21.02-py3 bash NV_GPU=0,1,2,3,4,5,6,7 nvidia-docker run -itd -P \ --name wyr_wenet_kaldi_cuda11.6 \ --mount type=bind,source=/home/work/wangyaru05,target=/home/work/wangyaru05 \ -v /opt/wfs1/aivoice:/opt/wfs1/aivoice \ --net host \ --shm-size 64G \ nvcr.io/nvidia/kaldi:22.01-py3 bash
-
启动容器:
docker container start wyr_wenet_kaldi_cuda11.6
-
进入容器:
vim ~/.bashrc 添加 alias docker_connect='nvidia-docker exec -it wyr_wenet_kaldi_cuda11.6 bash' 或者直接 nvidia-docker exec -it wyr_wenet_kaldi_cuda11.6 bash
-
进入容器快捷命令:
vim ~/.bashrc
alias wyr_docker_connect='nvidia-docker exec -it wyr_wenet_kaldi_cuda11.6 bash'
-
查看ubantu版本
cat /etc/issue
-
配置pip镜像
vim ~/.pip/pip.conf
添加如下内容
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host=mirrors.aliyun.com
- 配置conda镜像
vim ~/.condarc
channels:
- defaults
show_channel_urls: true
default_channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
-
下载wenet代码
git clone --branch v1.0.0 https://github.com/wenet-e2e/wenet.git git clone https://github.com/wenet-e2e/wenet.git
-
创建conda虚拟环境
下载conda
wget https://repo.anaconda.com/archive/Anaconda3-2021.05-Linux-x86_64.sh
安装conda
bash Anaconda3-2021.05-Linux-x86_64.sh
添加anaconda3环境变量
vim ~/.bashrc 添加: export PATH=$PATH:/root/anaconda3/bin
安装wenet虚拟环境
conda create -n wenet python=3.8 ( 如果conda init报错:运行 source activate) conda activate wenet
-
安装依赖及pytorch torchvision torchaudio cudatoolkit
pip install -r requirements.txt conda install pytorch torchvision torchaudio cudatoolkit=11.2 -c pytorch -c conda-forge conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge conda install pytorch=1.8.1 torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge conda install pytorch=1.8.1 torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge conda install pytorch torchvision torchaudio cudatoolkit=11.5 -c pytorch -c conda-forge conda install pytorch=1.12.1 torchvision torchaudio cudatoolkit=11.3 -c pytorch -c conda-forge
其他汇总
常见错误
- 如果安装时总是安装CPU版本的pytorch,则可以尝试降低cudatoolkit版本
cuda不能装太低版本的pytorch,能装比较高的pytorch - 尝试手动安装(https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch中下载包)
conda install --use-local **.tar.bz2
下载特定cuda版本的torch,使用pip安装
# 修改最后的cu117为你想下载的就行
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
# 或者下面的
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch==1.12.0+cu116 torchvision -f https://download.pytorch.org/whl/torch_stable.html
缺点:下载速度慢;
也可以提前手动下载好:
- 下载需要包网址:https://download.pytorch.org/whl/torch_stable.html
pytorch的版本号,cuda版本号,python版本号都对应好。 - 使用pip手动安装
pip install some_package.whl
查看conda缓存路径
- conda info找到下载包的临时存放地址
- 查看该地址下的urls.txt,找到下载地址:https://conda.anaconda.org/pytorch/linux-64/torchaudio-0.8.1-py38.tar.bz2