一 系统环境
1,系统已安装 pytorch 1.7 , 1.8
a)
pip install torch==1.8.2+cu111 torchvision==0.9.2+cu111 torchaudio===0.8.2 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
或者
https://download.pytorch.org/whl/torch_stable.html
cu111/torch-1.8.1%2Bcu111-cp38-cp38-linux_x86_64.whl
cu111/torchvision-0.9.1%2Bcu111-cp38-cp38-linux_x86_64.whl
$nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
b) 因为时间久远忘记是安装的哪一个了,可以自己去NVIDIA
cudnn-11.3-linux-x64-v8.2.1.32.tgz 或 NVIDIA-Linux-x86_64-460.91.03.run
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 2
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#endif /* CUDNN_VERSION_H */
2 升级
2.1 升级nvidia-driver
$nvidia-smi
Tue May 17 10:22:01 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 40% 68C P2 200W / 280W | 7891MiB / 8192MiB | 97% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
$ubuntu-drivers
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00002488sv00001462sd00003901bc03sc00i00
vendor : NVIDIA Corporation
driver : nvidia-driver-470 - distro non-free recommended
driver : nvidia-driver-510-server - distro non-free
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
然后,可以直接apt install nvidia-driver-510 , 或者
在 软件和更新 中选择 附加驱动,然后选择推荐的驱动,点击 应用更改
2.2 创建torch 1.11环境
$conda create -n aigret python=3.10 ipykernel psutil jupyter jupyterlab nodejs numpy matplotlib
环境名称 aigret, python版本3.10 ,后面是一些会用的的软件,直接一起安装了。
$conda activate aigret
$conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
The following packages will be downloaded:
package | build
---------------------------|-----------------
blas-1.0 | mkl 6 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
cudatoolkit-11.3.1 | h2bc3f7f_2 549.3 MB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
ffmpeg-4.3 | hf484d3e_0 9.9 MB pytorch
gmp-6.2.1 | h58526e2_0 806 KB conda-forge
gnutls-3.6.13 | h85f3911_1 2.0 MB conda-forge
intel-openmp-2022.0.1 | h06a4308_3633 4.2 MB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
lame-3.100 | h7f98852_1001 496 KB conda-forge
libblas-3.9.0 | 14_linux64_mkl 13 KB conda-forge
libcblas-3.9.0 | 14_linux64_mkl 12 KB conda-forge
liblapack-3.9.0 | 14_linux64_mkl 12 KB conda-forge
mkl-2022.0.1 | h06a4308_117 127.7 MB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
nettle-3.6 | he412f7d_0 6.5 MB conda-forge
openh264-2.1.1 | h780b84a_0 1.5 MB conda-forge
pytorch-1.11.0 |py3.10_cuda11.3_cudnn8.2.0_0 1.02 GB pytorch
pytorch-mutex-1.0 | cuda 3 KB pytorch
torchaudio-0.11.0 | py310_cu113 5.3 MB pytorch
torchvision-0.12.0 | py310_cu113 27.5 MB pytorch
typing_extensions-4.2.0 | pyha770c72_1 27 KB conda-forge
------------------------------------------------------------
Total: 1.74 GB
然后在jupyterlab上选择环境并验证可用性。有意思的是这里不是显示1.11.0+cu113,而是如下:
import torch
print(torch.__version__)
print(torch.cuda.is_available())
====
1.11.0
True
====
1.8.2+cu111
True
代码没有任何变动,可以正常运行。后续研究一下有什么新功能,可以提供速度。