运行nvidia-smi出现
Failed to initialize NVML: Driver/library version mismatch
表示驱动和库的版本不匹配,可能其中一个偷偷升级了。
解决方法:
(1)可以先重启一下,如果重启没有效果的话,可以试试下面的方法:
查看driver版本
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 470.129.06 Thu May 12 22:52:02 UTC 2022
GCC version: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
可以看出driver版本是470.129.06
查看lib版本
$ cat /var/log/dpkg.log|grep nvidia
2022-08-04 06:14:23 upgrade nvidia-utils-470:amd64 470.129.06-0ubuntu0.20.04.1 470.141.03-0ubuntu0.20.04.1
2022-08-04 06:14:23 status half-configured nvidia-utils-470:amd64 470.129.06-0ubuntu0.20.04.1
2022-08-04 06:14:23 status unpacked nvidia-utils-470:amd64 470.129.06-0ubuntu0.20.04.1
2022-08-04 06:14:23 status half-installed nvidia-utils-470:amd64 470.129.06-0ubuntu0.20.04.1
2022-08-04 06:14:23 status unpacked nvidia-utils-470:amd64 470.141.03-0ubuntu0.20.04.1
2022-08-04 06:14:23 configure nvidia-utils-470:amd64 470.141.03-0ubuntu0.20.04.1 <none>
2022-08-04 06:14:23 status unpacked nvidia-utils-470:amd64 470.141.03-0ubuntu0.20.04.1
2022-08-04 06:14:23 status half-configured nvidia-utils-470:amd64 470.141.03-0ubuntu0.20.04.1
2022-08-04 06:14:23 status installed nvidia-utils-470:amd64 470.141.03-0ubuntu0.20.04.1
2022-08-04 06:14:34 upgrade libnvidia-compute-470:amd64 470.129.06-0ubuntu0.20.04.1 470.141.03-0ubuntu0.20.04.1
2022-08-04 06:14:34 status half-configured libnvidia-compute-470:amd64 470.129.06-0ubuntu0.20.04.1
...
lib的版本是470.141.03
需要升级一下driver版本
$ sudo apt install nvidia-driver-470
再重启一下(sudo reboot),就可以正常运行nvidia-smi了.