今天想用GPU跑跑程序,突然发现运行不了了,输入nvidia-smi查看,提示这个!!!
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 535.161
赶紧搜网上解决办法,据说重启了就好?但是!重启了,也没解决问题。
参考Failed to initialize NVML: Driver/library version mismatch解决_api mismatch: the client has the version 535.154.0-CSDN博客和apt - NVML driver/library mismatch after libnvidia update - Ask Ubuntu
输入如下代码,查看发现 libnvidia 版本更新了,升级到535.161.07了,不兼容
(base)server:~$ cat /var/log/dpkg.log |grep nvidia|grep libnvidia-common
版本不兼容
2024-04-10 06:09:43 upgrade libnvidia-common-525:all 525.147.05-0ubuntu0.22.04.1 525.147.05-0ubuntu2.22.04.1
2024-04-10 06:09:43 status half-configured libnvidia-common-525:all 525.147.05-0ubuntu0.22.04.1
2024-04-10 06:09:43 status unpacked libnvidia-common-525:all 525.147.05-0ubuntu0.22.04.1
2024-04-10 06:09:43 status half-installed libnvidia-common-525:all 525.147.05-0ubuntu0.22.04.1
2024-04-10 06:09:43 status unpacked libnvidia-common-525:all 525.147.05-0ubuntu2.22.04.1
2024-04-10 06:09:43 install libnvidia-common-535:all <none> 535.161.07-0ubuntu0.22.04.1
2024-04-10 06:09:43 status half-installed libnvidia-common-535:all 535.161.07-0ubuntu0.22.04.1
2024-04-10 06:09:43 status unpacked libnvidia-common-535:all 535.161.07-0ubuntu0.22.04.1
2024-04-10 06:09:43 configure libnvidia-common-535:all 535.161.07-0ubuntu0.22.04.1 <none>
2024-04-10 06:09:43 status unpacked libnvidia-common-535:all 535.161.07-0ubuntu0.22.04.1
2024-04-10 06:09:43 status half-configured libnvidia-common-535:all 535.161.07-0ubuntu0.22.04.1
2024-04-10 06:09:43 status installed libnvidia-common-535:all 535.161.07-0ubuntu0.22.04.1
2024-04-10 06:09:43 configure libnvidia-common-525:all 525.147.05-0ubuntu2.22.04.1 <none>
2024-04-10 06:09:43 status unpacked libnvidia-common-525:all 525.147.05-0ubuntu2.22.04.1
2024-04-10 06:09:43 status half-configured libnvidia-common-525:all 525.147.05-0ubuntu2.22.04.1
2024-04-10 06:09:43 status installed libnvidia-common-525:all 525.147.05-0ubuntu2.22.04.1
解决办法:输入这个
(base) server:~$ sudo apt install nvidia-driver-535.161.07
提示
正在读取软件包列表... 完成
正在分析软件包的依赖关系树... 完成
正在读取状态信息... 完成
E: 无法定位软件包 nvidia-driver-535.161
E: 无法按照 glob ‘nvidia-driver-535.161’ 找到任何软件包
竟然还是失败了!后来发现,不用输入这么具体,输入前三位数字535即可,然后重启!!!
sudo apt install nvidia-driver-535
安装过程提示,输入Y就行
将会同时安装下列软件:
nvidia-compute-utils-535 nvidia-dkms-535 nvidia-firmware-535-535.161.07 nvidia-kernel-common-535 nvidia-kernel-source-535 nvidia-prime
nvidia-settings nvidia-utils-535
下列【新】软件包将被安装:
nvidia-compute-utils-535 nvidia-dkms-535 nvidia-driver-535 nvidia-firmware-535-535.161.07 nvidia-kernel-common-535 nvidia-kernel-source-535
nvidia-prime nvidia-settings nvidia-utils-535
升级了 0 个软件包,新安装了 9 个软件包,要卸载 0 个软件包,有 190 个软件包未被升级。
需要下载 87.1 MB 的归档。
解压缩后会消耗 143 MB 的额外空间。
您希望继续执行吗? [Y/n] Y
期间可能会有安装更新的warning, 不用理会,然后重启!正常使用了!