ubuntu内核更新导致的nvidia-smi failed问题

上次搞了好久的驱动配置又用不了,电脑又开始用eddy_openmp了,查看nvidia-smi显示如下报错

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

最近有看到大神的帖子说ubuntu系统重启后会自动更新内核,journalctl | grep "Linux version" 查看了一下确实有更新,但是问题在于,我的驱动已经用到560了,一般不会存在驱动版本跟不上内核更新的问题

apt-get purge nvidia* #卸载驱动后重新安装失败,报错信息如下

Error! Bad return status for module build on kernel: 5.15.0-119-generic (x86_64)
Consult /var/lib/dkms/nvidia/550.90.07/build/make.log for more information.
dpkg: error processing package nvidia-dkms-550-server-open (--configure):
 installed nvidia-dkms-550-server-open package post-installation script subprocess returned error exit status 10
Setting up libnvidia-encode-550-server:amd64 (550.90.07-0ubuntu0.20.04.2) ...
Setting up libnvidia-encode-550-server:i386 (550.90.07-0ubuntu0.20.04.2) ...
dpkg: dependency problems prevent configuration of nvidia-driver-550-server-open:
 nvidia-driver-550-server-open depends on nvidia-dkms-550-server-open (<= 550.90.07-1); however:
  Package nvidia-dkms-550-server-open is not configured yet.
 nvidia-driver-550-server-open depends on nvidia-dkms-550-server-open (>= 550.90.07); however:
  Package nvidia-dkms-550-server-open is not configured yet.

dpkg: error processing package nvidia-driver-550-server-open (--configure):
 dependency problems - leaving unconfigured
Processing triggers for libc-bin (2.31-0ubuntu9.16) ...
No apport report written because the error message indicates its a followup error from a previous failure.
                                                                                                          Processing triggers for man-db (2.9.1-1) ...
Processing triggers for desktop-file-utils (0.24-1ubuntu3) ...
Processing triggers for mime-support (3.64ubuntu1) ...
Processing triggers for gnome-menus (3.36.0-1ubuntu1) ...
Processing triggers for initramfs-tools (0.136ubuntu6.7) ...
update-initramfs: Generating /boot/initrd.img-5.15.0-119-generic
I: The initramfs will attempt to resume from /dev/nvme0n1p3
I: (UUID=ef6d7681-b7ff-4ae4-bb31-92f4be832ba1)
I: Set the RESUME variable to override this.
Errors were encountered while processing:
 nvidia-dkms-550-server-open
 nvidia-driver-550-server-open
E: Sub-process /usr/bin/dpkg returned an error code (1)

又尝试使用可视化界面安装,还是报错

 ############################################################

有网友提到是编译器版本的问题,很怀疑,因为当时为了迁就eddy_cuda 用了很老的cuda和编译器,但是很神奇的问题在于

使用 gcc --Version查看gcc版本时,显示

gcc (conda-forge gcc 12.3.0-10) 12.3.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

而使用sudo update-alternatives --config gcc 查看系统默认配置,显示的又是这个

There are 7 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path            Priority   Status
------------------------------------------------------------
  0            /usr/bin/gcc-5   90        auto mode
  1            /usr/bin/g++-6   9         manual mode
  2            /usr/bin/g++-7   1         manual mode
  3            /usr/bin/g++-9   1         manual mode
  4            /usr/bin/gcc-5   90        manual mode
* 5            /usr/bin/gcc-6   1         manual mode
  6            /usr/bin/gcc-7   9         manual mode
  7            /usr/bin/gcc-9   50        manual mode

虽然咱也不懂什么原因,直接sudo update-alternatives --config gcc 后选择gcc-9作为系统默认版本,突然就可以正常安装驱动跟调用了,非常神奇

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值