有个小朋友不知更新了啥导致服务器输入nvidia-smi之后显示如下信息:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
此问题我找了半天原因,不管怎么重装nvidia驱动都不对,最后有用的解决方案是更新内核。
主要参考资料:https://devtalk.nvidia.com/default/topic/1000340/cuda-setup-and-installation/-quot-nvidia-smi-has-failed-because-it-couldn-t-communicate-with-the-nvidia-driver-quot-ubuntu-16-04/post/5233711/#5233711
解决方案:更新Ubuntu内核(我们服务器从Linux 3.13.0-24-generic更新至Linux 4.12.9-041209-generic),然后按照正常流程安装最新的驱动nvidia-390
具体操作如下
#系统内核更新
#下载3个内核deb安装文件
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12.9/linux-headers-4.12.9-041209_4.12.9-041209.201708242344_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12.9/linux-headers-4.12.9-041209-generic_4.12.9-041209.201708242344_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12.9/linux-image-4.12.9-041209-generic_4.12.9-041209.201708242

当运行nvidia-smi提示“NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver”时,通过更新Ubuntu内核到4.12.9并重新安装nvidia-390驱动来解决问题。详细步骤包括下载内核deb文件,使用dpkg安装,验证内核版本,以及使用apt-get清除并安装新的NVIDIA驱动。
最低0.47元/天 解锁文章
8万+

被折叠的 条评论
为什么被折叠?



