方法1
参考:参考资料,可能没有效
方法2
具体参考NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running里面已解决
所使用的方法,主要的流程为:先删除所有与cuda有关的内容,再安装CUDA Toolkit
:
Even with those commands, the issue wasn’t solved.
Eventually, the fastest way to fix 2 machines with a package manager is to purge all Nvidia & Cuda,did it by:
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^libnvidia-.*'
sudo apt-get remove --purge '^cuda-.*'
Then after it’s clean ran that:
sudo apt-get install linux-headers-$(uname -r)
From here - it’s the same for all VMs:
Download latest run file from Nvidia site, and run it, accept if needed to upgrade current, or install from scratch.
The driver is back to work. (有时候需要重启电脑才能用nvidia-smi
)
The issue was started after did some updates, and the Linux kernel was changed.