我实际上解决了我自己的问题,并想分享对我有效的解决方案。
神奇的谷歌搜索是:
“modprobe:致命:在目录/lib/modules/中找不到模块nvidia uvm”
这个答案的作者,Sneetsher,做了很好的解释,如果链接没有404,我就从这里开始。
悬崖笔记
诊断:我怀疑Ubuntu可能在我重新启动时安装了内核更新。
解决方案:重新安装NVIDIA驱动程序修复了错误。
问题:运行X服务器时无法安装NVIDIA驱动程序
修复NVIDIA驱动程序的两种不同方法
1)键盘和显示器:
解释askubuntu的答案:1) Switch to text-only console (Ctrl+Alt+F1 or any to F6).
2) Build driver modules for the current kernel (which just installed) sudo ./.run -K
我没有连接到这台电脑的键盘或显示器,所以下面是我实际使用的“无头”方法:
2)通过SSH:
按照本指南重新引导到控制台:$ sudo cp -n /etc/default/grub /etc/default/grub.orig
$ sudo nano /etc/default/grub
$ sudo update-grub
根据以上链接编辑grub文件(3个更改):Comment the line GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash”, by adding # at the beginning, which will disable the Ubuntu purple screen.
Change GRUB_CMDLINE_LINUX=”" to GRUB_CMDLINE_LINUX=”text”, this makes Ubuntu boot directly into Text Mode.
Uncomment this line #GRUB_TERMINAL=console, by removing the # at the beginning, this makes Grub Menu into real black & white Text Mode (without background image)
UPDATE: (If running Ubuntu 16.04 If
$ sudo systemctl set-default multi-user.target
Reboot into console$ sudo shutdown -r now
$ sudo service lightdm stop
$ sudo ./.run
遵循NVIDIA驱动程序安装程序$ sudo mv /etc/default/grub /etc/default/grub.textonly
$ sudo mv /etc/default/grub.orig /etc/default/grub
$ sudo update-grub
$ sudo shutdown -r now
结果(现在成功检测到GPU的情况)...
('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz')
I tensorflow/core/common_runtime/gpu/gpu_init.cc:118] Found device 0 with properties:
name: GeForce GTX 970
major: 5 minor: 2 memoryClockRate (GHz) 1.342
pciBusID 0000:01:00.0
Total memory: 3.94GiB
Free memory: 3.88GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:138] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:148] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:868] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0)
(0, 113040.92)
(1, 94895.867)
...