1、首先将现有的 驱动和cuda 程序删除
#查找nvidia相关的包,然后将驱动相关的全部删除,一般采用*nvidia* 正则即可
rpm -qa | grep -i nvidia | sort
yum remove *nvidia*
#查找cuda相关的包,然后将cuda相关的全部删除,一般采用*cuda* 正则即可
rpm -qa | grep -i cuda | sort
yum remove *cuda*
2、网上找到自己想安装的NVIDIA 驱动
【NVIDIA驱动下载官网】 https://www.nvidia.cn/Download/index.aspx?lang=cn
例如,我下载的是 NVIDIA-Linux-x86_64-410.104.run ,直接执行,然后按照屏幕显示指示安装即可
bash NVIDIA-Linux-x86_64-410.104.run
3、网上找到自己想安装的cuda版本下载链接(下载链接里面有安装指南,推荐使用runfile模式,直接参照执行即可)
【cuda 9.0】https://developer.nvidia.com/cuda-90-download-archive
【cuda10.0】https://developer.nvidia.com/cuda-10.0-download-archive
【最新版本cuda10.1】https://developer.nvidia.com/cuda-downloads
4、执行nvidia-smi命令,验证是否安装成功
#nvidia-smi
Fri Jul 12 14:05:01 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:07.0 Off | 0 |
| N/A 27C P0 26W / 250W | 0MiB / 16280MiB | 6% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
看到【NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 】类似语句,则表明安装成功
备注: 如果只出现【NVIDIA-SMI 410.104 Driver Version: 410.104 】,则有可能驱动和cuda没有关联上,建议重新运行驱动安装程序【安装410.104版本:sh NVIDIA-Linux-x86_64-410.104.run】,然后就可以正常了。