1 部署前准备
1.1 下载驱动
驱动版本选择
Nvidia 的驱动会向前兼容老版本的 CUDA,而新版本的 CUDA 会有对于 Driver 的最低版本要求:
- CUDA 10.0.x =》 Driver 410.48+
- CUDA 11.0.x =》 Driver 450.36.06+
因而永远会建议选择硬件支持的、最新版本的 GPU 驱动。
可以通过官网查询和下载合适版本的驱动:Official Drivers | NVIDIA
从公司FTP下载NVIDIA驱动上传到待安装环境,下载地址:
如果待安装环境之前已经安装过nvidia驱动,需要先卸载旧的,执行如下命令:
卸载驱动相关命令
[root@tos059 ~] # nvidia-uninstall [root@tos059 ~] # cuda-uninstall |
如果之前已经安装过nvidia驱动,并且驱动已经在使用中,下面在启动安装程序时会报错(驱动正在使用),需要在执行完 nvidia-uninstall 后重启系统,再部署NVIDIA驱动。

1.2 安装依赖
yum -y install gcc kernel-devel-$( uname -r) kernel-headers-$( uname -r) |
1.3 冲突检查
操作系统可能自带了与 NVIDIA 驱动冲突的 nouveau 驱动,需要提前禁用相关模块。

禁用nouveau
# 禁用nouveau模块的启动 echo -e "blacklist nouveau\noptions nouveau modeset=0" > /etc/modprobe .d /blacklist .conf # 根据配置重新生成预加载模块的初始化镜像(Creates initial ramdisk images for preloading modules) # linux 系统启动后会从存储盘中加载该镜像到内存中 mv /boot/initramfs- $( uname -r).img /boot/initramfs- $( uname -r).img.bak dracut /boot/initramfs- $( uname -r).img $( uname -r) # 生效方案1——未验证,需安装initramfs-tools # update-initramfs # 生效方案2——重启 # reboot # 生效方案3 # rmmod nouveau |
2 部署NVIDIA驱动
安装NVIDIA内核模块(GPU卡驱动),执行如下命令:
执行驱动安装命令
[root @tdc - 56 gpu-driver]# ./NVIDIA-Linux-x86_64- 510.85 . 02 .run --info Identification : NVIDIA Accelerated Graphics Driver for Linux-x86_64 510.85 . 02 Target directory : NVIDIA-Linux-x86_64- 510.85 . 02 Uncompressed size : 885808 KB Compression : xz Date of packaging : Tue Jul 12 17 : 58 : 31 UTC 2022 Application run after extraction : ./nvidia-installer The directory NVIDIA-Linux-x86_64- 510.85 . 02 will be removed after extraction. # -q arg: "the default (normally 'yes') is assumed for all yes/no questions" [root @tdc - 56 gpu-driver]# ./NVIDIA-Linux-x86_64- 510.85 . 02 .run -q Verifying archive integrity... OK Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 510.85 . 02 ....... |
安装过程(根据提示选择下一步)


执行完成后执行nvidia-smi可输出GPU信息说明NVIDIA驱动部署成功:
[root @tdc - 56 gpu-driver]# nvidia-smi Thu Sep 8 04 : 32 : 21 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.85 . 02 Driver Version: 510.85 . 02 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000 :3B: 00.0 Off | 0 | | N/A 52C P0 27W / 70W | 0MiB / 15360MiB | 0 % Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 Tesla T4 Off | 00000000 :AF: 00.0 Off | 0 | | N/A 60C P0 29W / 70W | 0MiB / 15360MiB | 6 % Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ |
附 CUDA 版本与 Driver 版本对应关系
CUDA 12.1 Release Notes
CUDA Toolkit | Toolkit Driver Version |
---|
Linux x86_64 Driver Version | Windows x86_64 Driver Version |
---|
CUDA 11.8 GA | >=520.61.05 | >=522.06 |
CUDA 11.7 Update 1 | >=515.48.07 | >=516.31 |
CUDA 11.7 GA | >=515.43.04 | >=516.01 |
CUDA 11.6 Update 2 | >=510.47.03 | >=511.65 |
CUDA 11.6 Update 1 | >=510.47.03 | >=511.65 |
CUDA 11.6 GA | >=510.39.01 | >=511.23 |
CUDA 11.5 Update 2 | >=495.29.05 | >=496.13 |
CUDA 11.5 Update 1 | >=495.29.05 | >=496.13 |
CUDA 11.5 GA | >=495.29.05 | >=496.04 |
CUDA 11.4 Update 4 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 3 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 2 | >=470.57.02 | >=471.41 |
CUDA 11.4 Update 1 | >=470.57.02 | >=471.41 |
CUDA 11.4.0 GA | >=470.42.01 | >=471.11 |
CUDA 11.3.1 Update 1 | >=465.19.01 | >=465.89 |
CUDA 11.3.0 GA | >=465.19.01 | >=465.89 |
CUDA 11.2.2 Update 2 | >=460.32.03 | >=461.33 |
CUDA 11.2.1 Update 1 | >=460.32.03 | >=461.09 |
CUDA 11.2.0 GA | >=460.27.03 | >=460.82 |
CUDA 11.1.1 Update 1 | >=455.32 | >=456.81 |
CUDA 11.1 GA | >=455.23 | >=456.38 |
CUDA 11.0.3 Update 1 | >= 450.51.06 | >= 451.82 |
CUDA 11.0.2 GA | >= 450.51.05 | >= 451.48 |
CUDA 11.0.1 RC | >= 450.36.06 | >= 451.22 |
CUDA 10.2.89 | >= 440.33 | >= 441.22 |
CUDA 10.1 (10.1.105 general release, and updates) | >= 418.39 | >= 418.96 |
CUDA 10.0.130 | >= 410.48 | >= 411.31 |