NVIDIA显卡驱动 for Linux / Ubuntu22.04
最近在折腾实验室的公用服务器,安了个Ubuntu22.04,安装了NVIDIA驱动、CUDA、cuDNN,稍微记录一下,这一篇是前篇,CUDA与cuDNN配置太长了,下一篇发
主要内容(由GPT总结):
- 前置包安装:列出了安装驱动前所需的依赖包。
- 禁用nouveau驱动:通过修改blacklist.conf文件禁用系统默认的nouveau驱动,以避免冲突。
- 下载与安装驱动:提供了从NVIDIA官网下载驱动的链接,并给出了驱动安装的具体步骤,包括如何应对安装过程中的提示。
- 解决编译错误:详细描述了由于GCC版本不兼容导致的NVIDIA Kernel模块编译错误的解决方法,介绍了如何切换到GCC 12版本,并验证其配置。
驱动安装
-
前置包
sudo apt install gcc sudo apt install g++ sudo apt install make sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler sudo apt-get install --no-install-recommends libboost-all-dev sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
-
禁用默认驱动nouveau:
# 编辑文件blacklist.conf $ sudo gedit /etc/modprobe.d/blacklist.conf # 在文件最后部分插入以下两行内容 blacklist nouveau options nouveau modeset=0 # 更新系统 $ sudo update-initramfs -u # 重启后,验证,没有信息显示,说明nouveau已被禁用 $ lsmod | grep nouveau
-
下载对应版本的驱动:下载NVIDIA官方驱动
-
重启系统,进入命令行界面(非必要):
-
*如果之前装过Nvidia驱动,则卸载掉所有驱动:
sudo apt-get remove nvidia-*
-
安装
- 给驱动run文件赋予执行权限:
sudo chmod a+x NVIDIA-Linux-x86_64-550.67.run
- 安装:
sudo ./NVIDIA-Linux-x86_64-550.67.run -no-x-check -no-nouveau-check -no-opengl-files
- no-x-check:安装驱动时关闭X服务
- no-nouveau-check:安装驱动时禁用nouveau
- no-opengl-files:只安装驱动文件,不安装OpenGL文件,不会出现循环登陆的问题
- 给驱动run文件赋予执行权限:
-
安装过程
- The distribution-provided pre-install script failed! Are you sure you want to continue?
- 选择 yes 继续
- Install NVIDIA’s 32-bit compatibility libraries?
- 选择 No 继续
- Would you like to register the kernel module souces with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later?
- 选择 No 继续
- Would you like to run the nvidia-xconfigutility to automatically update your x configuration so that the NVIDIA x driver will be used when you restart x? Any pre-existing x confile will be backed up.
- 选择 Yes 继续
- The distribution-provided pre-install script failed! Are you sure you want to continue?
-
安装完毕
- 挂载NVIDIA驱动:
modprobe nvidia
- 检查驱动是否安装成功:
nvidia-smi
- 挂载NVIDIA驱动:
参考:ubuntu20.04系统用.run文件安装nvidia显卡驱动
Nvidia Kernel 编译错误
-
报错:
ERROR An error occurred while performing the step: "Building kernel modules". See /var/log/nvidia-installer.log for details. ERROR: The nvidia kernel module was not created. ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
-
gedit /var/log/nvidia-installer.log
检查发现是gcc出的错,版本不对,使用了gcc11 -
解决方法:安装gcc与g++的12版本
# 安装 $ sudo apt install gcc-12 g++-12 # 调整gcc-12与g++-12的优先级 $ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 100 $ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 100 # 下调gcc-11与g++-11的优先级 $ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 90 $ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 90 # 确认与切换gcc、g++的配置选择,不放心可以使用manual mode的12 $ sudo update-alternatives --config gcc There are 4 choices for the alternative gcc (providing /usr/bin/gcc). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/bin/gcc-12 100 auto mode 1 /usr/bin/g++-11 90 manual mode 2 /usr/bin/g++-12 100 manual mode 3 /usr/bin/gcc-11 90 manual mode 4 /usr/bin/gcc-12 100 manual mode Press <enter> to keep the current choice[*], or type selection number: # 修改后验证是否启用gcc-12,看最后一行 $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=1 ………………(省略) gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04)
-
参考: