NVIDIA A100-SXM4-40GB The NVIDIA driver on your system is too old (found version 11040)

通过nvidia-smi查看到GPU版本信息不全, 请通过 nvidia-smi -q 查看完整信息

版本过旧报错内容

RuntimeError: [address=0.0.0.0:33981, pid=1754380] The NVIDIA driver on your system is too old (found version 11040). 
Please update your GPU driver by downloading and installing a new version 
from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, 
go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

到官网下载相应最新的驱动版本

通过检查pyTorch支持的最高CUDA为12.4, 所以选择12.4版本下载

选择  Driver Version 为 550.127.05 下载

下载文件 NVIDIA-Linux-x86_64-550.127.05.run

如果你不方便下载, 可以使用我分享的链接下载, 如果对你有帮助, 请点赞回复评论, 感谢

通过网盘分享的文件:NVIDIA-Linux-x86_64-550.127.05.run
链接: https://pan.baidu.com/s/1EjpEZCV8K8i2hztbBPUM4Q?pwd=xcew 提取码: xcew 
--来自百度网盘超级会员v8的分享

然后将此驱动上传到目标机器任意目录

chmod +x NVIDIA-Linux-x86_64-550.127.05.run 
 

安装前先卸载旧版本 切记否则会报下面的错误

sudo apt-get purge nvidia-*

ls /usr/bin | grep nvidia
ls /lib/modules/$(uname -r)/kernel/drivers/video/ | grep nvidia

sudo rm -rf /usr/local/cuda*
sudo rm -rf /usr/bin/nvidia*
sudo rm -rf /lib/modules/$(uname -r)/kernel/drivers/video/nvidia*

sudo ./NVIDIA-Linux-x86_64-550.127.05.run  回车开始安装

WARNING: Continuing installation despite the presence of a loaded NVIDIA kernel module.  Some sanity checks will not be performed.  It is    
           strongly recommended that you reboot your computer after installation is complete.  If the installation is not successful after     
           rebooting the computer, you can run `nvidia-uninstall` to attempt to remove the NVIDIA driver.
 
                                                                       OK  

 WARNING: Your driver installation has been altered since it was initially installed; this may happen, for example, if you have since
           installed the NVIDIA driver through a mechanism other than nvidia-installer (such as your distribution's native package management
           system).  nvidia-installer will attempt to uninstall as best it can.  Please see the file '/var/log/nvidia-installer.log' for       
           details.                                                           
                                                                                                                                               
                                                                       OK  

一回确认, 

安装完成后, 提醒需要重启, 没有重启前执行 nvidia-smi 报错

Failed to initialize NVML: Driver/library version mismatch
NVML library version: 550.127

此时核心是安装最新的, 驱动版本也需要更新, 通过官网查找相应版本, 我的操作系统是Ubuntu 20.04

在此网站找到对应文件下载并安装

https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64/cuda-compat-12-4_550.127.05-0ubuntu1_amd64.deb

此软件包如果下载不了, 可以来这里下载

通过网盘分享的文件:cuda-compat-12-4_550.127.05-0ubuntu1_amd64.deb
链接: https://pan.baidu.com/s/1h8ycdIpGJsPf9oC7cFT9YQ?pwd=p2i4 提取码: p2i4 
--来自百度网盘超级会员v8的分享

安装 

sudo apt install ./cuda-compat-12-4_550.127.05-0ubuntu1_amd64.deb

注意: 这里面可以安装失败, 请根据提示卸载旧版本

安装成功后是报这样的内容: cuda-compat-12-4 已经是最新版 (550.127.05-0ubuntu1)。

nvcc --version

然后重启服务器

sudo reboot

升级成功效果

重启后, 执行命令 nvidia-smi 仍然报错 是因为上面没有清理干净, 导致旧的版本生效导致的, 按上面的步骤清理干净

Failed to initialize NVML: Driver/library version mismatch
NVML library version: 550.127

/proc/driver/nvidia/version 中显示的是 NVIDIA 内核驱动版本

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值