在singularity中使用gpu时出现“NVIDIA-SMI couldn‘t find libnvidia-ml.so library in your system”应该怎么办

文章讲述了在使用Singularity容器时遇到的NVIDIA-SMI无法找到libnvidia-ml.so的问题,提供了查找并正确设置该库路径的方法,强调需确保与NVIDIA驱动匹配的libnvidia-ml.so版本,并调整LD_LIBRARY_PATH环境变量。
摘要由CSDN通过智能技术生成

问题: 当尝试使用“/opt/app/singularity/bin/singularity exec --nv xxx.sif nvidia-smi”命令时,会显示:

NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.

Please also try adding directory that contains libnvidia-ml.so to your system PATH.

解决方法:
1、尝试使用“sudo find / -name libnvidia-ml.so”寻找这一文件,系统返回如下:

/opt/app/cuda/12.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/cuda/11.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/cuda/11.7/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/cuda/11.8/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/nvidia/525.60.13/lib/libnvidia-ml.so

/opt/app/nvidia/525.60.13/lib32/libnvidia-ml.so

/opt/app/nvidia/535.154.05/lib/libnvidia-ml.so

/opt/app/nvidia/535.154.05/lib32/libnvidia-ml.so

/opt/app/nvidia/460.91.03/lib/libnvidia-ml.so

/opt/app/nvidia/460.91.03/lib32/libnvidia-ml.so

/opt/app/nvidia/450.216.04/lib/libnvidia-ml.so

2、可以根据主机nvidia-smi显示的版本选择如下之一:

/opt/app/nvidia/525.60.13/lib/libnvidia-ml.so

/opt/app/nvidia/525.60.13/lib32/libnvidia-ml.so

/opt/app/nvidia/535.154.05/lib/libnvidia-ml.so

/opt/app/nvidia/535.154.05/lib32/libnvidia-ml.so

/opt/app/nvidia/460.91.03/lib/libnvidia-ml.so

/opt/app/nvidia/460.91.03/lib32/libnvidia-ml.so

/opt/app/nvidia/450.216.04/lib/libnvidia-ml.so

以下均不可选:

/opt/app/cuda/12.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/cuda/11.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/cuda/11.7/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

/opt/app/cuda/11.8/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

原因:

WARNING:
You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

3、然后在主机执行:/opt/app/singularity/bin/singularity shell --nv -B /opt/app/nvidia/535.154.05/lib/ xxx.sif

4、进入容器执行:export LD_LIBRARY_PATH=/opt/app/nvidia/535.154.05/lib/:$LD_LIBRARY_PATH

5、此时输入nvidia-smi就能返回正常的显示了

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

双鱼座羊驼

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值