问题描述:
【问题描述】
使用singularity方式创造ubuntu镜像安装MindSpore ,出现 cudaSetDevice failed, ret[999], unknown error 问题
原生操作系统: CentOS Linux release 7.4.1708 (Core)
singularity版本: 3.5.2
镜像操作系统: ubuntu 18.04.5 LTS (Bionic Beaver)
镜像源: docker://nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
在镜像操作系统安装MindSpore1.1.1 按照安装步骤安装成功,使用样例程序验证报错:
[ERROR] DEVICE(11474,python):2021-04-01-01:26:20.266.013 [mindspore/ccsrc/runtime/device/gpu/cuda_driver.cc:244] set_current_device] cudaSetDevice failed, ret[999], unknown error
[ERROR] SESSION(11474,python):2021-04-01-01:26:20.266.099 [mindspore/ccsrc/backend/session/gpu_session.cc:97] Init] GPUSession failed to set current device id.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/anaconda3/envs/python3.7/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 186, in __call__
return _run_op(self, self.name, args)
File "/opt/anaconda3/envs/python3.7/lib/python3.7/site-packages/mindspore/common/api.py", line 75, in wrapper
results = fn(*arg, **kwargs)
File "/opt/anaconda3/envs/python3.7/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 525, in _run_op
output = real_run_op(obj, op_name, args)
RuntimeError: mindspore/ccsrc/backend/session/gpu_session.cc:97 Init] GPUSession failed to set current device id.
【截图信息】
请问有大神知道这个是什么原因吗?该怎么解决呢?
解决方案:
机器环境问题