问题描述
安装OpenPCDet时,
python setup.py develop
报错:
UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /opt/conda/conda-bld/pytorch_1616554790289/work/c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-10.2'
running develop
running egg_info
writing pcdet.egg-info/PKG-INFO
writing dependency_links to pcdet.egg-info/dependency_links.txt
writing requirements to pcdet.egg-info/requires.txt
writing top-level names to pcdet.egg-info/top_level.txt
reading manifest file 'pcdet.egg-info/SOURCES.txt'
writing manifest file 'pcdet.egg-info/SOURCES.txt'
running build_ext
building 'pcdet.ops.iou3d_nms.iou3d_nms_cuda' extension
Traceback (most recent call last):
File "setup.py", line 125, in <module>
'src/sampling_gpu.cu',
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/develop.py", line 34, in run
self.install_for_development()
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/develop.py", line 136, in install_for_development
self.run_command('build_ext')
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 708, in build_extensions
build_ext.build_extensions(self)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 524, in unix_wrap_ninja_compile
cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 423, in unix_cuda_flags
cflags + _get_cuda_arch_flags(cflags))
File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1561, in _get_cuda_arch_flags
arch_list[-1] += '+PTX'
IndexError: list index out of range
解决
可见,最初的错误是:
CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
在
https://github.com/pytorch/pytorch/issues/49081#issuecomment-766793705
上找到解决方法:
yurunsheng1 commented on 25 Jan
apt-get install nvidia-modprobeThis works for me.
这个也work for me.
The nvidia-modprobe utility is used by user-space NVIDIA driver components to make sure the NVIDIA kernel module is loaded and that the NVIDIA character device files are present. These facilities are normally provided by Linux distribution configuration systems such as udev.