解决CUDA error (3): initialization error (multiprocessing)

我在使用Pytorch里面的torch.multiprocessing进行多进程训练时,报出来

CUDA error (3): initialization error (multiprocessing)

通过一番查找资料,得知在调用torch的任何函数之前加一句

torch.multiprocessing.set_start_method(‘spawn’)

可以解决问题

但是又报出了新的错误

RuntimeError: context has already been set

我在github issue中找到了解答,原来是tqdm库的问题,将其更新至4.29.0以上就好了

/home/kejia/Server/tf/Bin_x64/DeepLearning/DL_Lib_02/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 gpu count 0 Traceback (most recent call last): File "DL_ProcessManager_01.py", line 5, in <module> File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 677, in _load_unlocked File "PyInstaller/loader/pyimod03_importers.py", line 540, in exec_module File "DL_ProcessManager/__init__.py", line 1, in <module> File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 677, in _load_unlocked File "PyInstaller/loader/pyimod03_importers.py", line 540, in exec_module File "DL_ProcessManager/DL_ProcessManager.py", line 12, in <module> File "/home/lxy/anaconda3/envs/mmdet2/lib/python3.7/site-packages/PyInstaller/hooks/rthooks/pyi_rth_multiprocessing.py", line 55, in _freeze_support File "multiprocessing/spawn.py", line 105, in spawn_main File "multiprocessing/spawn.py", line 115, in _main AttributeError: Can't get attribute 'CarmeraFunc' on <module '__main__' (built-in)> [15584] Failed to execute script DL_ProcessManager_01
07-22
当出现"RuntimeError: CUDA error: initialization error"的错误时,通常是由于CUDA的初始化过程中出现了问题。这可能是由于以下几个原因导致的: 1. GPU驱动程序问题:请确保你的GPU驱动程序已经正确安装并与CUDA版本匹配。如果你的驱动程序过旧或与CUDA不兼容,可能会导致初始化错误。 2. CUDA版本不匹配:请确保你的CUDA版本与你的深度学习框架(如PyTorch或TensorFlow)所需的CUDA版本相匹配。如果版本不匹配,可能会导致初始化错误。 3. 硬件问题:有时候,初始化错误可能与硬件本身有关。请确保你的GPU正确安装并正常工作。 对于解决这个问题,可以尝试以下方法: 1. 更新GPU驱动程序:前往GPU制造商的官方网站,下载并安装最新的GPU驱动程序。 2. 检查CUDA版本:确保你的CUDA版本与深度学习框架所需的版本相匹配。如果不匹配,可以尝试升级或降级CUDA版本。 3. 检查硬件连接:确保你的GPU正确安装并与计算机连接良好。可以尝试重新插拔GPU,确保连接稳定。 4. 检查系统环境变量:确保你的系统环境变量中包含正确的CUDA路径。需要将CUDA的路径添加到PATH环境变量中。 如果以上方法都没有解决问题,那可能是其他更复杂的问题导致的初始化错误。建议咨询深度学习框架的官方论坛或寻求专业人士的帮助来进一步调试和解决问题。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *3* [【Pytorch】RuntimeError: CUDA error: initialization error](https://blog.csdn.net/LittleSeedling/article/details/127995183)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] - *2* [关于RuntimeError: Cannot re-initialize CUDA in forked subprocess和CUDA error: initialization error的...](https://blog.csdn.net/Davidietop/article/details/127072346)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值