return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None 【最全问题解决方案】

问题报错:return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) RuntimeError: CUDA error: invalid device ordinal CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

解决方法一:按照后面的提示增加环境变量 CUDA_LAUNCH_BLOCKING=1

建议直接用命令行制定执行时候的环境变量

CUDA_LAUNCH_BLOCKING=1 python train.py
解决方法二:你复制的代码的作者有多个gpu训练设备,并且和你的设备不匹配,你需要仔细查找每个设备的名称
快速搜索键:Ctrl+F , 搜索cuda/device,查看是否指定号码

例如你复制的代码里是 cuda:2

但是如果你是笔记本或者单显卡主机,你只有cuda:0

你替换 cuda:2 为 cuda:0 即可解决
解决方法三:可能是由于你的库和gpu驱动版本不兼容,当然这种情况较少,不建议轻易更换驱动,这将消耗大量时间,建议你多检查

解决方法四:重启(我知道这可能离谱,但确实有博客这么说)

解决方法五:内存问题
1.更换更diao的显卡

2.设置内存最大化

3.释放无关内存
if hasattr(torch.cuda, 'empty_cache'):
        torch.cuda.empty_cache()

4.可能需要杀死一些进程
调试过程可能的衍生问题:
ImportError: cannot import name 'COMMON_SAFE_ASCII_CHARACTERS' from 'charset_normalizer.constant'

解决
pip install chardet
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_mm)
或者
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_mm)

这个直接是因为你代码里使用了不同的设备,特指不同的GPU,可能有好几个cuda,

解决方法:

device = input.device
在输出后面加上 .to(device)

一般情况下前三个可以解决问题,大概率是你复制或者摘过来代码中,原作者设备和你的不匹配导致的,你需要仔细检查,或者是使用.to(device)把计算放入你自己的单个或多个显卡设备。

由于看了很多博客,希望可以帮助大家一站式解决问题,所参考的博客放在下面:

如果侵权请联系我删除

https://blog.51cto.com/u_15717393/5471457#_22

Debug Pytorch: RuntimeError: CUDA error: device-side assert triggered_return t.to(device, dtype if t.is_floating_point()-CSDN博客

https://wenku.csdn.net/answer/5b75c85ae73511edbcb5fa163eeb3507

问题解决 | RuntimeError: CUDA error: invalid device ordinalCUDA kernel errors_runtimeerror: cuda error: invalid device ordinal c-CSDN博客

【PyTorch】CUDA error: device-side assert triggered_return t.to(device, dtype if t.is_floating_point()-CSDN博客

https://www.cnblogs.com/urahyou/p/17832397.html

解决RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:-CSDN博客

/home/dss/Code/7_20/Condition_DDPM_7_20.py:14: DeprecationWarning: Please use `rotate` from the `scipy.ndimage` namespace, the `scipy.ndimage.interpolation` namespace is deprecated. from scipy.ndimage.interpolation import rotate Traceback (most recent call last): File "/home/dss/Code/7_20/Condition_DDPM_7_20.py", line 509, in <module> ddpm = DDPM(device, beta_1, beta_T, T, drop_prob=0.1) File "/home/dss/Code/7_20/Condition_DDPM_7_20.py", line 309, in __init__ self.model = UNet(T).to(device) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1145, in to return self._apply(convert) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply param_applied = fn(param) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "/home/dss/.conda/envs/DSS_env/lib/python3.9/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
07-21
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值