解决RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:

原创已于 2025-05-13 09:51:21 修改 · 2.1w 阅读

38 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #python #人工智能

于 2023-05-24 09:25:10 首次发布

完整报错警告为：

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_mm)

或者

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_mm)

这是由于代码所写的某一位置的Tensor设备指代不清楚，正常情况下是不会出现这个错误的，但是个人总结了以下容易导致该错误产生的原因：

1.当你存在多GPU并行训练时：

这时候你的model里面传入的不只是Cuda：0，还有Cuda:1， Cuda:2等等，这个时候，你的网络模型model里面的forward函数里面就及其容易报错，因为里面存在的一些定维度的Tensor，比如权重、偏差的device容易跑偏。

2.当你单GPU正常使用Cuda：0：

这时候按理来说，只要device代码指代正常，按理来说不会出现设备问题，如果出现了，就是tensor的格式没有传入cuda而是保留在cpu。碰巧我某天训练时候就出现了，思路跟上面一样，最容易出现错误的地方是网络模型中固定维度的tensor如权重和偏差等tensor，或者是forward函数里面的输出.

下面直接给出解决办法：

1.在你网络的forward函数，先调出input的device，然后在输出的后面添加一句.to(device)