【已解决】RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cu

鳗小鱼

已于 2023-12-12 20:47:57 修改

阅读量1.7w

点赞数 9

分类专栏： Bugs（程序报错）文章标签：深度学习人工智能机器学习 linux ubuntu python 服务器

于 2023-11-16 00:34:46 首次发布

本文链接：https://blog.csdn.net/BetrayFree/article/details/134432328

版权

Bugs（程序报错）专栏收录该内容

144 篇文章 7 订阅

订阅专栏

问题描述

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

解决办法

这个办法比较常见，就是说在训练的时候需要张量在同一个设备上，但是呢不小心放到了其他的设备，所以有这个错误，那怎么办呢？调呗

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

改为

device = torch.device('cuda:1' if torch.cuda.is_available() else 'cpu')

再ctr+F在各个py文件查找关键词“cuda” ，找到了

改！！！

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

改为

device = torch.device('cuda:1' if torch.cuda.is_available() else 'cpu')

齐活！

总结

其实写到这里会发现上面两部分代码是一样的，无所谓，主要是证明两个得是一样的，仅此而已。

延展阅读

首先我们要知道原因有两种：

参与运算的两个或多个变量，有的在CPU上，有的在GPU上

首先找到报错的行，看看计算时都用到哪些变量或者数据，然后在调试模式下使用.is_cuda这个属性去查看到底哪些是在GPU上，哪些是在CPU上，然后把它们统一都放在CPU，或者统一放在GPU上就可以。
如果增加了变量，需要将新增加的变量也搬到GPU上去

解决RuntimeError: Expected all tensors to be on the same device, but found at least two devices,-CSDN博客文章浏览阅读9.1w次，点赞29次，收藏57次。目录1.错误：RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!2.错误：Tensorfor argument #2 ‘mat1’ is on CPU, but expected it to be on GPU (while checking arguments for addmm)1.错误：RuntimeError: E.https://blog.csdn.net/yimenren/article/details/124106706

一个数据在GPU0上，一个数据在GPU1上

造成这个错误的可能性有挺多，总起来是模型、输入、模型内参数不在一个GPU上。本人是在调试RandLA-Net pytorch源码，希望使用双GPU训练，经过尝试解决这个问题，此处做一个记录，希望给后来人一个提醒。
经过调试，发现报错的地方主要是在数据拼接的时候，即一个数据在GPU0上，一个数据在GPU1上，这就会出现错误，就按照我上面的解决办法就可以了，也可以参考这个博主

PyTorch 多GPU使用torch.nn.DataParallel训练参数不一致问题_gpu之间参数不一样-CSDN博客文章浏览阅读3.4k次，点赞6次，收藏9次。在多GPU训练时，遇到了下述的错误：1. Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 0 does not equal 1 2. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 _gpu之间参数不一样https://blog.csdn.net/weixin_41496173/article/details/119789280