项目场景:
神经网络模型运行中,出现的问题。
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
问题描述:
出现RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
的原因是:
数据存储在cpu和gpu中。
具体报错为:
File "G:/ADNI_TRAIN/ADNI_train3/MRI_data/AD_MCI_MRI_daima/MRI_dataload.py", line 133, in forward
out = self.postion_embedding(x)
File "C:\Users\Administrator\.conda\envs\lyf\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "G:/ADNI_TRAIN/ADNI_train3/MRI_data/AD_MCI_MRI_daima/MRI_dataload.py", line 175, in forward
out = x + nn.Parameter(self.pe, requires_grad=False).to(self.device)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
原因分析:
数据不在同一块设备中,X在cpu,nn.Parameter在GPU中。
解决方案:
把x放入到gpu中,即可。
找到出错的目标代码处,加上转换x存储的位置的代码即可。
x= x.to(self.device)
x= x.to(self.device)
a = nn.Parameter(self.pe, requires_grad=False).to(self.device)
#print('a.size:',a.size())
out = x + nn.Parameter(self.pe, requires_grad=False).to(self.device)
out = self.dropout(out)