RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm
产生这类错误的原因,模型和数据未在相同设备上
解决方法:
将其放在同一设备上,模型和数据都放在GPU上 net .to(“cuda”) data.to(“cuda”)
最后如果还出现相同错误,请检查网络模型定义部分,将nn.Linear() nn.LayerNorm()等放在def __init__函数里初始化
对于将不同结构组成一个module的情况,初始化时不要使用[],一定要使用nn.ModuleList()
如:
self.mlp_blocks = nn.ModuleList()
for _ in range(num_blocks):
self.mlp_blocks.append(MixerBlock(tokens_mlp_dim, channels_mlp_dim, tokens_hidden_dim, channels_hidden_dim))
https://blog.csdn.net/weixin_43760844/article/details/116047427