报错信息:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [32, 2]], which is output 0 of TBackward, is at version 5; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
背景信息:多目标优化任务,计算了两个loss loss1和loss2,然后将两者相加得到需要反向传播计算的loss
解决方法:
报错信息表明在计算梯度时发生了错误,由于 inplace 操作导致了版本不匹配。这种问题通常涉及到对张量进行原地修改,而 PyTorch 的自动求导系统无法跟踪这些修改。
看了网上一些教程,解决方法包括:
- loss.backward(retain_graph=True)
- 使用torch.autograd.set_detect_anomaly(True)查看错误位置,然后对相应变量添加.clone().detach()
最终发现问题出在计算 loss 时的表达式上,我之前写的是
loss = loss + loss1
loss = loss + loss2
正确的写法应该是
loss = loss1 + loss2