在运行Unet代码的时候报错如下:
报错如下:RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 64, 256, 256]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
大概意思就是说在计算梯度的时候检查出某个Variable有被一个 inplace operation 修改。
按照提示信息,设置:torch.autograd.set_detect_anomaly = True,再运行一下,可以得到更详细的输出。
什么是 inplace operation 呢?
inplace operation 就是直接对tensor的内容进行修改,而没有使用复制的副本 (An in-place operation is an operation that changes directly the content of a given Tensor without making a copy)。
解决方案:
定位到报错的位置:
-
在pytorch中, inplace operation 可以是一些 .add_() 或 .scatter_() 导致的。对于.add_()方法,是直接在tensor上进行修改的,可以把x.add_(y)改成x = x + y。如果需要复制一个副本话,参照第二个帖子的方法,可以使用.clone()方法。
-
在python中, inplace operation 可以是一些 += 或 *= 导致的。比如 x += y,需要改成 x = x +y
最终解决方法:
class double_conv(nn.Module):
'''(conv => BN => ReLU) * 2'''
def __init__(self, in_ch, out_ch):
super(double_conv, self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(in_ch, out_ch, 3, padding=1),
nn.BatchNorm2d(out_ch),
nn.ReLU(),
nn.Conv2d(out_ch, out_ch, 3, padding=1),
nn.BatchNorm2d(out_ch),
nn.ReLU()
)
self.channel_conv = nn.Sequential(
nn.Conv2d(in_ch, out_ch, kernel_size=1, stride=1, bias=False),
nn.BatchNorm2d(out_ch)
)
def forward(self, x):
residual = x
x = self.conv(x)
if residual.shape[1] != x.shape[1]:
residual = self.channel_conv(residual)
x = x + residual #原代码: x += residual
return x
通过修改x+=residual为 x = x + residual解决了问题。