起因
老师说dropout一般加在activation后面;于是我用pytorch开始尝试。在Relu之后添加dropout之后,发生报错:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
过程
- 在Github中搜到 we cannot have two subsequent inplace operations , 作答者还举了个例子:“If you use ABN and you use an inplace activation function (e.g. RELU with inplace set to True) then the subsequent layer cannot be inplace”
- 由于不懂什么是ABN,于是查得 In-Place Activated BatchNorm; 这个神奇的东西把BN和Act并成一层,并且是in-place操作,能save memory。
解决
将以下EncoderBlock里 定义dropout layer的 nn.Dropout3d(dropout_rate, inplace=True) 中的 inplace=True 删除即可。
class EncoderBlock(nn.Module):
'''
Encoder block; Green
'''
def __init__(self, inChans, outChans, stride=1, padding=1, num_groups=8, activation="relu",
normalizaiton="group_normalization", dropout_rate=None):
super(EncoderBlock, self).__init__()
self.dropout_flag = False
if normalizaiton == "group_normalization":
self.norm1 = nn.GroupNorm(num_groups=num_groups, num_channels=inChans)
self.norm2 = nn.GroupNorm(num_groups=num_groups, num_channels=inChans)
if activation == "relu":
self.actv1 = nn.ReLU(inplace=True)
self.actv2 = nn.ReLU(inplace=True)
# elif activation == "elu":
# self.actv1 = nn.ELU(inplace=True)
# self.actv2 = nn.ELU(inplace=True)
elif activation == "sin":
self.actv1 = Sine(1.0)
self.actv2 = Sine(1.0)
if dropout_rate is not None:
self.dropout_flag = True
self.dropout1 = nn.Dropout3d(dropout_rate, inplace=True)
self.dropout2 = nn.Dropout3d(dropout_rate, inplace=True)
self.conv1 = nn.Conv3d(in_channels=inChans, out_channels=outChans, kernel_size=3, stride=stride, padding=padding)
self.conv2 = nn.Conv3d(in_channels=inChans, out_channels=outChans, kernel_size=3, stride=stride, padding=padding)
def forward(self, x):
residual = x
out = self.norm1(x)
out = self.actv1(out)
if self.dropout_flag:
out = self.dropout1(out)
out = self.conv1(out)
out = self.norm2(out)
out = self.actv2(out)
if self.dropout_flag:
out = self.dropout2(out)
out = self.conv2(out)
out += residual
return out