作者:雷杰
链接:https://www.zhihu.com/question/67209417/answer/302434279
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
刚踩的坑, 差点就哭出来了TT. — 我明明加了一百个dropout, 为什么结果一点都没变
使用F.dropout ( nn.functional.dropout )的时候需要设置它的training这个状态参数与模型整体的一致.
比如:
Class DropoutFC(nn.Module):
def __init__(self):
super(DropoutFC, self).__init__()
self.fc = nn.Linear(100,20)
def forward(self, input):
out = self.fc(input)
out = F.dropout(out, p=0.5)
return out
Net = DropoutFC()
Net.train()
# train the Net
这段代码中的F.dropout实际上是没有任何用的, 因为它的training状态一直是默认值False. 由于F.dropout只是相当于引用的一个外部函数, 模型整体的training状态变化也不会引起F.dropout这个函数的training状态发生变化. 所以, 此处的out = F.dropout(out) 就是 out = out.
正确的使用方法如下, 将模型整体的training状态参数传入dropout函数
Class DropoutFC(nn.Module):
def __init__(self):
super(DropoutFC, self).__init__()
self.fc = nn.Linear(100,20)
def forward(self, input):
out = self.fc(input)
out = F.dropout(out, p=0.5, training=self.training)
return out
Net = DropoutFC()
Net.train()
# train the Net
或者直接使用nn.Dropout() (nn.Dropout()实际上是对F.dropout的一个包装, 也将self.training传入了)
Class DropoutFC(nn.Module):
def __init__(self):
super(DropoutFC, self).__init__()
self.fc = nn.Linear(100,20)
self.dropout = nn.Dropout(p=0.5)
def forward(self, input):
out = self.fc(input)
out = self.dropout(out)
return out
Net = DropoutFC()
Net.train()
# train the Net