Dropout

最新推荐文章于 2019-12-26 15:07:35 发布

玄云飘风

最新推荐文章于 2019-12-26 15:07:35 发布

阅读量441

点赞数

分类专栏： CV 基本功

本文链接：https://blog.csdn.net/tfcy694/article/details/88736724

版权

CV 同时被 2 个专栏收录

35 篇文章 4 订阅

订阅专栏

基本功

35 篇文章 0 订阅

订阅专栏

这里需要明确若干个问题：

dropout层在训练阶段是以概率p随机隐藏神经元，而不是把该层中所有神经元中占比为p的神经元数量进行隐藏。
训练阶段dropout层的输出 $o$ 不等于输入 $i$ ，而是 $\dfrac{i}{1-p}$ ，原因如下：
按照论文，原始的dropout在两个阶段的表示应该是：

（论文中的p表示展示概率，PyTorch和Caffe中的p表示隐藏概率，所以两种情况下的p不一样）测试阶段权重应该乘概率。然而在框架中中，训练阶段以 $\dfrac{i}{1-p}$ 作为输出，那么测试阶段将可以直接输出 $i$ ，这和论文是等价的。

测试代码：

import torch
import torch.nn as nn

class Model1(nn.Module):
    # Model 1 using functional dropout
    def __init__(self, p=0.0):
        super().__init__()
        self.p = p

    def forward(self, inputs):
        return nn.functional.dropout(inputs, p=self.p, training=True)

class Model2(nn.Module):
    # Model 2 using dropout module
    def __init__(self, p=0.0):
        super().__init__()
        self.drop_layer = nn.Dropout(p=p)

    def forward(self, inputs):
        return self.drop_layer(inputs)
model1 = Model1(p=0.5) # functional dropout 
model2 = Model2(p=0.5) # dropout module

# creating inputs
inputs = torch.rand(10)

print("inputs:", inputs)
print()

# forwarding inputs in train mode
print('Normal (train) model:')
print('Model 1', model1(inputs))
print('Model 2', model2(inputs))
print()

# switching to eval mode
model1.eval()
model2.eval()

# forwarding inputs in evaluation mode
print('Evaluation mode:')
print('Model 1', model1(inputs))
print('Model 2', model2(inputs))
print()

# show model summary
print('Print summary:')
print(model1)
print(model2)

inputs: tensor([0.1426, 0.6055, 0.0692, 0.3617, 0.7946, 0.4689, 0.1311, 0.9336, 0.8236, 0.7306])

Normal (train) model:
Model 1 tensor([0.2851, 0.0000, 0.1384, 0.7234, 0.0000, 0.9378, 0.2622, 1.8672, 0.0000, 0.0000])
Model 2 tensor([0.2851, 0.0000, 0.0000, 0.0000, 0.0000, 0.9378, 0.2622, 1.8672, 1.6471, 0.0000])

Evaluation mode:
Model 1 tensor([0.2851, 1.2109, 0.0000, 0.7234, 1.5891, 0.9378, 0.0000, 1.8672, 0.0000, 0.0000])
Model 2 tensor([0.1426, 0.6055, 0.0692, 0.3617, 0.7946, 0.4689, 0.1311, 0.9336, 0.8236, 0.7306])

Print summary:
Model1()
Model2(
  (drop_layer): Dropout(p=0.5)
)

ref：
https://stackoverflow.com/questions/53419474/nn-dropout-vs-f-dropout-pytorch
https://stackoverflow.com/questions/34597316/why-input-is-scaled-in-tf-nn-dropout-in-tensorflow