【深度学习】Dropout实现(pyTorch)

3.13 Dropout解决过拟合问题

3.13.1 方法

本节中提到的丢弃法特指 倒置丢弃法(inverted dropout)

设丢弃概率为p,那么有p的概率hi会被清零,有1−p的概率hi会除以1−p做拉伸以保证不改变下层输入的期望值。

在测试模型时,我们为了拿到更加确定性的结果,一般不使用丢弃法。

3.13.2 从零开始实现

%matplotlib inline
import torch
import torch.nn as nn
import numpy as np
import sys
sys.path.append('..')
import d2lzh_pytorch as d2l

def dropout(X, drop_prob):
    X=X.float()
    assert 0 <= drop_prob  <= 1
    keep_prob = 1-drop_prob
    if keep_prob ==0:
        return torch.zeros_like(X)
    mask = (torch.randn(X.shape) < keep_prob).float()
    return mask * X / keep_prob
X = torch.arange(16).view(2,8)
dropout(X, 0)
tensor([[ 0.,  1.,  0.,  3.,  4.,  5.,  6.,  7.],
        [ 8.,  0., 10., 11., 12., 13., 14., 15.]])
dropout(X, 0.5)
tensor([[ 0.,  2.,  4.,  0.,  0., 10., 12.,  0.],
        [ 0.,  0.,  0., 22., 24., 26., 28., 30.]])
dropout(X, 1)
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.]])
#定义模型参数

num_inputs, num_outputs, num_hiddens1, num_hiddens2 = 784, 10, 256, 256

W1 = torch.tensor(np.random.normal(0, 0.01, size=(num_inputs, num_hiddens1)), dtype=torch.float, requires_grad=True)
b1 = torch.zeros(num_hiddens1, requires_grad=True)
W2 = torch.tensor(np.random.normal(0, 0.01, size=(num_hiddens1, num_hiddens2)), dtype=torch.float, requires_grad=True)
b2 = torch.zeros(num_hiddens2, requires_grad=True)
W3 = torch.tensor(np.random.normal(0, 0.01, size=(num_hiddens2, num_outputs)), dtype=torch.float, requires_grad=True)
b3 = torch.zeros(num_outputs, requires_grad=True)

params = [W1, b1, W2, b2, W3, b3]
#定义模型
drop_prob1, drop_prob2 = 0.2,0.5
def net(X, is_training=True):
    X=X.view(-1, num_inputs)
    H1 = (torch.matmul(X, W1)+b1).relu()
    if is_training:
        H1=dropout(H1, drop_prob1)
    H2 =  (torch.matmul(H1, W2)+b2).relu()
    if is_training:
        H2=dropout(H2, drop_prob2)
    return torch.matmul(H2, W3)+b3
num_epochs, lr, batch_size= 5, 100.0, 256
loss = torch.nn.CrossEntropyLoss()
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
              params=params, lr=lr, optimizer=None)
epoch 1, loss 0.0042, train acc 0.583, test acc 0.780
epoch 2, loss 0.0022, train acc 0.796, test acc 0.790
epoch 3, loss 0.0018, train acc 0.826, test acc 0.821
epoch 4, loss 0.0017, train acc 0.844, test acc 0.811
epoch 5, loss 0.0016, train acc 0.853, test acc 0.826
#简单实现
net = nn.Sequential(
    d2l.FlattenLayer(),
    nn.Linear(num_inputs, num_hiddens1),
    nn.ReLU(),
    nn.Dropout(drop_prob1),
    nn.Linear(num_hiddens1, num_hiddens2),
    nn.ReLU(),
    nn.Dropout(drop_prob2), 
    nn.Linear(num_hiddens2, num_outputs)
)
for param in net.parameters():
    nn.init.normal_(param, mean=0,std=0.01)
optimizer = torch.optim.SGD(net.parameters(), lr=0.5)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
              None, None,optimizer=optimizer)
epoch 1, loss 0.0047, train acc 0.538, test acc 0.726
epoch 2, loss 0.0023, train acc 0.782, test acc 0.806
epoch 3, loss 0.0019, train acc 0.820, test acc 0.813
epoch 4, loss 0.0018, train acc 0.838, test acc 0.827
epoch 5, loss 0.0017, train acc 0.845, test acc 0.835

Hints:

我们可以通过使用丢弃法应对过拟合。要记住,丢弃法只在训练模型时使用。

欢迎关注【OAOA

评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值