如何设置随机种子保证pytorch代码的可复现性

最新推荐文章于 2024-02-21 16:40:08 发布

_胡不归

最新推荐文章于 2024-02-21 16:40:08 发布

阅读量1.8k

点赞数 2

分类专栏： python # pytorch 文章标签： python

本文链接：https://blog.csdn.net/B_DATA_NUIST/article/details/105857043

版权

python 同时被 2 个专栏收录

6 篇文章 1 订阅

订阅专栏

pytorch

3 篇文章 0 订阅

订阅专栏

如何设置随机种子保证pytorch代码的可复现性

缘起
示例代码
- 设置pytorch随机数种子
- 设置pytorch初始化种子和CuDNN

缘起

在深度学习研究领域,论文结果的可复现性是一个很大的问题.遑论各种paper中的代码,有时候就是自己写的代码,都难以保证可复现性:即使使用同样的网络结构,同样的数据库,在同一台机器上训练,训练的结果都有差别.这一现象很大程度上是由于深度学习训练过程中的随机性造成的.

网络参数的随机初始化
正则化方法,例如dropout在训练中随机丢弃网络中的节点
优化过程,例如SGD,RMSPorp或者Adam等方法也涉及随机初始化

Tips: pytorch的可复现性会受到pytorch版本和操作系统平台的影响.

示例代码

下面通过一个例子来演示如何设置pytorch的随机种子

# Train a model to fit a line y=mx using given data points

import torch

## Uncomment the two lines below to make the training reproducible.
#seed = 3
#torch.manual_seed(seed)

# set device to CUDA if available, else to CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Device:', device)

# N - number of data points
# n_inputs - number of input variables
# n_hidden - number of units in the hidden layer
# n_outputs - number of outputs
N, n_inputs, n_hidden, n_outputs = 5, 1, 100, 1

# Input 7 pairs of (x, y) input values
x = torch.tensor([[0.0], [1.0], [2.0], [3.0], [4.0], [5.0], [6.0], [7.0]], device=device)
y = torch.tensor([[0.0], [10.0], [20.0], [30.0], [40.0], [50.0], [60.0], [70.0]], device=device)

# Make a 3 layer neural network with an input layer, hidden layer and output layer
model = torch.nn.Sequential(
    torch.nn.Linear(n_inputs, n_hidden),
    torch.nn.ReLU(),
    torch.nn.Linear(n_hidden, n_outputs)
)
# Move the model to the device
model.to(device)

# Define the loss function to be the mean squared error loss
loss_fn = torch.nn.MSELoss(reduction='sum')

# Do forward pass through the data points, compute loss, compute gradients using backward propagation and update the weights using the gradients.
learning_rate = 1e-4
for t in range(1000):
    y_out = model.forward(x)
    loss = loss_fn(y_out, y)
    if t % 100 == 99:
        print(t, loss.item())
    #  print(y_out)

    # Gradients are made to zero prior to backward pass.
    model.zero_grad()
    loss.backward()

    # Update weights using gradient descent
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

运行上述代码两次:
第一次运行结果

Device: cuda
99 13.865872383117676
199 5.772928714752197
299 3.566026210784912
399 2.5292069911956787
499 1.8655864000320435
599 1.3915504217147827
699 1.0447190999984741
799 0.7871285676956177
899 0.5957959890365601
999 0.45342087745666504

第二次运行的结果

Device: cuda
99 6.1840715408325195
199 3.0933115482330322
299 1.9355353116989136
399 1.3561317920684814
499 0.998731791973114
599 0.7554249167442322
699 0.5831341743469238
799 0.45905551314353943
899 0.3688798248767853
999 0.30284053087234497

设置pytorch随机数种子

注释代码中6,7两行

seed = 3
torch.manual_seed(seed)

重新运行两次:
第一次运行结果

Device: cuda
99 10.655608177185059
199 3.6195263862609863
299 1.653144359588623
399 0.9989959001541138
499 0.712784469127655
599 0.5509689450263977
699 0.44407185912132263
799 0.368024617433548
899 0.3116675019264221
999 0.2681158781051636

第二次运行结果

Device: cuda
99 10.655608177185059
199 3.6195263862609863DNN
299 1.653144359588623
399 0.9989959001541138
499 0.712784469127655
599 0.5509689450263977
699 0.44407185912132263
799 0.368024617433548
899 0.3116675019264221
999 0.2681158781051636

可以看到两次运行的结果是一样的

设置pytorch初始化种子和CuDNN

上面的简单实例用仅仅设置了pytorch的随机数种子,但是当涉及到卷积操作的时候,这样是不够的.因为此时涉及到CuDNN加速GPU操作.,实际上只要增加一下代码就可以

seed = 3
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

将以上代码加入pytorch图像分类代码
第一次运行结果

Device: cuda
[1, 2000] loss: 2.192
[1, 4000] loss: 1.824
[1, 6000] loss: 1.613
[1, 8000] loss: 1.532
[1, 10000] loss: 1.470
[1, 12000] loss: 1.429
[2, 2000] loss: 1.378
[2, 4000] loss: 1.317
[2, 6000] loss: 1.291
[2, 8000] loss: 1.298
[2, 10000] loss: 1.264
[2, 12000] loss: 1.255
Finished Training

第二次运行结果

Device: cuda
[1, 2000] loss: 2.192
[1, 4000] loss: 1.824
[1, 6000] loss: 1.613
[1, 8000] loss: 1.532
[1, 10000] loss: 1.470
[1, 12000] loss: 1.429
[2, 2000] loss: 1.378
[2, 4000] loss: 1.317
[2, 6000] loss: 1.291
[2, 8000] loss: 1.298
[2, 10000] loss: 1.264
[2, 12000] loss: 1.255
Finished Training

如果涉及到numpy还需要设置numpy的初始化种子

seed = 3
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

_胡不归

关注

2
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
如何设置随机种子保证pytorch代码的可复现性

在深度学习研究领域,论文结果的可复现性是一个很大的问题.遑论各种paper中的代码,有时候就是自己写的代码,都难以保证可复现性:即使使用同样的网络结构,同样的数据库,在同一台机器上训练,训练的结果都有差别.这一现象很大程度上是由于深度学习训练过程中的随机性造成的.网络参数的随机初始化正则化方法,例如dropout在训练中随机丢弃网络中的节点优化过程,例如SGD,RMSPorp或者Adam等...
复制链接

扫一扫

专栏目录