Torch 实验复现需要设置的随机种子

lh_lyh

已于 2023-11-30 09:59:06 修改

阅读量3.9k

点赞数 2

分类专栏： Python Pytorch 文章标签： pytorch Reproducibility cudnn.benchmark

于 2021-11-15 19:57:39 首次发布

本文链接：https://blog.csdn.net/qq_39735236/article/details/121341958

版权

Python 同时被 2 个专栏收录

22 篇文章 0 订阅

订阅专栏

Pytorch

3 篇文章 0 订阅

订阅专栏

参考官方文档 Reproducibility

# torch
import torch
torch.manual_seed(0)
# python
import random
random.seed(0)
# numpy
import numpy as np
np.random.seed(0)
# cudnn
torch.backends.cudnn.benchmark = False # 语句1，关闭卷积优化
torch.backends.cudnn.deterministic = True # 语句2，使用确定性的操作，该语句在新版本torch（1.9及以后版本）中被以下语句3替代
torch.use_deterministic_algorithms(True) # 语句3，该语句新版本torch（1.9及以后版本）实现，功能更强大，建议使用该语句代替语句2使用

新版本里提供了torch.use_deterministic_algorithms()函数来check 是否使用不确定性算法。
上述参数设置为True时：torch.use_deterministic_algorithms(True) 将torch中的不确定性算法用已有的确定性算法替代，若没有替代算法时报错。
该语句与torch.backends.cudnn.deterministic = True的区别：
torch.backends.cudnn.deterministic = True只控制cuda卷积的确定性
torch.use_deterministic_algorithms(True)会check所有torch操作的确定性

torch.use_deterministic_algorithms() lets you configure PyTorch to use deterministic algorithms instead of nondeterministic ones where available, and to throw an error if an operation is known to be nondeterministic (and without a deterministic alternative).

While disabling CUDA convolution benchmarking (discussed above) ensures that CUDA selects the same algorithm each time an application is run, that algorithm itself may be nondeterministic, unless either torch.use_deterministic_algorithms(True) or torch.backends.cudnn.deterministic = True is set. The latter setting controls only this behavior, unlike torch.use_deterministic_algorithms() which will make other PyTorch operations behave deterministically, too.

TIP： 如果使用CUDA tensor并且CUDA版本在10.2及以上，需要额外设置CUBLAS_WORKSPACE_CONFIG 以保证可复现性。

Furthermore, if you are using CUDA tensors, and your CUDA version is 10.2 or greater, you should set the environment variable CUBLAS_WORKSPACE_CONFIG according to CUDA documentation: https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility

其余操作的确定性：
CUDA RNN and LSTM

In some versions of CUDA, RNNs and LSTM networks may have non-deterministic behavior. See torch.nn.RNN() and torch.nn.LSTM() for details and workarounds.

DataLoader

DataLoader will reseed workers following Randomness in multi-process data loading algorithm. Use worker_init_fn() and generator to preserve reproducibility:

def seed_worker(worker_id):
    worker_seed = torch.initial_seed() % 2**32
    numpy.random.seed(worker_seed)
    random.seed(worker_seed)

g = torch.Generator()
g.manual_seed(0)

DataLoader(
    train_dataset,
    batch_size=batch_size,
    num_workers=num_workers,
    worker_init_fn=seed_worker,
    generator=g,
)