为什么设置了随机种子结果还是不一样，提供一些小细节也许可以解决

最新推荐文章于 2024-02-15 01:30:13 发布

sundayberrysharp

最新推荐文章于 2024-02-15 01:30:13 发布

阅读量2.2k

点赞数 4

文章标签： python 人工智能神经网络深度学习

本文链接：https://blog.csdn.net/sundayberrysharp/article/details/131996102

版权

**为什么设置了随机种子结果还是不一样，提供一些小细节也许可以解决

一、保证同一台机子环境，保证cuda>10.2 2.

二、保证torch、numpy版本比较新

三、在import结束后设置随机种子且运行set（seed=42）#设一个数字

**    `def seed(seed):
random.seed(seed) #为python设置随机种子
np.random.seed(seed)  #为numpy设置随机种子
torch.manual_seed(seed)   #为CPU设置随机种子
torch.cuda.manual_seed(seed)   #为当前GPU设置随机种子
os.environ['PYTHONHASHSEED'] = str(seed)
torch.cuda.manual_seed_all(seed)   #为所有GPU设置随机种子
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.enabled = True
os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':16:8'
torch.use_deterministic_algorithms(True)``

torch.use_deterministic_algorithms(True)能够找到不能保持复现性的错误
如果报错RuntimeError: Deterministic behavior was enabled with either torch.set_deterministic(True)orat::Context::setDeterministic(true), but this operation is not deterministic because it uses CuBLAS and you have CUDA >= 10.2. To enable deterministic behavior in this case, you must set an environment variable before running your PyTorch application: CUBLAS_WORKSPACE_CONFIG=:4096:8 or CUBLAS_WORKSPACE_CONFIG=:16:8. For more information, go to https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility就是添加os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':16:8'
os.environ[‘CUBLAS_WORKSPACE_CONFIG’] = ':16:8’如果还报错 RuntimeError: Deterministic behavior was enabled with either torch.use_deterministic_algorithms(True) or at::Context::setDeterministicAlgorithms(true),`
参考链接: link
报错max_pool3d_with_indices_backward_cuda does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'. You can turn off determinism just for this operation if that's acceptable for your application. You can also file an issue at https://github.com/pytorch/pytorch/issues to help us prioritize adding deterministic support for this operation. 参考了两个pytorch的issue，问题比较接近，但是有些算法目前还是没法解决随机性的问题：
参考链接: link
参考链接:link
这里有两个解决办法：
（1）在发生错误的位置前后，设置

    torch.use_deterministic_algorithms(False)
    loss = lossCE（）#错误位置
    torch.use_deterministic_algorithms(True)

（2）比如我这里其实报错第一行可以看到max_pool3d_with_indices_backward_cuda does not have a deterministic implementation主要是max_pool发生错误，返回代码找到其中一处maxpool3d时会报错，我取消了那一句，不让代码maxpool也可以解决，其他办法目前没有想到，但是也许更多的问题将会在在pytorch中找到解决，参见上面第二个链接。

四、当设置num_worker>1时，可以设置一下worker_init_fn

    train_loader = DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True,num_workers=6, worker_init_fn=np.random.seed(seed),pin_memory=True)