解决pytorch DDP 报错This error indicates that your module has parameters that were not used

最新推荐文章于 2024-01-23 10:43:29 发布

npupengsir

最新推荐文章于 2024-01-23 10:43:29 发布

阅读量320

点赞数

文章标签： pytorch 人工智能 python

本文链接：https://blog.csdn.net/u012897374/article/details/134352981

版权

文章指导如何在PyTorchDDP训练中识别并处理未参与梯度计算的参数，以避免错误。

摘要由CSDN通过智能技术生成

使用pytorch DDP训练的时候，有的时候会报错This error indicates that your module has parameters that were not used。

解决办法:

现在单GPU下做一个iteration，并在loss.backward()和optimizer.step()之间插入以下代码:

for name, param in model.named_parameters():
    if param.grad is None:
        print(name)

这时未使用的参数就会未参与梯度的计算，因此可以获取未使用参数，并注释掉消除错误。完整代码如下:

model = model().cuda()
x = torch.rand((...)).cuda()

criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

y_pred = model(x)

y = y_pred

loss = criterion(y_pred, y)

# Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad()
loss.backward()

for name, param in model.named_parameters():
    if param.grad is None:
        print(name)

optimizer.step()