RuntimeError: one of the variables needed for gradient computation has been modified by an inplace o

最新推荐文章于 2024-07-02 09:32:38 发布

带霸气的骑士

最新推荐文章于 2024-07-02 09:32:38 发布

阅读量2w

点赞数 33

分类专栏： python 机器学习文章标签： python 深度学习

本文链接：https://blog.csdn.net/cough777/article/details/114989916

版权

机器学习同时被 2 个专栏收录

11 篇文章 2 订阅

订阅专栏

python

10 篇文章 2 订阅

订阅专栏

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

问题分析

这个问题是因为计算图中反传过程中发生了计算变量的改变。
就相当于我提前搬好了砖头和水泥放在了一个位置准备建房子，但是我正要用的时候，砖头和水泥不是我之前放置的时候的数量，我就着急啊，我就报错。。。。。。
例子见下图
详情见文献一

解决办法

办法一

文献二说明，修改loss计算公式可以解决问题
（ ps：我这里没有太明白，这个应该是GAN训练相关的，期望大神指正。）
修改了G_loss

办法二

1）torch版本降为0.3.0（我测试了pytorch=1.8.0降至1.4.0，没有work，所以降版本的方法不一定有用，当然也可能是因为1.4.0也是比较新的版本）
2）在inplace为True的时候，将其改为Flase，如drop()，还有Relu（）函数
3）去掉所有的inplace操作
4）换掉”-=”“+=”之类的操作，且用b=a代替a = a
a -=c ==> a = b - c

哪些是Inplace操作，哪些不是

任何有一个’'后缀改变张量的操作都是inplace操作。例如x.squeeze()，x.unsqueeze_()操作将改变x。x.squeeze()，x.unsqueeze()则不会。

Pytorch中 torch.relu()和torch.sigmoid()等激活函数不是inplace操作，其中ReLU可通过设置inplace=True进行inplace操作。

x += res是inplace操作，x = x + res不是（详情见官网论坛Adam Paszke的回答）

详情见文献三，文献四

办法三

哪有问题就删除哪儿
我用这种方法，解决了这个问题。

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [12, 1024, 13, 13]], which is output 0 of SigmoidBackward, is at version 2; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

报错如上，提示是sigmoidbackward出问题了，所以我把可能出现的地方，都注释了sigmoid激活函数，最后找到了问题所在。

其他

查看出问题的地方，会提示

 Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

这个用于定位挺好用的，但对于复杂的计算图，可能这个的放置位置我还没有细致的研究过。使用方法见下面的操作实例

import torch

with torch.autograd.set_detect_anomaly(True): #就是这句话
    a = torch.rand(1, requires_grad=True)
    c = torch.rand(1, requires_grad=True)
    
    b = a ** 2 * c ** 2
    b += 1
    b *= c + a

    d = b.exp_()
    d *= 5

    b.backward()

输出如下

sys:1: RuntimeWarning: Traceback of forward call that caused the error:
  File "tst.py", line 13, in <module>
    d = b.exp_()

Traceback (most recent call last):
  File "tst.py", line 16, in <module>
    b.backward()
  File "/Users/fmassa/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/Users/fmassa/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

从这里可以看出是 d = b.exp_() 出了问题，修改为d = b.exp()，即可解决

详情见文献五

参考文献

文献一
 文献二
 文献三
 文献四
 文献五

带霸气的骑士

关注

33
点赞
踩
34

收藏

觉得还不错? 一键收藏
10
评论
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace o

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation问题分析这个问题是因为计算图中反传过程中发生了计算变量的改变。就相当于我提前搬好了砖头和水泥放在了一个位置准备建房子，但是我正要用的时候，砖头和水泥不是我之前放置的时候的数量，我就着急啊，我就报错。。。。。。详情见文献一解决办法办法一文献二说明，修改loss计算公式可以解决问题
复制链接

扫一扫

专栏目录