排坑日记1：RuntimeError: one of the variables needed for gradient computation has been modified

笼子里的薛定谔

已于 2022-10-20 19:15:16 修改

阅读量3.6k

点赞数 14

分类专栏：图像修复文章标签：深度学习 pytorch 图像处理

于 2022-10-20 18:55:57 首次发布

本文链接：https://blog.csdn.net/liuzi_hang/article/details/127432515

版权

图像修复专栏收录该内容

3 篇文章 1 订阅

订阅专栏

问题描述

在使用Pytorch复现DeepFill V1时，报如下错：

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512, 256, 5, 5]] is at version 6; expected version 5 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

报错提示行为生成器的反向传播部分
在这里插入图片描述

环境配置

python 3.6.13
pytorch 1.10.2
cuda 11.3
cudnn 8.0

解决过程

参考网上：

找到网络模型中的 inplace 操作，将inplace=True改成 inplace=False；

发现模型中根本就没有inplace操作或者 += 操作，不得已只能自己排坑。

首先根据报错提示内容，在生成器反向传播的上方添加：

torch.autograd.set_detect_anomaly(True)

但仍会报出如下错误：

RuntimeError: one of the variables needed for gradient computation has
been modified by an inplace operation: [torch.cuda.FloatTensor [512,
256, 5, 5]] is at version 6; expected version 5 instead. Hint: the
backtrace further above shows the operation that failed to compute its
gradient. The variable in question was changed in there or anywhere
later. Good luck!

多次debug无果后去找g_loss的定义

 self.loss['g_loss'] = self.gan_loss_alpha * self.loss['g_loss']
                    self.loss['g_loss'] = self.loss['g_loss'] + self.l1_loss_alpha * self.loss['recon']+ self.ae_loss_alpha * self.loss['ae_loss']

由上可得，g_loss由两部分组成：重构损失和自编码器损失。

将重构损失去掉，程序仍报相同错误，但将自编码器损失去掉后，程序顺利运行，说明问题的根源在于自编码器损失，定义如下：
在这里插入图片描述但是将其去掉显然是不合理的，并且在进行反向传播的过程中，并没有改变ae_loss的值，只有计算，所以返回刚刚的代码中：
可以看到，在生成器反向传播前进行了网络参数更新，将之调整到生成器反向传播的下方，即：
在这里插入图片描述困扰好久得问题得以解决，程序顺利跑通。

解决方案

原因： 在进行生成器网络反向传播梯度计算之前，先对判别器参数进行了更新，修改了某些值，导致生成器网络反向传播时梯度计算失败。

解决方案： 将辨别器网络优化器的参数更新放到生成器网络反向传播的下方(后面)。

笼子里的薛定谔

关注

14
点赞
踩
19

收藏

觉得还不错? 一键收藏
5
评论
排坑日记1：RuntimeError: one of the variables needed for gradient computation has been modified

排坑：RuntimeError: one of the variables needed for gradient computation hasbeen modified by an inplace operation
复制链接

扫一扫

专栏目录