Debugging feature for “modified by an inplace operation“ errors

最新推荐文章于 2022-05-26 14:44:30 发布

anshiquanshu

最新推荐文章于 2022-05-26 14:44:30 发布

阅读量191

点赞数

分类专栏：深度学习

原文链接：https://github.com/pytorch/pytorch/issues/15803

版权

深度学习专栏收录该内容

130 篇文章 6 订阅

订阅专栏

Feature

Better error logging for inplace operations that throw errors in automatic differentiation.

Motivation

In complex computational graphs, locating the operation causing the error is nontrivial. A feature for isolating the offending operation would save a lot of developer time.

Pitch

See here for a more concrete suggestion. To quote:

I wonder whether it might be worth adding a "debug" modus that records the stack of the op in the forward pass and spits it out on error in the backward. That way, it would point to the right line of code directly.

Alternatives

Just being able to log all inplace operations would be useful.

Question: can't you use the anomaly_detection for that?

Here is an example where it shows up the exact line where the error occurs:

import torch

with torch.autograd.set_detect_anomaly(True):
    a = torch.rand(1, requires_grad=True)
    c = torch.rand(1, requires_grad=True)

    b = a ** 2 * c ** 2
    b += 1
    b *= c + a

    d = b.exp_()
    d *= 5

    b.backward()
And the stack trace:

sys:1: RuntimeWarning: Traceback of forward call that caused the error:
  File "tst.py", line 13, in <module>
    d = b.exp_()

Traceback (most recent call last):
  File "tst.py", line 16, in <module>
    b.backward()
  File "/Users/fmassa/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/Users/fmassa/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
which points directly to b.exp_(), and indeed, if you change that line to be b.exp(), it all works fine.

Closing as this can be obtained with anomaly_detection.

Great, thanks. This might be a good thing to add to the section on in-place operations in the autograd mechanics tutorial. I'm happy to add a sentence or two, but the contributing doc doesn't mention how to modify tutorials or notes.

which points directly to b.exp_(), and indeed, if you change that line to be b.exp(), it all works fine.

To clarify for other readers, the anomaly detection will not necessarily point you at the inplace operation that caused the failure. Instead, it will point you at the operation that could not compute its gradient in the backward pass. The inplace operation to blame may occur anywhere after that, modifying one of the tensors that participated in the line found by the anomaly detection.

Example:

x = torch.rand(10, 20, requires_grad=True)
y = torch.rand(10)
z = (x / y[:, np.newaxis])  # anomaly detection will point here
c = y.abs_()  # but the problem is here
z.sum().backward()


The last line will cause a RuntimeError. With anomaly detection enabled, it will point at the line performing the division, but the inplace operation came later.

anshiquanshu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Debugging feature for “modified by an inplace operation“ errors

FeatureBetter error logging for inplace operations that throw errors in automatic differentiation.MotivationIn complex computational graphs, locating the operation causing the error is nontrivial. A feature for isolating the offending operation would
复制链接

扫一扫

专栏目录