您已经回答了自己的问题,即下划线表示PyTorch中的就地操作.但是,我想简要指出为什么就地操作会出现问题:
>首先,在大多数情况下,建议在PyTorch网站上不要使用就地操作.除非在沉重的内存压力下工作,否则在大多数情况下,不使用就地操作会更有效率.
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd
>其次,在使用就地操作时可能会出现计算梯度的问题:
Every tensor keeps a version counter, that is incremented every time
it is marked dirty in any operation. When a Function saves any tensors
for backward, a version counter of their containing Tensor is saved as
well. Once you access self.saved_tensors it is checked, and if it is
greater than the saved value an error is raised. This ensures that if
you’re using in-place functions and not seeing any errors, you can be
sure that the computed gradients are correct.
07001
这是从您发布的答案中摘录并经过稍微修改的示例:
首先是就地版本:
import torch
a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
adding_tensor = torch.rand(3)
b = a.add_(adding_tensor)
c = torch.sum(b)
c.backward()
print(c.grad_fn)
导致此错误:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in
2 a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
3 adding_tensor = torch.rand(3)
----> 4 b = a.add_(adding_tensor)
5 c = torch.sum(b)
6 c.backward()
RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.
其次,非就地版本:
import torch
a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
adding_tensor = torch.rand(3)
b = a.add(adding_tensor)
c = torch.sum(b)
c.backward()
print(c.grad_fn)
哪个工作得很好-输出:
因此,作为总结,我只想指出要在PyTorch中谨慎使用就地操作.