What about .data?
.data
was the primary way to get the underlying Tensor
from a Variable
. After this merge, calling y = x.data
still has similar semantics. So y
will be a Tensor
that shares the same data with x
, is unrelated with the computation history of x
, and has requires_grad=False
.
However, .data
can be unsafe in some cases. Any changes on x.data
wouldn’t be tracked by autograd
, and the computed gradients would be incorrect if x
is needed in a backward pass. A safer alternative is to use x.detach()
, which also returns a Tensor
that shares data with requires_grad=False
, but will have its in-place changes reported by autograd
if x
is needed in backward.
Here is an example of the difference between .data
and x.detach()
(and why we recommend using detach
in general).
If you use Tensor.detach()
, the gradient computation is guaranteed to be correct.
>>> a = torch.tensor([1,2,3.], requires_grad = True)
>>> out = a.sigmoid()
>>> c = out.detach()
>>> c.zero_()
tensor([ 0., 0., 0.])
>>> out # modified by c.zero_() !!
tensor([ 0., 0., 0.])
>>> out.sum().backward() # Requires the original value of out, but that was overwritten by c.zero_()
RuntimeError: one of the variables needed for gradient computation has been modified by an
However, using Tensor.data
can be unsafe and can easily result in incorrect gradients when a tensor is required for gradient computation but modified in-place.
>>> a = torch.tensor([1,2,3.], requires_grad = True)
>>> out = a.sigmoid()
>>> c = out.data
>>> c.zero_()
tensor([ 0., 0., 0.])
>>> out # out was modified by c.zero_()
tensor([ 0., 0., 0.])
>>> out.sum().backward()
>>> a.grad # The result is very, very wrong because `out` changed!
tensor([ 0., 0., 0.])
转自:PyTorch 0.4.0 Migration Guide
参考:https://zhuanlan.zhihu.com/p/38475183