pytorch .data

最新推荐文章于 2023-03-29 01:51:50 发布

grllery

最新推荐文章于 2023-03-29 01:51:50 发布

阅读量296

点赞数

分类专栏： PyTorch

原文链接：https://pytorch.org/blog/pytorch-0_4_0-migration-guide/

版权

PyTorch 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

What about `.data?`

.data was the primary way to get the underlying Tensor from a Variable. After this merge, calling y = x.data still has similar semantics. So y will be a Tensor that shares the same data with x, is unrelated with the computation history of x, and has requires_grad=False.

However, .data can be unsafe in some cases. Any changes on x.data wouldn’t be tracked by autograd, and the computed gradients would be incorrect if x is needed in a backward pass. A safer alternative is to use x.detach(), which also returns a Tensor that shares data with requires_grad=False, but will have its in-place changes reported by autograd if x is needed in backward.

Here is an example of the difference between .data and x.detach() (and why we recommend using detach in general).

If you use Tensor.detach(), the gradient computation is guaranteed to be correct.

>>> a = torch.tensor([1,2,3.], requires_grad = True)
>>> out = a.sigmoid()
>>> c = out.detach()
>>> c.zero_()
tensor([ 0.,  0.,  0.])

>>> out  # modified by c.zero_() !!
tensor([ 0.,  0.,  0.])

>>> out.sum().backward()  # Requires the original value of out, but that was overwritten by c.zero_()
RuntimeError: one of the variables needed for gradient computation has been modified by an

However, using Tensor.data can be unsafe and can easily result in incorrect gradients when a tensor is required for gradient computation but modified in-place.

>>> a = torch.tensor([1,2,3.], requires_grad = True)
>>> out = a.sigmoid()
>>> c = out.data
>>> c.zero_()
tensor([ 0.,  0.,  0.])

>>> out  # out  was modified by c.zero_()
tensor([ 0.,  0.,  0.])

>>> out.sum().backward()
>>> a.grad  # The result is very, very wrong because `out` changed!
tensor([ 0.,  0.,  0.])