PyTorch中 tensor.detach() 和 tensor.data 的区别

最新推荐文章于 2024-04-24 22:41:39 发布

小瓶盖的猪猪侠

最新推荐文章于 2024-04-24 22:41:39 发布

阅读量329

点赞数

分类专栏： pytorch 文章标签： pytorch 深度学习人工智能

本文链接：https://blog.csdn.net/qq_29983883/article/details/130000680

版权

pytorch 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

以 a.data, a.detach() 为例：
两种方法均会返回和a相同的tensor，且与原tensor a 共享数据，一方改变，则另一方也改变。

所起的作用均是将变量tensor从原有的计算图中分离出来，分离所得tensor的requires_grad = False。

不同点：

data是一个属性，.detach()是一个方法；
data是不安全的，.detach()是安全的；

>>> a = torch.tensor([1,2,3.], requires_grad =True)
>>> out = a.sigmoid()
>>> c = out.data
>>> c.zero_()
tensor([ 0., 0., 0.])

>>> out                   #  out的数值被c.zero_()修改
tensor([ 0., 0., 0.])

>>> out.sum().backward()  #  反向传播
>>> a.grad                #  这个结果很严重的错误，因为out已经改变了
tensor([ 0., 0., 0.])

为什么.data是不安全的？

这是因为，当我们修改分离后的tensor，从而导致原tensora发生改变。PyTorch的自动求导Autograd是无法捕捉到这种变化的，会依然按照求导规则进行求导，导致计算出错误的导数值。

其风险性在于，如果我在某一处修改了某一个变量，求导的时候也无法得知这一修改，可能会在不知情的情况下计算出错误的导数值。

>>> a = torch.tensor([1,2,3.], requires_grad =True)
>>> out = a.sigmoid()
>>> c = out.detach()
>>> c.zero_()
tensor([ 0., 0., 0.])

>>> out                   #  out的值被c.zero_()修改 !!
tensor([ 0., 0., 0.])

>>> out.sum().backward()  #  需要原来out得值，但是已经被c.zero_()覆盖了，结果报错
RuntimeError: one of the variables needed for gradient
computation has been modified by an

那么.detach()为什么是安全的？

使用.detach()的好处在于，若是出现上述情况，Autograd可以检测出某一处变量已经发生了改变，进而以如下形式报错，从而避免了错误的求导。

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

从以上可以看出，是在前向传播的过程中使用就地操作(In-place operation)导致了这一问题，那么就地操作是什么呢？