pytorch requires_grad 与 detach 区别 梯度传递细节 cpu gpu Variable numpy转换

pytorch变量类型可以分成三大类,cpu,gpu,Variable。分别表示数据在cpu上参与计算,数据在gpu上参与计算,已经数据加入到梯度计算图中。三者转换方法也很简单:

cpu转gpu使用t.cuda()
gpu转cpu使用t.cpu()
cpu,gpu转variable使用Variable(t)
Variable转cpu,gpu使用v.data
tensor转numpy使用t.numpy()
numpy转tensor使用torch.from_numpy()
注意y = Variable(t.cuda())生成一个节点y,y = Variable(t).cuda(),生成两个计算图节点t和y
将[1]转成1,单元素tensor转成scalar变量,类型不变。single element tensor to scalar
a = [1]
a[0]
# detach_()将计算图中节点转为叶子节点,也就是将节点.grad_fn设置为none,这样detach_()的前一个节点就不会再与当前变量连接
>>> import torch
>>> from torch.autograd import Variable
>>> 
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = 1*x
>>> m.detach_()
>>> n = y.pow(3)
>>> z.backward(torch.ones(2,3))
>>> print(x.grad)  #没有后续变量与x连接,x与m是断开的
None
>>> print(y.grad)
Variable containing:
 1.8000e+01  5.7600e+02  4.3740e+03
 1.8432e+04  5.6250e+04  1.3997e+05
[torch.FloatTensor of size 2x3]

由于pytorch是动态编程,detach使用位置不同,效果也不一样。
import torch
from torch.autograd import Variable
a = Variable(torch.randn(2, 2), requires_grad=True)
b = a * 2
c = b * 2
b.detach_()
c.sum().backward()
print(a.grad, b.grad, c.grad)

Variable containing:
 4  4
 4  4
[torch.FloatTensor of size 2x2]
 None None


import torch
from torch.autograd import Variable
a = Variable(torch.randn(2, 2), requires_grad=True)
b = a * 2
b.detach_()
c = b * 2
c.sum().backward()
print(a.grad, b.grad, c.grad)
#报错: element 0 of variables does not require grad and does not have a grad_fn


import torch
from torch.autograd import Variable
a = Variable(torch.randn(2, 2), requires_grad=True)
b = a * 2
d = a * 3
temp = b.detach()
c = temp * 2 + d
c.sum().backward()
print(a.grad, b.grad, c.grad, d.grad)

Variable containing:
 3  3
 3  3
[torch.FloatTensor of size 2x2]
 None None None
注意如果使用detach_(),则虽然新分离出Variable,但其指向的tensor还是同一个修改的话,会产生影响。
import torch
from torch.nn import init
from torch.autograd import Variable
t1 = torch.FloatTensor([1., 2.])
v1 = Variable(t1)
t2 = torch.FloatTensor([2., 3.])
v2 = Variable(t2)
v3 = v1 + v2
v3_detached = v3.detach()
v3_detached.data.add_(t1) # 修改了 v3_detached Variable中 tensor 的值
print(v3, v3_detached)    # v3 中tensor 的值也会改变
# 如果对tensor采用直接根据索引赋值,这些元素也将不在参与梯度计算
>>> import torch
>>> from torch.autograd import Variable
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = 1*x
>>> m[(m>4).detach()] = 0
>>> print(m)   #m中值5 6被直接赋值为0
Variable containing:
 1  2  3
 4  0  0
[torch.FloatTensor of size 2x3]

>>> n = y.pow(3)
>>> z = m.pow(2)+3*n.pow(2)
>>> z.backward(torch.ones(2,3))
>>> print(x.grad)  # x的梯度不再包含值5 6的梯度
Variable containing:
 2  4  6
 8  0  0
[torch.FloatTensor of size 2x3]
# requires_grad=False 用于控制是否对leaf variable求导
>>> import torch
>>> from torch.autograd import Variable
>>> 
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=False)
>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = x.pow(2)
>>> n = y.pow(3)
>>> z = m.pow(2)+3*n.pow(2)
>>> z.backward(torch.ones(2,3))
>>> print(m.requires_grad)
False
>>> print(z.requires_grad)
True
>>> print(x.grad)
None
>>> print(y.grad)
Variable containing:
 1.8000e+01  5.7600e+02  4.3740e+03
 1.8432e+04  5.6250e+04  1.3997e+05
[torch.FloatTensor of size 2x3]

# requires_grad 只能用于 leaf variables
>>> import torch
>>> from torch.autograd import Variable
>>> 
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
z.backward(torch.ones(2,3))
print(m.requires_grad)
print(x.grad)
print(y.grad)>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = x.pow(2)
>>> m.requires_grad = False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>> n = y.pow(3)
>>> z = m.pow(2)+3*n.pow(2)
>>> z.backward(torch.ones(2,3))
>>> print(m.requires_grad)
True
>>> print(x.grad)
Variable containing:
   4   32  108
 256  500  864
[torch.FloatTensor of size 2x3]

# 一个二维张量与一个一维张量相除结果
Variable containing:
 0.0000  0.2447  0.0000
 0.0000  0.2447  0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]

Variable containing:
 0.0010
 2.0010
 0.0010
[torch.cuda.FloatTensor of size 3 (GPU 0)]

Variable containing:
 0.0000  0.1223  0.0000
 0.0000  0.1223  0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]

# 一个二维张量与一个一维张量相除结果
Variable containing:
 0.0000  0.2447  0.0000
 0.0000  0.2447  0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]

Variable containing:
 0.0010  2.0010  0.0010
[torch.cuda.FloatTensor of size 1x3 (GPU 0)]

Variable containing:
 0.0000  0.1223  0.0000
 0.0000  0.1223  0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]

d = c.view(-1,1).sum(1)的排序方式
Variable containing:
(0 ,0 ,.,.) = 
  2.5000  6.5000
  4.5000  4.0000

(0 ,1 ,.,.) = 
  2.5000  6.5000
  4.5000  4.0000

(1 ,0 ,.,.) = 
  4.5000  6.5000
  6.5000  6.5000

(1 ,1 ,.,.) = 
  4.5000  6.5000
  6.5000  6.5000
[torch.cuda.FloatTensor of size 2x2x2x2 (GPU 0)]

Variable containing:
 2.5000
 6.5000
 4.5000
 4.0000
 2.5000
 6.5000
 4.5000
 4.0000
 4.5000
 6.5000
 6.5000
 6.5000
 4.5000
 6.5000
 6.5000
 6.5000

非leaf Variable不存储grad
xx = Variable(torch.randn(1,1), requires_grad = True)
yy = 3*xx
zz = yy**2
zz.backward()
xx.grad # 0.5137
yy.grad # None
zz.grad # None

注意:
a = Variable(torch.randn(2,10), requires_grad=True).cuda()
a不是leaf,Variable是leaf,但是使用cuda后生成另一个非leaf Variable。
如下才是leaf:
a = Variable(torch.randn(2,10).cuda(), requires_grad=True)

如果你想获得非leaf Variable的grad需要注入hook:
yGrad = torch.zeros(1,1)
def extract(xVar):
	global yGrad
	yGrad = xVar	

xx = Variable(torch.randn(1,1), requires_grad = True)
yy = 3*xx
zz = yy**2

yy.register_hook(extract)

#### Run the backprop:
print (yGrad) # Shows 0.
zz.backward()
print (yGrad) # Show the correct dzdy
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值