From: https://github.com/L1aoXingyu/code-of-learn-deep-learning-with-pytorch
自动求导是 PyTorch 中非常重要的特性,能够让我们避免手动去计算非常复杂的导数,这能够极大地减少了我们构建模型的时间,这也是其前身 Torch 这个框架所不具备的特性,下面我们通过例子看看 PyTorch 自动求导的独特魅力以及探究自动求导的更多用法。
简单自动求导
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x + 2
z = y ** 2 + 3
z.backward()
print(x.grad)
print(y.grad)
tensor([8.])
None
稍复杂自动求导
x = Variable(torch.randn(10, 20), requires_grad=True)
w = Variable(torch.randn(20, 5), requires_grad=True)
y = Variable(torch.randn(10, 5), requires_grad=True)
out = torch.mean(y - torch.matmul(x, w))
out.backward()
print(x.grad)
tensor([[ 9.2806e-02, 2.1742e-03, -8.3944e-02, 6.3885e-02, -1.3879e-02,
-8.7919e-02, 5.4606e-03, -3.3287e-02, -2.4952e-02, -2.5106e-02,
-2.2252e-03, -8.5285e-03, -3.1087e-02, -1.8958e-05, 5.8073e-03,
-3.0519e-02, -4.8497e-02, 1.5940e-03, -2.7383e-02, -1.1168e-02],
[ 9.2806e-02, 2.1742e-03, -8.3944e-02, 6.3885e-02, -1.3879e-02,
-8.7919e-02, 5.4606e-03, -3.3287e-02, -2.4952e-02, -2.5106e-02,
-2.2252e-03, -8.5285e-03, -3.1087e-02, -1.8958e-05, 5.8073e-03,
-3.0519e-02, -4.8497e-02, 1.5940e-03, -2.7383e-02, -1.1168e-02],
......
[ 9.2806e-02, 2.1742e-03, -8.3944e-02, 6.3885e-02, -1.3879e-02,
-8.7919e-02, 5.4606e-03, -3.3287e-02, -2.4952e-02, -2.5106e-02,
-2.2252e-03, -8.5285e-03, -3.1087e-02, -1.8958e-05, 5.8073e-03,
-3.0519e-02, -4.8497e-02, 1.5940e-03, -2.7383e-02, -1.1168e-02]])
多维数组自动求导
在 PyTorch 中,如果要调用自动求导,需要往backward()中传入一个参数,这个参数的形状和 n 一样大
m = Variable(torch.FloatTensor([[2, 3]]), requires_grad=True)
n = Variable(torch.zeros([1, 2]))
print(m)
print(n)
n[0, 0] = m[0 ,0] ** 2
n[0, 1] = m[0 ,1] ** 3
print(n)
n.backward(torch.ones_like(n))
print(m.grad)
tensor([[2., 3.]], requires_grad=True)
tensor([[0., 0.]])
tensor([[ 4., 27.]], grad_fn=<CopySlices>)
tensor([[ 4., 27.]])
多次自动求导
x = Variable(torch.FloatTensor([3]), requires_grad=True)
y = x*2 + x**2 + 3
# 第一次
y.backward(retain_graph=True) # 设置 retain_graph 为 True 来保留计算图
print(x.grad)
# 第二次
y.backward()
print(x.grad)
tensor([18.], grad_fn=<AddBackward0>)
tensor([8.])
tensor([16.])
backward参数
x = Variable(torch.Tensor([2, 3]), requires_grad=True)
k = Variable(torch.zeros(2))
k[0] = x[0]**2
k[1] = x[1]**2
print(k)
k.backward(torch.FloatTensor([1, 0]), retain_graph=True)
print(x.grad)
x.grad.data.zero_() # 归0
k.backward(torch.FloatTensor([0, 1]))
print(x.grad)
tensor([4., 9.], grad_fn=<CopySlices>)
tensor([4., 0.])
tensor([0., 6.])