x = torch.FloatTensor([1, 1, 5]) # x.grad:tensor([0., 0., 1.]) x.requires_grad = True y = torch.FloatTensor([2, 2, 20]) # y.grad:tensor([0., 0., 2.]) y.requires_grad = True z = x + 2 * y z.backward(gradient=torch.FloatTensor([0, 0, 1]))
设置gradient,反传时梯度会乘上这个参数,如上面代码所示:
如果没有设置gradient,那么:
x.grad:tensor([1., 1., 1.])
y.grad:tensor([2., 2., 2.])
设置如上的gradient,那么:
x.grad:tensor([0., 0., 1.])
y.grad:tensor([0., 0., 2.])
更复杂一点的例子:
x = torch.FloatTensor([1, 1, 5]) x.requires_grad = True y = torch.FloatTensor([2, 2, 20]) y.requires_grad = True z = x + 2 * y u = torch.FloatTensor([[1, 2], [3, 4], [5, 6]]) u.requires_grad = True v = torch.matmul(z.reshape(1, 3), u).reshape(-1) v.backward(gradient=torch.FloatTensor([0, 1])) # 相当于只看u[:,1]的梯度反传 output: x.grad:tensor([2., 4., 6.]) 相当于只看u[:,1]对x的梯度反传 y.grad:tensor([4., 8., 12.]) 相当于只看u[:,1]对y的梯度反传