Autograd

Autograd

Autograd is now a core torch package for automatic differentiation. It uses a tape based system for automatic differentiation.

In the forward phase, the autograd tape will remember all the operations it executed, and in the backward phase, it will replay the operations.

Tensors that track history

In autograd, if any input Tensor of an operation has requires_grad=True, the computation will be tracked. After computing the backward pass, a gradient w.r.t. this tensor is accumulated into .gradattribute.

There’s one more class which is very important for autograd implementation - a FunctionTensorand Function are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each variable has a .grad_fn attribute that references a function that has created a function (except for Tensors created by the user - these have None as .grad_fn).

If you want to compute the derivatives, you can call .backward() on a Tensor. If Tensor is a scalar (i.e. it holds a one element tensor), you don’t need to specify any arguments to backward(), however if it has more elements, you need to specify a grad_output argument that is a tensor of matching shape.

import torch

Create a tensor and set requires_grad=True to track computation with it

x = torch.ones(2, 2, requires_grad=True)
print(x)

Out:

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
print(x.data)

Out:

tensor([[1., 1.],
        [1., 1.]])
print(x.grad)

Out:

None
print(x.grad_fn)  # we've created x ourselves

Out:

None

Do an operation of x:

y = x + 2
print(y)

Out:

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward>)

y was created as a result of an operation, so it has a grad_fn

print(y.grad_fn)

Out:

<AddBackward object at 0x7f23f4054eb8>

More operations on y:

z = y * y * 3
out = z.mean()

print(z, out)

Out:

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward>) tensor(27., grad_fn=<MeanBackward1>)

.requires_grad_( ... ) changes an existing Tensor’s requires_grad flag in-place. The input flag defaults to True if not given.

a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

Out:

False
True
<SumBackward0 object at 0x7f23f76eb9e8>

Gradients

let’s backprop now and print gradients d(out)/dx

out.backward()
print(x.grad)

Out:

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

By default, gradient computation flushes all the internal buffers contained in the graph, so if you even want to do the backward on some part of the graph twice, you need to pass in retain_variables = True during the first pass.

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
y.backward(torch.ones(2, 2), retain_graph=True)
# the retain_variables flag will prevent the internal buffers from being freed
print(x.grad)

Out:

tensor([[1., 1.],
        [1., 1.]])
z = y * y
print(z)

Out:

tensor([[9., 9.],
        [9., 9.]], grad_fn=<ThMulBackward>)

just backprop random gradients

gradient = torch.randn(2, 2)

# this would fail if we didn't specify
# that we want to retain variables
y.backward(gradient)

print(x.grad)

Out:

tensor([[ 0.8181,  1.6773],
        [ 1.6309, -0.5167]])

You can also stops autograd from tracking history on Tensors with requires_grad=True by wrapping the code block in with torch.no_grad():

print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

Out:

True
True
False

 

更过学习资料欢迎关注微信公众号,回复“学习书籍”获取,谢谢。

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值