二、A GENTLE INTRODUCTION TO TORCH.AUTOGRAD

原文链接
torch.autograd是PyTorch的自动差分引擎,可为神经网络训练提供支持。在本节中,您将获得有关autograd如何帮助神经网络训练的概念性理解。

一、Background

神经网络(NN)是在某些输入数据上执行的嵌套函数的集合。这些函数由参数 (由权重和偏差组成)定义,这些参数在PyTorch中存储在张量中。

训练NN分为两个步骤:

前向传播:在前向传播中,NN对正确的输出做出最佳猜测。它通过其每个功能运行输入数据以进行猜测。

向后传播:在向后传播中,NN根据其猜测中的错误调整其参数。它通过从输出向后遍历,收集有关函数参数(梯度)的误差导数并使用梯度下降优化参数来实现。

二、Usage in PyTorch

Let’s take a look at a single training step. For this example, we load a pretrained resnet18 model from torchvision. We create a random data tensor to represent a single image with 3 channels, and height & width of 64, and its corresponding label initialized to some random values.

import torch, torchvision
model = torchvision.models.resnet18(pretrained=True)
data = torch.rand(1, 3, 64, 64)#64*64为一个元素的一行三列数组
labels = torch.rand(1, 1000)#1行1000列的数组

Next, we run the input data through the model through each of its layers to make a prediction. This is the forward pass.

prediction = model(data) # forward pass

We use the model’s prediction and the corresponding label to calculate the error (loss). The next step is to backpropagate this error through the network. Backward propagation(传播)is kicked off when we call .backward() on the error tensor. Autograd then calculates and stores the gradients for each model parameter in the parameter’s .grad attribute.

loss = (prediction - labels).sum()
loss.backward() # backward pass

Next, we load an optimizer(优化程序), in this case SGD with a learning rate of 0.01 and momentum(推进力)of 0.9. We register all the parameters of the model in the optimizer.

optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)#lr learnrate

Finally, we call .step() to initiate gradient descent(梯度下降). The optimizer adjusts each parameter by its gradient stored in .grad.

optim.step() #gradient descent

至此,您已经具备了训练神经网络所需的一切。以下各节详细介绍了autograd的工作原理


三、Differentiation in Autograd

Let’s take a look at how autograd collects gradients. We create two tensors a and b with requires_grad=True. This signals to autograd that every operation on them should be tracked.

import torch

a = torch.tensor([2., 3.], requires_grad=True)
b = torch.tensor([6., 4.], requires_grad=True)

在pytorch中,所有的tensor有一个requires_grad参数,如果设置为True,则反向传播时,该tensor就会自动求导。 tensor的requires_grad的属性默认为False。

在这里插入图片描述
We create another tensor Q from a and b.
在这里插入图片描述

Q = 3*a**3 - b**2

Let’s assume a and b to be parameter(因素)of an NN, and Q to be the error. In NN training, we want gradients of the error w.r.t. parameters, i.e.


在这里插入图片描述


在这里插入图片描述
When we call(调用).backward() onQ, autograd calculates these gradients and stores them in the respective tensors’.grad`` attribute.

We need to explicitly pass(显式传递)a gradient argument in Q.backward() because it is a vector. gradient is a tensor of the same shape as Q, and it represents the gradient of Q w.r.t. itself, i.e.
在这里插入图片描述
Equivalently, we can also aggregate Q into a scalar and call backward implicitly, like Q.sum().backward().

external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad)

Gradients are now deposited(放置)in a.grad and b.grad

# check if collected gradients are correct
print(9*a**2 == a.grad)
print(-2*b == b.grad)

在这里插入图片描述

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值