二、A GENTLE INTRODUCTION TO TORCH.AUTOGRAD

最新推荐文章于 2022-05-05 11:08:04 发布

chenyu Ma

最新推荐文章于 2022-05-05 11:08:04 发布

阅读量202

点赞数

分类专栏： PyTorch Python 文章标签： python

本文链接：https://blog.csdn.net/weixin_38220799/article/details/112752597

版权

Python 同时被 2 个专栏收录

11 篇文章 0 订阅

订阅专栏

PyTorch

6 篇文章 0 订阅

订阅专栏

一、Background

神经网络（NN）是在某些输入数据上执行的嵌套函数的集合。这些函数由参数（由权重和偏差组成）定义，这些参数在PyTorch中存储在张量中。

训练NN分为两个步骤：

前向传播：在前向传播中，NN对正确的输出做出最佳猜测。它通过其每个功能运行输入数据以进行猜测。

向后传播：在向后传播中，NN根据其猜测中的错误调整其参数。它通过从输出向后遍历，收集有关函数参数（梯度）的误差导数并使用梯度下降优化参数来实现。

二、Usage in PyTorch

Let’s take a look at a single training step. For this example, we load a pretrained resnet18 model from torchvision. We create a random data tensor to represent a single image with 3 channels, and height & width of 64, and its corresponding label initialized to some random values.

import torch, torchvision
model = torchvision.models.resnet18(pretrained=True)
data = torch.rand(1, 3, 64, 64)#64*64为一个元素的一行三列数组
labels = torch.rand(1, 1000)#1行1000列的数组

Next, we run the input data through the model through each of its layers to make a prediction. This is the forward pass.

prediction = model(data) # forward pass

We use the model’s prediction and the corresponding label to calculate the error (loss). The next step is to backpropagate this error through the network. Backward propagation(传播)is kicked off when we call .backward() on the error tensor. Autograd then calculates and stores the gradients for each model parameter in the parameter’s .grad attribute.

loss = (prediction - labels).sum()
loss.backward() # backward pass

Next, we load an optimizer(优化程序), in this case SGD with a learning rate of 0.01 and momentum(推进力)of 0.9. We register all the parameters of the model in the optimizer.

optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)#lr learnrate

Finally, we call .step() to initiate gradient descent(梯度下降). The optimizer adjusts each parameter by its gradient stored in .grad.

optim.step() #gradient descent

至此，您已经具备了训练神经网络所需的一切。以下各节详细介绍了autograd的工作原理

三、Differentiation in Autograd

Let’s take a look at how autograd collects gradients. We create two tensors a and b with requires_grad=True. This signals to autograd that every operation on them should be tracked.

import torch

a = torch.tensor([2., 3.], requires_grad=True)
b = torch.tensor([6., 4.], requires_grad=True)

在pytorch中，所有的tensor有一个requires_grad参数，如果设置为True，则反向传播时，该tensor就会自动求导。 tensor的requires_grad的属性默认为False。

在这里插入图片描述
We create another tensor Q from a and b.

Q = 3*a**3 - b**2

Let’s assume a and b to be parameter(因素)of an NN, and Q to be the error. In NN training, we want gradients of the error w.r.t. parameters, i.e.

在这里插入图片描述

在这里插入图片描述
When we call(调用).backward() onQ, autograd calculates these gradients and stores them in the respective tensors’.grad`` attribute.

We need to explicitly pass(显式传递)a gradient argument in Q.backward() because it is a vector. gradient is a tensor of the same shape as Q, and it represents the gradient of Q w.r.t. itself, i.e.
在这里插入图片描述
Equivalently, we can also aggregate Q into a scalar and call backward implicitly, like Q.sum().backward().

external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad)

Gradients are now deposited(放置)in a.grad and b.grad

# check if collected gradients are correct
print(9*a**2 == a.grad)
print(-2*b == b.grad)

在这里插入图片描述

chenyu Ma

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
二、A GENTLE INTRODUCTION TO TORCH.AUTOGRAD

目录原文链接
复制链接

扫一扫

专栏目录