60分钟闪学pytorch深度学习记录2

二、自动求导 Torch.autograd()

概念:

autograd() 是torch的自动求导机制,用于神经网络的训练。

神经网络的训练主要有两个大步骤:

1、前向传播:

构建合理的模型,训练网络参数,使其在训练数据集上能够拟合结果。

2、反向传播:

根据预测结果与正确结果之间的差距(用损失函数来衡量),来调节模型的参数。通常有梯度下降方法。这是就需要自动求导autograd()。

 

1、举例

 

# Usage in PyTorch

# create a random data tensor
import torch, torchvision
model = torchvision.models.resnet18(pretrained=True)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

# forward pass
prediction = model(data)
 
# Autograd then calculates and stores the gradients for each model parameter 
# in the parameter’s .grad attribute.
loss = (prediction - labels).sum()
loss.backward() # backward pass

# Next, we load an optimizer
optim = torch.optim.SGD(model.parameters(), lr= 0.01, momentum = 0.9)

# Finally, we call .step() to initiate gradient descent. 
# The optimizer adjusts each parameter by its gradient stored in .grad.
optim.step() # gradient descent

注:

 

1、torch.rand()   

# Returns a tensor filled with random numbers from a uniform distribution on the interval [0, 1)

# 返回 对应维度的张量,每个元素数值为[0, 1)

2、torchvision.models.resnet18(pretrained=False, ** kwargs)

构建一个resnet18模型。pretrained (bool) – True, 返回在ImageNet上训练好的模型。

参考:https://pytorch-cn.readthedocs.io/zh/latest/torchvision/torchvision-models/

3、torch.optim, .step()

参考:https://pytorch.org/docs/stable/optim.html

 

2、自动(求导)微分

 

## 2、Differention in Autograd
import torch

a = torch.tensor([2., 3.], requires_grad=True) 
# requires_grad=True  this signals to "autograd" that every operation on them should be traacked
b = torch.tensor([6., 4.], requires_grad=True)

Q = 3*a**3 - b**2

# We need to explicitly pass a gradient argument in Q.backward() because it is a vector.
external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad,retain_graph=True)

# Gradients are now deposited in a.grad and b.grad
print(9*a**2 == a.grad)
print(-2*b == b.grad)
print(a.grad)
print(b.grad)

# Equivalently, we can also aggregate Q into a scalar and call backward implicitly,
# like Q.sum().backward().
Q.sum().backward()
print(a.grad)
print(b.grad)

# 运行结果
tensor([True, True])
tensor([True, True])
tensor([36., 81.])
tensor([-12.,  -8.])
tensor([ 72., 162.])
tensor([-24., -16.])

3 、链式求导法则,

4、pytorch 动态计算图

 

5、无需自动求导的情况:

 

(1)pytorch 默认所有运算都有需要求导,但是可以通过设置requires_grad(bool)属性。在有很多不需要求导的张量情况下,可以减少内存消耗。

## 5、Exclusion from the DAG


x = torch.rand(5, 5)
y = torch.rand(5, 5)
z = torch.rand((5,5), requires_grad=True)

a = x + y
print(f"Does 'a' require gradients?: {a.requires_grad}")
b = x + z
print(f"Does 'b' require gradients?: {b.requires_grad}")


# 运行结果
Does 'a' require gradients?: False
Does 'b' require gradients?: True

(2)如果利用现成的模型,只改动输出层的参数。则可以设置requires_grad = false。在反向传播时,不会改变其他层参数。

# frozen parameters
# for finetuning a pretrained network

# like torch.no_grad()
## Context-manager that disabled gradient calculation
## and this context manager is thread local, it will not affect computation in other threads
from torch import nn, optim

model = torchvision.models.resnet18(pretrained = True)


# Freeze all the parameters in the network
for param in model.parameters():
    param.requires_grad = False

model.fc = nn.Linear(512, 10)

# Optimize only the classifier
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值