本系列旨在通过阅读官方pytorch代码熟悉CNN各个框架的实现方式和流程。
【pytorch官方文档学习之六】torch.optim
- 本文是对官方文档PyTorch: optim的详细注释和个人理解,欢迎交流。
- learnable parameters的缺点
本系列的之前几篇文章已经可以做到使用torch.no_grad
或.data
来手动更改可学习参数的tensors
来更新模型的权重。但是这种方法对于简单的优化算法,如stochastic gradient descent
随机梯度下降尚可,而对于实际中更常见的复杂的优化器optimizer
,如Adam
、AdaGrad
和RMSProp
等就显得过于臃肿。
为了解决这个问题,torch.optim
包应运而生。 - optim
torch.optim
包可实现常用的优化算法。 - 实例
以下实例使用torch.nn.Module
来定义神经网络模型,使用torch.optim
提供的Adam
算法优化神经网络模型。
# -*- coding: utf-8 -*-
import torch
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10 # 定义batch size、输入、中间层和输出维度
# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in) # 定义输入和输出数据
y = torch.randn(N, D_out)
# Use the nn package to define our model and loss function.
model = torch.nn.Sequential( # 定义模型的各个layer并串联到一起
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)
loss_fn = torch.nn.MSELoss(reduction='sum') # 定义loss
# Use the optim package to define an Optimizer that will update the weights of
# the model for us. Here we will use Adam; the optim package contains many other
# optimization algorithms. The first argument to the Adam constructor tells the
# optimizer which Tensors it should update.
learning_rate = 1e-4 # 定义学习率
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) # 定义更新权重的优化算法,直接使用model.parameters()可以???
for t in range(500):
# Forward pass: compute predicted y by passing x to the model.
y_pred = model(x)
# Compute and print loss.
loss = loss_fn(y_pred, y)
if t % 100 == 99:
print(t, loss.item())
# Before the backward pass, use the optimizer object to zero all of the
# gradients for the variables it will update (which are the learnable
# weights of the model). This is because by default, gradients are
# accumulated in buffers( i.e, not overwritten) whenever .backward()
# is called. Checkout docs of torch.autograd.backward for more details.
optimizer.zero_grad() # 相比于前一篇中的model.zero_grad()有改变
# Backward pass: compute gradient of the loss with respect to model
# parameters
loss.backward()
# Calling the step function on an Optimizer makes an update to its
# parameters
optimizer.step() # 调用step函数更新optimizer的参数