【pytorch官方文档学习之六】torch.optim

最新推荐文章于 2024-07-17 15:41:07 发布

maze2023

最新推荐文章于 2024-07-17 15:41:07 发布

阅读量165

点赞数

分类专栏： learning 文章标签：神经网络 python

本文链接：https://blog.csdn.net/youzhizhe2014/article/details/108800243

版权

learning 专栏收录该内容

11 篇文章 3 订阅

订阅专栏

本系列旨在通过阅读官方pytorch代码熟悉CNN各个框架的实现方式和流程。

【pytorch官方文档学习之六】torch.optim

本文是对官方文档PyTorch: optim的详细注释和个人理解，欢迎交流。
learnable parameters的缺点
本系列的之前几篇文章已经可以做到使用torch.no_grad或.data来手动更改可学习参数的tensors来更新模型的权重。但是这种方法对于简单的优化算法，如stochastic gradient descent随机梯度下降尚可，而对于实际中更常见的复杂的优化器optimizer，如Adam、AdaGrad和RMSProp等就显得过于臃肿。
为了解决这个问题，torch.optim包应运而生。
optim
torch.optim包可实现常用的优化算法。
实例
以下实例使用torch.nn.Module来定义神经网络模型，使用torch.optim提供的Adam算法优化神经网络模型。

# -*- coding: utf-8 -*-
import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension. 
N, D_in, H, D_out = 64, 1000, 100, 10 # 定义batch size、输入、中间层和输出维度

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)  # 定义输入和输出数据
y = torch.randn(N, D_out)

# Use the nn package to define our model and loss function.
model = torch.nn.Sequential(     # 定义模型的各个layer并串联到一起
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)
loss_fn = torch.nn.MSELoss(reduction='sum') # 定义loss

# Use the optim package to define an Optimizer that will update the weights of
# the model for us. Here we will use Adam; the optim package contains many other
# optimization algorithms. The first argument to the Adam constructor tells the
# optimizer which Tensors it should update.
learning_rate = 1e-4 # 定义学习率
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) # 定义更新权重的优化算法，直接使用model.parameters()可以？？？
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model.
    y_pred = model(x)

    # Compute and print loss.
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default, gradients are
    # accumulated in buffers( i.e, not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad() # 相比于前一篇中的model.zero_grad()有改变

    # Backward pass: compute gradient of the loss with respect to model
    # parameters
    loss.backward()

    # Calling the step function on an Optimizer makes an update to its
    # parameters
    optimizer.step() # 调用step函数更新optimizer的参数