11-优化器的使用

优化器的使用

torch.optim

How to use an optimizer

To use torch.optim you have to construct an optimizer object that will hold the current state and will update the parameters based on the computed gradients.

Constructing it

To construct an Optimizer you have to give it an iterable containing the parameters (all should be Variable s) to optimize. Then, you can specify optimizer-specific options such as the learning rate, weight decay, etc.

optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
optimizer = optim.Adam([var1, var2], lr=0.0001)

Taking an optimization step

All optimizers implement a step() method, that updates the parameters. It can be used in two ways:

optimizer.step()

This is a simplified version supported by most optimizers. The function can be called once the gradients are computed using e.g. backward().

for input, target in dataset:
    optimizer.zero_grad()
    output = model(input)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()

optimizer.step(closure)

Some optimization algorithms such as Conjugate Gradient and LBFGS need to reevaluate the function multiple times, so you have to pass in a closure that allows them to recompute your model. The closure should clear the gradients, compute the loss, and return it.

for input, target in dataset:
    def closure():
        optimizer.zero_grad()  # 重置上一步中的梯度值
        output = model(input)
        loss = loss_fn(output, target)
        loss.backward()
        return loss
    optimizer.step(closure)

Base class

CLASS torch.optim.Optimizer(params, defaults)[SOURCE]

Base class for all optimizers.

Optimizer.add_param_groupAdd a param group to the Optimizer s param_groups.
Optimizer.load_state_dictLoads the optimizer state.
Optimizer.state_dictReturns the state of the optimizer as a dict.
Optimizer.stepPerforms a single optimization step (parameter update).
Optimizer.zero_gradResets the gradients of all optimized torch.Tensor s.

Algorithms

AdadeltaImplements Adadelta algorithm.
AdagradImplements Adagrad algorithm.
AdamImplements Adam algorithm.
AdamWImplements AdamW algorithm.
SparseAdamSparseAdam implements a masked version of the Adam algorithm suitable for sparse gradients.
AdamaxImplements Adamax algorithm (a variant of Adam based on infinity norm).
ASGDImplements Averaged Stochastic Gradient Descent.
LBFGSImplements L-BFGS algorithm, heavily inspired by minFunc.
NAdamImplements NAdam algorithm.
RAdamImplements RAdam algorithm.
RMSpropImplements RMSprop algorithm.
RpropImplements the resilient backpropagation algorithm.
SGDImplements stochastic gradient descent (optionally with momentum).

ADADELTA

CLASS torch.optim.Adadelta(params, lr=1.0, rho=0.9, eps=1e-06, weight_decay=0, foreach=None, ***, maximize=False, differentiable=False)[SOURCE]

初学的话,只用设置params和lr即可,其他参数可用默认值,需要时再做学习

示例

import torch
import torchvision
from torch import nn
from torch.nn import Sequential
from torch.utils.data import DataLoader
​
dataset = torchvision.datasets.CIFAR10(root="./dataset", train=True, download=False,
                                       transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64, shuffle=True)
​
class XiaoMo(nn.Module):
    def __init__(self):
        super(XiaoMo, self).__init__()
​
        self.model1 = Sequential(
            nn.Conv2d(3, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Flatten(),
            nn.Linear(64 * 4 * 4, 64),
            nn.Linear(64, 10)
        )
​
    def forward(self, x):
        x = self.model1(x)
​
        return x
​
loss = nn.CrossEntropyLoss()
​
xiaomo = XiaoMo()
​
optim = torch.optim.SGD(xiaomo.parameters(), lr=0.01)  # 创建优化器
​
for epoch in range(20):  # 重复20轮
    loss_running = 0.0
    for imgs, target in dataloader:  # 对数据进行一轮学习
        outputs = xiaomo(imgs)
        loss_res = loss(outputs, target)
​
        optim.zero_grad()
        loss_res.backward()  # 计算梯度
​
        optim.step()
        loss_running += loss_res
​
    print(loss_running)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值