1.应用
import torch
import torch.nn as nn
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
optimizer.zero_grad()
loss_fn(model(input), target).backward()
optimizer.step()
概念
最简单的更新规则是Stochastic Gradient Descent (SGD):
weight = weight - learning_rate * gradient
手动实现
learning_rate = 0.01
for f in net.parameters(): # 遍历图中每个节点的参数
f.data.sub_(f.grad.data * learning_rate) # 将节点的参数-(学习速率*梯度),单下划线表示替换
pytorch中已经实现了SGD等一系列的更新方法
import torch.optim as optim
# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)
# in your training loop:
optimizer.zero_grad() # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # Does the update
API
1.类
stochastic gradient descent (optionally with momentum).
CLASS torch.optim.SGD(params, lr=<required parameter>, momentum=0, dampening=0, weight_decay=0, nesterov=False)
参数 | 描述 |
---|---|
params (iterable) | iterable of parameters to optimize or dicts defining parameter groups |
lr (float) | 学习速率 |
momentum (float, optional) | momentum factor (default: 0) |
weight_decay (float, optional) | weight decay (L2 penalty) (default: 0) |
dampening (float, optional) | dampening for momentum (default: 0) |
nesterov (bool, optional) | enables Nesterov momentum (default: False) |
对象
参数 | 描述 |
---|---|
step(closure=None) | Performs a single optimization step. |
参考:
https://pytorch.org/docs/stable/optim.html?highlight=sgd#torch.optim.SGD