CLASS torch.optim.SGD(params, lr=, momentum=0, dampening=0, weight_decay=0, nesterov=False)
实现随机梯度下降(可选地具有动量)。
Nesterov势头是基于关于初始化和动力学在深度学习中的重要性的公式 。
Parameters:
params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
lr (float) – learning rate
momentum (float, optional) – momentum factor (default: 0)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
dampening (float, optional) – dampening for momentum (default: 0)
nesterov (bool, optional) – enables Nesterov momentum (default: False)
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
optimizer.zero_grad()
loss_fn(model(input), target).backward()
optimizer.step()