今天看到论文中需要实现训练中lr的变化为
l
r
=
{
1
e
−
2
,
e
p
o
c
h
≤
100
1
e
−
2
∗
(
1
e
−
3
)
α
100
<
e
p
o
c
h
≤
400
1
e
−
5
e
p
o
c
h
>
400
lr = \begin{cases} 1e-2, & epoch \le 100 \\ 1e-2 * (1e-3)^\alpha & 100 < epoch \le 400 \\ 1e-5 & epoch > 400 \end{cases}
lr=⎩
⎨
⎧1e−2,1e−2∗(1e−3)α1e−5epoch≤100100<epoch≤400epoch>400
其中,
α
\alpha
α是从0 到1 线性增加的指数。这样,lr可以在第100个到第400个epoch中,指数衰减到
1
e
−
5
1e-5
1e−5.
这一规则可以用torch.optim.lr_scheduler.LambdaLR
来实现
参考以下例子,我们随意定义一个model来测试一下。
model = torch.nn.Sequential(torch.nn.Linear(3, 1))
optimizer = torch.optim.Adam(model.parameters(), lr=1)
scheduler = LambdaLR(optimizer, lr_lambda=lambda epoch: 1e-2 * 1e-3 ** (max(min((epoch - 100)/300, 1), 0)))
for epoch in range(500):
loss_fun = torch.nn.MSELoss()
loss = loss_fun(model(torch.rand((32, 3))), torch.rand((32, 1)))
optimizer.zero_grad()
loss.backward()
optimizer.step()
scheduler.step()
if (epoch + 1) % 50 == 0:
print(f"epoch: {epoch + 1}, lr: {optimizer.param_groups[0]['lr']:.6f}")
运行一下,输出:
epoch: 50, lr: 0.010000
epoch: 100, lr: 0.010000
epoch: 150, lr: 0.003162
epoch: 200, lr: 0.001000
epoch: 250, lr: 0.000316
epoch: 300, lr: 0.000100
epoch: 350, lr: 0.000032
epoch: 400, lr: 0.000010
epoch: 450, lr: 0.000010
epoch: 500, lr: 0.000010
Process finished with exit code 0