这段代码实现了一个带有预热阶段的余弦退火学习率调度器。其目的是在训练过程中动态调整学习率,开始时通过预热线性增加学习率,然后在余弦曲线的基础上逐渐减少学习率。调度器的设计可以帮助模型在训练初期稳定学习,并随着训练的进行逐步减少学习率,避免训练后期的过拟合问题。
from lrgb.cosine_scheduler import cosine_with_warmup_scheduler
import math
import torch.optim as optim
from torch.optim import Optimizer
def cosine_with_warmup_scheduler(optimizer: Optimizer,
num_warmup_epochs: int, max_epoch: int):
scheduler = get_cosine_schedule_with_warmup(
optimizer=optimizer,
num_warmup_steps=num_warmup_epochs,
num_training_steps=max_epoch
)
return scheduler
def get_cosine_schedule_with_warmup(
optimizer: Optimizer, num_warmup_steps: int, num_training_steps: int,
num_cycles: float = 0.5, last_epoch: int = -1):
"""
Implementation by Huggingface:
https://github.com/huggingface/transformers/blob/v4.16.2/src/transformers/optimization.py
Create a schedule with a learning rate that decreases following the values
of the cosine function between the initial lr set in the optimizer to 0,
after a warmup period during which it increases linearly between 0 and the
initial lr set in the optimizer.
Args:
optimizer ([`~torch.optim.Optimizer`]):
The optimizer for which to schedule the learning rate.
num_warmup_steps (`int`):
The number of steps for the warmup phase.
num_training_steps (`int`):
The total number of training steps.
num_cycles (`float`, *optional*, defaults to 0.5):
The number of waves in the cosine schedule (the defaults is to just
decrease from the max value to 0 following a half-cosine).
last_epoch (`int`, *optional*, defaults to -1):
The index of the last epoch when resuming training.
Return:
`torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule.
"""