lr_updater.py(学习率更新)
https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py![](https://i-blog.csdnimg.cn/direct/d1ba95e81e294da689d71b59d0adc192.png)
lr_updater.py(学习率更新)
1、LrUpdaterHook
LrUpdaterHook 还提供了 warmup 功能,并且支持 constant, linear, exp
三种方式。
Warmup的作用
-
减少训练不稳定性:训练初期,如果学习率过高,可能会导致模型参数的更新过于剧烈,从而使训练过程变得不稳定。Warmup策略通过逐渐增加学习率,帮助模型逐步适应训练过程,减少这种不稳定性。
-
改善收敛速度:在训练初期,模型的参数通常还没有找到合适的区域。通过逐步增加学习率,模型可以更好地探索参数空间,从而更快地找到收敛点。
-
提高最终性能:适当的warmup策略可以使得模型在训练过程中获得更好的局部最优解,从而提高最终的模型性能。
如何实现Warmup
-
设置Warmup阶段:
- 时间段:选择一个warmup阶段的时间段,这通常是在训练的前几个epoch或者前几个训练步骤内。
- 学习率增长策略:确定学习率在warmup阶段的增长策略,比如线性增长或指数增长。
-
实现Warmup:
- 线性增长:在warmup阶段,从一个较小的学习率开始,逐渐线性增加到预设的学习率。
- 指数增长:在warmup阶段,从一个较小的学习率开始,按照指数方式逐渐增加到预设的学习率。
class LrUpdaterHook(Hook):
"""LR Scheduler in MMCV.
Args:
by_epoch (bool): LR changes epoch by epoch
warmup (string): Type of warmup used. It can be None(use no warmup),
'constant', 'linear' or 'exp'
warmup_iters (int): The number of iterations or epochs that warmup
lasts
warmup_ratio (float): LR used at the beginning of warmup equals to
warmup_ratio * initial_lr
warmup_by_epoch (bool): When warmup_by_epoch == True, warmup_iters
means the number of epochs that warmup lasts, otherwise means the
number of iteration that warmup lasts
"""
def __init__(self,
by_epoch: bool = True,
warmup: Optional[str] = None,
warmup_iters: int = 0,
warmup_ratio: float = 0.1,
warmup_by_epoch: bool = False) -> None:
# validate the "warmup" argument
if warmup is not None:
if warmup not in ['constant', 'linear', 'exp']:
raise ValueError(
f'"{warmup}" is not a supported type for warming up, valid'
' types are "constant", "linear" and "exp"')
if warmup is not None:
assert warmup_iters > 0, \
'"warmup_iters" must be a positive integer'
assert 0 < warmup_ratio <= 1.0, \
'"warmup_ratio" must be in range (0,1]'
self.by_epoch = by_epoch
self.warmup = warmup
self.warmup_iters: Optional[int] = warmup_iters
self.warmup_ratio = warmup_ratio
self.warmup_by_epoch = warmup_by_epoch
if self.warmup_by_epoch:
self.warmup_epochs: Optional[int] = self.warmup_iters
self.warmup_iters = None
else:
self.warmup_epochs = None
self.base_lr: Union[list, dict] = [] # initial lr for all param groups
self.regular_lr: list = [] # expected lr if no warming up is performed
2、StepLrUpdaterHook(单步和多步阶学习率调度)
class StepLrUpdaterHook(LrUpdaterHook):
"""Step LR scheduler with min_lr clipping.使用min_lr剪辑的Step LR scheduler
Args:
step (int | list[int]): Step to decay the LR. If an int value is given,
regard it as the decay interval. If a list is given, decay LR at
these steps.step衰变LR。如果给出了一个int值,则将其视为衰减区间。如果给出一个列表,在这些步骤衰减LR。
gamma (float): Decay LR ratio. Defaults to 0.1.
min_lr (float, optional): Minimum LR value to keep. If LR after decay
is lower than `min_lr`, it will be clipped to this value. If None
is given, we don't perform lr clipping. Default: None.
保持最小LR值。如果衰减后的LR低于' min_lr ',它将被剪切到这个值。如果给出None,则不执行lr剪切
"""
3、CosineAnnealingLrUpdaterHook(余弦退火学习率)
class CosineAnnealingLrUpdaterHook(LrUpdaterHook):
"""CosineAnnealing LR scheduler.cosine退火LR调度程序
Args:
min_lr (float, optional): The minimum lr. Default: None.
min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
Either `min_lr` or `min_lr_ratio` should be specified.
Default: None.
"""
4、FlatCosineAnnealingLrUpdaterHook
class FlatCosineAnnealingLrUpdaterHook(LrUpdaterHook):
"""Flat + Cosine lr schedule.
Modified from https://github.com/fastai/fastai/blob/master/fastai/callback/schedule.py#L128 # noqa: E501
Args:
start_percent (float): When to start annealing the learning rate
after the percentage of the total training steps.
何时开始退火后的学习率占总训练步数的百分比
The value should be in range [0, 1).
Default: 0.75
min_lr (float, optional): The minimum lr. Default: None.
min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
Either `min_lr` or `min_lr_ratio` should be specified.
Default: None.
"""
5、CosineRestartLrUpdaterHook(带重启的余弦退火学习率)
class CosineRestartLrUpdaterHook(LrUpdaterHook):
"""Cosine annealing with restarts learning rate scheme.
Args:
periods (list[int]): Periods for each cosine anneling cycle.每个余弦退火循环的周期
restart_weights (list[float]): Restart weights at each
restart iteration. Defaults to [1].在每次重新启动迭代中重新启动权重
min_lr (float, optional): The minimum lr. Default: None.
min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
Either `min_lr` or `min_lr_ratio` should be specified.
Default: None.
"""
6、CyclicLrUpdaterHook(循环学习率)
class CyclicLrUpdaterHook(LrUpdaterHook):
"""Cyclic LR Scheduler.
Implement the cyclical learning rate policy (CLR) described in
https://arxiv.org/pdf/1506.01186.pdf
Different from the original paper, we use cosine annealing rather than
triangular policy inside a cycle. This improves the performance in the
3D detection area.
与原论文不同的是,我们在循环内使用余弦退火而不是三角策略。这提高了3D检测区域的性能。
Args:
by_epoch (bool, optional): Whether to update LR by epoch.
target_ratio (tuple[float], optional): Relative ratio of the highest LR
and the lowest LR to the initial LR.
cyclic_times (int, optional): Number of cycles during training
step_ratio_up (float, optional): The ratio of the increasing process of
LR in the total cycle.
anneal_strategy (str, optional): {'cos', 'linear'}
Specifies the annealing strategy: 'cos' for cosine annealing,
'linear' for linear annealing. Default: 'cos'.
gamma (float, optional): Cycle decay ratio. Default: 1.
It takes values in the range (0, 1]. The difference between the
maximum learning rate and the minimum learning rate decreases
periodically when it is less than 1. `New in version 1.4.4.`
"""
7、OneCycleLrUpdaterHook
class OneCycleLrUpdaterHook(LrUpdaterHook):
"""One Cycle LR Scheduler.
The 1cycle learning rate policy changes the learning rate after every
batch. The one cycle learning rate policy is described in
https://arxiv.org/pdf/1708.07120.pdf
Args:
max_lr (float or list): Upper learning rate boundaries in the cycle
for each parameter group.每个参数组在循环中的最高学习率边界
total_steps (int, optional): The total number of steps in the cycle.
Note that if a value is not provided here, it will be the max_iter
of runner. Default: None.循环中的总步骤数。注意,如果这里没有提供一个值,它将是runner的max_iter
pct_start (float): The percentage of the cycle (in number of steps)
spent increasing the learning rate.
Default: 0.3
anneal_strategy (str): {'cos', 'linear'}
Specifies the annealing strategy: 'cos' for cosine annealing,
'linear' for linear annealing.
Default: 'cos'
div_factor (float): Determines the initial learning rate via
initial_lr = max_lr/div_factor
Default: 25
final_div_factor (float): Determines the minimum learning rate via
min_lr = initial_lr/final_div_factor
Default: 1e4
three_phase (bool): If three_phase is True, use a third phase of the
schedule to annihilate the learning rate according to
final_div_factor instead of modifying the second phase (the first
two phases will be symmetrical about the step indicated by
pct_start).
Default: False
"""
8、LinearAnnealingLrUpdaterHook
class LinearAnnealingLrUpdaterHook(LrUpdaterHook):
"""Linear annealing LR Scheduler decays the learning rate of each parameter
group linearly.
Args:
min_lr (float, optional): The minimum lr. Default: None.
min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
Either `min_lr` or `min_lr_ratio` should be specified.
Default: None.
"""