MMCV1.6.0之Runner/Hook/LrUpdaterHook（学习率配置参数和函数）

qq_41627642

已于 2024-07-26 18:13:22 修改

阅读量820

点赞数 1

分类专栏： MMdetection MMroteate MMSegmentation 文章标签：人工智能计算机视觉深度学习

于 2023-02-13 15:37:40 首次发布

本文链接：https://blog.csdn.net/qq_41627642/article/details/129009085

版权

MMdetection 同时被 3 个专栏收录

31 篇文章 22 订阅

订阅专栏

MMroteate

14 篇文章 20 订阅

订阅专栏

MMSegmentation

14 篇文章 21 订阅

订阅专栏

lr_updater.py(学习率更新)https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py
lr_updater.py(学习率更新)

1、LrUpdaterHook

LrUpdaterHook 还提供了 warmup 功能，并且支持 constant, linear, exp 三种方式。

Warmup的作用

减少训练不稳定性：训练初期，如果学习率过高，可能会导致模型参数的更新过于剧烈，从而使训练过程变得不稳定。Warmup策略通过逐渐增加学习率，帮助模型逐步适应训练过程，减少这种不稳定性。
改善收敛速度：在训练初期，模型的参数通常还没有找到合适的区域。通过逐步增加学习率，模型可以更好地探索参数空间，从而更快地找到收敛点。
提高最终性能：适当的warmup策略可以使得模型在训练过程中获得更好的局部最优解，从而提高最终的模型性能。

如何实现Warmup

设置Warmup阶段：
- 时间段：选择一个warmup阶段的时间段，这通常是在训练的前几个epoch或者前几个训练步骤内。
- 学习率增长策略：确定学习率在warmup阶段的增长策略，比如线性增长或指数增长。
实现Warmup：
- 线性增长：在warmup阶段，从一个较小的学习率开始，逐渐线性增加到预设的学习率。
- 指数增长：在warmup阶段，从一个较小的学习率开始，按照指数方式逐渐增加到预设的学习率。

class LrUpdaterHook(Hook):
    """LR Scheduler in MMCV.

    Args:
        by_epoch (bool): LR changes epoch by epoch
        warmup (string): Type of warmup used. It can be None(use no warmup),
            'constant', 'linear' or 'exp'
        warmup_iters (int): The number of iterations or epochs that warmup
            lasts
        warmup_ratio (float): LR used at the beginning of warmup equals to
            warmup_ratio * initial_lr
        warmup_by_epoch (bool): When warmup_by_epoch == True, warmup_iters
            means the number of epochs that warmup lasts, otherwise means the
            number of iteration that warmup lasts
    """

    def __init__(self,
                 by_epoch: bool = True,
                 warmup: Optional[str] = None,
                 warmup_iters: int = 0,
                 warmup_ratio: float = 0.1,
                 warmup_by_epoch: bool = False) -> None:
        # validate the "warmup" argument
        if warmup is not None:
            if warmup not in ['constant', 'linear', 'exp']:
                raise ValueError(
                    f'"{warmup}" is not a supported type for warming up, valid'
                    ' types are "constant", "linear" and "exp"')
        if warmup is not None:
            assert warmup_iters > 0, \
                '"warmup_iters" must be a positive integer'
            assert 0 < warmup_ratio <= 1.0, \
                '"warmup_ratio" must be in range (0,1]'

        self.by_epoch = by_epoch
        self.warmup = warmup
        self.warmup_iters: Optional[int] = warmup_iters
        self.warmup_ratio = warmup_ratio
        self.warmup_by_epoch = warmup_by_epoch

        if self.warmup_by_epoch:
            self.warmup_epochs: Optional[int] = self.warmup_iters
            self.warmup_iters = None
        else:
            self.warmup_epochs = None

        self.base_lr: Union[list, dict] = []  # initial lr for all param groups
        self.regular_lr: list = []  # expected lr if no warming up is performed

2、StepLrUpdaterHook(单步和多步阶学习率调度)

class StepLrUpdaterHook(LrUpdaterHook):
    """Step LR scheduler with min_lr clipping.使用min_lr剪辑的Step LR scheduler
    Args:
        step (int | list[int]): Step to decay the LR. If an int value is given,
            regard it as the decay interval. If a list is given, decay LR at
            these steps.step衰变LR。如果给出了一个int值，则将其视为衰减区间。如果给出一个列表，在这些步骤衰减LR。
        gamma (float): Decay LR ratio. Defaults to 0.1.
        min_lr (float, optional): Minimum LR value to keep. If LR after decay
            is lower than `min_lr`, it will be clipped to this value. If None
            is given, we don't perform lr clipping. Default: None.
保持最小LR值。如果衰减后的LR低于' min_lr '，它将被剪切到这个值。如果给出None，则不执行lr剪切
    """

3、CosineAnnealingLrUpdaterHook(余弦退火学习率)

class CosineAnnealingLrUpdaterHook(LrUpdaterHook):
    """CosineAnnealing LR scheduler.cosine退火LR调度程序
    Args:
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

4、FlatCosineAnnealingLrUpdaterHook

class FlatCosineAnnealingLrUpdaterHook(LrUpdaterHook):
    """Flat + Cosine lr schedule.
    Modified from https://github.com/fastai/fastai/blob/master/fastai/callback/schedule.py#L128 # noqa: E501
    Args:
        start_percent (float): When to start annealing the learning rate
            after the percentage of the total training steps.
            何时开始退火后的学习率占总训练步数的百分比
            The value should be in range [0, 1).
            Default: 0.75
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

5、CosineRestartLrUpdaterHook(带重启的余弦退火学习率)

class CosineRestartLrUpdaterHook(LrUpdaterHook):
    """Cosine annealing with restarts learning rate scheme.
    Args:
        periods (list[int]): Periods for each cosine anneling cycle.每个余弦退火循环的周期
        restart_weights (list[float]): Restart weights at each
            restart iteration. Defaults to [1].在每次重新启动迭代中重新启动权重
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

6、CyclicLrUpdaterHook(循环学习率)

class CyclicLrUpdaterHook(LrUpdaterHook):
    """Cyclic LR Scheduler.
    Implement the cyclical learning rate policy (CLR) described in
    https://arxiv.org/pdf/1506.01186.pdf
    Different from the original paper, we use cosine annealing rather than
    triangular policy inside a cycle. This improves the performance in the
    3D detection area.
    与原论文不同的是，我们在循环内使用余弦退火而不是三角策略。这提高了3D检测区域的性能。
    Args:
        by_epoch (bool, optional): Whether to update LR by epoch.
        target_ratio (tuple[float], optional): Relative ratio of the highest LR
            and the lowest LR to the initial LR.
        cyclic_times (int, optional): Number of cycles during training
        step_ratio_up (float, optional): The ratio of the increasing process of
            LR in the total cycle.
        anneal_strategy (str, optional): {'cos', 'linear'}
            Specifies the annealing strategy: 'cos' for cosine annealing,
            'linear' for linear annealing. Default: 'cos'.
        gamma (float, optional): Cycle decay ratio. Default: 1.
            It takes values in the range (0, 1]. The difference between the
            maximum learning rate and the minimum learning rate decreases
            periodically when it is less than 1. `New in version 1.4.4.`
    """

7、OneCycleLrUpdaterHook

class OneCycleLrUpdaterHook(LrUpdaterHook):
    """One Cycle LR Scheduler.
    The 1cycle learning rate policy changes the learning rate after every
    batch. The one cycle learning rate policy is described in
    https://arxiv.org/pdf/1708.07120.pdf
    Args:
        max_lr (float or list): Upper learning rate boundaries in the cycle
            for each parameter group.每个参数组在循环中的最高学习率边界
        total_steps (int, optional): The total number of steps in the cycle.
            Note that if a value is not provided here, it will be the max_iter
            of runner. Default: None.循环中的总步骤数。注意，如果这里没有提供一个值，它将是runner的max_iter
        pct_start (float): The percentage of the cycle (in number of steps)
            spent increasing the learning rate.
            Default: 0.3
        anneal_strategy (str): {'cos', 'linear'}
            Specifies the annealing strategy: 'cos' for cosine annealing,
            'linear' for linear annealing.
            Default: 'cos'
        div_factor (float): Determines the initial learning rate via
            initial_lr = max_lr/div_factor
            Default: 25
        final_div_factor (float): Determines the minimum learning rate via
            min_lr = initial_lr/final_div_factor
            Default: 1e4
        three_phase (bool): If three_phase is True, use a third phase of the
            schedule to annihilate the learning rate according to
            final_div_factor instead of modifying the second phase (the first
            two phases will be symmetrical about the step indicated by
            pct_start).
            Default: False
    """

8、LinearAnnealingLrUpdaterHook

class LinearAnnealingLrUpdaterHook(LrUpdaterHook):
    """Linear annealing LR Scheduler decays the learning rate of each parameter
    group linearly.
    Args:
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """