MMCV1.6.0之Runner/Hook/LrUpdaterHook(学习率配置参数和函数)

14 篇文章 20 订阅
14 篇文章 21 订阅

lr_updater.py(学习率更新)icon-default.png?t=N7T8https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py
lr_updater.py(学习率更新)

1、LrUpdaterHook

LrUpdaterHook 还提供了 warmup 功能,并且支持 constant, linear, exp 三种方式。

Warmup的作用

  1. 减少训练不稳定性:训练初期,如果学习率过高,可能会导致模型参数的更新过于剧烈,从而使训练过程变得不稳定。Warmup策略通过逐渐增加学习率,帮助模型逐步适应训练过程,减少这种不稳定性。

  2. 改善收敛速度:在训练初期,模型的参数通常还没有找到合适的区域。通过逐步增加学习率,模型可以更好地探索参数空间,从而更快地找到收敛点。

  3. 提高最终性能:适当的warmup策略可以使得模型在训练过程中获得更好的局部最优解,从而提高最终的模型性能。

如何实现Warmup

  1. 设置Warmup阶段

    • 时间段:选择一个warmup阶段的时间段,这通常是在训练的前几个epoch或者前几个训练步骤内。
    • 学习率增长策略:确定学习率在warmup阶段的增长策略,比如线性增长或指数增长。
  2. 实现Warmup

    • 线性增长:在warmup阶段,从一个较小的学习率开始,逐渐线性增加到预设的学习率。
    • 指数增长:在warmup阶段,从一个较小的学习率开始,按照指数方式逐渐增加到预设的学习率。
class LrUpdaterHook(Hook):
    """LR Scheduler in MMCV.

    Args:
        by_epoch (bool): LR changes epoch by epoch
        warmup (string): Type of warmup used. It can be None(use no warmup),
            'constant', 'linear' or 'exp'
        warmup_iters (int): The number of iterations or epochs that warmup
            lasts
        warmup_ratio (float): LR used at the beginning of warmup equals to
            warmup_ratio * initial_lr
        warmup_by_epoch (bool): When warmup_by_epoch == True, warmup_iters
            means the number of epochs that warmup lasts, otherwise means the
            number of iteration that warmup lasts
    """

    def __init__(self,
                 by_epoch: bool = True,
                 warmup: Optional[str] = None,
                 warmup_iters: int = 0,
                 warmup_ratio: float = 0.1,
                 warmup_by_epoch: bool = False) -> None:
        # validate the "warmup" argument
        if warmup is not None:
            if warmup not in ['constant', 'linear', 'exp']:
                raise ValueError(
                    f'"{warmup}" is not a supported type for warming up, valid'
                    ' types are "constant", "linear" and "exp"')
        if warmup is not None:
            assert warmup_iters > 0, \
                '"warmup_iters" must be a positive integer'
            assert 0 < warmup_ratio <= 1.0, \
                '"warmup_ratio" must be in range (0,1]'

        self.by_epoch = by_epoch
        self.warmup = warmup
        self.warmup_iters: Optional[int] = warmup_iters
        self.warmup_ratio = warmup_ratio
        self.warmup_by_epoch = warmup_by_epoch

        if self.warmup_by_epoch:
            self.warmup_epochs: Optional[int] = self.warmup_iters
            self.warmup_iters = None
        else:
            self.warmup_epochs = None

        self.base_lr: Union[list, dict] = []  # initial lr for all param groups
        self.regular_lr: list = []  # expected lr if no warming up is performed

2、StepLrUpdaterHook(单步和多步阶学习率调度)

class StepLrUpdaterHook(LrUpdaterHook):
    """Step LR scheduler with min_lr clipping.使用min_lr剪辑的Step LR scheduler
    Args:
        step (int | list[int]): Step to decay the LR. If an int value is given,
            regard it as the decay interval. If a list is given, decay LR at
            these steps.step衰变LR。如果给出了一个int值,则将其视为衰减区间。如果给出一个列表,在这些步骤衰减LR。
        gamma (float): Decay LR ratio. Defaults to 0.1.
        min_lr (float, optional): Minimum LR value to keep. If LR after decay
            is lower than `min_lr`, it will be clipped to this value. If None
            is given, we don't perform lr clipping. Default: None.
保持最小LR值。如果衰减后的LR低于' min_lr ',它将被剪切到这个值。如果给出None,则不执行lr剪切
    """

3、CosineAnnealingLrUpdaterHook(余弦退火学习率)

class CosineAnnealingLrUpdaterHook(LrUpdaterHook):
    """CosineAnnealing LR scheduler.cosine退火LR调度程序
    Args:
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

4、FlatCosineAnnealingLrUpdaterHook

class FlatCosineAnnealingLrUpdaterHook(LrUpdaterHook):
    """Flat + Cosine lr schedule.
    Modified from https://github.com/fastai/fastai/blob/master/fastai/callback/schedule.py#L128 # noqa: E501
    Args:
        start_percent (float): When to start annealing the learning rate
            after the percentage of the total training steps.
            何时开始退火后的学习率占总训练步数的百分比
            The value should be in range [0, 1).
            Default: 0.75
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

5、CosineRestartLrUpdaterHook(带重启的余弦退火学习率)

class CosineRestartLrUpdaterHook(LrUpdaterHook):
    """Cosine annealing with restarts learning rate scheme.
    Args:
        periods (list[int]): Periods for each cosine anneling cycle.每个余弦退火循环的周期
        restart_weights (list[float]): Restart weights at each
            restart iteration. Defaults to [1].在每次重新启动迭代中重新启动权重
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

6、CyclicLrUpdaterHook(循环学习率)

class CyclicLrUpdaterHook(LrUpdaterHook):
    """Cyclic LR Scheduler.
    Implement the cyclical learning rate policy (CLR) described in
    https://arxiv.org/pdf/1506.01186.pdf
    Different from the original paper, we use cosine annealing rather than
    triangular policy inside a cycle. This improves the performance in the
    3D detection area.
    与原论文不同的是,我们在循环内使用余弦退火而不是三角策略。这提高了3D检测区域的性能。
    Args:
        by_epoch (bool, optional): Whether to update LR by epoch.
        target_ratio (tuple[float], optional): Relative ratio of the highest LR
            and the lowest LR to the initial LR.
        cyclic_times (int, optional): Number of cycles during training
        step_ratio_up (float, optional): The ratio of the increasing process of
            LR in the total cycle.
        anneal_strategy (str, optional): {'cos', 'linear'}
            Specifies the annealing strategy: 'cos' for cosine annealing,
            'linear' for linear annealing. Default: 'cos'.
        gamma (float, optional): Cycle decay ratio. Default: 1.
            It takes values in the range (0, 1]. The difference between the
            maximum learning rate and the minimum learning rate decreases
            periodically when it is less than 1. `New in version 1.4.4.`
    """

7、OneCycleLrUpdaterHook

class OneCycleLrUpdaterHook(LrUpdaterHook):
    """One Cycle LR Scheduler.
    The 1cycle learning rate policy changes the learning rate after every
    batch. The one cycle learning rate policy is described in
    https://arxiv.org/pdf/1708.07120.pdf
    Args:
        max_lr (float or list): Upper learning rate boundaries in the cycle
            for each parameter group.每个参数组在循环中的最高学习率边界
        total_steps (int, optional): The total number of steps in the cycle.
            Note that if a value is not provided here, it will be the max_iter
            of runner. Default: None.循环中的总步骤数。注意,如果这里没有提供一个值,它将是runner的max_iter
        pct_start (float): The percentage of the cycle (in number of steps)
            spent increasing the learning rate.
            Default: 0.3
        anneal_strategy (str): {'cos', 'linear'}
            Specifies the annealing strategy: 'cos' for cosine annealing,
            'linear' for linear annealing.
            Default: 'cos'
        div_factor (float): Determines the initial learning rate via
            initial_lr = max_lr/div_factor
            Default: 25
        final_div_factor (float): Determines the minimum learning rate via
            min_lr = initial_lr/final_div_factor
            Default: 1e4
        three_phase (bool): If three_phase is True, use a third phase of the
            schedule to annihilate the learning rate according to
            final_div_factor instead of modifying the second phase (the first
            two phases will be symmetrical about the step indicated by
            pct_start).
            Default: False
    """

8、LinearAnnealingLrUpdaterHook

class LinearAnnealingLrUpdaterHook(LrUpdaterHook):
    """Linear annealing LR Scheduler decays the learning rate of each parameter
    group linearly.
    Args:
        min_lr (float, optional): The minimum lr. Default: None.
        min_lr_ratio (float, optional): The ratio of minimum lr to the base lr.
            Either `min_lr` or `min_lr_ratio` should be specified.
            Default: None.
    """

9、FixedLrUpdaterHook 固定学习率

10、ExpLrUpdaterHook 指数学习率

11、PolyLrUpdaterHook 多项式学习率

12、InvLrUpdaterHook 和指数学习率类似的调度策略

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值