mxnet设置动态学习率（learning rate）

最新推荐文章于 2022-10-09 17:35:40 发布

weixin_30342209

最新推荐文章于 2022-10-09 17:35:40 发布

阅读量164

点赞数

文章标签：人工智能

原文链接：http://www.cnblogs.com/jukan/p/10235091.html

版权

https://blog.csdn.net/xiaotao_1/article/details/78874336

如果learning rate很大，算法会在局部最优点附近来回跳动，不会收敛；
　　如果learning rate太小，算法每步的移动距离很短，就会导致算法收敛速度很慢。
　　所以我们可以先设置一个比较大的学习率，随着迭代次数的增加慢慢降低它。mxnet中有现成的类class，我们可以直接引用。
　　这里有三种mxnet.lr_scheduler。
　　第一种是：

mxnet.lr_scheduler.FactorScheduler(step, factor=1, stop_factor_lr=1e-08)
# Reduce the learning rate by a factor for every n steps.
# It returns a new learning rate by:
base_lr * pow(factor, floor(num_update/step))

# Parameters:
step (int) – Changes the learning rate for every n updates.
factor (float, optional) – The factor to change the learning rate.
stop_factor_lr (float, optional) – Stop updating the learning rate if it is less than this value.
1
2
3
4
5
6
7
8
9
　　例如：

lr_sch = mxnet.lr_scheduler.FactorScheduler(step=500, factor=0.9)
model.fit(
train_iter,
eval_data=val_iter,
optimizer='sgd',
optimizer_params={'learning_rate': 0.1, 'lr_scheduler': lr_sch},
eval_metric=metric,
num_epoch=num_epoch,
1
2
3
4
5
6
7
8
　　这里就表示：初始学习率是0.1 。经过500次参数更新后，学习率变为0.1×0.90.1×0.9。经过1000次参数更新之后，学习率变为0.1×0.9×0.90.1×0.9×0.9
　　第二种是：

class mxnet.lr_scheduler.LRScheduler(base_lr=0.01)
# Base class of a learning rate scheduler.
# A scheduler returns a new learning rate based on the number of updates that have been performed.
Parameters: base_lr (float, optional) – The initial learning rate.

__call__(num_update)
# Return a new learning rate.
# The num_update is the upper bound of the number of updates applied to every weight.
# Assume the optimizer has updated i-th weight by k_i times, namely optimizer.update(i, weight_i) is called by k_i times. Then:
num_update = max([k_i for all i])
Parameters: num_update (int) – the maximal number of updates applied to a weight.
1
2
3
4
5
6
7
8
9
10
11
　　第三种是：

class mxnet.lr_scheduler.MultiFactorScheduler(step, factor=1)
# Reduce the learning rate by given a list of steps.
# Assume there exists k such that:
step[k] <= num_update and num_update < step[k+1]

# Then calculate the new learning rate by:
base_lr * pow(factor, k+1)
# Parameters:
step (list of int) – The list of steps to schedule a change
factor (float) – The factor to change the learning rate.
1
2
3
4
5
6
7
8
9
10
11
参考：https://mxnet.incubator.apache.org/api/python/optimization/optimization.html#mxnet.lr_scheduler.LRScheduler
---------------------
作者：xiaotao_1
来源：CSDN
原文：https://blog.csdn.net/xiaotao_1/article/details/78874336
版权声明：本文为博主原创文章，转载请附上博文链接！

转载于:https://www.cnblogs.com/jukan/p/10235091.html

weixin_30342209

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
mxnet设置动态学习率（learning rate）

https://blog.csdn.net/xiaotao_1/article/details/78874336如果learning rate很大，算法会在局部最优点附近来回跳动，不会收敛；　　如果learning rate太小，算法每步的移动距离很短，就会导致算法收敛速度很慢。　　所以我们可以先设置一个比较大的学习率，随着迭代次数的增加慢慢降低它。mxnet中有现成的类clas...
复制链接

扫一扫