Tensorflow深度学习之二十四：tf.train.exponential_decay

最新推荐文章于 2022-03-08 16:58:06 发布

子为空

最新推荐文章于 2022-03-08 16:58:06 发布

阅读量831

点赞数

分类专栏： Tensorflow 深度学习文章标签： exponential_decay 学习率

本文链接：https://blog.csdn.net/DaVinciL/article/details/80573671

版权

深度学习同时被 2 个专栏收录

65 篇文章 9 订阅

订阅专栏

Tensorflow

55 篇文章 6 订阅

订阅专栏

一、学习率指数级下降
在深度神经网络训练时，可以开始使用一个较大的学习率，使得loss值可以较快地下降。随着训练的进行，可以慢慢调整学习率，让其慢慢下降，loss也可以以一个较小的速度下降并逐渐趋于稳定。

二、TensorFlow中的指数下降：tf.train.exponential_decay
在TensorFlow中，可以使用该函数来控制学习率，优化训练过程。

def exponential_decay(learning_rate, 
                      global_step, 
                      decay_steps, 
                      decay_rate,
                      staircase=False, 
                      name=None):

参数	含义
learning_rate	初始的学习率
global_step	当前的全局的迭代步数
decay_steps	每次迭代时需要经过多少步数
decay_rate	衰减比例
staircase	是否呈现阶梯状衰减
name	名称

该函数计算公式如下：

当staircase为False时：

l e a r n i n g_r a t e_d e c a y e d = l e a r n i n g_r a t e * d e c a y_r a t e g l o b a l _ s t e p d e c a y _ s t e p s

$learning\_rate\_decayed = learning\_rate * decay\_rate ^ \frac{global\_step}{ decay\_steps}$
当staircase为True时：

l e a r n i n g_r a t e_d e c a y e d = l e a r n i n g_r a t e * d e c a y_r a t e f l o o r (g l o b a l _ s t e p d e c a y _ s t e p s)

$learning\_rate\_decayed = learning\_rate * decay\_rate^ {floor(\frac{global\_step}{ decay\_steps})}$
floor为向下取整函数。

三、代码

import tensorflow as tf
import matplotlib.pyplot as plt

# Initial learning rate
lr = 1.0

# global steps
steps = tf.placeholder(tf.int32)

# The learning rate drops every 4 steps by decay rate 0.9, and here staircase is True
lr_decayed_1 = tf.train.exponential_decay(learning_rate=lr, global_step=steps, decay_steps=4, decay_rate=0.9, staircase=True)

# The learning rate drops every 4 steps by decay rate 0.9, and here staircase is False
lr_decayed_2 = tf.train.exponential_decay(learning_rate=lr, global_step=steps, decay_steps=4, decay_rate=0.9, staircase=False)

lr_1 = []
lr_2 = []

with tf.Session() as sess:
    tf.global_variables_initializer().run()

    for i in range(40):
        l1 = sess.run(lr_decayed_1, feed_dict={steps: i})
        lr_1.append(l1)

        l2 = sess.run(lr_decayed_2, feed_dict={steps: i})
        lr_2.append(l2)

        print(l1, ' ', l2)

# Plot
plt.plot(range(40), lr_1, 'r')
plt.plot(range(40), lr_2, 'b')
plt.show()

结果如下：

1.0   1.0
1.0   0.97400373
1.0   0.94868326
1.0   0.92402107
0.9   0.9
0.9   0.87660336
0.9   0.85381496
0.9   0.83161896
0.80999994   0.80999994
0.80999994   0.788943
0.80999994   0.7684334
0.80999994   0.748457
0.7289999   0.7289999
0.7289999   0.7100487
0.7289999   0.6915901
0.7289999   0.6736113
0.6560999   0.6560999
0.6560999   0.6390438
0.6560999   0.62243104
0.6560999   0.60625017
0.5904899   0.5904899
0.5904899   0.5751394
0.5904899   0.56018794
0.5904899   0.54562515
0.5314409   0.5314409
0.5314409   0.51762545
0.5314409   0.5041691
0.5314409   0.4910626
0.47829682   0.47829682
0.47829682   0.46586287
0.47829682   0.4537522
0.47829682   0.44195634
0.43046713   0.43046713
0.43046713   0.4192766
0.43046713   0.40837696
0.43046713   0.3977607
0.3874204   0.3874204
0.3874204   0.37734893
0.3874204   0.36753926
0.3874204   0.3579846

这里写图片描述