tf.train.exponential_decay函数(指数衰减法)

最新推荐文章于 2023-06-07 17:41:51 发布

Sophia$

最新推荐文章于 2023-06-07 17:41:51 发布

阅读量3.7k

点赞数 1

分类专栏： Tensorflow 文章标签： tensorflow 学习率指数衰减函数

原文链接：https://blog.csdn.net/wuguangbin1230/article/details/77658229

版权

Tensorflow 专栏收录该内容

8 篇文章 4 订阅

订阅专栏

训练神经网络模型时通常要设置学习率learning_rate，可以直接将其设置为一个常数（通常设置0.01左右），但是在训练刚开始时，用固定学习率会使参数的更新过程显得很僵硬，不能很好的符合训练的需要；到后期参数仅需要很小变化时，学习率的值还是原来的值，会造成无法收敛，甚至越来越差的情况，过大无法收敛，过小训练太慢。

所以我们通常会采用指数衰减学习率来优化这个问题，可以通过tf.train.exponential_decay函数直接设置，

global_step = tf.Variable(0, trainable=False)
lr = tf.train.exponential_decay(learning_rate=0.02, global_step=global_step, decay_steps=100,  
                                 decay_rate=0.9, staircase=False)

learning_rate为原始学习率
global_step个人感觉好比一个计数器，你每进行一次更新它就会增一
decay_steps为衰减间隔（速度），顾名思义就是每隔多少步会更新一次学习率（它只有在staircase为true时才有效）
decay_rate衰减率
staircase若为true则每隔decay_steps步对学习率进行一次更新，若为false则每一步都更新

通过tf.train.exponential_decay函数实现指数衰减学习率。

步骤：1.首先使用较大学习率(目的：为快速得到一个比较优的解);

2.然后通过迭代逐步减小学习率(目的：为使模型在训练后期更加稳定);

代码实现：

decayed_learning_rate=learining_rate*decay_rate^(global_step/decay_steps)

其中，decayed_learning_rate为每一轮优化时使用的学习率；

而tf.train.exponential_decay函数则可以通过staircase(默认值为False,当为True时，（global_step/decay_steps）则被转化为整数) ,选择不同的衰减方式。

global_step = tf.Variable(0)
learning_rate = tf.train.exponential_decay(0.1, global_step, 100, 0.96, staircase=True)     #生成学习率
learning_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(....., global_step=global_step)  #使用指数衰减学习率

import tensorflow as tf;  
import numpy as np;  
import matplotlib.pyplot as plt;  
  
learning_rate = 0.1  
decay_rate = 0.96  
global_steps = 1000  
decay_steps = 100  
  
global_ = tf.Variable(tf.constant(0))  
c = tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=True)  
d = tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=False)  
  
T_C = []  
F_D = []  
  
with tf.Session() as sess:  
    for i in range(global_steps):  
        T_c = sess.run(c,feed_dict={global_: i})  
        T_C.append(T_c)  
        F_d = sess.run(d,feed_dict={global_: i})  
        F_D.append(F_d)    
  
plt.figure(1)  
plt.plot(range(global_steps), F_D, 'r-')  
plt.plot(range(global_steps), T_C, 'b-')        
plt.show()

Sophia$

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
tf.train.exponential_decay函数(指数衰减法)

训练神经网络模型时通常要设置学习率learning_rate，可以直接将其设置为一个常数（通常设置0.01左右），但是在训练刚开始时，用固定学习率会使参数的更新过程显得很僵硬，不能很好的符合训练的需要；到后期参数仅需要很小变化时，学习率的值还是原来的值，会造成无法收敛，甚至越来越差的情况，过大无法收敛，过小训练太慢。所以我们通常会采用指数衰减学习率来优化这个问题，可以通过tf.train.ex...
复制链接

扫一扫