Tensorflow学习系列一：对于Tensorflow训练模型的理解

最新推荐文章于 2023-03-04 09:19:53 发布

肥肥佑

最新推荐文章于 2023-03-04 09:19:53 发布

阅读量369

点赞数

分类专栏： Tensorflow

本文链接：https://blog.csdn.net/qq_40503771/article/details/105234288

版权

Tensorflow 专栏收录该内容

6 篇文章 1 订阅

订阅专栏

[Tensorflow学习系列]Tensorflow学习一 :对于Tensorflow训练模型的理解

本文是Tensorflow学习系列之第一篇

深度学习之TensorFlow构建训练模型

1.指数衰减的学习率设置
2.正则化避免过拟合
3.滑动平均使得最终模型更健壮

学习率的设置

在训练神经网络时，通过设置学习率（learning rate)来控制参数更新幅度，对于学习率而言，过小将会导致收敛过慢，导致训练时间过长;幅度过大，将会导致在最优解附近来回震荡。

学习率衰减

学习率公式：

$a=\frac{1}{1+（decay-rate）*（epoch-num）}*a_0$

验证：假设初始学习率为 $a_0$ =0.2,衰减率为decay_rate=1，epoch_num为代数，根据公式推论出：

Epoch_num	$a$
1	$a_1=\frac{0.2}{1+1*1}=0.1$
2	$a_1=\frac{0.2}{1+1*2}=0.067$
3	$a_1=\frac{0.2}{1+1*3}=0.05$
4	$a_1=\frac{0.2}{1+1*4}=0.04$
…	…

根据上述公式，你的学习率呈递减趋势。

如果你想用学习率衰减，要做的是要去尝试不同的值，包括超参数a_0，以及超参数衰退率，找到合适的值。如果epoch次数太少，网络有可能发生欠拟合，过多可能导致过拟合。

关于epoch，batch_size等不懂的可以看一下这篇文章tensorflow学习笔记–深度学习中的epochs，batch_size，iterations详解

简单举例：
训练集有1000个样本，batchsize=10，那么：
训练完整个样本集需要：100次iteration，1次epoch。

指数衰减

tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)

tf.train.exponential_decay函数实现了指数衰减学习率。通过这个函数，可以先使用较大的学习率来快速得到一个比较优的解，然后随着迭代的继续逐步减小学习率，使得模型在训练后期更加稳定。 exponential_decay 函数会指数级地减小学习率.
如果staircase=True，那就表明每decay_steps次计算学习速率变化，更新原始学习速率，如果是False，那就是每一步都更新学习速率。

计算公式:
decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps)

#tf.train.exponential_decay(learning_rate, global_step, decay_steps,decay_rate, staircase=False, name=None)
#红色为False，蓝色为True
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
learning_rate = 0.1  # 初始学习速率时0.1
decay_rate = 0.96  # 衰减率
global_steps = 1000  # 总的迭代次数
decay_steps = 100  # 衰减次数
global_ = tf.Variable(tf.constant(0))
c = tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=True)
d = tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=False)
T_C = []
F_D = []
with tf.Session() as sess:
   for i in range(global_steps):
       T_c = sess.run(c, feed_dict={global_: i})
       T_C.append(T_c)
       F_d = sess.run(d, feed_dict={global_: i})
       F_D.append(F_d)
plt.figure(1)
plt.plot(range(global_steps), F_D, 'r-')
plt.plot(range(global_steps), T_C, 'b-')
# 关于函数的值的计算0.96^(3/1000)=0.998
plt.show()

在这里插入图片描述代码有看不懂的可以参考一下我另一篇博文， tensorflow运行机制

肥肥佑

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Tensorflow学习系列一：对于Tensorflow训练模型的理解

[Tensorflow学习系列]Tensorflow学习一 :对于Tensorflow训练模型的理解本文是Tensorflow学习系列之第一篇深度学习之TensorFlow构建训练模型 1.指数衰减的学习率设置2.正则化避免过拟合3.滑动平均使得最终模型更健壮...
复制链接

扫一扫