Tensorflow 学习记录

最新推荐文章于 2024-02-14 10:51:59 发布

纫秋兰以为佩

最新推荐文章于 2024-02-14 10:51:59 发布

阅读量227

点赞数

分类专栏： Python - 应用

本文链接：https://blog.csdn.net/sinat_27421407/article/details/79834992

版权

32 篇文章 0 订阅

订阅专栏

19 篇文章 0 订阅

订阅专栏

人工智能实践：Tensorflow笔记
 曹健

init = tf.global_variables_initializer() 赋初值
X = tf.placeholder(tf.float32, [None, 5], name = 'X')
- 喂多组数据 None ，一组数据有 5 个特征
- sess.run(..., feed_dict = {X: ...})
W = tf.Variable(tf.random_normal([2,3], stddev=2, mean=0, seed=1))
- 产生 2×3 的标准差为2、均值为0的正态分布，随机种子为1
- tf.truncated_normal() 去掉偏离点过大的正态分布
- tf.random_uniform(shape=7,minval=0,maxval=1,dtype=tf.int32， seed=1) 平均分布，左闭右开
tf.where() 类似 ? : 表达式

均方误差 MSE(Mean Square Error)
cost = tf.reduce_mean(tf.square(yhat-y)) tf.reduce_mean() 求平均值
交叉熵 Cross Entropy
-tf.reduce_mean(yhat * tf.log(tf.clip_by_value(y,1e-12,1.0)))
若y 小于 1e-12 则为1e-12，大于 1.0 则为 1.0；再取平均
tf.nn.sparse_cross_entropy_with_logits(logits, labels)

tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
tf.train.MomentumOptimizer(learning_rate, momentum).minimize(cost)
tf.train.AdamOptimizer(learning_rate).minimize(cost)
Adam 算法通过计算梯度的一阶矩估计和二阶矩估计而为不同的参数设计独立的自适应性学习率

更新参数的幅度
过大导致不收敛（震荡），过小收敛缓慢
指数衰减学习率
- global_step = tf.Variable(0, trainable = False) 不被训练
- learning_rate = tf.train.exponential_decay(learning_rate_base, global_step, learing_rate_step, learning_rate_decay, staircase=True)

记录了一段时间内模型中所有参数 w 和 b 各自的平均值
利用滑动平均值可以增强模型的泛化能力
计算过程：
- 影子 = 衰减率 × 影子 + (1 - 衰减率) × 参数
- 衰减率 = min{moving_average_decay, (1+轮数)/(10+轮数)}
ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
- MOVING_AVERAGE_DECAY 表示滑动平均衰减率，一般会赋接近 1 的值
- global_step 表示当前训练了多少轮
ema_op = ema.apply(tf.trainable_variables())
对所有待优化的参数求滑动平均
ema.average(参数) 返回参数的滑动平均值

防止过拟合
一般只对参数 W 使用
tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(REGULARIZER)(W))
正则化带来的损失加到总损失中
loss = cem + tf.add_n(tf.get_collection('losses'))
REGULARIZER 是超参数