tf.train.ExponentialMovingAverage(decay, steps)
tf.train.ExponentialMovingAverage这个函数用于更新参数,就是采用滑动平均的方法更新参数。这个函数初始化需要提供一个衰减速率(decay),用于控制模型的更新速度。这个函数还会维护一个影子变量(也就是更新参数后的参数值),这个影子变量的初始值就是这个变量的初始值,影子变量值的更新方式如下:
shadow_variable = decay * shadow_variable + (1-decay) * variable
shadow_variable是影子变量,variable表示待更新的变量,也就是变量被赋予的值,decay为衰减速率。decay一般设为接近于1的数(0.99,0.999)。decay越大模型越稳定,因为decay越大,参数更新的速度就越慢,趋于稳定。
tf.train.ExponentialMovingAverage这个函数还提供了自己动更新decay的计算方式:
decay= min(decay,(1+steps)/(10+steps))
steps是迭代的次数,可以自己设定。
比如:
- import tensorflow as tf;
- import numpy as np;
- import matplotlib.pyplot as plt;
- v1 = tf.Variable(0, dtype=tf.float32)
- step = tf.Variable(tf.constant(0))
- ema = tf.train.ExponentialMovingAverage(0.99, step)
- maintain_average = ema.apply([v1])
- with tf.Session() as sess:
- init = tf.initialize_all_variables()
- sess.run(init)
- print sess.run([v1, ema.average(v1)]) #初始的值都为0
- sess.run(tf.assign(v1, 5)) #把v1变为5
- sess.run(maintain_average)
- print sess.run([v1, ema.average(v1)]) # decay=min(0.99, 1/10)=0.1, v1=0.1*0+0.9*5=4.5
- sess.run(tf.assign(step, 10000)) # steps=10000
- sess.run(tf.assign(v1, 10)) # v1=10
- sess.run(maintain_average)
- print sess.run([v1, ema.average(v1)]) # decay=min(0.99,(1+10000)/(10+10000))=0.99, v1=0.99*4.5+0.01*10=4.555
- sess.run(maintain_average)
- print sess.run([v1, ema.average(v1)]) #decay=min(0.99,<span style="font-family:Arial, Helvetica, sans-serif;">(1+10000)/(10+10000)</span><span style="font-family:Arial, Helvetica, sans-serif;">)=0.99, v1=0.99*4.555+0.01*10=4.6</span>
[0.0, 0.0]
[5.0, 4.5]
[10.0, 4.5549998]
[10.0, 4.6094499]
tf.trainable_variables:返回的是需要训练的变量列表
tf.all_variables:返回的是所有变量的列表
例如:
- import tensorflow as tf;
- import numpy as np;
- import matplotlib.pyplot as plt;
- v = tf.Variable(tf.constant(0.0, shape=[1], dtype=tf.float32), name='v')
- v1 = tf.Variable(tf.constant(5, shape=[1], dtype=tf.float32), name='v1')
- global_step = tf.Variable(tf.constant(5, shape=[1], dtype=tf.float32), name='global_step', trainable=False)
- ema = tf.train.ExponentialMovingAverage(0.99, global_step)
- for ele1 in tf.trainable_variables():
- print ele1.name
- for ele2 in tf.all_variables():
- print ele2.name
v1:0
v:0
v1:0
global_step:0
tf.control_dependencies()函数用法:
在有些机器学习程序中我们想要指定某些操作执行的依赖关系,这时我们可以使用tf.control_dependencies()
来实现。 control_dependencies(control_inputs)
返回一个控制依赖的上下文管理器,使用with
关键字可以让在这个上下文环境中的操作都在control_inputs
执行。
with g.control_dependencies([a, b, c]):
# `d` and `e` will only run after `a`, `b`, and `c` have executed.
d = ...
e = ...
- 1
- 2
- 3
- 4
可以嵌套control_dependencies
使用
with g.control_dependencies([a, b]):
# Ops constructed here run after `a` and `b`.
with g.control_dependencies([c, d]):
# Ops constructed here run after `a`, `b`, `c`, and `d`.
- 1
- 2
- 3
- 4
可以传入None
来消除依赖:
with g.control_dependencies([a, b]):
# Ops constructed here run after `a` and `b`.
with g.control_dependencies(None):
# Ops constructed here run normally, not waiting for either `a` or `b`.
with g.control_dependencies([c, d]):
# Ops constructed here run after `c` and `d`, also not waiting
# for either `a` or `b`.
- 1
- 2
- 3
- 4
- 5
- 6
- 7
注意:
控制依赖只对那些在上下文环境中建立的操作有效,仅仅在context中使用一个操作或张量是没用的
# WRONG
def my_func(pred, tensor):
t = tf.matmul(tensor, tensor)
with tf.control_dependencies([pred]):
# The matmul op is created outside the context, so no control
# dependency will be added.
return t
# RIGHT
def my_func(pred, tensor):
with tf.control_dependencies([pred]):
# The matmul op is created in the context, so a control dependency
# will be added.
return tf.matmul(tensor, tensor)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
例子:
在训练模型时我们每步训练可能要执行两种操作,op a, b
这时我们就可以使用如下代码:
with tf.control_dependencies([a, b]):
c= tf.no_op(name='train')#tf.no_op;什么也不做
sess.run(c)
- 1
- 2
- 3
在这样简单的要求下,可以将上面代码替换为:
c= tf.group([a, b])
sess.run(c)