tensorflow 在实现 Batch Normalization(各个网络层输出的归一化)时,主要用到以下两个 api:
- tf.nn.moments(x, axes, name=None, keep_dims=False) ⇒ mean, variance:
- 统计矩,mean 是一阶矩,variance 则是二阶中心矩
- tf.nn.batch_normalization(x, mean, variance, offset, scale, variance_epsilon, name=None)
- https://www.tensorflow.org/api_docs/python/tf/nn/batch_normalization
γ ⋅ x − μ σ + β \gamma\cdot\frac{x-\mu}{\sigma}+\beta γ⋅σx−μ+β - γ \gamma γ 表示 scale 缩放因子, β \beta β 表示偏移量;
- tf.nn.batch_norm_with_global_normalization(t, m, v, beta, gamma, variance_epsilon, scale_after_normalization, name=None)
- 由函数接口可知,tf.nn.moments 计算返回的 mean 和 variance 作为 tf.nn.batch_normalization 参数进一步调用;
- https://www.tensorflow.org/api_docs/python/tf/nn/batch_normalization
1. tf.nn.moments,矩
tf.nn.moments 返回的 mean 表示一阶矩,variance 则是二阶中心矩;
如我们需计算的 tensor 的 shape 为一个四元组 [batch_size, height, width, kernels]
,一个示例程序如下:
import tensorflow as tf
shape = [128, 32, 32, 64]
a = tf.Variable(tf.random_normal(shape)) # a:activations
axis = list(range(len(shape)-1)) # len(x.get_shape())
a_mean, a_var = tf.nn.moments(a, axis)
这里我们仅给出 a_mean, a_var 的维度信息,
sess = tf.Session()
sess.run(tf.global_variables_initalizer())
sess.run(a_mean).shape # (64, )
sess.run(a_var).shape # (64, ) ⇒ 也即是以 kernels 为单位,batch 中的全部样本的均值与方差
2. demo
def batch_norm(x):
epsilon = 1e-3
batch_mean, batch_var = tf.nn.moments(x, [0])
return tf.nn.batch_normalization(x, batch_mean, batch_var,
offset=None, scale=None,
variance_epsilon=epsilon)
references
- <a href=“http://www.jianshu.com/p/0312e04e4e83”, target="_blank">谈谈Tensorflow的Batch Normalization