论文提出了batch normalization,用于减少Internal Covariate Shift,防止梯度弥散,还可加速模型的训练。本文就tensorflow中的实现过程和训练测试时的使用差异进行说明。
![]() |
图1 算法说明 |
tf实现函数体如下:
with ops.name_scope(name, "batchnorm", [x, mean, variance, scale, offset]):
inv = math_ops.rsqrt(variance + variance_epsilon)
if scale is not None:
inv *= scale
# Note: tensorflow/contrib/quantize/python/fold_batch_norms.py depends on
# the precise order of ops that are generated by the expressio