BN(Batch Normalization)解释

最新推荐文章于 2022-08-08 20:36:24 发布

酥酥要做程序媛

最新推荐文章于 2022-08-08 20:36:24 发布

阅读量758

点赞数

文章标签：深度学习

原文链接：https://blog.csdn.net/jiang_ming_/article/details/82314287?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.channel_

版权

参考论文

：Batch Normalization Accelerating Deep Network Training by Reducing Internal Covariate Shift
Batch Normalization
Batch ：理解为批量，Batch就是训练网络所设定的图片数量batch_size
Normalization:就是数据标准化
BN层就像激活函数层、卷积层、全连接层、池化层一样，BN大的来说就是归一化层代替掉了LRN ( Local Response Normalization) 局部响应归一化层。

公式

在这里插入图片描述

引入了可学习参数γ、β，在前向传播中记录下来γ、β的值。

tensorflow 源码

代码来源于知乎，这里加入注释帮助阅读。

    with tf.variable_scope(scope_bn):
        # 新建两个变量，平移、缩放因子
        beta = tf.Variable(tf.constant(0.0, shape=[x.shape[-1]]), name='beta', trainable=True)
        gamma = tf.Variable(tf.constant(1.0, shape=[x.shape[-1]]), name='gamma', trainable=True)

        # 计算此次批量的均值和方差
        axises = np.arange(len(x.shape) - 1)
        batch_mean, batch_var = tf.nn.moments(x, axises, name='moments')

        # 滑动平均做衰减
        ema = tf.train.ExponentialMovingAverage(decay=0.5)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)
        # train_phase 训练还是测试的flag
        # 训练阶段计算runing_mean和runing_var，使用mean_var_with_update（）函数
        # 测试的时候直接把之前计算的拿去用 ema.average(batch_mean)
        mean, var = tf.cond(train_phase, mean_var_with_update,
                            lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
    return normed

def batch_normalization(x,
                        mean,
                        variance,
                        offset,
                        scale,
                        variance_epsilon,
                        name=None):

    with ops.name_scope(name, "batchnorm", [x, mean, variance, scale, offset]):
        inv = math_ops.rsqrt(variance + variance_epsilon)
        if scale is not None:
            inv *= scale
        return x * inv + (offset - mean * inv
                      if offset is not None else -mean * inv)

BatchNorm的优点总结：

没有它之前，需要小心的调整学习率和权重初始化，但是有了BN可以放心的使用大学习率，但是使用了BN，就不用小心的调参了，较大的学习率极大的提高了学习速度；
Batchnorm本身上也是一种正则的方式，可以代替其他正则方式如dropout等；
另外，个人认为，batchnorm降低了数据之间的绝对差异，有一个去相关的性质，更多的考虑相对差异性，因此在分类任务上具有更好的效果。

酥酥要做程序媛

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
BN(Batch Normalization)解释

参考论文：Batch Normalization Accelerating Deep Network Training by Reducing Internal Covariate ShiftBatch NormalizationBatch ：理解为批量，Batch就是训练网络所设定的图片数量batch_sizeNormalization:就是数据标准化BN层就像激活函数层、卷积层、全连接层、池化层一样，BN大的来说就是归一化层代替掉了LRN ( Local Response Normalizat
复制链接

扫一扫