batch normalization与layer normalization

最新推荐文章于 2024-03-21 11:15:36 发布

JasonKQLin

最新推荐文章于 2024-03-21 11:15:36 发布

阅读量296

点赞数

分类专栏： deep learning 文章标签：深度学习

本文链接：https://blog.csdn.net/linkequa/article/details/130514531

版权

deep learning 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

1，batch normalization是以特征为主体进行标准化，一个batch中所有样本的某个特征组成一组数，对这组数进行标准化。

在这里插入图片描述

2，layer normalization是以样本为主体进行标准化，某个样本的所有特征组成一组数，对这组数进行标准化。

在这里插入图片描述

3，标准化最常用的方法就是减去平均值，再除以标准差。

在这里插入图片描述

4，标准化的目的：1），加快训练的速度；2），防止梯度爆炸。

batch normalization常用在CNN上，而用layer normalization用在RNN和transformer上更合适，因为序列数据的长度不一，导致有些特征在部分样本中没有，给基于特征的标准化带来了麻烦。

5，batch normalization的缺点：

1），In batch normalization, we use the batch statistics: the mean and standard deviation corresponding to the current mini-batch. However, when the batch size is small, the sample mean and sample standard deviation are not representative enough of the actual distribution and the network cannot learn anything meaningful.

2），As batch normalization depends on batch statistics for normalization, it is less suited for sequence models. This is because, in sequence models, we may have sequences of potentially different lengths and smaller batch sizes corresponding to longer sequences.

Reference

https://www.pinecone.io/learn/batch-layer-normalization/

JasonKQLin

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
batch normalization与layer normalization

batch normalization常用在CNN上，而用layer normalization用在RNN和transformer上更合适，因为序列数据的长度不一，导致有些特征在部分样本中没有，给基于特征的标准化带来了麻烦。
复制链接

扫一扫

专栏目录