1,batch normalization是以特征为主体进行标准化,一个batch中所有样本的某个特征组成一组数,对这组数进行标准化。
2,layer normalization是以样本为主体进行标准化,某个样本的所有特征组成一组数,对这组数进行标准化。
3,标准化最常用的方法就是减去平均值,再除以标准差。
4,标准化的目的:1),加快训练的速度;2),防止梯度爆炸。
batch normalization常用在CNN上,而用layer normalization用在RNN和transformer上更合适,因为序列数据的长度不一,导致有些特征在部分样本中没有,给基于特征的标准化带来了麻烦。
5,batch normalization的缺点:
1),In batch normalization, we use the batch statistics: the mean and standard deviation corresponding to the current mini-batch. However, when the batch size is small, the sample mean and sample standard deviation are not representative enough of the actual distribution and the network cannot learn anything meaningful.
2),As batch normalization depends on batch statistics for normalization, it is less suited for sequence models. This is because, in sequence models, we may have sequences of potentially different lengths and smaller batch sizes corresponding to longer sequences.
Reference
https://www.pinecone.io/learn/batch-layer-normalization/