- Encoder structure
2. layer normalization:
-
什么是covariate shift?
Covariate shift is the change in the distribution of the covariates specifically, that is, the independent variables.
在机器学习实践中,我们一定要注意训练数据集和实际情况产生的数据分布不同而带来的影响。 -
batch norm vs layer norm
BN:
LN:
- Layer normalized recurrent neural networks
its normalization terms dependonly on the summed inputs to a layer at the current time-step. It also has only one set of gain (g) and bias (b) parameters shared over all time-steps.