参考 :https://stackoverflow.com/questions/39691902/ordering-of-batch-normalization-and-dropout
一般来说使用顺序是:
-> CONV/FC -> ReLu(or other activation) -> Dropout -> BatchNorm -> CONV/FC ->
但是:
Usually, Just drop the
Dropout
(when you haveBN
):
- "BN eliminates the need for
Dropout
in some cases cause BN provides similar regularization benefits as Dropout intuitively"- "Architectures like ResNet, DenseNet, etc. not using
Dropout
For more details, refer to this paper [Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift] as already mentioned by @Haramoz in the comments.
所以更推荐的做法是:
-> CONV/FC -> ReLu(or other activation) -> BatchNorm -> CONV/FC ->
总结:
- 先dropout后BN/LN(特别是residual net,必须先drop才能对残差和进行norm)
- activation一般在drop和norm之前