1、Group Normalization: https://arxiv.org/abs/1803.08494
解决Batch Normalization中对Batch Size依赖的短板,在目标检测,图像分割,视频分类等任务上,Batch Size往往比较小,导致BN作用的效果比较差。如下图,Group Normalization是对Layer Normalization和Instance Normalization的折中。
def GroupNorm(x, gamma, beta, G, eps=1e-5):
# x: input features with shape [N,C,H,W]
# gamma, beta: scale and offset, with shape [1,C,1,1]
# G: number of groups for GN
N, C, H, W = x.shape
x = tf.reshape(x, [N, G, C // G, H, W])
mean, var = tf.nn.moments(x, [2, 3, 4], keep dims=True)
x = (x - mean) / tf.sqrt(var + eps)
x = tf.reshape(x, [N, C, H, W])
return x * gamma + beta
根据实验结果,GN比BN对于Batch Size的敏感性更弱,即鲁棒性更高
https://github.com/shaohua0116/Group-Normalization-Tensorflow
2、Focal Loss for Dense Object Detection
3、