最近研究SSD代码时,发现在Conv4_3特征层后,进行了L2 Norm 的处理,最后研究了一下。
首先介绍一下L2 Norm,其实很简单,就是把所有值平方后加起来,求根以后当分母,然后每个数当分子,进行求解, L2 Regularization and Batch Norm 详细介绍了他的不同
但是简而言之,L2 Norm 是在channel上进行求平均, 而batch_norm 是对=[batch,higth,width]进行求平均
至于为什么要用L2Normal,原作者说conv4_3和其他特种层不一样,有这不同的scale
That was discovered in my other paper (ParseNet) that conv4_3 has different scale from other layers. That is why I add L2 normalization for conv4_3 only.
然后看了不同版本的SSD代码,主要有以下几种实现方法:
# 1、tensorflow,新建一个layer
# 自定义layer层,会有正向传播,和反向传播
class L2Normalization(Layer):
"""
在深度神经网络中,偶尔会出现多个量纲不同的向量拼接在一起的情况,
此时就可以使用L2归一化统一拼接后的向量的量纲,使得网络能够快速收敛。
"""
def __init__(self, gamma_init=20, **kwargs):
self.axis = 3
self.gamma_init = gamma_init
super(L2Normalization, self).__init__(**kwargs)
def build(self, input_shape):
self.input_spec = [InputSpec(shape=input_shape)]
gamma = self.gamma_init * np.ones((input_shape[self.axis],))
self.gamma = K.variable(gamma, name='{}_gamma'.format(self.name))
self.trainable_weights = [self.gamma]
super(
, self).build(input_shape)
def call(self, x, mask=None):
outputs = K.l2_normalize(x, self.axis)
return outputs * self.gamma
# 2、tensorflow构建L2norm方法
def l2norm(x):
n_channels = x.get_shape().as_list()[-1] # 通道数
l2_norm = tf.math.l2_normalize(x, axis=3, epsilon=1e-12) # 只对每个像素点在channels上做归一化
return l2_norm
with tf.variable_scope(scope):
gamma = tf.get_variable("gamma", shape=[n_channels, ], dtype=tf.float32,
initializer=tf.constant_initializer(scale),
trainable=trainable)
return l2_norm * gamma
# 3、tensorflow 暴力实现
def l2norm_v2(conv4_3_feats):
# Since lower level features (conv4_3_feats) have considerably larger scales, we take the L2 norm and rescale
# Rescale factor is initially set at 20, but is learned for each channel during back-prop
self.rescale_factors = nn.Parameter(torch.FloatTensor(1, 512, 1, 1)) # there are 512 channels in conv4_3_feats
nn.init.constant_(self.rescale_factors, 20)
# norm = conv4_3_feats.pow(2).sum(dim=1, keepdim=True).sqrt() # (N, 1, 38, 38)
pow = tf.math.pow(conv4_3_feats, 2) # (N, 1, 38, 38)
norm = tf.math.sqrt(tf.math.reduce_sum(pow, axis=3, keepdims=True))
conv4_3_feats = conv4_3_feats / norm # (N, 512, 38, 38)
# print(conv4_3_feats)
gamma_init = 20
gamma = gamma_init * np.ones((input_shape[3],))
gamma = tf.Variable(gamma, name='{}_gamma'.format(self.name))
return conv4_3_feats *gamma
# 4、pytorch 直接计算
# Rescale conv4_3 after L2 norm
norm = conv4_3_feats.pow(2).sum(dim=1, keepdim=True).sqrt() # (N, 1, 38, 38)
conv4_3_feats = conv4_3_feats / norm # (N, 512, 38, 38)
conv4_3_feats = conv4_3_feats * self.rescale_factors # (N, 512, 38, 38)
# (PyTorch autobroadcasts singleton dimensions during arithmetic)