关于SSD中Conv4_3的L2 Norm处理

最新推荐文章于 2024-03-26 14:04:21 发布

loovelj

最新推荐文章于 2024-03-26 14:04:21 发布

阅读量1.8k

点赞数

分类专栏： python tensorflow

本文链接：https://blog.csdn.net/loovelj/article/details/106556851

版权

python 同时被 2 个专栏收录

74 篇文章 4 订阅

订阅专栏

tensorflow

31 篇文章 0 订阅

订阅专栏

最近研究SSD代码时，发现在Conv4_3特征层后，进行了L2 Norm 的处理，最后研究了一下。

首先介绍一下L2 Norm，其实很简单，就是把所有值平方后加起来，求根以后当分母，然后每个数当分子，进行求解， L2 Regularization and Batch Norm 详细介绍了他的不同
但是简而言之，L2 Norm 是在channel上进行求平均，而batch_norm 是对=[batch,higth,width]进行求平均
在这里插入图片描述

至于为什么要用L2Normal，原作者说conv4_3和其他特种层不一样，有这不同的scale

That was discovered in my other paper (ParseNet) that conv4_3 has different scale from other layers. That is why I add L2 normalization for conv4_3 only.

然后看了不同版本的SSD代码，主要有以下几种实现方法：



# 1、tensorflow，新建一个layer
# 自定义layer层，会有正向传播，和反向传播
class L2Normalization(Layer):
    """
     在深度神经网络中，偶尔会出现多个量纲不同的向量拼接在一起的情况，
     此时就可以使用L2归一化统一拼接后的向量的量纲，使得网络能够快速收敛。
    """

    def __init__(self, gamma_init=20, **kwargs):
        self.axis = 3
        self.gamma_init = gamma_init
        super(L2Normalization, self).__init__(**kwargs)

    def build(self, input_shape):
        self.input_spec = [InputSpec(shape=input_shape)]
        gamma = self.gamma_init * np.ones((input_shape[self.axis],))
        self.gamma = K.variable(gamma, name='{}_gamma'.format(self.name))
        self.trainable_weights = [self.gamma]
        super(
        , self).build(input_shape)

    def call(self, x, mask=None):
        outputs = K.l2_normalize(x, self.axis)
        return outputs * self.gamma


# 2、tensorflow构建L2norm方法 
def l2norm(x):
    n_channels = x.get_shape().as_list()[-1]  # 通道数
    l2_norm = tf.math.l2_normalize(x, axis=3, epsilon=1e-12)  # 只对每个像素点在channels上做归一化
    return l2_norm
    with tf.variable_scope(scope):
        gamma = tf.get_variable("gamma", shape=[n_channels, ], dtype=tf.float32,
                                initializer=tf.constant_initializer(scale),
                                trainable=trainable)
    return l2_norm * gamma


# 3、tensorflow 暴力实现
def l2norm_v2(conv4_3_feats):
    # Since lower level features (conv4_3_feats) have considerably larger scales, we take the L2 norm and rescale
    # Rescale factor is initially set at 20, but is learned for each channel during back-prop
    self.rescale_factors = nn.Parameter(torch.FloatTensor(1, 512, 1, 1))  # there are 512 channels in conv4_3_feats
    nn.init.constant_(self.rescale_factors, 20)
    # norm = conv4_3_feats.pow(2).sum(dim=1, keepdim=True).sqrt()  # (N, 1, 38, 38)
	pow = tf.math.pow(conv4_3_feats, 2)  # (N, 1, 38, 38)
	norm = tf.math.sqrt(tf.math.reduce_sum(pow, axis=3, keepdims=True))
	conv4_3_feats = conv4_3_feats / norm  # (N, 512, 38, 38)
	# print(conv4_3_feats)
	gamma_init = 20
	 gamma = gamma_init * np.ones((input_shape[3],))
     gamma = tf.Variable(gamma, name='{}_gamma'.format(self.name))
	return conv4_3_feats *gamma

# 4、pytorch 直接计算
# Rescale conv4_3 after L2 norm
norm = conv4_3_feats.pow(2).sum(dim=1, keepdim=True).sqrt()  # (N, 1, 38, 38)
conv4_3_feats = conv4_3_feats / norm  # (N, 512, 38, 38)
conv4_3_feats = conv4_3_feats * self.rescale_factors  # (N, 512, 38, 38)
# (PyTorch autobroadcasts singleton dimensions during arithmetic)