(转)神经网络中concatenate和add层的不同

最新推荐文章于 2024-08-16 11:55:21 发布

qq_41978139

最新推荐文章于 2024-08-16 11:55:21 发布

阅读量999

点赞数

分类专栏：深度学习

原文链接：https://blog.csdn.net/u012193416/article/details/79479935

版权

深度学习专栏收录该内容

20 篇文章 2 订阅

订阅专栏

原

神经网络中concatenate和add层的不同

2018年03月08日 10:01:58 Kun Li 阅读数 8072

                    版权声明：本文为博主原创文章，未经博主允许不得转载。                        <a class="copy-right-url" href=" https://blog.csdn.net/u012193416/article/details/79479935"> https://blog.csdn.net/u012193416/article/details/79479935</a>
                </div>
                                                <link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/ck_htmledit_views-3019150162.css">
                                    <link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/ck_htmledit_views-3019150162.css">
            <div class="htmledit_views" id="content_views">
                                        <p>在网络结构的设计上，经常说DenseNet和Inception中更多采用的是concatenate操作，而ResNet更多采用的add操作，那么这两个操作有什么异同呢？</p><p>concatenate操作是网络结构设计中很重要的一种操作，经常用于将特征联合，多个卷积特征提取框架提取的特征融合或者是将输出层的信息进行融合，而add层更像是信息之间的叠加。</p><p><span class="fontstyle0">This reveals that both DenseNets and ResNets densely aggregate features from prior layers and their essential difference is how features are aggregated: ResNets aggregate features by summation and DenseNets aggregate them by concatenation. </span></p><p><span class="fontstyle0">Resnet是做值的叠加，通道数是不变的，DenseNet是做通道的合并。你可以这么理解，add是描述图像的特征下的信息量增多了，但是描述图像的维度本身并没有增加，只是每一维下的信息量在增加，这显然是对最终的图像的分类是有益的。而concatenate是通道数的合并，也就是说描述图像本身的特征增加了，而每一特征下的信息是没有增加。</span></p><p><span class="fontstyle0">在代码层面就是ResNet使用的都是add操作，而DenseNet使用的是concatenate。</span></p><p>这些对我们设计网络结构其实有很大的启发。</p><p>通过看keras的源码，发现add操作，</p><p></p><pre style="font-family:'Courier New';font-size:13.8pt;"><span style="color:#ff802d;"><strong>def </strong></span><span style="color:#990000;"><strong>_merge_function</strong></span>(<span style="color:#932194;">self</span>, <span style="font-style:italic;">inputs</span>)<span style="font-weight:bold;">:

output = inputs[0]
for i in range(1, len(inputs)):
output += inputs[i]
return output

执行的就是加和操作，举个例子


 
 
   
   
    
    
   
   
   
   
    
    
     
     import keras
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     input1 = keras.layers.Input(shape=(
     
     16,))
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     x1 = keras.layers.Dense(
     
     8, activation=
     
     ‘relu’)(input1)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     input2 = keras.layers.Input(shape=(
     
     32,))
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     x2 = keras.layers.Dense(
     
     8, activation=
     
     ‘relu’)(input2)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     added = keras.layers.
     
     add([x1, x2])
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     out = keras.layers.Dense(
     
     4)(added)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     model = keras.models.Model(inputs=[input1, input2], outputs=
     
     out)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     model.summary()

打印出来模型结构就是：

________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 16) 0

input_2 (InputLayer) (None, 32) 0

dense_1 (Dense) (None, 8) 136 input_1[0][0]

dense_2 (Dense) (None, 8) 264 input_2[0][0]

add_1 (Add) (None, 8) 0 dense_1[0][0]
dense_2[0][0]
______________________________________________
dense_3 (Dense) (None, 4) 36 add_1[0][0]
==================================================================================================
Total params: 436
Trainable params: 436
Non-trainable params: 0

________________________________________________________________________________________________

这个比较好理解，add层就是接在dense_1,dense_2后面的是一个连接操作，并没有训练参数。

相对来说，concatenate操作比较难理解一点。

if py_all([is_sparse(x) for x in tensors]):

    return tf.sparse_concat(axis, tensors)

else:

    return tf.concat([to_dense(x) for x in tensors], axis)

通过keras源码发现，一个返回sparse_concate，一个返回concate，这个就比较明朗了，

concate操作，举个例子

t1 = [[1, 2, 3], [4, 5, 6]]

t2 = [[7, 8, 9], [10, 11, 12]]

tf.concat([t1, t2], 0) ==> [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]

tf.concat([t1, t2], 1) ==> [[1, 2, 3, 7, 8, 9], [4, 5, 6, 10, 11, 12]]



# tensor t3 with shape [2, 3]

# tensor t4 with shape [2, 3]

tf.shape(tf.concat([t3, t4], 0)) ==> [4, 3]

tf.shape(tf.concat([t3, t4], 1)) ==> [2, 6]

事实上，是关于维度的一个联合，axis=0表示列维，1表示行维，沿着通道维度连接两个张量。另一个sparse_concate则是关于稀疏矩阵的级联，也比较好理解。