DepthwiseConv2D和Conv2D详解

最新推荐文章于 2025-04-01 20:54:59 发布

人工智能和FPGA AI技术

最新推荐文章于 2025-04-01 20:54:59 发布

阅读量2w

点赞数 40

分类专栏： AI tensorflow Python

本文链接：https://blog.csdn.net/u010879745/article/details/108043183

版权

Python 同时被 3 个专栏收录

34 篇文章

订阅专栏

29 篇文章

订阅专栏

tensorflow

10 篇文章

订阅专栏

depthwise_conv2d和conv2d的不同之处在于conv2d在每一深度上卷积，然后求和，depthwise_conv2d卷积，不求和。
[https://www.cnblogs.com/itmorn/p/11250371.html]

1x1卷积降维是每个通道长宽不变，维数变为卷积核数，深度方向的1x1卷积应能实现深度方向multiplication, 也叫channel-wise mutiplication.

depthwise_conv2d

如下张量x和卷积核K进行depthwise_conv2d卷积
在这里插入图片描述

结果为：

import tensorflow as tf

# [batch, in_height, in_width, in_channels]
input =tf.reshape( tf.constant([2,5,3,3,8,2,6,1,1,2,5,4,7,9,2,3,-1,3], tf.float32),[1,3,3,2])

# [filter_height, filter_width, in_channels, out_channels]
kernel = tf.reshape(tf.constant([3,1,-2,2,-1,-3,4,5], tf.float32),[2,2,2,1])

print(tf.Session().run(tf.nn.depthwise_conv2d(input,kernel,[1,1,1,1],"VALID")))
[[[[ -2.  18.]    此行为同一位置不同通道卷积值
   [ 12.  21.]]

  [[ 17.  -7.]
   [-13.  16.]]]]

单个张量与多个卷积核在深度上分别卷积
在这里插入图片描述
维数为 [1，卷积后行数，卷积后列数，kernel数x输入通道数]

Depthwise Convolution
同样是上述例子，一个大小为64×64像素、三通道彩色图片首先经过第一次卷积运算，不同之处在于此次的卷积完全是在二维平面内进行，且Filter的数量与上一层的Depth相同。所以一个三通道的图像经过运算后生成了3个Feature map，如下图所示。
在这里插入图片描述

conv2d

如何理解CNN中的卷积？ - 知乎 https://zhuanlan.zhihu.com/p/35083956
在这里插入图片描述
解析：图中input 773中，7*7代表图像的像素/长宽，3代表R、G、B三个颜色通道，可以看到周边有填充0；有两个卷积核Filter w0、Filter w1,每个filter对应每个通道有一组w权重；一个filter滑动到一个位置后计算三个通道的卷积，求和，加bias，得到这个filter在该位置的最终结果；每个filter的输出是各个通道的汇总；输出的个数与filter个数相同。所以最右边能得到两个不同的输出。

1 的计算过程：

第一个通道和对应权重的结果：01+01+0*(-1)+0*(-1)+00+11+0*(-1)+0*(-1)+10 = 1
第二个通道和对应权重的结果：0(-1)+00+0(-1)+00+10+1*(-1)+01+0(-1)+20 = -1
第三个通道和对应权重的结果：00+01+00+01+20+01+00+0*(-1)+0*0 = 0

偏置：1

1+（-1）+ 0 + 1 = 1

具体卷积计算：
[TensorFlow卷积函数tf.nn.conv2d - 简书 https://www.jianshu.com/p/c72af2ff5393]

Pointwise Convolution
Pointwise Convolution的运算与常规卷积运算非常相似，不同之处在于卷积核的尺寸为 1×1×M，M为上一层的depth。所以这里的卷积运算会将上一步的map在深度方向上进行加权组合，生成新的Feature map。有几个Filter就有几个Feature map。如下图所示。

在这里插入图片描述

在这里插入图片描述
tf.layers.separable_conv2d函数
tf.layers.separable_conv2d(
inputs,
filters,
kernel_size,
strides=(1, 1),
padding=‘valid’,
data_format=‘channels_last’,
dilation_rate=(1, 1),
depth_multiplier=1,
activation=None,
use_bias=True,
depthwise_initializer=None,
pointwise_initializer=None,
bias_initializer=tf.zeros_initializer(),
depthwise_regularizer=None,
pointwise_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
depthwise_constraint=None,
pointwise_constraint=None,
bias_constraint=None,
trainable=True,
name=None,
reuse=None
)

filters Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
kernel_size An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
strides An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1.
padding one of “valid” or “same” (case-insensitive).
data_format A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch_size, height, width, channels) while channels_first corresponds to inputs with shape (batch_size, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be “channels_last”.
dilation_rate An integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1.
depth_multiplier The number of depthwise convolution output channels for each input channel. The total number of depthwise convolution output channels will be equal to filters_in * depth_multiplier.
activation Activation function to use. If you don’t specify anything, no activation is applied ( see keras.activations).
use_bias Boolean, whether the layer uses a bias vector.
depthwise_initializer Initializer for the depthwise kernel matrix ( see keras.initializers).
pointwise_initializer Initializer for the pointwise kernel matrix ( see keras.initializers).
bias_initializer Initializer for the bias vector ( see keras.initializers).
depthwise_regularizer Regularizer function applied to the depthwise kernel matrix (see keras.regularizers).
pointwise_regularizer Regularizer function applied to the pointwise kernel matrix (see keras.regularizers).
bias_regularizer Regularizer function applied to the bias vector ( see keras.regularizers).
activity_regularizer Regularizer function applied to the output of the layer (its “activation”) ( see keras.regularizers).
depthwise_constraint Constraint function applied to the depthwise kernel matrix ( see keras.constraints).
pointwise_constraint Constraint function applied to the pointwise kernel matrix ( see keras.constraints).
bias_constraint Constraint function applied to the bias vector ( see keras.constraints).