[TensorFlow 学习笔记-04]卷积函数之tf.nn.conv2d

最新推荐文章于 2024-06-04 07:50:53 发布

caicaiatnbu

最新推荐文章于 2024-06-04 07:50:53 发布

阅读量8.9k

点赞数 7

分类专栏： TensorFlow学习笔记文章标签： TensorFlow GAN CNN 卷积函数

本文链接：https://blog.csdn.net/caicaiatnbu/article/details/72792684

版权

TensorFlow学习笔记专栏收录该内容

10 篇文章 1 订阅

订阅专栏

[版权说明]

TensorFlow 学习笔记参考：

李嘉璇著 TensorFlow技术解析与实战
黄文坚唐源著 TensorFlow实战郑泽宇
顾思宇著 TensorFlow实战Google深度学习框架
乐毅王斌著深度学习-Caffe之经典模型详解与实战
TensorFlow中文社区 http://www.tensorfly.cn/
极客学院著 TensorFlow官方文档中文版
TensorFlow官方文档英文版
以及各位大大的CSDN博客和Github等等...希望本系列博文没有侵犯版权！（若侵权，请联系我，邮箱：1511082629@nbu.edu.cn ）

欢迎大家转载分享，会不定期更新。鉴于博主本人水平有限，如有问题。恳请批评指正！

1. 卷积概念

卷积的过程：如下图所示，用一个3*3的卷积核在5*5的图像上做卷积的过程。

卷积核如下，大小3*3，在原图上滑动的步长为1。

我们再看一个在三通道图像上的卷积过程，如下：

计算步骤解释如下，原图大小为7*7，通道数为3:，卷积核大小为3*3，Input Volume中的蓝色方框和Filter W0中红色方框的对应位置元素相乘再求和得到res( 即，下图中的步骤1.res的计算)，再把res和Bias b0进行相加( 即，下图中的步骤2)，得到最终的Output Volume。

2. 卷积函数tf.nn.conv2d

函数源码目录如下(TensorFlow安装方式参考 TensorFlow 学习笔记-01)：

C:\Anaconda3\envs\tensorflow\Lib\site-packages\tensorflow\python\ops\gen_nn_ops.py

def conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None,
           data_format=None, name=None):
  r"""Computes a 2-D convolution given 4-D `input` and `filter` tensors.

  Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
  and a filter / kernel tensor of shape
  `[filter_height, filter_width, in_channels, out_channels]`, this op
  performs the following:

  1. Flattens the filter to a 2-D matrix with shape
     `[filter_height * filter_width * in_channels, output_channels]`.
  2. Extracts image patches from the input tensor to form a *virtual*
     tensor of shape `[batch, out_height, out_width,
     filter_height * filter_width * in_channels]`.
  3. For each patch, right-multiplies the filter matrix and the image patch
     vector.

  In detail, with the default NHWC format,

      output[b, i, j, k] =
          sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *
                          filter[di, dj, q, k]

  Must have `strides[0] = strides[3] = 1`.  For the most common case of the same
  horizontal and vertices strides, `strides = [1, stride, stride, 1]`.

  Args:
    input: A `Tensor`. Must be one of the following types: `half`, `float32`.
      A 4-D tensor. The dimension order is interpreted according to the value
      of `data_format`, see below for details.
    filter: A `Tensor`. Must have the same type as `input`.
      A 4-D tensor of shape
      `[filter_height, filter_width, in_channels, out_channels]`
    strides: A list of `ints`.
      1-D tensor of length 4.  The stride of the sliding window for each
      dimension of `input`. The dimension order is determined by the value of
        `data_format`, see below for details.
    padding: A `string` from: `"SAME", "VALID"`.
      The type of padding algorithm to use.
    use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
    data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to `"NHWC"`.
      Specify the data format of the input and output data. With the
      default format "NHWC", the data is stored in the order of:
          [batch, height, width, channels].
      Alternatively, the format could be "NCHW", the data storage order of:
          [batch, channels, height, width].
    name: A name for the operation (optional).

  Returns:
    A `Tensor`. Has the same type as `input`.
    A 4-D tensor. The dimension order is determined by the value of
    `data_format`, see below for details.
  """
  result = _op_def_lib.apply_op("Conv2D", input=input, filter=filter,
                                strides=strides, padding=padding,
                                use_cudnn_on_gpu=use_cudnn_on_gpu,
                                data_format=data_format, name=name)
  return result


_conv2d_backprop_filter_outputs = ["output"]

通过conv2d源码我们可以发现一共有7个参数，参数的详细分析如下：

第一个参数：input

input: A `Tensor`. Must be one of the following types: `half`, `float32`.
      A 4-D tensor. The dimension order is interpreted according to the value
      of `data_format`, see below for details.
input tensor of shape `[batch, in_height, in_width, in_channels]`

通过源码中的描述(如上)，我们可以知道input就是需要做卷积的图像（这里要求用Tensor来表示输入图像，并且Tensor(一个4维的Tensor，要求类型为half(half是什么东东？)或者float32)的shape为[batch, in_height, in_width, in_channels]具体含义[训练时一个batch图像的数量，图像高度，图像宽度，图像通道数]）。

第二个参数：filter

filter: A `Tensor`. Must have the same type as `input`.
      A 4-D tensor of shape
      `[filter_height, filter_width, in_channels, out_channels]`

通过源码中的描述(如上)，我们可以知道filter就是卷积核（这里要求用Tensor来表示卷积核，并且Tensor（一个4维的Tensor，要求类型与input相同）的shape为[filter_height, filter_width, in_channels, out_channels]具体含义[卷积核高度，卷积核宽度，图像通道数，卷积核个数]，这里的图片通道数也就input中的图像通道数，二者相同。）

第三个参数：strides

strides: A list of `ints`.
      1-D tensor of length 4.  The stride of the sliding window for each
      dimension of `input`. The dimension order is determined by the value of
        `data_format`, see below for details.

通过源码中的描述(如上)，我们可以知道strides就是卷积操作时在图像每一维的步长，strides是一个长度为4的一维向量。

第四个参数：padding

padding: A `string` from: `"SAME", "VALID"`.
      The type of padding algorithm to use.

通过源码中的描述(如上)，我们知道padding是一个string类型的变量，只能是 "SAME" 或者 "VALID"，决定了两种不同的卷积方式。下面我们来介绍 "SAME" 和 "VALID" 的卷积方式，如下图我们使用单通道的图像，图像大小为5*5，卷积核用3*3。

"VALID" 卷积方式

具体卷积操作如下图（也是文中一开始用到的图），我们考虑卷积核中心点(这里卷积核大小是3*3，)走过的位置，

如下所示，红色#表示卷积核中心点在图像上的滑动过程。最后得到3*3的图像大小。

#####

#####

#####

"SAME"卷积方式

对于上图，图像的每一个点都作为卷积核的中心。最后得到5*5的结果，如下图：

通俗的来说：首先在原图外层补一圈0，将原图的第一点作为卷积核中心，若一圈0不够，继续补一圈0。

第五个参数：use_cudnn_on_gpu

use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.

通过源码中的描述(如上)，我们知道use_cudnn_on_gpu就是选择是否用GPU进行运算加速。默认为True。

第六个参数：data_format

data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to `"NHWC"`.
      Specify the data format of the input and output data. With the
      default format "NHWC", the data is stored in the order of:
          [batch, height, width, channels].
      Alternatively, the format could be "NCHW", the data storage order of:
          [batch, channels, height, width].

通过源码中的描述(如上)，我们知道data_format就是input的Tensor格式，一般默认就可以了。都采用NHWC。

第七个参数：name

name: A name for the operation (optional).

就是用以指定该操作的name，仅此而已。

函数返回值：

Returns:
    A `Tensor`. Has the same type as `input`.
    A 4-D tensor. The dimension order is determined by the value of
    `data_format`, see below for details.

返回卷积操作后的特征图。

caicaiatnbu

关注

7
点赞
踩
15

收藏

觉得还不错? 一键收藏
3
评论
[TensorFlow 学习笔记-04]卷积函数之tf.nn.conv2d

[版权说明]TensorFlow 学习笔记参考：李嘉璇著 TensorFlow技术解析与实战黄文坚唐源著 TensorFlow实战郑泽宇顾思宇著 TensorFlow实战Google深度学习框架乐毅王斌著深度学习-Caffe之经典模型详解与实战TensorFlow中文社区http://www.tensorfly.cn/极客学院著 TensorFlow官方文档...
复制链接

扫一扫

专栏目录