tf.nn.conv2d()函数解析

最新推荐文章于 2024-08-06 22:08:46 发布

邵家小鱼

最新推荐文章于 2024-08-06 22:08:46 发布

阅读量670

点赞数

分类专栏： tensorflow 文章标签： tensorflow

tensorflow 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)

参数input是：

[batch, in_height, in_width, in_channels]

参数filter是：

[filter_height, filter_width, in_channels, out_channels]

参数strides是一个一维具有四个元素的张量，其规定前后必须为1，可以改的是中间两个数，中间两个数分别代表了水平滑动和垂直滑动步长值：

strides=[1, 1, 1, 1] 或 strides=[1, 2, 2, 1]

个人认为前后的1可能是对应的input中的batch和in_channels，这两个维度，只能是1，不能跳过卷积，所以不能设置成别的数值。

参数padding可选same和valid：

比如有图大小为5*5,卷积核为2*2,步长为2,卷积核扫描了两次后，剩下一个元素，不够卷积核扫描了，这个时候就在后面补零，补完后满足卷积核的扫描，这种方式就是same。

如果说把刚才不足以扫描的元素位置抛弃掉，就是valid方式。

对于多通道来说，input是[1x3x3x2]是3x3图像有2个通道，filter是[2x2x2x1]，步长均是1，padding=VALID，输出是[1x2x2x1],如图：

后续补充：贴一下源码的注释

def conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None,
           data_format=None, name=None):
  r"""Computes a 2-D convolution given 4-D `input` and `filter` tensors.

  Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
  and a filter / kernel tensor of shape
  `[filter_height, filter_width, in_channels, out_channels]`, this op
  performs the following:

  1. Flattens the filter to a 2-D matrix with shape
     `[filter_height * filter_width * in_channels, output_channels]`.
  2. Extracts image patches from the input tensor to form a *virtual*
     tensor of shape `[batch, out_height, out_width,
     filter_height * filter_width * in_channels]`.
  3. For each patch, right-multiplies the filter matrix and the image patch
     vector.

  In detail, with the default NHWC format,

      output[b, i, j, k] =
          sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *
                          filter[di, dj, q, k]

  Must have `strides[0] = strides[3] = 1`.  For the most common case of the same
  horizontal and vertices strides, `strides = [1, stride, stride, 1]`.

  Args:
    input: A `Tensor`. Must be one of the following types: `half`, `float32`, `float64`.
      A 4-D tensor. The dimension order is interpreted according to the value
      of `data_format`, see below for details.
    filter: A `Tensor`. Must have the same type as `input`.
      A 4-D tensor of shape
      `[filter_height, filter_width, in_channels, out_channels]`
    strides: A list of `ints`.
      1-D tensor of length 4.  The stride of the sliding window for each
      dimension of `input`. The dimension order is determined by the value of
        `data_format`, see below for details.
    padding: A `string` from: `"SAME", "VALID"`.
      The type of padding algorithm to use.
    use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
    data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to `"NHWC"`.
      Specify the data format of the input and output data. With the
      default format "NHWC", the data is stored in the order of:
          [batch, height, width, channels].
      Alternatively, the format could be "NCHW", the data storage order of:
          [batch, channels, height, width].
    name: A name for the operation (optional).

  Returns:
    A `Tensor`. Has the same type as `input`.
    A 4-D tensor. The dimension order is determined by the value of
    `data_format`, see below for details.
  """