tensorflow学习之tf.nn.conv2d

最新推荐文章于 2021-12-03 12:10:33 发布

一只放纵的死魂灵

最新推荐文章于 2021-12-03 12:10:33 发布

阅读量173

点赞数

分类专栏： tensorflow 文章标签： tensorflow

本文链接：https://blog.csdn.net/xiyaozhe/article/details/83548968

版权

tensorflow 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文借鉴了CSDN：xf__mao的博客，原文地址：https://blog.csdn.net/mao_xiao_feng/article/details/78004522

tf.nn.conv2d(input,filter,strides,padding,use_cudnn_on_gpu=True,data_format=‘NHWC’,dilations=[1, 1, 1, 1],name=None)

参数：
除去name参数用以指定该操作的name，与方法有关的一共五个参数：

input：
指需要做卷积的输入图像，它要求是一个Tensor，具有[batch, in_height, in_width, in_channels]这样的shape，具体含义是[训练时一个batch的图片数量, 图片高度, 图片宽度, 图像通道数]，注意这是一个4维的Tensor，要求类型为float32和float64其中之一
filter：
相当于CNN中的卷积核，它要求是一个Tensor，具有[filter_height, filter_width, in_channels, out_channels]这样的shape，具体含义是[卷积核的高度，卷积核的宽度，图像通道数，卷积核个数]，要求类型与参数input相同，有一个地方需要注意，第三维in_channels，就是参数input的第四维
strides：
TensorFlow 文档关于 strides的说明如下：
strides: 首先要求 strides 为长度不小于 4 的整数构成的 list，strides参数表示的是滑窗在输入张量各个维度上的移动步长。
而且一般要求 strides的参数，strides[0] = strides[3] = 1
具体什么含义呢？
一般而言，对于输入张量（input tensor）有四维信息：[batch, height, width, channels]（分别表示 batch_size, 也即样本的数目，单个样本的行数和列数，样本的频道数，rgb图像就是三维的，灰度图像则是一维），对于一个二维卷积操作而言，其主要作用在 height, width上。
strides参数确定了滑动窗口在各个维度上移动的步数。一种常用的经典设置就是要求，strides[0]=strides[3]=1。
strides[0] = 1，也即在 batch 维度上的移动为 1，也就是不跳过任何一个样本，否则当初也不该把它们作为输入（input）
strides[3] = 1，也即在 channels 维度上的移动为 1，也就是不跳过任何一个颜色通道；
padding：
string类型的量，只能是”SAME”,”VALID”其中之一，这个值决定了不同的卷积方式（后面会介绍）
use_cudnn_on_gpu：
bool类型，是否使用cudnn加速，默认为true
data_format=‘NHWC’
区别
NHWC
[batch, in_height, in_width, in_channels]
NCHW
[batch, in_channels, in_height, in_width]
dilation
卷积核的膨胀，是孔洞算法。即卷积时，从输入中每隔 (dilation-1) 个元素取一个值而不是连续取值(因此当dilation=1时相当于不使用dilation)。所以当 kernel=3乘以3，且dilation=2时，实际覆盖的输入范围为5乘以5，只是从这25个数字中挑选出9个数字进行卷积计算。

结果返回一个和input结构相同的Tensor，这个输出，就是我们常说的feature map

实验
那么TensorFlow的卷积具体是怎样实现的呢，用一些例子去解释它：

1.考虑一种最简单的情况，现在有一张3×3单通道的图像（对应的shape：[1，3，3，1]），用一个1×1的卷积核（对应的shape：[1，1，1，1]）去做卷积，最后会得到一张3×3的feature map

2.增加图片的通道数，使用一张3×3五通道的图像（对应的shape：[1，3，3，5]），用一个1×1的卷积核（对应的shape：[1，1，1，1]）去做卷积，仍然是一张3×3的feature map，这就相当于每一个像素点，卷积核都与该像素点的每一个通道做点积

input = tf.Variable(tf.random_normal([1,3,3,5]))
filter = tf.Variable(tf.random_normal([1,1,5,1]))

op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding=‘VALID’)

3.把卷积核扩大，现在用3×3的卷积核做卷积，最后的输出是一个值，相当于情况2的feature map所有像素点的值求和

input = tf.Variable(tf.random_normal([1,3,3,5]))
filter = tf.Variable(tf.random_normal([3,3,5,1]))

op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding=‘VALID’)

4.使用更大的图片将情况2的图片扩大到5×5，仍然是3×3的卷积核，令步长为1，输出3×3的feature map

…
.xxx.
.xxx.
.xxx.
…

5.上面我们一直令参数padding的值为‘VALID’，当其为‘SAME’时，表示卷积核可以停留在图像边缘，如下，输出5×5的feature map

input = tf.Variable(tf.random_normal([1,5,5,5]))
filter = tf.Variable(tf.random_normal([3,3,5,1]))

op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding=‘SAME’)

xxxxx
xxxxx
xxxxx
xxxxx
xxxxx

6.如果卷积核有多个

input = tf.Variable(tf.random_normal([1,5,5,5]))
filter = tf.Variable(tf.random_normal([3,3,5,7]))

op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding=‘SAME’)

此时输出7张5×5的feature map

7.步长不为1的情况，文档里说了对于图片，因为只有两维，通常strides取[1，stride，stride，1]

input = tf.Variable(tf.random_normal([1,5,5,5]))
filter = tf.Variable(tf.random_normal([3,3,5,7]))

op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding=‘SAME’)

此时，输出7张3×3的feature map

x.x.x
…
x.x.x
…
x.x.x

8.如果batch值不为1，同时输入10张图

input = tf.Variable(tf.random_normal([10,5,5,5]))
filter = tf.Variable(tf.random_normal([3,3,5,7]))

op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding=‘SAME’)

每张图，都有7张3×3的feature map，输出的shape就是[10，3，3，7]

一只放纵的死魂灵

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
tensorflow学习之tf.nn.conv2d

本文借鉴了CSDN：xf__mao的博客，原文地址：https://blog.csdn.net/mao_xiao_feng/article/details/78004522tf.nn.conv2d(input,filter,strides,padding,use_cudnn_on_gpu=True,data_format=‘NHWC’,dilations=[1, 1, 1, 1],name=...
复制链接

扫一扫