tf.nn.conv2d()卷积函数 NHWC和NCHW格式转换

最新推荐文章于 2024-07-03 09:05:00 发布

挡不住三千问的BlueCat

最新推荐文章于 2024-07-03 09:05:00 发布

阅读量5.8k

点赞数 8

分类专栏： Python 文章标签： CNN NHWC NCHW

本文链接：https://blog.csdn.net/qq_23944915/article/details/88759222

版权

Python 专栏收录该内容

95 篇文章 1 订阅

订阅专栏

1、首先介绍tf.nn.conv2d()函数, 其函数原型：

conv2(
    input,
    filter,
    strides,
    padding,
    use_cudnn_on_gpu=None,
    data_format=None,
    name=None
)

(1) input(输入)：类型为tf.float32或tf.float64。通常指需要做卷积的输入图像，它要求是一个张量(Tensor)，维度为4，是[batch, in_height, in_width, in_channels]（格式简称为：NHWC）这样的形状。这4个参数的含义是[参与训练的一批(batch)图像的数量，输入图片的高度，输入图片的宽度，输入图片的通道数]。

(2) filter(过滤核或卷积核): 类型为tf.float32或tf.float64。实际就是CNN中的卷积核,也是一个张量(Tensor)，型状和input一样，也是4D维度，分别是[filter_height, filter_width, in_channels, out_channels](格式简称为:HWCN)，其含义是[卷积核的高度，卷积核的宽度，输入图像的通道数，卷积核的个数]。注意filter的第三个参数in_channels就是参数input的第四个维度。

(3) strides: 表示卷积时在每一维度上的步长，它是一个一维向量，长度为4，分别表示input中参数中在每一维度上的滑动窗口距离。维度的次序取决于后面的参数“data_format”的设置。

(4) padding：填充类型， string类型，只能是"SAME"或"VALID"其中之一。它决定卷积方式是“一致性填充(SAME)”还是有效填充(VALID)”。“SAME”能保证输入图像和输出图像保持大小一致，"VALID"是卷积操作时不填充，可能会裁剪图片。

(5) use_cudnn_on_gpu: 可选参数，布尔类型。表示是否在GPU上使用cudnn加速，默认为true。

(6) data_format：可选参数，string类型，只能是“NHWC”或“NCHW”，默认是前者。“NHWC“表示[batch, height, width, channels]，如果是”NCHW“表示[batch, channels, height, width]。

(7) name: 可选参数，给操作命名。

2、NHWC格式和NCHW格式之间的转换

NHWC：[batch, in_height, in_width, in_channels]

NCHW: [batch, in_channels, in_height, in_width]

转换使用：tf.transpose()函数，或者 np.transpose函数，举例：

temp2_NCHW维度为：(1, 2, 5, 5)， 数据格式为NCWH
temp2_NHWC = tf.transpose(temp2_NCHW, [0, 2, 3, 1]) ， 这样转换后维度为(1, 5, 5, 2),格式变为NWHC；

其中第二参数[0, 2, 3, 1]中的数字是temp2_NCHW每个维度对应的索引，0:表示第一个维度N, 1:表示第二维度C，2表示第三个维度W，3表示第四个维度H；而数组[0, 2, 3, 1]对应的索引Index为0 ，1， 2，3；即0维度不变， 2->1, 3->2, 1->3；

import tensorflow as tf
import numpy as np
image = np.array([[[[1, 2, 3,  4,  5],
                    [6, 7, 8, 9, 10],
                    [11, 12, 13, 14, 15],
                    [16, 17, 18, 19, 20],
                    [21, 22, 23, 24, 25]],

                   [[26, 27, 28, 29, 30],
                    [31, 32, 33, 34, 35],
                    [36, 37, 38, 39, 40],
                    [41, 42, 43, 44, 45],
                    [46, 47, 48, 49, 50]]]])
print(image.shape)

temp2_NCHW = tf.constant(image, dtype=tf.float32)  #temp2_NCHW维度为：(1, 2, 5, 5) 即batch=1, in_channels=2, in_height=5,in_width=5
temp2_NHWC = tf.transpose(temp2_NCHW, [0, 2, 3, 1]) #转化后temp2_NHWC维度为(1, 5, 5, 2),即batch=1, in_height=5, in_width=5, in_channels=2
print(temp2_NHWC)
filter_NCHW = np.array([[[[1, 0, 0],
                          [0, 1, 0],
                          [1, 0, 0]],
                         [[0, 1, 0],
                          [0, 0, 1],
                          [0, 0, 0]]]])  # 卷积核此处维度为(1, 2, 3, 3)，即卷积核个数=1， 图像通道数为2， height=3, width=3

filter_HWCN = filter_NCHW.transpose([2, 3, 1, 0])  # 转换后维度为：(3, 3, 2, 1)，即height=3, width=3, in_channels=2(输入图像的通道), out_channels=1（卷积核的个数）
print(filter_NCHW.shape)
print("卷积核转为WHCN后维度为：")
print(filter_HWCN.shape)
print("卷积核转为WHCN后数据为：")
print(filter_HWCN)
print("*"*10)

filter2 = tf.constant(filter_HWCN, dtype=tf.float32)
convolution = tf.nn.conv2d(temp2_NHWC, filter2, strides=[1, 1, 1, 1], padding="VALID")  # 输入数据类型必须是32位 或 64位

with tf.Session() as sess:
    print("图像数据转化为NHWC后数据为")
    print(sess.run(temp2_NHWC))
    print("*"*10)
    result = sess.run(convolution)
    print("结果: ")
    print(result)
    print(result.shape)

运行结果为：

(1, 2, 5, 5)
Tensor("transpose:0", shape=(1, 5, 5, 2), dtype=float32)
(1, 2, 3, 3)
卷积核转为WHCN后维度为：
(3, 3, 2, 1)
卷积核转为WHCN后数据为：
[[[[1]
   [0]]

  [[0]
   [1]]

  [[0]
   [0]]]


 [[[0]
   [0]]

  [[1]
   [0]]

  [[0]
   [1]]]


 [[[1]
   [0]]

  [[0]
   [0]]

  [[0]
   [0]]]]
**********
图像数据转化为NHWC后数据为
[[[[ 1. 26.]
   [ 2. 27.]
   [ 3. 28.]
   [ 4. 29.]
   [ 5. 30.]]

  [[ 6. 31.]
   [ 7. 32.]
   [ 8. 33.]
   [ 9. 34.]
   [10. 35.]]

  [[11. 36.]
   [12. 37.]
   [13. 38.]
   [14. 39.]
   [15. 40.]]

  [[16. 41.]
   [17. 42.]
   [18. 43.]
   [19. 44.]
   [20. 45.]]

  [[21. 46.]
   [22. 47.]
   [23. 48.]
   [24. 49.]
   [25. 50.]]]]
**********
结果: 
[[[[ 79.]
   [ 84.]
   [ 89.]]

  [[104.]
   [109.]
   [114.]]

  [[129.]
   [134.]
   [139.]]]]
(1, 3, 3, 1)

NHWC格式数据图像数据和HWCN格式的卷积核的乘法，应该仔细琢磨下；