tf.nn.conv1d()，tf.nn.conv2d()，tf.nn.conv3d()

最新推荐文章于 2024-06-04 11:38:55 发布

蓝之刃

最新推荐文章于 2024-06-04 11:38:55 发布

阅读量2.8k

点赞数 1

在自然语言处理(NLP)领域，甚至图像处理的时候，我们可能会用到一维卷积(conv1d).以为卷积可以看作是二维卷积(conv2d)的简化,二维卷积是将一个特征图在width和height两个方向上进行滑动窗操作，对应位置进行相乘求和；而一维卷积则只是在width或者说height方向上进行滑动窗口并相乘求和。

tf.nn.conv1d(inputs, filters, stride, padding, use_cudnn_on_gpu=None, data_format=None,name=None)

    inputs = [batch, in_width, in_channels]     是一个3-D张量
    filters = [filter_width, in_channels, out_channels] 是一个3-D张量，out_channels表示输出通道，可以理解为卷积核的个数
    stride = 1   一个整数，步长，filters窗口移动的步长
    padding：SAME or VALID 是否用0填充，SAME用0填充，VALID不使用0填充
    use_cudnn_on_gpu：是否使用gpu加速，True or Flase
    data_format = [batch, in_channels, in_width]
    name = 取一个名字

Return:和输入参数一样纬度的张量，如果padding选的是SAME，步长是1的情况下，则改变的只是通道数由in_channels==>out_channels其他不变，如果选的是VALID，步长是1，则除了batch其他的纬度数也可能变小

    import tensorflow as tf
    import numpy as np

    #定义一个矩阵a，作为输入，也就是需要被圈记得矩阵
    a = np.array(np.arange(1, 21).reshape([1, 10, 2]), dtype=np.float32)
    print(a.shape)

    #定义卷积核的大小，数目为1
    kernel = np.array(np.arange(1, 5), dtype=np.float32).reshape([2,2,1])
    print(a.shape)

    #定义一个stride
    strides = 1

    #进行conv1d卷积
    conv1d = tf.nn.conv1d(a, kernel, strides, 'VALID')

    with tf.Session() as sess:
        #初始化
        tf.global_variables_initializer().run()
        #输出卷积值
        print(sess.run(conv1d))
        print(conv1d.shape)


    结果如下：
    [1, 10, 2]
    [2, 2, 1]
    [[[ 30.]
      [ 50.]
      [ 70.]
      [ 90.]
      [110.]
      [130.]
      [150.]
      [170.]
      [190.]]]
    [1, 9, 1]

二、tf.nn.conv2d()

tf.nn.conv2d(inputs, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)

input: shape=[batch_size, in_height, in_width, in_channels]，是一个4-D张量

filter: shape=[filter_height, filter_width, in_channels, out_channels]

strides: shape=[strides_batch, strides_height, strides_width, strides_channels] ,

一般情况下strides_size = strides_channels = 1

    padding：SAME or VALID 是否用0填充，SAME用0填充，VALID不使用0填充
    use_cudnn_on_gpu：是否使用gpu加速，True or Flase
    data_format = [batch, in_channels, in_height, in_width]
    name = 取一个名字

    import tensorflow as tf
    import numpy as np

    #定义一个矩阵a，作为输入，也就是需要被圈记得矩阵
    a = np.array(np.arange(1, 26).reshape([1, 5, 5, 1]), dtype=np.float32)
    #print(a)
    print(a.shape)

    #定义卷积核的大小，数目为1
    kernel = np.array(np.arange(1, 5), dtype=np.float32).reshape([2,2,1,1])
    #print(kernel)
    print(kernel.shape)

    #定义一个stride
    strides = [1, 1, 1, 1]

    #进行conv1d卷积
    conv2d = tf.nn.conv2d(a, kernel, strides, padding='VALID')


    with tf.Session() as sess:
        #初始化
        tf.global_variables_initializer().run()
        #输出卷积值
        #print(sess.run(conv2d))
        print(conv2d.shape)

    运行结果：
    (1, 5, 5, 1)
    (2, 2, 1, 1)
    (1, 4, 4, 1)

三、tf.nn.conv3d()

3D卷积一般用在视频上，就是在图像的长和宽的卷积的基础上加上视频中的帧也就是时间这一维上的卷积。具体可以参考(3D Convolutional Neural Network for Human Action Recognition)

tf.nn.conv3d(input, filter, strides, padding, data_format=None, name=None)

input:就是输入的数据，必须是float32，float64类型的。shape=[batch_size, in_depth, in_height, in_width, in_channels]

batch是每次输入的视频样本数；in_depth是每个视频样本的帧数；in_height,in_width:视频中每个帧的长和宽，类似图像的分辨率。是一个5-D张量

filter:shape=[filter_depth, filter_height, filter_width, in_channels, out_channels],是一个Tensor，必须和input有一样的 shape；

stride:shape=[strides_batch,strides_depth,strides_height,strides_width,strides_channels],是一个长度为五的一维张量，是输入张量每一维的滑动窗口跨度的大小；一般情况下strides_batch=strides_channels=1。

padding：参数有SAME or VALID，代表不同的填充；SAME表示使用0填充；VALID表示不用0填充

data_format:代表的是输入数据和输出数据每一维都指代的参数，有"NDHWC"，默认值是NDHWC，数据存储的顺序是 [batch, in_depth, in_height, in_width, in_channels];

"HCDHW"则是，[batch, in_channels, in_depth, in_height,in_width]

name = 取一个名字

    import tensorflow as tf
    import numpy as np

    #定义一个矩阵a，作为输入，也就是需要被圈记得矩阵
    a = np.array(np.arange(1,126).reshape([1, 5, 5, 5, 1]), dtype=np.float32)
    #print(a)
    print(a.shape)

    #定义卷积核的大小，数目为1
    kernel = np.array(np.arange(1, 9), dtype=np.float32).reshape([2, 2, 2, 1, 1])
    #print(kernel)
    print(kernel.shape)

    #定义步长strides
    strides = [1, 1, 1, 1, 1]
    #进行conv1d卷积
    conv3d = tf.nn.conv3d(a, kernel, strides, 'VALID')

    with tf.Session() as sess:
        #初始化
        tf.global_variables_initializer().run()
        #输出卷积值
        #print(sess.run(conv3d))
        print(conv3d.shape)


    运行结果：
    (1, 5, 5, 5, 1)
    (2, 2, 2, 1, 1)
    (1, 4, 4, 4, 1)

---------------------
原文：https://blog.csdn.net/qq_40196164/article/details/83176564
版权声明：本文为博主原创文章，转载请附上博文链接！

蓝之刃

关注

1
点赞
踩
15

收藏

觉得还不错? 一键收藏
2
评论
tf.nn.conv1d()，tf.nn.conv2d()，tf.nn.conv3d()

在自然语言处理(NLP)领域，甚至图像处理的时候，我们可能会用到一维卷积(conv1d).以为卷积可以看作是二维卷积(conv2d)的简化,二维卷积是将一个特征图在width和height两个方向上进行滑动窗操作，对应位置进行相乘求和；而一维卷积则只是在width或者说height方向上进行滑动窗口并相乘求和。 tf.nn.conv1d(inputs, filters, stride, ...
复制链接

扫一扫