大话：tf.nn.conv2d方法

最新推荐文章于 2022-06-30 15:29:42 发布

Python图像识别

最新推荐文章于 2022-06-30 15:29:42 发布

阅读量263

点赞数

分类专栏：人工智能文章标签：卷积神经网络 tensorflow 计算机视觉深度学习卷积

本文链接：https://blog.csdn.net/qq_28949847/article/details/106178482

版权

人工智能专栏收录该内容

74 篇文章 19 订阅

订阅专栏

方法定义

tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)

参数：

input : 输入的要做卷积的图片，要求为一个张量，输入必须得是一个 4维的数据。
也就是shape为 [ batch, in_height, in_weight, in_channel ]，其中batch为图片的数量，in_height 为图片高度，in_weight 为图片宽度，in_channel 为图片的通道数（rgb），灰度图该值为1，彩色图为3）

输入的数据类型必须为float32或float64。注意是float类型
可以用 tf.cast 进行类型转换。

filter：卷积核，要求也是一个张量，shape为 [ filter_height, filter_weight, in_channel, out_channels ]，其中 filter_height 为卷积核高度，filter_weight 为卷积核宽度，in_channel 是图像通道数，和 input 的 in_channel 要保持一致，out_channel 是卷积核数量。

strides：
卷积核相对图片滑动，然后进行卷积，提取特征。细想，问题有两个，怎么相对? 滑动多少？那么步长（strides）来帮你解决问题。
步长（strides）就是移动方式
卷积时在图像每一维的步长，其分别代表的含义为：[batch, height, width, channels]，这是一个一维的向量，默认设置strides [1, x, y, 1]， batch = 1指在样本上的步长为1，depth = 1 指在通道上的步长为1，即 strides[0] = strides[3] = 1；strides[1] = strides[2]=“你设置的步长大小”。

这是当data_format是默认的NHWC时，data_format有多种取值，但是一般都不会改变，改它没有多大的意义。

data_format：默认为NHWC。
padding： string类型，值为“SAME” 和 “VALID”，表示的是卷积的形式，是否考虑边界。"SAME"是考虑边界，不足的时候用0去填充周围，"VALID"则不考虑，不填充。

use_cudnn_on_gpu： bool类型，是否使用cudnn加速，默认为true

代码详解各维度的变化

import tensorflow as tf
import numpy as np


#  [ batch, in_height, in_weight, in_channel ]
input_data = np.random.randn(32, 32).reshape(1, 32, 32, 1)
# [ filter_height, filter_weight, in_channel, out_channels ]
# 8: 输出有 8 个   1: 同 input_data 中的 1 对应
filter_ = np.random.randn(5, 5, 8).reshape(5, 5, 1, 8)

# 一层卷积
conv = tf.nn.conv2d(input_data, filter_, strides=[1, 1, 1, 1], padding='VALID')
# VALID：不会补0。 28如何得到的：  32 - 5 + 1，宽度和高度的shape变化都是这个公式
# SAME:补0，得到的就是 32 * 32 同输入一样的大小。
# (1, 28, 28, 8)
print(conv.shape)
# 池化层, 此函数参数同 tf.nn.conv2d 一样
# [1, 2, 2, 1]: 输入：1  高度：2  宽度：2 channel：1
max_pool = tf.nn.max_pool(conv, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
# (1, 14, 14, 8)
print(max_pool.shape)

# 激活层，不改变维度
relu = tf.nn.relu(max_pool)
# (1, 14, 14, 8)
print(relu.shape)

# dropout
dropput = tf.nn.dropout(relu, keep_prob=0.6)
# (1, 14, 14, 8)
print(dropput.shape)

# 第二层卷积
# 高度：5  宽度：5   输入8：同dropout中的8   输出：20
filter2_ = np.random.randn(5, 5, 8, 20)
conv2 = tf.nn.conv2d(dropput, filter2_, strides=[1, 1, 1, 1], padding='VALID')
# (1, 10, 10, 20)
print(conv2.shape)

# 池化层
# [1, 2, 2, 1]: 输入：1  高度：2  宽度：2 channel：1
max_pool = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
# (1, 5, 5, 20)
print(max_pool.shape)
#
# # 激活层
# sigmoid = tf.nn.sigmoid(max_pool)
# # (1, 5, 5, 20)
# print(relu.shape)
#
# # dropout
# dropput2 = tf.nn.dropout(sigmoid, keep_prob=0.5)
# # (1, 5, 5, 20)
# print(dropput.shape)
#
# # 全连接层
# # 500: 上面：5*5*20
# dense = np.random.randn(500, 120)
# fc = tf.reshape(dropput2, shape=[1, 5*5*20])
# conn = tf.matmul(fc, dense)
# # (1, 120)
# print(conn.shape)
#
# # out输出层
# w = np.random.randn(120, 9)
#
# b = np.random.randn(9)
#
# out = tf.matmul(conn, w) + b
# # (1, 9)
# print(out.shape)
#

Python图像识别

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
大话：tf.nn.conv2d方法

方法定义tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)参数：input : 输入的要做卷积的图片，要求为一个张量，输入必须得是一个 4维的数据。也就是shape为 [ batch, in_height, in_weight, in_channel ]，其中batch为图片的数量，in_height 为图片高度，in_weight 为图片宽度，in_ch
复制链接

扫一扫