【机器学习小记】【卷积神经网络模型】deeplearning.ai course4 1st week programming（tensorflow2.0实现）

最新推荐文章于 2021-05-20 14:47:03 发布

LittleSeedling

最新推荐文章于 2021-05-20 14:47:03 发布

阅读量259

点赞数 1

分类专栏： # 初学深度学习

本文链接：https://blog.csdn.net/LittleSeedling/article/details/113407437

版权

初学深度学习专栏收录该内容

12 篇文章 0 订阅

订阅专栏

目标：
	搭建简单的 卷积神经网络
	修改【参考文章】的代码，使用tensorflow2实现

参考自：【中文】【吴恩达课后编程作业】Course 4 - 卷积神经网络 - 第一周作业 - 搭建卷积神经网络模型以及应用（1&2）

1.3.1 边界填充

边界填充的作用：

卷积之后，保持了高度和宽度
卷积的时候，保留了更多的边界的信息（如果不使用pad填充，边界只会计算1次）

np.pad

np.pad(
	input,
	(npx,npy), # 样本数填充
	(hpx,hpy), # 图像高度填充
	(wpx,wpy), # 图像宽度填充
	(cpx,cpy),# 通道填充
	mode, # 填充模式constant表示常量填充
	constant_value # 填充的常量值
)

1.3.2 单步卷积

一步卷积，就是【一个窗口slice与过滤器W】对应相乘再全部相加

1.3.3 卷积

卷积操作是对所有通道同时进行操作的

卷积之后的输出结果为：
$n_H^{[l]} = \lfloor {n_{H}^{[l-1]}- f^{[l]} + 2 \cdot p^{[l]} \over s^{[l]}} \rfloor +1$

$n_W^{[l]} = \lfloor {n_{W}^{[l-1]}- f^{[l]} + 2 \cdot p^{[l]} \over s^{[l]}} \rfloor +1$

$n_C = 过滤器数量$
其中， $n_H,n_W$ 表示图像的高度和宽度， $^{[l]}$ 表示层数，f表示过滤器的大小，p表示图像填充多少，s表示卷积步长

1.4 池化层

池化的作用：

减少输入宽度和高度，进而减少之后的计算量
使特征检测器对其在输入中的位置更加稳定

池化分两种：

最大池化：取窗口中的最大值作为输出
平均池化：取窗口中所有值的平均值作为输出

池化之后输出的大小，和卷积类似，但是，池化一般不使用padding

1.5 反向传播

反向传播由框架搞定了，这里不写了

2.1.0 TensorFlow模型

模型大致如下：
CONV2D→RELU→MAXPOOL→CONV2D→RELU→MAXPOOL→FULLCONNECTED

import time
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import cnn_utils

import os

# 不使用GPU
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

np.random.seed(1)

# 加载数据
def get_data():
    X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = cnn_utils.load_dataset()

    # index = 6
    # plt.imshow(X_train_orig[index])
    # print("y = " + str(np.squeeze(Y_train_orig[:, index])))
    # plt.show()
    """
        X_train_orig 是 1080*64*64*3的矩阵
    """
    # 归一化数据
    X_train = X_train_orig / 255
    X_test = X_test_orig / 255

    # Y转换成的独热码的形式
    Y_train = cnn_utils.convert_to_one_hot(Y_train_orig, 6).T
    Y_test = cnn_utils.convert_to_one_hot(Y_test_orig, 6).T

    return X_train, Y_train, X_test, Y_test, classes

不需要创建placeholders

2.1.1 初始化参数

原博文中说，【我们不需要考虑偏置值b，TensorFlow会考虑到】。但是，是否真的有考虑到，我没有深究。

def initialize_parameters():
    """
    初始化权值矩阵，这里我们把权值矩阵硬编码：
    W1 : [4, 4, 3, 8]
    W2 : [2, 2, 8, 16]
	W3 : [64,6]
	b3 : [1]
    返回：
        包含了tensor类型的W1、W2的字典
    """
    tf.random.set_seed(1)

    # 初始化器
    initializer = tf.initializers.he_uniform(0)

    # f=4x4,channel=3,n_c=8
    W1 = tf.Variable(initializer([4, 4, 3, 8]), name="W1")
    # f=2x2,channel=8,n_c=16
    W2 = tf.Variable(initializer([2, 2, 8, 16]), name="W2")

    # 全连接层参数
    W3 = tf.Variable(initializer([64, 6]),name="W3")
    b3 = tf.Variable(initializer([1]),name="b3")

    parameters = {
        "W1": W1,
        "W2": W2,
        "W3": W3,
        "b3": b3
    }
    return parameters

这里和原博文不同的是，需要手动添加上全连接层的参数，方便后面使用求导和反向传播更新参数。

PS:当然，如果使用full_connection_layer = tf.keras.layers.Dense(6, activation=None)创建一个全连接层。Z3 = full_connection_layer(P)也可以实现全连接层的前向传播，后面可用使用full_connection_layer.variables获得本层全连接的参数，但是不方便，还不如自己写。（我还以为，如果在【自动求导】中，前向传播使用这个，后面求导的时候tensorflow会自动考虑上这个参数的更新。实际上，不会，还是要自己手动添加该层的参数，进行求导和参数更新。~~卡bug了一晚上。~~ ）

测试：

parameters = initialize_parameters()
print("W1 = " + str(parameters["W1"].numpy()[1,1,1]))
print("W2 = " + str(parameters["W2"].numpy()[1,1,1]))
print("W3 = " + str(parameters["W3"].numpy()[1,1]))
print("b3 = " + str(parameters["b3"].numpy()))

输出：

W1 = [-0.10238287  0.35137287 -0.023274    0.00264338  0.00247917 -0.05561143
 -0.21562235 -0.27392948]
W2 = [-0.29680836  0.1650249  -0.12899727 -0.03889439  0.42350248 -0.11916256
  0.3731927  -0.15247756 -0.28602475 -0.3382344  -0.39762798  0.27272776
  0.22672692 -0.21311466 -0.09157833  0.05948022]
W3 = -0.18558505
b3 = [-1.4157294]

2.1.2 前向传播

def forward_propagation(X, parameters):
    """
    实现前向传播
    CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED
    参数：
        X - 维度为(样本数量，高度，宽度，通道数)
        parameters - 包含了“W1”、“W2”、”W3“、”b3“的python字典。

    返回：
        Z3 - 最后一个LINEAR节点的输出
    """
    # 获得参数
    W1 = parameters["W1"]
    W2 = parameters["W2"]
    W3 = parameters["W3"]
    b3 = parameters["b3"]
    """
        Z1.shape (64, 64, 64, 8)
        P1.shape (64, 8, 8, 8)
        Z2.shape (64, 8, 8, 16)
        P2.shape (64, 2, 2, 16)
    """
    # conv2d: s=1,pad=same
    Z1 = tf.nn.conv2d(X, W1, strides=[1, 1, 1, 1], padding="SAME")
    # relu
    A1 = tf.nn.relu(Z1)
    # max pool: f=8x8 s=8
    P1 = tf.nn.max_pool(A1, ksize=[1, 8, 8, 1], strides=[1, 8, 8, 1], padding="SAME")

    # conv2d s=1,pad=same
    Z2 = tf.nn.conv2d(P1, W2, strides=[1, 1, 1, 1], padding="SAME")
    # relu
    A2 = tf.nn.relu(Z2)
    # max pool: f=4x4 s=4
    P2 = tf.nn.max_pool(A2, ksize=[1, 4, 4, 1], strides=[1, 4, 4, 1], padding="SAME")

    # 拉平
    P = tf.reshape(P2, (P2.shape[0], -1))
    # out:64x64  样本数x维度

    # 全连接层：使用没有非线性激活函数的全连接层
    # full_connect_model = tf.keras.layers.Dense(6, activation=None)
    # Z3 = full_connect_model(P)
    Z3 = tf.matmul(P,W3) + b3
    # out:64x6

    return Z3

注意此时的输入X的形状 (样本数量，高度，宽度，通道数)。
池化之后的输出的大小(padding的原因，详细见后面的padding)，也和吴恩达老师定义的稍微有点不同。
这里，手动实现了全连接层。
Z3 的维度是（样本数，x）

测试：

X = np.random.randn(2,64,64,3)
parameters = initialize_parameters()
Z3 = forward_propagation(X,parameters)
print("Z3 = ",Z3)

输出：

Z3 =  tf.Tensor(
[[-1.9189914  5.236899  -2.1311748 -4.4184475 -3.7767754 -3.9814925]
 [-1.3195286  4.619976  -2.6811643 -5.2443013 -3.3180122 -3.3682647]], shape=(2, 6), dtype=float32)

2.1.3 计算成本

def compute_cost(Z3, Y):
    """
    计算成本
    参数：
        Z3 - 正向传播最后一个LINEAR节点的输出，维度为（样本数，6）。
        Y - 和Z3的维度相同
    返回：
        cost - 计算后的成本
    """

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z3, labels=Y))

    return cost

测试：

X = np.random.randn(4,64,64,3)
Y = np.random.randn(4,6)
parameters = initialize_parameters()
Z3 = forward_propagation(X,parameters)
print("Z3 =",Z3)
print("Y =",Y,Y.shape)
cost = compute_cost(Z3,Y)
print("cost =",cost)

输出：

Z3 = tf.Tensor(
[[-1.9189913   5.2368994  -2.1311746  -4.418448   -3.7767758  -3.981492  ]
 [-1.3195288   4.619976   -2.6811643  -5.244302   -3.3180122  -3.3682647 ]
 [-1.1359475   4.538566   -2.0724049  -5.7774     -3.6145096  -3.2256317 ]
 [-2.8010159   5.0077     -0.25371218 -4.8593636  -3.7319794  -2.9729443 ]], shape=(4, 6), dtype=float32)
Y = [[-0.3153977   1.24542971 -1.30459226 -1.25915944  0.48752929 -0.24841797]
 [-1.28537295  2.04446858  0.14148981  1.33236812  1.26680127  0.90926308]
 [-0.44260544  0.73702333 -0.93976884  0.90032918 -0.29123109 -1.58389475]
 [ 0.05890907  0.37748493  1.84972109 -0.42243608 -0.32072321 -0.89134773]] (4, 6)
cost = tf.Tensor(-4.0131383, shape=(), dtype=float32)

注：

Z3的维度和原博文相反。
在tensorflow中，一般按照每一行（行向量，每个样本就是一行）计算。

2.1.4 构建模型

预测函数

def predict(X_train, Y_train, X_test, Y_test, parameters):
    ##################################### 训练集
    # 预测值
    Z = forward_propagation(X_train, parameters) # out: 样本*dims
    # 找到最大值的下标，在向量行中找最大值的索引
    Z = tf.argmax(Z, axis=1)
    # 转换成独热码的形式
    # axis表示填充方向，行填充
    Z = tf.one_hot(Z, depth=Y_train.shape[1], axis=1)
    # 强制类型转换，和需要比较的标签的类型一致
    Z = tf.cast(Z, dtype=Y_train.dtype)

    # 预测值 与 真实值进行【比较】，强转成默认的int类型，我这里是int64
    # 得到一个True，False数组
    correct = tf.equal(Z, Y_train)

    # 强转，把bool数组转换成int数组
    correct = tf.cast(correct, dtype=tf.int64)
    # 求每一行的平均数，
    correct = tf.reduce_mean(correct, axis=1)
    # 强转，避免产生小数
    correct = tf.cast(correct, dtype=tf.int64)

    # 记录正确的数量，对所有元素求和
    total_correct = tf.reduce_sum(correct)
    # 样本数，这里是m
    total_number = X_train.shape[0]

    # 正确率 = 正确数 / 总样本数
    train_acc = total_correct / total_number
    print("训练集准确率:", train_acc.numpy())

    ################################## 预测集
    # 预测值
    Z = forward_propagation(X_test, parameters)
    # 找到最大值的下标，在向量行中找最大值的索引
    Z = tf.argmax(Z, axis=1)
    # 转换成独热码的形式
    # axis表示填充方向，行填充
    Z = tf.one_hot(Z, depth=Y_test.shape[1], axis=1)
    # 强制类型转换，和需要比较的标签的类型一致
    Z = tf.cast(Z, dtype=Y_test.dtype)

    # 预测值 与 真实值进行【比较】，强转成默认的int类型，我这里是int64
    # 得到一个True，False数组
    correct = tf.equal(Z, Y_test)

    # 强转，把bool数组转换成int数组
    correct = tf.cast(correct, dtype=tf.int64)
    # 求每一行的平均数，
    correct = tf.reduce_mean(correct, axis=1)
    # 强转，避免产生小数
    correct = tf.cast(correct, dtype=tf.int64)

    # 记录正确的数量，对所有元素求和
    total_correct = tf.reduce_sum(correct)
    # 样本数，这里是m
    total_number = X_test.shape[0]

    # 正确率 = 正确数 / 总样本数
    test_acc = total_correct / total_number
    print("测试集准确率:", test_acc.numpy())

    return train_acc, test_acc

由于输入矩阵进行了转置，和tensorflow2.0入门的作业稍微不一样

model

def model(X_train, Y_train, X_test, Y_test, learning_rate=0.009,
          num_epochs=100, mini_batch_size=64, print_cost=True, is_plot=True):
    """
    使用TensorFlow实现三层的卷积神经网络
    CONV2D -> RELU -> MAXPOOL -> CONV2D -> RELU -> MAXPOOL -> FLATTEN -> FULLYCONNECTED
    参数：
        X_train - 训练数据，维度为(样本数, 64, 64, 3)
        Y_train - 训练数据对应的标签，维度为(样本数, n_y = 6)
        X_test - 测试数据，维度为(样本数, 64, 64, 3)
        Y_test - 训练数据对应的标签，维度为(样本数, n_y = 6)
        learning_rate - 学习率
        num_epochs - 遍历整个数据集的次数
        mini_batch_size - 每个小批量数据块的大小
        print_cost - 是否打印成本值，每遍历100次整个数据集打印一次
        is_plot - 是否绘制图谱
    返回：
        train_accuracy - 实数，训练集的准确度
        test_accuracy - 实数，测试集的准确度
        parameters - 学习后的参数
    """
    tf.random.set_seed(1)
    # 设置numpy的随机种子
    seed = 3
    m, n_H0, n_W0, n_C0 = X_train.shape
    n_y = Y_train.shape[1]

    costs = []

    # 初始化参数
    parameters = initialize_parameters()

    # 优化器
    optimizer = tf.optimizers.Adam(learning_rate=learning_rate)

    # 开始训练
    for epoch in range(num_epochs):
        # 每个epoch的成本
        epoch_cost = 0
        # mini_batch 的数量
        num_mini_batches = m // mini_batch_size

        # 每个epoch打乱batches
        seed = seed + 1
        mini_batches = cnn_utils.random_mini_batches(X_train, Y_train, mini_batch_size, seed)

        # 对于每个batch
        for step, (X, Y) in enumerate(mini_batches):
            with tf.GradientTape() as tape:
                # 前向传播
                Z3 = forward_propagation(X, parameters)

                # 计算成本
                mini_batch_cost = compute_cost(Z3, Y)

                epoch_cost = epoch_cost + mini_batch_cost / num_mini_batches

            # 获得梯度
            grads = tape.gradient(mini_batch_cost, list(parameters.values()))

            # 更新参数
            optimizer.apply_gradients(grads_and_vars=zip(grads, list(parameters.values())))

        # 打印成本
        if print_cost:
            costs.append(epoch_cost)
            if epoch % 5 == 0:
                print("当前是第 " + str(epoch) + " 代，成本值为：" + str(epoch_cost.numpy()))

    # 数据处理完毕，绘制曲线
    if is_plot:
        plt.plot(np.squeeze(costs))
        plt.ylabel("cost")
        plt.xlabel("iterations(per tens)")
        plt.title("learning rate =" + str(learning_rate))
        plt.show()

    train_accuracy, test_accuracy = predict(X_train, Y_train, X_test, Y_test,parameters)

    return parameters, train_accuracy, test_accuracy

测试

def main():
    X_train, Y_train, X_test, Y_test, classes = get_data()
    # 开始时间
    start_time = time.perf_counter()
    # 开始训练
    parameters = model(X_train, Y_train, X_test, Y_test, num_epochs=1000,learning_rate=0.009)
    # 结束时间
    end_time = time.perf_counter()
    # 计算时差
    print("CPU的执行时间 = " + str(end_time - start_time) + " 秒")
if __name__ == '__main__':
    main()

epoch=150,learning_rate=0.009,mini_batch_size=64

在这里插入图片描述

当前是第 0 代，成本值为：1.9711732
当前是第 5 代，成本值为：1.6441493
当前是第 10 代，成本值为：0.9809544
当前是第 15 代，成本值为：0.74599206
当前是第 20 代，成本值为：0.52382654
当前是第 25 代，成本值为：0.41984344
当前是第 30 代，成本值为：0.3410723
当前是第 35 代，成本值为：0.31803787
当前是第 40 代，成本值为：0.23219788
当前是第 45 代，成本值为：0.21172494
当前是第 50 代，成本值为：0.1985099
当前是第 55 代，成本值为：0.16930607
当前是第 60 代，成本值为：0.114748925
当前是第 65 代，成本值为：0.09465394
当前是第 70 代，成本值为：0.1167781
当前是第 75 代，成本值为：0.10553896
当前是第 80 代，成本值为：0.080426455
当前是第 85 代，成本值为：0.087554045
当前是第 90 代，成本值为：0.07595678
当前是第 95 代，成本值为：0.06289906
当前是第 100 代，成本值为：0.04345636
当前是第 105 代，成本值为：0.050669502
当前是第 110 代，成本值为：0.07043151
当前是第 115 代，成本值为：0.023179952
当前是第 120 代，成本值为：0.03279618
当前是第 125 代，成本值为：0.019446563
当前是第 130 代，成本值为：0.014923153
当前是第 135 代，成本值为：0.012299964
当前是第 140 代，成本值为：0.015927585
当前是第 145 代，成本值为：0.008962453
训练集准确率: 1.0
测试集准确率: 0.9
CPU的执行时间 = 263.3198113 秒

效果比原博文好一点

epoch=1000,learning_rate=0.009,mini_batch_size=64

在这里插入图片描述

当前是第 0 代，成本值为：1.9711732
当前是第 5 代，成本值为：1.6441493
当前是第 10 代，成本值为：0.9809544
当前是第 15 代，成本值为：0.74599206
...
当前是第 195 代，成本值为：0.005246058
当前是第 200 代，成本值为：0.0045862542
当前是第 205 代，成本值为：0.0041113026
...
当前是第 550 代，成本值为：0.000109061075
当前是第 555 代，成本值为：0.000102569014
当前是第 560 代，成本值为：9.616972e-05
当前是第 565 代，成本值为：9.2503054e-05
...
当前是第 990 代，成本值为：1.489012e-06
当前是第 995 代，成本值为：1.3689555e-06
训练集准确率: 1.0
测试集准确率: 0.875
CPU的执行时间 = 1360.6341113 秒

从图中可以看出，epoch=200之后，cost变化已经不大了

迭代1000次之后，测试集的准确度反而下降了，这说明出现了过拟合的现象。

3.0 相关库代码(tf2修改后）

import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.python.framework import ops
def load_dataset():
    train_dataset = h5py.File('datasets/train_signs.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
    test_dataset = h5py.File('datasets/test_signs.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0):
    """
    Creates a list of random minibatches from (X, Y)
    Arguments:
    X -- input data, of shape (input size, number of examples) (m, Hi, Wi, Ci)
    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) (m, n_y)
    mini_batch_size - size of the mini-batches, integer
    seed -- this is only for the purpose of grading, so that you're "random minibatches are the same as ours.
    Returns:
    mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)
    """
    m = X.shape[0]                  # number of training examples
    mini_batches = []
    np.random.seed(seed)
    # Step 1: Shuffle (X, Y)
    permutation = list(np.random.permutation(m))
    shuffled_X = X[permutation,:,:,:]
    shuffled_Y = Y[permutation,:]
    # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.
    num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning
    for k in range(0, num_complete_minibatches):
        mini_batch_X = shuffled_X[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:,:,:]
        mini_batch_Y = shuffled_Y[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    # Handling the end case (last mini-batch &lt; mini_batch_size)
    if m % mini_batch_size != 0:
        mini_batch_X = shuffled_X[num_complete_minibatches * mini_batch_size : m,:,:,:]
        mini_batch_Y = shuffled_Y[num_complete_minibatches * mini_batch_size : m,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    return mini_batches
def convert_to_one_hot(Y, C):
    Y = np.eye(C)[Y.reshape(-1)].T
    return Y
def forward_propagation_for_predict(X, parameters):
    """
    Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX
    Arguments:
    X -- input dataset placeholder, of shape (input size, number of examples)
    parameters -- python dictionary containing your parameters "W1", "b1", "W2", "b2", "W3", "b3"
                  the shapes are given in initialize_parameters
    Returns:
    Z3 -- the output of the last LINEAR unit
    """
    # Retrieve the parameters from the dictionary "parameters" 
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    W3 = parameters['W3']
    b3 = parameters['b3'] 
                                                           # Numpy Equivalents:
    Z1 = tf.add(tf.matmul(W1, X), b1)                      # Z1 = np.dot(W1, X) + b1
    A1 = tf.nn.relu(Z1)                                    # A1 = relu(Z1)
    Z2 = tf.add(tf.matmul(W2, A1), b2)                     # Z2 = np.dot(W2, a1) + b2
    A2 = tf.nn.relu(Z2)                                    # A2 = relu(Z2)
    Z3 = tf.add(tf.matmul(W3, A2), b3)                     # Z3 = np.dot(W3,Z2) + b3
    return Z3
def predict(X, parameters):
    W1 = tf.convert_to_tensor(parameters["W1"])
    b1 = tf.convert_to_tensor(parameters["b1"])
    W2 = tf.convert_to_tensor(parameters["W2"])
    b2 = tf.convert_to_tensor(parameters["b2"])
    W3 = tf.convert_to_tensor(parameters["W3"])
    b3 = tf.convert_to_tensor(parameters["b3"])
    params = {"W1": W1,
              "b1": b1,
              "W2": W2,
              "b2": b2,
              "W3": W3,
              "b3": b3}

    z3 = forward_propagation_for_predict(X, params)
    prediction = tf.argmax(z3)

    return prediction

其他一些问题

data_format

在tensorflow中，如果不做特殊说明
input矩阵的数据格式都是（样本数N，高度H，宽度W，图像通道数C），即NHWC

在卷积中，tf.conv2d过滤器的数据格式是（过滤器高度f，过滤器宽度f，输入通道数ic（上次层通道数），输出通道数oc（该层过滤器个数））

padding

在tensorflow中，
padding="VALID"表示不使用填充。
padding="SAME"表示使用填充，使得在步长=1时，输出图像的大小=输入图像的大小。即

if data_format starts with "NC", where output_spatial_shape depends on the value of padding.

If padding == "SAME": output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])

If padding == "VALID": output_spatial_shape[i] = ceil((input_spatial_shape[i] - (spatial_filter_shape[i]-1) * dilation_rate[i]) / strides[i]).

tensorflow官方文档 tf.nn.convolution see returns for details

LittleSeedling

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
【机器学习小记】【卷积神经网络模型】deeplearning.ai course4 1st week programming（tensorflow2.0实现）

搭建卷积神经网络模型1.3.1 边界填充np.pad1.3.2 单步卷积1.3.3 卷积1.4 池化层1.5 反向传播2.1.0 TensorFlow模型2.1.1 不需要创建placeholders初始化参数2.1.2 前向传播2.1.3 计算成本2.1.4 构建模型预测函数model测试epoch=150,learning_rate=0.009,mini_batch_size=64epoch=1000,learning_rate=0.009,mini_batch_size=643.0 相关库代码(tf2
复制链接

扫一扫