CV-CNN-2014:VGG模型【重复堆叠3x3卷积增加网络深度】【设计思想:更深的网络有助于性能的提升;更深的网络不好训练,容易过拟合,所以采用小卷积核】【11层、13层、16层、19层】

VGGNet是由牛津大学和GoogleDeepMind合作研发的深度卷积神经网络,以深度著称,最多达19层。它通过3x3的小卷积核和多层堆叠,替代大卷积核,减少了参数数量,提高了网络性能。VGG证明了增加网络深度能有效提升图像识别性能,并且其全连接层在测试阶段可转换为卷积层,适用于不同尺寸输入。VGG16和VGG19是其经典模型,至今仍用于图像特征提取。
摘要由CSDN通过智能技术生成

《原始论文:Very Deep Convolutional Networks for Large-Scale Image Recognition》

2014年,牛津大学计算机视觉组(Visual Geometry Group)和Google DeepMind公司的研究员一起研发出了新的深度卷积神经网络:VGGNet,并取得了ILSVRC2014比赛分类项目的第二名,将 Top-5错误率降到7.3%(第一名是GoogLeNet,也是同年提出的)和定位项目的第一名。

它主要的贡献是展示出网络的深度(depth)是算法优良性能的关键部分。

VGGNet探索了卷积神经网络的深度与其性能之间的关系,成功地构筑了16~19层深的卷积神经网络,证明了增加网络的深度能够在一定程度上影响网络最终的性能,使错误率大幅下降,同时拓展性又很强,迁移到其它图片数据上的泛化性也非常好。

目前使用比较多的网络结构主要有ResNet(152-1000层),GooleNet(22层),VGGNet(19层),大多数模型都是基于这几个模型上改进,采用新的优化算法,多模型融合等。

VGGNet可以看成是加深版本的AlexNet,都是由卷积层、全连接层两大部分构成。

VGG Net结构已经过时,只学习其思想即可

不过,VGG Net 依然经常被用来提取图像特征。

在这里插入图片描述

  • VGG神经网络提供了如下结论:
    1. 通过增加深度能有效地提升性能;
    2. 最佳模型:VGG16,从头到尾只有3x3卷积与2x2池化,简洁优美;
    3. 卷积可代替全连接,可适应各种尺寸的图片

1、VGG的特点

1.1 结构简洁

  • VGG由5层卷积层、3层全连接层、softmax输出层构成。层与层之间使用max-pooling(最大化池)分开,所有隐层的激活单元都采用ReLU函数。

1.2 小卷积核和多卷积子层

  • VGG使用多个较小卷积核(3x3)的卷积层代替一个卷积核较大的卷积层,一方面可以减少参数,另一方面相当于进行了更多的非线性映射,可以增加网络的拟合/表达能力。
  • 小卷积核是VGG的一个重要特点,虽然VGG是在模仿AlexNet的网络结构,但没有采用AlexNet中比较大的卷积核尺寸(如7x7),而是通过降低卷积核的大小(3x3),增加卷积子层数来达到同样的性能(VGG:从1到4卷积子层,AlexNet:1子层)
  • VGG的作者认为两个3x3的卷积堆叠获得的感受野大小,相当一个5x5的卷积;而3个3x3卷积的堆叠获取到的感受野相当于一个7x7的卷积。这样可以增加非线性映射,也能很好地减少参数(例如7x7的参数为49个,而3个3x3的参数为27),如下图所示:
    在这里插入图片描述

1.3 小池化核

  • 相比AlexNet的3x3的池化核,VGG全部采用2x2的池化核。

1.4 通道数多

  • VGG网络第一层的通道数为64,后面每层都进行了翻倍,最多到512个通道,通道数的增加,使得更多的信息可以被提取出来。

1.5 层数更深、特征图更宽

  • 由于卷积核专注于扩大通道数、池化专注于缩小宽和高,使得模型架构上更深更宽的同时,控制了计算量的增加规模。

1.6 全连接转卷积(测试阶段)

  • 这也是VGG的一个特点,在网络测试阶段将训练阶段的三个全连接替换为三个卷积,使得测试得到的全卷积网络因为没有全连接的限制,因而可以接收任意宽或高为的输入,这在测试阶段很重要。
  • 如本节第一个图所示,输入图像是224x224x3,如果后面三个层都是全连接,那么在测试阶段就只能将测试的图像全部都要缩放大小到224x224x3,才能符合后面全连接层的输入数量要求,这样就不便于测试工作的开展。
  • 而“全连接转卷积”,替换过程如下:
    在这里插入图片描述
  • 例如7x7x512的层要跟4096个神经元的层做全连接,则替换为对7x7x512的层作通道数为4096、卷积核为1x1的卷积。
  • 这个“全连接转卷积”的思路是VGG作者参考了OverFeat的工作思路,例如下图是OverFeat将全连接换成卷积后,则可以来处理任意分辨率(在整张图)上计算卷积,这就是无需对原图做重新缩放处理的优势。
    在这里插入图片描述

2、VGG的网络结构

  • 下图是来自论文《Very Deep Convolutional Networks for Large-Scale Image Recognition》(基于甚深层卷积网络的大规模图像识别)的VGG网络结构,正是在这篇论文中提出了VGG,如下图:
    在这里插入图片描述
  • 在这篇论文中分别使用了A、A-LRN、B、C、D、E这6种网络结构进行测试,这6种网络结构相似,都是由5层卷积层、3层全连接层组成,其中区别在于每个卷积层的子层数量不同,从A至E依次增加(子层数量从1到4),总的网络深度从11层到19层(添加的层以粗体显示),表格中的卷积层参数表示为“conv⟨感受野大小⟩-通道数⟩”,例如con3-128,表示使用3x3的卷积核,通道数为128。为了简洁起见,在表格中不显示ReLU激活功能。
  • 其中,网络结构D就是著名的VGG16,网络结构E就是著名的VGG19。
  • 以网络结构D(VGG16)为例,介绍其处理过程如下,请对比上面的表格和下方这张图,留意图中的数字变化,有助于理解VGG16的处理过程:
    在这里插入图片描述
    1. 输入224x224x3的图片,经64个3x3的卷积核作两次卷积+ReLU,卷积后的尺寸变为224x224x64
    2. 作max pooling(最大化池化),池化单元尺寸为2x2(效果为图像尺寸减半),池化后的尺寸变为112x112x64
    3. 经128个3x3的卷积核作两次卷积+ReLU,尺寸变为112x112x128
    4. 作2x2的max pooling池化,尺寸变为56x56x128
    5. 经256个3x3的卷积核作三次卷积+ReLU,尺寸变为56x56x256
    6. 作2x2的max pooling池化,尺寸变为28x28x256
    7. 经512个3x3的卷积核作三次卷积+ReLU,尺寸变为28x28x512
    8. 作2x2的max pooling池化,尺寸变为14x14x512
    9. 经512个3x3的卷积核作三次卷积+ReLU,尺寸变为14x14x512
    10. 作2x2的max pooling池化,尺寸变为7x7x512
    11. 与两层1x1x4096,一层1x1x1000进行全连接+ReLU(共三层)
    12. 通过softmax输出1000个预测结果
  • 以上就是VGG16(网络结构D)各层的处理过程,A、A-LRN、B、C、E其它网络结构的处理过程也是类似,执行过程如下(以VGG16为例):
    在这里插入图片描述
  • A、A-LRN、B、C、D、E这6种网络结构的深度虽然从11层增加至19层,但参数量变化不大,这是由于基本上都是采用了小卷积核(3x3,只有9个参数),这6种结构的参数数量(百万级)并未发生太大变化,这是因为在网络中,参数主要集中在全连接层。
    在这里插入图片描述
  • 经作者对A、A-LRN、B、C、D、E这6种网络结构进行单尺度的评估,错误率结果如下:
    在这里插入图片描述
  • 从上表可以看出:
    1. LRN层无性能增益(A-LRN)
      VGG作者通过网络A-LRN发现,AlexNet曾经用到的LRN层(local response normalization,局部响应归一化)并没有带来性能的提升,因此在其它组的网络中均没再出现LRN层。
    2. 随着深度增加,分类性能逐渐提高(A、B、C、D、E)
      从11层的A到19层的E,网络深度增加对top1和top5的错误率下降很明显。
    3. 多个小卷积核比单个大卷积核性能好(B)
      VGG作者做了实验用B和自己一个不在实验组里的较浅网络比较,较浅网络用conv5x5来代替B的两个conv3x3,结果显示多个小卷积核比单个大卷积核效果要好。

3、VGG案例-cifar100分类数据集【Tensorflow2】

import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # 放在 import tensorflow as tf 之前才有效

import tensorflow as tf
from tensorflow.keras import layers, optimizers, datasets, Sequential

# 一、获取数据集
(X_train, Y_train), (X_val, Y_val) = datasets.cifar100.load_data()
print('X_train.shpae = {0},Y_train.shpae = {1}------------type(X_train) = {2},type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
Y_train = tf.squeeze(Y_train)
Y_val = tf.squeeze(Y_val)
print('X_train.shpae = {0},Y_train.shpae = {1}------------type(X_train) = {2},type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))


# 二、数据处理
# 预处理函数:将numpy数据转为tensor
def preprocess(x, y):
    x = tf.cast(x, dtype=tf.float32) / 255.
    y = tf.cast(y, dtype=tf.int32)
    return x, y


# 2.1 处理训练集
# print('X_train.shpae = {0},Y_train.shpae = {1}------------type(X_train) = {2},type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
db_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train))  # 此步骤自动将numpy类型的数据转为tensor
db_train = db_train.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
# 从data数据集中按顺序抽取buffer_size个样本放在buffer中,然后打乱buffer中的样本。buffer中样本个数不足buffer_size,继续从data数据集中安顺序填充至buffer_size,此时会再次打乱。
db_train = db_train.shuffle(buffer_size=1000)  # 打散db_train中的样本顺序,防止图片的原始顺序对神经网络性能的干扰。
print('db_train = {0},type(db_train) = {1}'.format(db_train, type(db_train)))
batch_size_train = 2000  # 每个batch里的样本数量设置100-200之间合适。
db_batch_train = db_train.batch(batch_size_train)  # 将db_batch_train中每sample_num_of_each_batch_train张图片分为一个batch,读取一个batch相当于一次性并行读取sample_num_of_each_batch_train张图片
print('db_batch_train = {0},type(db_batch_train) = {1}'.format(db_batch_train, type(db_batch_train)))
# 2.2 处理测试集:测试数据集不需要打乱顺序
db_val = tf.data.Dataset.from_tensor_slices((X_val, Y_val))  # 此步骤自动将numpy类型的数据转为tensor
db_val = db_val.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
batch_size_val = 2000  # 每个batch里的样本数量设置100-200之间合适。
db_batch_val = db_val.batch(batch_size_val)  # 将db_val中每sample_num_of_each_batch_val张图片分为一个batch,读取一个batch相当于一次性并行读取sample_num_of_each_batch_val张图片

# 三、构建神经网络
# 1、卷积神经网络结构:Conv2D 表示卷积层,激活函数用 relu
conv_layers = [  # 5 units of conv + max pooling
    # unit 1
    layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),  # 64个kernel表示输出的数据的channel为64,padding="same"表示自动padding使得输入与输出大小一致
    layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
    # unit 2
    layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
    # unit 3
    layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
    # unit 4
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
    # unit 5
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same')
]
# 2、全连接神经网络结构:Dense 表示全连接层,激活函数用 relu
fullcon_layers = [
    layers.Dense(300, activation=tf.nn.relu),  # 降维:512-->300
    layers.Dense(200, activation=tf.nn.relu),  # 降维:300-->200
    layers.Dense(100)  # 降维:200-->100,最后一层一般不需要在此处指定激活函数,在计算Loss的时候会自动运用激活函数
]
# 3、构建卷积神经网络、全连接神经网络
conv_network = Sequential(conv_layers)  # [b, 32, 32, 3] => [b, 1, 1, 512]
fullcon_network = Sequential(fullcon_layers)  # [b, 1, 1, 512] => [b, 1, 1, 100]
conv_network.build(input_shape=[None, 32, 32, 3])  # 原始图片维度为:[32, 32, 3],None表示样本数量,是不确定的值。
fullcon_network.build(input_shape=[None, 512])  # 从卷积网络传过来的数据维度为:[b, 512],None表示样本数量,是不确定的值。
# 4、打印神经网络信息
conv_network.summary()  # 打印卷积神经网络network的简要信息
fullcon_network.summary()  # 打印神经网络network的简要信息

# 四、梯度下降优化器设置
optimizer = optimizers.Adam(lr=1e-4)


# 五、整体数据集进行一次梯度下降来更新模型参数,整体数据集迭代一次,一般用epoch。每个epoch中含有batch_step_no个step,每个step中就是设置的每个batch所含有的样本数量。
def train_epoch(epoch_no):
    print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段:开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))
    for batch_step_no, (X_batch, Y_batch) in enumerate(db_batch_train):  # 每次计算一个batch的数据,循环结束则计算完毕整体数据的一次梯度下降;每个batch的序号一般用step表示(batch_step_no)
        print('epoch_no = {0}, batch_step_no = {1},X_batch.shpae = {2},Y_batch.shpae = {3}------------type(X_batch) = {4},type(Y_batch) = {5}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape, type(X_batch), type(Y_batch)))
        Y_batch_one_hot = tf.one_hot(Y_batch, depth=100)  # One-Hot编码,共有100类  [] => [b,100]
        print('\tY_train_one_hot.shpae = {0}'.format(Y_batch_one_hot.shape))
        # 梯度带tf.GradientTape:连接需要计算梯度的”函数“和”变量“的上下文管理器(context manager)。将“函数”(即Loss的定义式)与“变量”(即神经网络的所有参数)都包裹在tf.GradientTape中进行追踪管理
        with tf.GradientTape() as tape:
            # Step1. 前向传播/前向运算-->计算当前参数下模型的预测值
            out_logits_conv = conv_network(X_batch)  # [b, 32, 32, 3] => [b, 1, 1, 512]
            print('\tout_logits_conv.shape = {0}'.format(out_logits_conv.shape))
            out_logits_conv = tf.reshape(out_logits_conv, [-1, 512])    # [b, 1, 1, 512] => [b, 512]
            print('\tReshape之后:out_logits_conv.shape = {0}'.format(out_logits_conv.shape))
            out_logits_fullcon = fullcon_network(out_logits_conv)  # [b, 512] => [b, 100]
            print('\tout_logits_fullcon.shape = {0}'.format(out_logits_fullcon.shape))
            # Step2. 计算预测值与真实值之间的损失Loss:交叉熵损失
            MSE_Loss = tf.losses.categorical_crossentropy(Y_batch_one_hot, out_logits_fullcon, from_logits=True)    # categorical_crossentropy()第一个参数是真实值,第二个参数是预测值,顺序不能颠倒
            print('\tMSE_Loss.shape = {0}'.format(MSE_Loss.shape))
            MSE_Loss = tf.reduce_mean(MSE_Loss)
            print('\t求均值后:MSE_Loss.shape = {0}'.format(MSE_Loss.shape))
            print('\t第{0}个epoch-->第{1}个batch step的初始时的:MSE_Loss = {2}'.format(epoch_no, batch_step_no + 1, MSE_Loss))
        # Step3. 反向传播-->损失值Loss下降一个学习率的梯度之后所对应的更新后的各个Layer的参数:W1, W2, W3, B1, B2, B3...
        variables = conv_network.trainable_variables + fullcon_network.trainable_variables  # list的拼接: [1, 2] + [3, 4] => [1, 2, 3, 4]
        # grads为整个全连接神经网络模型中所有Layer的待优化参数trainable_variables [W1, W2, W3, B1, B2, B3...]分别对目标函数MSE_Loss 在 X_batch 处的梯度值,
        grads = tape.gradient(MSE_Loss, variables)  # grads为梯度值。MSE_Loss为目标函数,variables为卷积神经网络、全连接神经网络所有待优化参数,
        # grads, _ = tf.clip_by_global_norm(grads, 15)  # 限幅:解决gradient explosion或者gradients vanishing的问题。
        # print('\t第{0}个epoch-->第{1}个batch step的初始时的参数:'.format(epoch_no, batch_step_no + 1))
        if batch_step_no == 0:
            index_variable = 1
            for grad in grads:
                print('\t\tgrad{0}:grad.shape = {1},grad.ndim = {2}'.format(index_variable, grad.shape, grad.ndim))
                index_variable = index_variable + 1
        # 进行一次梯度下降
        print('\t梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始')
        optimizer.apply_gradients(zip(grads, variables))  # network的所有参数 trainable_variables [W1, W2, W3, B1, B2, B3...]下降一个梯度  w' = w - lr * grad,zip的作用是让梯度值与所属参数前后一一对应
        print('\t梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束\n')
    print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段:结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))


# 六、模型评估 test/evluation
def evluation(epoch_no):
    print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段:开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))
    total_correct, total_num = 0, 0
    for batch_step_no, (X_batch, Y_batch) in enumerate(db_batch_val):
        print('epoch_no = {0}, batch_step_no = {1},X_batch.shpae = {2},Y_batch.shpae = {3}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape))
        # 根据训练模型计算测试数据的输出值out
        out_logits_conv = conv_network(X_batch)  # [b, 32, 32, 3] => [b, 1, 1, 512]
        print('\tout_logits_conv.shape = {0}'.format(out_logits_conv.shape))
        out_logits_conv = tf.reshape(out_logits_conv, [-1, 512])  # [b, 1, 1, 512] => [b, 512]
        print('\tReshape之后:out_logits_conv.shape = {0}'.format(out_logits_conv.shape))
        out_logits_fullcon = fullcon_network(out_logits_conv)  # [b, 512] => [b, 100]
        print('\tout_logits_fullcon.shape = {0}'.format(out_logits_fullcon.shape))
        # print('\tout_logits_fullcon[:1,:] = {0}'.format(out_logits_fullcon[:1, :]))
        # 利用softmax()函数将network的输出值转为0~1范围的值,并且使得所有类别预测概率总和为1
        out_logits_prob = tf.nn.softmax(out_logits_fullcon, axis=1)  # out_logits_prob: [b, 100] ~ [0, 1]
        # print('\tout_logits_prob[:1,:] = {0}'.format(out_logits_prob[:1, :]))
        out_logits_prob_max_index = tf.cast(tf.argmax(out_logits_prob, axis=1), dtype=tf.int32)  # [b, 100] => [b] 查找最大值所在的索引位置 int64 转为 int32
        # print('\t预测值:out_logits_prob_max_index = {0},\t真实值:Y_train_one_hot = {1}'.format(out_logits_prob_max_index, Y_batch))
        is_correct_boolean = tf.equal(out_logits_prob_max_index, Y_batch.numpy())
        # print('\tis_correct_boolean = {0}'.format(is_correct_boolean))
        is_correct_int = tf.cast(is_correct_boolean, dtype=tf.float32)
        # print('\tis_correct_int = {0}'.format(is_correct_int))
        is_correct_count = tf.reduce_sum(is_correct_int)
        print('\tis_correct_count = {0}\n'.format(is_correct_count))
        total_correct += int(is_correct_count)
        total_num += X_batch.shape[0]
    print('total_correct = {0}---total_num = {1}'.format(total_correct, total_num))
    acc = total_correct / total_num
    print('第{0}轮Epoch迭代的准确度: acc = {1}'.format(epoch_no, acc))
    print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段:结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))


# 七、整体数据迭代多次梯度下降来更新模型参数
def train():
    epoch_count = 1  # epoch_count为整体数据集迭代梯度下降次数
    for epoch_no in range(1, epoch_count + 1):
        print('\n\n利用整体数据集进行模型的第{0}轮Epoch迭代开始:**********************************************************************************************************************************'.format(epoch_no))
        train_epoch(epoch_no)
        evluation(epoch_no)
        print('利用整体数据集进行模型的第{0}轮Epoch迭代结束:**********************************************************************************************************************************'.format(epoch_no))


if __name__ == '__main__':
    train()

打印结果:

X_train.shpae = (50000, 32, 32, 3),Y_train.shpae = (50000, 1)------------type(X_train) = <class 'numpy.ndarray'>type(Y_train) = <class 'numpy.ndarray'>
X_train.shpae = (50000, 32, 32, 3),Y_train.shpae = (50000,)------------type(X_train) = <class 'numpy.ndarray'>type(Y_train) = <class 'tensorflow.python.framework.ops.EagerTensor'>
db_train = <ShuffleDataset shapes: ((32, 32, 3), ()), types: (tf.float32, tf.int32)>type(db_train) = <class 'tensorflow.python.data.ops.dataset_ops.ShuffleDataset'>
db_batch_train = <BatchDataset shapes: ((None, 32, 32, 3), (None,)), types: (tf.float32, tf.int32)>type(db_batch_train) = <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 32, 32, 64)        1792      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 64)        36928     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 16, 128)       73856     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 16, 16, 128)       147584    
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 128)         0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 8, 8, 256)         295168    
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 256)         590080    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 256)         0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 4, 4, 512)         1180160   
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 512)         2359808   
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 2, 512)         0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 2, 2, 512)         2359808   
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 2, 2, 512)         2359808   
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 1, 1, 512)         0         
=================================================================
Total params: 9,404,992
Trainable params: 9,404,992
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 300)               153900    
_________________________________________________________________
dense_1 (Dense)              (None, 200)               60200     
_________________________________________________________________
dense_2 (Dense)              (None, 100)               20100     
=================================================================
Total params: 234,200
Trainable params: 234,200
Non-trainable params: 0
_________________________________________________________________


利用整体数据集进行模型的第1轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++1轮Epoch-->Training 阶段:开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
	Y_train_one_hot.shpae = (2000, 100)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	MSE_Loss.shape = (2000,)
	求均值后:MSE_Loss.shape = ()1个epoch-->1个batch step的初始时的:MSE_Loss = 4.605105400085449
		grad1:grad.shape = (3, 3, 3, 64),grad.ndim = 4
		grad2:grad.shape = (64,),grad.ndim = 1
		grad3:grad.shape = (3, 3, 64, 64),grad.ndim = 4
		grad4:grad.shape = (64,),grad.ndim = 1
		grad5:grad.shape = (3, 3, 64, 128),grad.ndim = 4
		grad6:grad.shape = (128,),grad.ndim = 1
		grad7:grad.shape = (3, 3, 128, 128),grad.ndim = 4
		grad8:grad.shape = (128,),grad.ndim = 1
		grad9:grad.shape = (3, 3, 128, 256),grad.ndim = 4
		grad10:grad.shape = (256,),grad.ndim = 1
		grad11:grad.shape = (3, 3, 256, 256),grad.ndim = 4
		grad12:grad.shape = (256,),grad.ndim = 1
		grad13:grad.shape = (3, 3, 256, 512),grad.ndim = 4
		grad14:grad.shape = (512,),grad.ndim = 1
		grad15:grad.shape = (3, 3, 512, 512),grad.ndim = 4
		grad16:grad.shape = (512,),grad.ndim = 1
		grad17:grad.shape = (3, 3, 512, 512),grad.ndim = 4
		grad18:grad.shape = (512,),grad.ndim = 1
		grad19:grad.shape = (3, 3, 512, 512),grad.ndim = 4
		grad20:grad.shape = (512,),grad.ndim = 1
		grad21:grad.shape = (512, 300),grad.ndim = 2
		grad22:grad.shape = (300,),grad.ndim = 1
		grad23:grad.shape = (300, 200),grad.ndim = 2
		grad24:grad.shape = (200,),grad.ndim = 1
		grad25:grad.shape = (200, 100),grad.ndim = 2
		grad26:grad.shape = (100,),grad.ndim = 1
	梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
	梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束

epoch_no = 1, batch_step_no = 2,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
	Y_train_one_hot.shpae = (2000, 100)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	MSE_Loss.shape = (2000,)
	求均值后:MSE_Loss.shape = ()1个epoch-->2个batch step的初始时的:MSE_Loss = 4.605042934417725
	梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
	梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
...
...
...
epoch_no = 1, batch_step_no = 25,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
	Y_train_one_hot.shpae = (2000, 100)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	MSE_Loss.shape = (2000,)
	求均值后:MSE_Loss.shape = ()1个epoch-->25个batch step的初始时的:MSE_Loss = 4.540274143218994
	梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
	梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束

++++++++++++++++++++++++++++++++++++++++++++1轮Epoch-->Training 阶段:结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++1轮Epoch-->Evluation 阶段:开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	is_correct_count = 51.0

epoch_no = 1, batch_step_no = 2,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	is_correct_count = 39.0

epoch_no = 1, batch_step_no = 3,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	is_correct_count = 41.0

epoch_no = 1, batch_step_no = 4,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	is_correct_count = 43.0

epoch_no = 1, batch_step_no = 5,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
	out_logits_conv.shape = (2000, 1, 1, 512)
	Reshape之后:out_logits_conv.shape = (2000, 512)
	out_logits_fullcon.shape = (2000, 100)
	is_correct_count = 47.0

total_correct = 221---total_num = 100001轮Epoch迭代的准确度: acc = 0.0221
++++++++++++++++++++++++++++++++++++++++++++1轮Epoch-->Evluation 阶段:结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第1轮Epoch迭代结束:**********************************************************************************************************************************

Process finished with exit code 0



参考资料:
CNN模型合集 | 5 VGGNet

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值