卷积神经网络

西伯利亚大草原的狼

已于 2022-08-18 14:57:57 修改

阅读量641

点赞数 3

分类专栏：卷积神经网络文章标签： cnn 人工智能深度学习

于 2022-06-22 00:51:15 首次发布

本文链接：https://blog.csdn.net/internetv/article/details/125386533

版权

卷积神经网络专栏收录该内容

4 篇文章 0 订阅

订阅专栏

卷积计算过程

在实际应用中，传输的都是彩色图片，所以如果全是全连接层的话，会导致大量的参数训练，导致模型过拟合

因此，目前，实际应用时会先对原始图像进行特征提取，再把提取到的特征送给全连接网络

卷积核滑动，遍历图像中每一个像素点，如果输入是灰度图，可以使用深度为1的单通道卷积核

如果输入特征是三通道彩色图，可以使用 3*3*3 或 5*5*3 的卷积核

输入特征图的深度(channel数)，决定了当前层卷积核的深度；
当前层卷积核的个数，决定了当前层输出特征图的深度。

单通道

三通道

感受野

全零填充(Padding)

TF描述卷积计算层

注意：

1 卷积核可以为长方形。

2 卷积步长横纵步长可以不相同

3 activate表示此层卷积层使用了什么激活函数，但是如果后续要进行批标准化则此处不写

4 推荐使用写法：第三种，filters=6,kernel_size(5,5)... 代码可读性更强

批标准化(Batch Normalization,BN)

神经网络对0附近的数据更敏感，但随着神经网络层数的增加，特征值会出现偏移0均值的情况

标准化：使数据符合0均值，1为标准差的分布，把偏移的特征数据重新拉回到0附近
批标准化：对一小批数据(batch),做标准化处理，使数据回归标准正态分布，常用在卷积操作和激活操作之间

BN操作将原本偏移的特征数据拉回到0均值，使进入激活函数的数据分布在激活函数的线性区域，使输入数据的微小变化，更明显的用激活函数表达出来，提升了激活函数对输入数据的区分力，但是这种简单的特征数据标准化，使特征数据完全满足标准正态分布，集中在激活函数中心的线性区域，使激活函数丧失了非线性特性，因此，在BN操作中为每个卷积核引入了两个可训练参数（类似于w和b）缩放因子和偏移因子，反向传播时，这两个参数会和w和b一起被训练优化，使标准正态分布后的特征数据，通过缩放因子和偏移因子，优化了特征数据分布的宽窄和偏移量，保证了网络的非线性表达力。

BN层位于卷积层之后，激活层之前

BatchNormalization（）把BN层加入中间

池化(Pooling)

最大值池化可提取图片纹理，均值池化可保留背景特征

TF描述池化层步骤

tf.keras.layers.MaxPool2D 表示最大值池化

tf.keras.layers.AveragePooling2D 表示均值池化

舍弃(Dropout)

在神经网络训练中，参数过多时，将一部分神经元按照一个概念从神经网络中暂时舍弃

神经网络使用的时候，被舍弃的神经元恢复链接

tf.keras.layers.Dropout(舍弃的概率)

Dropout（0,2） # 表示随机舍弃掉20%的神经元

卷积神经网络

卷积就是特征提取器，就是CBAPD（卷积神经网络八股文）

cifar10数据集

卷积神经网络搭建示例

搭建一个一层卷积，两层全连接的网络，使用6个5*5的卷积核，过2*2的池化核，池化步长是2，过128个神经元的全连接层，由于cifar是10分类，最后还要过一个10个神经元的全连接层

识别率不高的，在加一层卷积层和全连接层中的隐藏层，然后增打batch到128或者256，然后增加epoch，就可提高准确率

实现LeNet、AlexNet、VGGNet、InceptionNet、ResNet五个经典卷积网络（比如说5个3*3卷积核，实际上是5层叠加，来获得和大卷积核相同的感受野）

除白框内内容，其他作为基准内容，在经典卷积神经网络的讲解中，代码不变，只更改白框内的代码

LeNet

第一层是6个5*5的卷积核

第二层是池化层 2*2 步长为2

第三层是16个5*5的卷积核

第四层是池化层 2*2 步长为2

第五层是120个神经元的全连接层

第六层是84个神经元的全连接层

第七层是10个神经元的全连接层

注意：在实际编写过程中卷积层与池化层编写在了一起

import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# 对LeNet5结构描述
class LeNet5(Model):
    def __init__(self):
        super(LeNet5, self).__init__()
        self.c1 = Conv2D(filters=6, kernel_size=(5, 5),
                         activation='sigmoid')
        self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)

        self.c2 = Conv2D(filters=16, kernel_size=(5, 5),
                         activation='sigmoid')
        self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)

        self.flatten = Flatten()
        self.f1 = Dense(120, activation='sigmoid')
        self.f2 = Dense(84, activation='sigmoid')
        self.f3 = Dense(10, activation='softmax')

    def call(self, x):
        x = self.c1(x)
        x = self.p1(x)

        x = self.c2(x)
        x = self.p2(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.f2(x)
        y = self.f3(x)
        return y


model = LeNet5()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/LeNet5.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

AlexNet

VGGNet(使用小尺寸卷积核，在减少了参数的同时，提高了识别准确率)

结构

CBA \CBAPD\

CBA\CBAPD\

CBA\CBA\CBAPD\

CBA\CBA\CBAPD

+ 三个全连接层

InceptionNet（引入了inception结构块，在同一层网络内使用不同尺寸的卷积核，提升了模型感知力，使用批标准化，缓解了梯度消失）

inception结构块，在同一层网络内使用多个不同尺寸的卷积核，可以提取不同尺寸的特征，通过1*1卷积核，作用到输入特征图的每一个像素点。

通过设定少于输入特征图深度的1*1卷积核个数，减少了输出特征图深度，起到了降维的作用，减少了参数量和计算量

卷积连接器会把收到的这四路特征数据按深度方向拼接

用CBAPD把inception结构块表示出来结构如图（颜色一一对应）

inception结构块的实现（封装了ConvBNRelu）

import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense, \
    GlobalAveragePooling2D
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# 由于inception结构块中的卷积均采用CBA操作，所以 将其定义成一个新的类ConvBNRelu,减少代码长度，增加可读性
# 类class搭建神经网络能够实现跳连
class ConvBNRelu(Model):
    def __init__(self, ch, kernelsz=3, strides=1, padding='same'):
        super(ConvBNRelu, self).__init__()
        self.model = tf.keras.models.Sequential([
            # 输入、卷积核大小、卷积步长
            # ch是channel的简写.代表每个Conv2D中卷积核的个数
            Conv2D(ch, kernelsz, strides=strides, padding=padding),
            BatchNormalization(),
            Activation('relu')
        ])

    def call(self, x):
        x = self.model(x, training=False) #在training=False时，BN通过整个训练集计算均值、方差去做批归一化，training=True时，通过当前batch的均值、方差去做批归一化。推理时 training=False效果好
        return x

# 实现inception结构块
# 类class搭建神经网络能够实现跳连
class InceptionBlk(Model):
    def __init__(self, ch, strides=1):
        super(InceptionBlk, self).__init__()
        self.ch = ch
        self.strides = strides
        # 第一个分支
        self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
        # 第二个分支
        self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
        self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)
        # 第三个分支
        self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
        self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1)
        # 第四个分支
        # 最大池化
        self.p4_1 = MaxPool2D(3, strides=1, padding='same')
        self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)

    def call(self, x):
        # x1、x2_2、x3_2、x4_2 是各分支的输出
        x1 = self.c1(x)
        x2_1 = self.c2_1(x)
        x2_2 = self.c2_2(x2_1)
        x3_1 = self.c3_1(x)
        x3_2 = self.c3_2(x3_1)
        x4_1 = self.p4_1(x)
        x4_2 = self.c4_2(x4_1)
        # concat along axis=channel
        # 使用concat函数将他们堆叠在一起、axis=3 指定堆叠的维度是沿深度方向
        x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
        return x


class Inception10(Model):
    def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
        super(Inception10, self).__init__(**kwargs)
        self.in_channels = init_ch
        self.out_channels = init_ch
        self.num_blocks = num_blocks
        self.init_ch = init_ch
        self.c1 = ConvBNRelu(init_ch)
        self.blocks = tf.keras.models.Sequential()
        # 每两个相邻的结构快，组成一个block
        for block_id in range(num_blocks):
            for layer_id in range(2):
                if layer_id == 0:
                    # 每个block中第一个inception结构块卷积步长是2，则输出特征图尺寸减半
                    block = InceptionBlk(self.out_channels, strides=2)
                else:
                    # 每个block中第二个inception结构块卷积步长是1
                    block = InceptionBlk(self.out_channels, strides=1)
                self.blocks.add(block)
            # enlarger out_channels per block
            # 因为输出特征图尺寸减半，所以在此处将深度加深，尽可能保证特征抽取中信息的承载量一致
            # 且block_1通道数是block_0通道数的两倍
            self.out_channels *= 2
        self.p1 = GlobalAveragePooling2D()
        self.f1 = Dense(num_classes, activation='softmax')

    def call(self, x):
        x = self.c1(x)
        x = self.blocks(x)
        x = self.p1(x)
        y = self.f1(x)
        return y

# 实例化了Inception10的类 指定列block个数是2 也就是block_0和block_1 并且指定了网络是10分类的
model = Inception10(num_blocks=2, num_classes=10)

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/Inception10.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

ResNet（提出了层间残差跳连，引入了前方信息，缓解了梯度消失，使神经网络层数增加成为可能）

用一根跳连线将前面的特征直接接到后面，使输出结果H（x）包含了堆叠卷积的非线性输出F（x）和跳过这两层堆叠卷积，直接连接过来的恒等映射x，让他们对应元素相加，有效缓解了神经网络模型堆叠导致的退化，使神经网络可以向更深层级发展

相当于两个矩阵对应元素做加法

RESNET模块中有两种情况

情况一用图中实线表示，这种情况两层堆叠卷积，没有改变特征图的维度，也就是他们特征图的个数，高，宽和深度都相同，可以直接将F(x)和x相加

情况二用图中虚线表示，这种情况中两层堆叠卷积，改变了特征图的维度，需要借助1*1卷积来调整x的维度，使W（x）和F（x）的维度一致

把上一段提到的ResNet块两种形式，封装在一起，写出Resnetblock类

具体我写进了代码

import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class ResnetBlock(Model):

    def __init__(self, filters, strides=1, residual_path=False):
        super(ResnetBlock, self).__init__()
        self.filters = filters
        self.strides = strides
        self.residual_path = residual_path

        self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
        self.b1 = BatchNormalization()
        self.a1 = Activation('relu')

        self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
        self.b2 = BatchNormalization()

        # residual_path为True时，对输入进行下采样，即用1x1的卷积核做卷积操作 调整输入特征图inputs的尺寸或深度，保证x能和F(x)维度相同，顺利相加
        if residual_path:
            self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
            self.down_b1 = BatchNormalization()
        
        self.a2 = Activation('relu')

    def call(self, inputs):
        residual = inputs  # residual等于输入值本身，即residual=x
        # 将输入通过卷积、BN层、激活层，计算F(x)
        x = self.c1(inputs)
        x = self.b1(x)
        x = self.a1(x)

        x = self.c2(x)
        y = self.b2(x)
        # 如果堆叠卷积层前后维度不同 residual_path = 1 则调用31行 if 代码
        # 如果堆叠卷积层前后维度相同，则不执行31行代码 直接相加。
        if self.residual_path:
            residual = self.down_c1(inputs)
            residual = self.down_b1(residual)

        # 最后输出的是两部分的和，即F(x)+x或F(x)+Wx,再过激活函数
        out = self.a2(y + residual)
        return out

# 一共是18层网络 第一层简单卷积层 四个橙色块 第一个橙色块是两个实线跳连的ResNet块
# 第二、三、四个橙色块是先虚线，再实线跳连的ResNet块
class ResNet18(Model):

    def __init__(self, block_list, initial_filters=64):  # block_list表示每个block有几个卷积层
        super(ResNet18, self).__init__()
        self.num_blocks = len(block_list)  # 共有几个block
        self.block_list = block_list
        self.out_filters = initial_filters
        # 第一层简单卷积层
        self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
        self.b1 = BatchNormalization()
        self.a1 = Activation('relu')
        self.blocks = tf.keras.models.Sequential()
        # 构建ResNet网络结构
        # 第一个橙色块是两个实线跳连的ResNet块
        # 第二、三、四个橙色块是先虚线，再实线跳连的ResNet块
        # 整体由for循环构建，循环次数由参数列表元素个数决定 model = ResNet18([2, 2, 2, 2])四个元素，所以最外层for循环4次
        for block_id in range(len(block_list)):  # 第几个resnet block
            for layer_id in range(block_list[block_id]):  # 第几个卷积层

                if block_id != 0 and layer_id == 0:  # 对除第一个block以外的每个block的输入进行下采样
                    block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
                else:
                    block = ResnetBlock(self.out_filters, residual_path=False)
                self.blocks.add(block)  # 将构建好的block加入resnet
            self.out_filters *= 2  # 下一个block的卷积核数是上一个block的2倍
        # 平均全局池化
        self.p1 = tf.keras.layers.GlobalAveragePooling2D()
        # 全连接 输出10种
        self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())

    def call(self, inputs):
        x = self.c1(inputs)
        x = self.b1(x)
        x = self.a1(x)
        x = self.blocks(x)
        x = self.p1(x)
        y = self.f1(x)
        return y


model = ResNet18([2, 2, 2, 2])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/ResNet18.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()