【神经网络】(8) 卷积神经网络（Mobilenet_v1），案例：cifar图像10分类

立Sir

已于 2022-04-04 08:49:04 修改

阅读量3.2k

点赞数 3

分类专栏： TensorFlow神经网络

于 2021-12-16 20:04:47 首次发布

本文链接：https://blog.csdn.net/dgvv4/article/details/121981239

版权

神经网络 python 深度学习 tensorflow cnn

TensorFlow神经网络专栏收录该内容

22 篇文章 164 订阅

订阅专栏

各位同学大家好，今天和大家分享一下TensorFlow2.0中如何搭载Mobilenet_v1神经网络。

1. 模型简介

MobileNet系列是轻量级网络的一个系列，是针对移动端以及嵌入式视觉的应用提出的一类有效的模型。详细见：MobileNet_v1详解 - 灰信网（软件开发博客聚合） (freesion.com)

MobileNet是一种基于深度可分离卷积的模型，深度可分离卷积是一种将标准卷积分解成深度卷积以及一个1x1的卷积。对于MobileNet而言，深度卷积针对每个单个输入通道应用单个滤波器进行滤波，然后逐点卷积应用1x1的卷积操作来结合所有深度卷积得到的输出。而标准卷积一步即对所有的输入进行结合得到新的一系列输出。深度可分离卷积将其分成了两步，针对每个单独层进行滤波然后下一步即结合。这种分解能够有效的大量减少计算量以及模型的大小。

在这里插入图片描述

深度可分离卷积和标准卷积的区别在于：标准卷积是将每个卷积核应用在所有通道上，而深度可分离卷积针对输入中的每个通道应用不同的卷积核。假设输入有M个通道，从下图可以看到，标准卷积层有N个卷积核，每个卷积核尺寸为Dk * Dk，通道数为M；深度可分离卷积有M个卷积核，每个卷积核的通道数都是1。

但是深度可分离卷积的输入和输出维度都是一样的，那怎么改变维度呢？这时候就轮到逐点卷积出场，其本质为1*1的标准卷积，主要作用就是对输入进行升维和降维。

在这里插入图片描述

2. 数据加载

导入cifar10图像数据，由于导入的标签值y是二维的，需要将的shape=1的轴挤压掉。使用tf.squeeze()，指定轴为axis=1，使用.map()对数据集中的所有数据采用函数中的方法预处理，原始图像中每个像素值在[0,255]之间，归一化处理将其映射到[0,1]之间。指定.batch(128)，即每次迭代从数据集中取出128组样本用于训练。

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, Model, datasets

# （1）数据获取
(x,y), (x_test,y_test) = datasets.cifar10.load_data()

# 查看数据
print('x.shape:',x.shape,'y.shape:',y.shape)
# x.shape: (50000, 32, 32, 3) y.shape: (50000, 1)
print('y[:5]:', y[:5])

#（2）预处理
# 预处理函数类型转换
def processing(x,y):
    x = tf.cast(x, dtype=tf.float32) / 255.0  
    y = tf.cast(y, dtype=tf.int32)
    return x,y

# 创建训练集数据集
y = tf.squeeze(y, axis=1) # 把目标值y维度为1的轴挤压掉
train_db = tf.data.Dataset.from_tensor_slices((x,y))
train_db = train_db.map(processing).shuffle(10000).batch(128)

# 创建测试集数据集
y_test = tf.squeeze(y_test, axis=1)
test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
test_db = test_db.map(processing).batch(128)

# 构造一个迭代器，检查数据
sample = next(iter(train_db))
print('x_batch:', sample[0].shape,'y_batch', sample[1].shape)

为了能直观地了解我们要预测的图像，将图像可视化。

#（2）显示图像
import matplotlib.pyplot as plt
for i in range(15):
    plt.subplot(3,5,i+1)
    plt.imshow(sample[0][i]) # sample[0]代表取出的一个batch的所有图像信息
    plt.xticks([]) # 不显示xy轴坐标刻度
    plt.yticks([])
plt.show()

3. 模型构建

网络结构模型如下：

preview

超参数：depth_multiplier 和 alpha。

所有层的 通道数 乘以 alpha 参数(四舍五入)，模型大小近似下降到原来的 alpha^2 倍，计算量下降到原来的 alpha^2 倍，用于降低模型的宽度。

输入层的 分辨率 乘以 depth_multiplier 参数 (四舍五入)，等价于所有层的分辨率乘 depth_multiplier，模型大小不变，计算量下降到原来的 depth_multiplier^2 倍，用于降低输入图像的分辨率。

3.1 标准卷积

# 卷积+BN+激活函数
def conv_block(x, filter1, alpha, kernel, stride):
    # 宽度缩放因子
    filter1 = int(filter1 * alpha) # 卷积核--通道数
    # 卷积
    x = layers.Conv2D(filter1, kernel, stride, padding='same', use_bias=False)(x)
    # 标准化
    x = layers.BatchNormalization()(x)
    # 激活函数，使用relu6激活函数，保证移动端的精度
    x = layers.Activation('relu6')(x)
    
    return x

如果卷积层之后跟了BatchNormalization层，可以不用再加偏置了use_bias=False，对模型不起作用，还会占用内存。详情见下文：偏置（bias）在什么情况下可以要，可以不要？

下图是激活函数relu6和relu之间的关系；主要是为了在移动端float16的低精度的时候，也能有很好的数值分辨率，如果对reLu的输出值不加限制，那么输出范围就是0到正无穷，而低精度的float16无法精确描述其数值，带来精度损失。

3.2 深度分离卷积

# 深度可分离卷积块代替普通的3*3卷积，减少参数提高网络检测效果
def depth_conv_block(x, filter1, alpha, stride, depth_multiplier):
    # 宽度缩放因子
    filter1 = int(filter1 * alpha)
    
    # 深度可分离卷积，输入层的 分辨率 乘以 depth_multiplier 参数 
    x = layers.DepthwiseConv2D(kernel_size=(3,3), 
                                strides=stride, 
                                padding='same', 
                                depth_multiplier=depth_multiplier,
                                use_bias=False)(x)
    # 标准化
    x = layers.BatchNormalization()(x)
    # 激活函数
    x = layers.Activation('relu6')(x)
    
    # 1*1普通卷积+BN+激活，调整通道数。注，卷积后有BN层没必要加偏置，占用内存
    x = layers.Conv2D(filter1, kernel_size=(1,1), strides=(1,1), padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu6')(x)
    
    return x

3.3 网络主体

根据上面的网络结构图，一步一步堆叠下来

# 网络主体部分
def mobilenet(classes=1000, input_shape=[224,224,3], alpha=1.0, depth_multiplier=1):
    # 输入层
    input_tensor = keras.Input(shape=input_shape)
    
    # 普通卷积  [224,224,3]==>[112,112,32]
    x = conv_block(input_tensor, 32, alpha, kernel=(3,3), stride=(2,2))
    
    # 深度可分离卷积块
    # [112,112,32] ==> [112,112,64]
    x = depth_conv_block(x, 64, alpha, (1,1), depth_multiplier) 
    
    # [112,112,64] ==> [56,56,128]
    x = depth_conv_block(x, 128, alpha, (2,2), depth_multiplier) # [56,56,128]
    x = depth_conv_block(x, 128, alpha, (1,1), depth_multiplier) # [56,56,128]
   
    # [56,56,128] ==> [28,28,256]
    x = depth_conv_block(x, 256, alpha, (2,2), depth_multiplier) # [28,28,256]
    x = depth_conv_block(x, 256, alpha, (1,1), depth_multiplier) # [28,28,256]
    
    # [28,28,256] ==> [14,14,512]
    x = depth_conv_block(x, 512, alpha, (2,2), depth_multiplier) # [14,14,512]
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
     
    # [14,14,512] ==> [7,7,1024]
    x = depth_conv_block(x, 1024, alpha, (2,2), depth_multiplier) # [7,7,1024]
    x = depth_conv_block(x, 1024, alpha, (1,1), depth_multiplier) # [7,7,1024]   
   
    # 全局平均池化层
    # [7, 7, 1024] -> [None, 1024]
    x = layers.GlobalAveragePooling2D()(x)
     
    shape = (1, 1, int(1024 * alpha)) #[1,1,1024]

    x = layers.Reshape(shape)(x) #[1,1,1024]
    x = layers.Dropout(0.5)(x)

    x = layers.Conv2D(classes, (1, 1), padding='same')(x) #[1,1,classes]
    x = layers.Activation('softmax')(x) # [1,1,1024]
    x = layers.Reshape((classes,))(x) # [None, classes]
    
    # 模型构建
    model = Model(input_tensor, x)
    return model

# 模型
model = mobilenet()
# 查看网络结构
model.summary()

4. 完整代码

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, Model, datasets


# （1）数据获取
(x,y), (x_test,y_test) = datasets.cifar10.load_data()

# 查看数据
print('x.shape:',x.shape,'y.shape:',y.shape)
print('x_test.shape:', x_test.shape, 'y_test.shape:', y_test.shape)
print('y[:5]:', y[:5])


#（2）预处理
# 预处理函数类型转换
def processing(x,y):
    x = tf.cast(x, dtype=tf.float32) / 255.0  
    y = tf.cast(y, dtype=tf.int32)
    return x,y

# 创建训练集数据集
y = tf.squeeze(y, axis=1) # 把目标值y维度为1的轴挤压掉
y = tf.one_hot(y, depth=10)
train_db = tf.data.Dataset.from_tensor_slices((x,y))
train_db = train_db.map(processing).shuffle(10000).batch(128)

# 创建测试集数据集
y_test = tf.squeeze(y_test, axis=1)
y_test = tf.one_hot(y_test, depth=10)
test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
test_db = test_db.map(processing).batch(128)

# 构造一个迭代器，检查数据
sample = next(iter(train_db))
print('x_batch:', sample[0].shape,'y_batch', sample[1].shape)


#（2）显示图像
import matplotlib.pyplot as plt
for i in range(15):
    plt.subplot(3,5,i+1)
    plt.imshow(sample[0][i]) # sample[0]代表取出的一个batch的所有图像信息，映射到[0,1]之间显示图像
    plt.xticks([]) # 不显示xy轴坐标刻度
    plt.yticks([])
plt.show()


#（3）构造模型
# 卷积+BN+激活函数
def conv_block(x, filter1, alpha, kernel, stride):
    # 宽度缩放因子
    filter1 = int(filter1 * alpha)
    # 卷积
    x = layers.Conv2D(filter1, kernel, stride, padding='same', use_bias=False)(x)
    # 标准化
    x = layers.BatchNormalization()(x)
    # 激活函数，使用relu6激活函数，保证移动端的精度
    x = layers.Activation('relu6')(x)
    
    return x


# 深度可分离卷积块代替普通的3*3卷积，减少参数提高网络检测效果
def depth_conv_block(x, filter1, alpha, stride, depth_multiplier):
    # 宽度缩放因子
    filter1 = int(filter1 * alpha)
    
    # 深度可分离卷积
    x = layers.DepthwiseConv2D(kernel_size=(3,3), 
                                strides=stride, 
                                padding='same', 
                                depth_multiplier=depth_multiplier,
                                use_bias=False)(x)
    # 标准化
    x = layers.BatchNormalization()(x)
    # 激活函数
    x = layers.Activation('relu6')(x)
    
    # 1*1普通卷积+BN+激活，调整通道数。注，卷积后有BN层没必要加偏置，占用内存
    x = layers.Conv2D(filter1, kernel_size=(1,1), strides=(1,1), padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu6')(x)
    
    return x


# 网络主体部分
# 4种分类，规定图片输入大小和之前读入图片相同
def mobilenet(classes=10, input_shape=[32, 32, 3], alpha=1.0, depth_multiplier=1):
    # 输入层
    input_tensor = keras.Input(shape=input_shape)
    
    # 普通卷积  [224,224,3]==>[112,112,32]
    x = conv_block(input_tensor, 32, alpha, kernel=(3,3), stride=(2,2))
    
    # 深度可分离卷积块
    # [112,112,32] ==> [112,112,64]
    x = depth_conv_block(x, 64, alpha, (1,1), depth_multiplier) 
    
    # [112,112,64] ==> [56,56,128]
    x = depth_conv_block(x, 128, alpha, (2,2), depth_multiplier) # [56,56,128]
    x = depth_conv_block(x, 128, alpha, (1,1), depth_multiplier) # [56,56,128]
   
    # [56,56,128] ==> [28,28,256]
    x = depth_conv_block(x, 256, alpha, (2,2), depth_multiplier) # [28,28,256]
    x = depth_conv_block(x, 256, alpha, (1,1), depth_multiplier) # [28,28,256]
    
    # [28,28,256] ==> [14,14,512]
    x = depth_conv_block(x, 512, alpha, (2,2), depth_multiplier) # [14,14,512]
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
    x = depth_conv_block(x, 512, alpha, (1,1), depth_multiplier) # [14,14,512]   
     
    # [14,14,512] ==> [7,7,1024]
    x = depth_conv_block(x, 1024, alpha, (2,2), depth_multiplier) # [7,7,1024]
    x = depth_conv_block(x, 1024, alpha, (1,1), depth_multiplier) # [7,7,1024]   
   
    # 全局平均池化层[7,7,1024] ==> [b, 1024]
    x = layers.GlobalAveragePooling2D()(x)
    
    shape = (1, 1, int(1024 * alpha))

    x = layers.Reshape(shape)(x)
    x = layers.Dropout(0.5)(x)

    x = layers.Conv2D(classes, (1, 1), padding='same')(x)
    x = layers.Activation('softmax')(x)
    x = layers.Reshape((classes,))(x)
    
    # 模型构建
    model = Model(input_tensor, x)
    return model

# 模型
model = mobilenet()
# 查看网络结构
model.summary()


#（4）网络配置
# 设置学习率
opt = optimizers.Adam(learning_rate=1e-5)

# 编译
model.compile(optimizer=opt, # 学习率
              loss = tf.losses.CategoricalCrossentropy(),  #损失函数
              metrics=['accuracy'])  # 评价指标

# 训练
history = model.fit(train_db,  # 训练集
                    validation_data=test_db,  # 验证集
                    epochs=30)  # 迭代次数

#（5）模型评估
# 准确率
train_acc = history.history['acc']
val_acc = history.history['val_acc']
# 损失
train_loss = history.history['loss']
val_loss = history.history['val_loss']
# 绘图
plt.figure(figsize=(10,5))
# 准确率
plt.subplot(1,2,1)
plt.plot(train_acc, label='train_acc')
plt.plot(val_acc, label='val_acc')
plt.legend()
# 损失曲线
plt.subplot(1,2,2)
plt.plot(train_loss, label='train_loss')
plt.plot(val_loss, label='val_loss')
plt.legend()