MobileNetV1&MobileNetV2

最新推荐文章于 2024-05-04 11:36:04 发布

一位美女

最新推荐文章于 2024-05-04 11:36:04 发布

阅读量437

点赞数

分类专栏：深度学习

本文链接：https://blog.csdn.net/weixin_45019830/article/details/107939899

版权

深度学习专栏收录该内容

9 篇文章 1 订阅

订阅专栏

MobileNetV1

1. 深度可分离卷积

分成两步Depthwise+Pointwise

1.1 Depthwise深度卷积

在这里插入图片描述

1.2 Pointwise逐点卷积

在这里插入图片描述

1.3 实现

1）使用函数tf.keras.layers.DepthwiseConv2D和tf.keras.layers.Conv2D组合使用实现：

tf.keras.layers.DepthwiseConv2D(kernel_size=(3, 3), strides=1, padding='same', depth_multiplier=1),
tf.keras.layers.Conv2D(filters=16, kernel_size=(1, 1), strides=1, padding='same')

2)直接使用函数tf.keras.layers.SeparableConv2D实现:

tf.keras.layers.SeparableConv2D(filters=num_filters_2nd,kernel_size=(3, 3),strides=1,padding="same")

2. MobileNetV1

2.1 结构

MobileNetV1结构由深度可分离卷积所构成，且除了第一层之外为全卷积:
在这里插入图片描述
所有的层都跟着一个batchnorm以及ReLU非线性激活函数，除了最后一层全连接层没有非线性激活函数直接送入softmax层进行分类:

2.2 代码实现

卷积块构建：卷积块包含卷积层，批量正则化层，激活函数层

def conv_block(inputs, filters, kernel_size = (3,3), strides=(1,1)):
    x = tf.keras.layers.Conv2D(filters=filters, kernel_size=kernel_size, strides=strides,     
                               padding='SAME', use_bias=False)(inputs)
    x = tf.keras.layers.BatchNormalization()(x)
    out = tf.keras.layers.Activation('relu')(x)
 
    return out

深度可分离块包含深度卷积层和逐点卷积层:

def depthwise_conv_block(inputs,
                         pointwise_conv_filters,
                         strides=(1,1)):
    x = tf.keras.layers.DepthwiseConv2D(kernel_size=(3,3), strides=strides, padding='SAME',
                                        use_bias=False)(inputs)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Activation('relu')(x)
 
    x = tf.keras.layers.Conv2D(filters=pointwise_conv_filters, kernel_size=(1,1), 
                               padding='SAME', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    out = tf.keras.layers.Activation('relu')(x)
    return out

整体模型构建：包含一个卷积块，13个深度可分离卷积块，一个全局平均池化层，最后加一个全连接分类层

def mobilenet_v1(inputs,
                 classes):
    # [32, 32, 3] => [16, 16, 32]
    x = conv_block(inputs, 32, strides=(2,2))
    # [16, 16, 32] => [16, 16, 64]
    x = depthwise_conv_block(x, 64)
    # [16, 16, 64] => [8, 8, 128]
    x = depthwise_conv_block(x, 128, strides=(2,2))
    # [8, 8, 128] => [8, 8, 128]
    x = depthwise_conv_block(x, 128)
    # [8, 8, 128] => [4, 4, 256]
    x = depthwise_conv_block(x, 256, strides=(2, 2))
    # [4, 4, 256] => [4, 4, 256]
    x = depthwise_conv_block(x, 256)
    # [4, 4, 256] => [2, 2, 512]
    x = depthwise_conv_block(x, 512, strides=(2, 2))
    # [2, 2, 512] => [2, 2, 512]
    x = depthwise_conv_block(x, 512)
    # [2, 2, 512] => [2, 2, 512]
    x = depthwise_conv_block(x, 512)
    # [2, 2, 512] => [2, 2, 512]
    x = depthwise_conv_block(x, 512)
    # [2, 2, 512] => [2, 2, 512]
    x = depthwise_conv_block(x, 512)
    # [2, 2, 512] => [2, 2, 512]
    x = depthwise_conv_block(x, 512)
    # [2, 2, 512] => [1, 1, 1024]
    x = depthwise_conv_block(x, 1024, strides=(2,2))
    # [1, 1, 1024] => [1, 1, 1024]
    x = depthwise_conv_block(x, 1024)
    
    # [1, 1, 1024] => (1024,)
    x = tf.keras.layers.GlobalAveragePooling2D()(x)
    # (1024,) => (classes,)
    pred = tf.keras.layers.Dense(classes, activation='softmax')(x)
 
    return pred

2.3 优化

宽度乘法器：更薄的模型
在这里插入图片描述

分辨率乘法器：约化表达
在这里插入图片描述

参考：

https://blog.csdn.net/mzpmzk/article/details/82976871

https://blog.csdn.net/qq_37116150/article/details/105161871#2.%20%E5%9F%BA%E4%BA%8ECIFAR-10%E6%95%B0%E6%8D%AE%E9%9B%86%E6%9E%84%E5%BB%BA%E5%88%86%E7%B1%BB%E6%A8%A1%E5%9E%8B

https://blog.csdn.net/qq_37116150/article/details/105161871#2.%20%E5%9F%BA%E4%BA%8ECIFAR-10%E6%95%B0%E6%8D%AE%E9%9B%86%E6%9E%84%E5%BB%BA%E5%88%86%E7%B1%BB%E6%A8%A1%E5%9E%8B

MobileNetV2

1.与MobileNetV1的区别

在这里插入图片描述

2. 与Resnet区别：

在这里插入图片描述

3. 网络结构

在这里插入图片描述
t 是输入通道的倍增系数（即中间部分的通道数是输入通道数的多少倍）
n 是该模块重复次数
c 是输出通道数
s 是该模块第一次重复时的 stride（后面重复都是 stride 1）

4.总结

引入残差结构，先升维再降维，增强梯度的传播，显著减少推理期间所需的内存占用（Inverted Residuals）
去掉 Narrow layer（low dimension or depth）后的 ReLU，保留特征多样性，增强网络的表达能力（Linear Bottlenecks）
网络为全卷积的，使得模型可以适应不同尺寸的图像；使用 RELU6（最高输出为 6）激活函数，使得模型在低精度计算下具有更强的鲁棒性
MobileNetV2 building block 如下所示，若需要下采样，可在 DWise 时采用步长为 2 的卷积；小网络使用小的扩张系数（expansion factor），大网络使用大一点的扩张系数（expansion factor），推荐是5~10，论文中 t=6

参考：
https://blog.csdn.net/mzpmzk/article/details/82976871
https://blog.csdn.net/kangdi7547/article/details/81431572