Python精选200Tips：181-182

AnFany

于 2024-09-28 22:46:26 发布

阅读量1k

点赞数 14

分类专栏： Python200+Tips 文章标签： python 开发语言深度学习神经网络 tensorflow 图像处理

本文链接：https://blog.csdn.net/qq_32882309/article/details/142622142

版权

Python200+Tips 专栏收录该内容

29 篇文章 0 订阅

订阅专栏

针对图像的经典卷积网络结构进化史及可视化

针对图像的经典卷积网络结构进化史及可视化（续）

运行系统：macOS Sequoia 15.0
Python编译器：PyCharm 2024.1.4 (Community Edition)
Python版本：3.12
TensorFlow版本：2.17.0
Pytorch版本：2.4.1

往期链接：

1-5	6-10	11-20	21-30	31-40	41-50

51-60：函数	61-70：类	71-80：编程范式及设计模式

81-90：Python编码规范	91-100：Python自带常用模块-1

101-105：Python自带模块-2	106-110：Python自带模块-3

111-115：Python常用第三方包-频繁使用	116-120：Python常用第三方包-深度学习

121-125：Python常用第三方包-爬取数据	126-130：Python常用第三方包-为了乐趣

131-135：Python常用第三方包-拓展工具1	136-140：Python常用第三方包-拓展工具2

Python项目实战

141-145	146-150	151-155	156-160	161-165	166-170	171-175	176-180

针对图像的经典卷积网络结构进化史及可视化（续）

P181–MobileNet【2017】

模型结构及创新性说明

MobileNet是一系列为移动和嵌入式视觉应用设计的轻量级卷积神经网络。以下是MobileNet各个版本的的主要特点：

（1）MobileNetV1版本

主要特点

引入深度可分离卷积（Depthwise Separable Convolution）
使用宽度乘子（Width Multiplier）和分辨率乘子（Resolution Multiplier）调整模型大小和复杂度

创新点

深度可分离卷积将标准卷积分解为深度卷积和逐点卷积，大大减少了计算量
使用ReLU6作为激活函数，有利于低精度计算

（2）MobileNetV2版本

主要特点

引入倒置残差结构（Inverted Residual Structure）
设计线性瓶颈（Linear Bottleneck）

创新点

倒置残差结构先扩展通道数，再做深度卷积，最后压缩回原来的通道数
去掉了最后一个ReLU，使用线性激活，有助于保留低维特征

（3）MobileNetV3

主要特点

网络结构搜索（NAS）优化的网络架构
引入新的激活函数：h-swish
集成Squeeze-and-Excitation (SE) 模块
提供Small和Large两个版本

创新点

使用NAS自动搜索最优网络结构
h-swish激活函数提高了精度，同时计算效率高
SE模块增强了特征的表达能力
优化了网络的首尾层，进一步提高效率

模型结构代码

MobileNet V1版本

import tensorflow as tf
from tensorflow.keras import layers, models


def depthwise_conv_block(inputs, pointwise_conv_filters, alpha,
                         depth_multiplier=1, strides=(1, 1), block_id=1):
    """Adds a depthwise convolution block.

    A depthwise convolution block consists of a depthwise conv,
    batch normalization, ReLU6, pointwise convolution,
    batch normalization and ReLU6 activation.
    """
    channel_axis = -1
    pointwise_conv_filters = int(pointwise_conv_filters * alpha)

    x = layers.DepthwiseConv2D((3, 3),
                               padding='same',
                               depth_multiplier=depth_multiplier,
                               strides=strides,
                               use_bias=False,
                               name='conv_dw_%d' % block_id)(inputs)
    x = layers.BatchNormalization(axis=channel_axis, name='conv_dw_%d_bn' % block_id)(x)
    x = layers.ReLU(6., name='conv_dw_%d_relu' % block_id)(x)

    x = layers.Conv2D(pointwise_conv_filters, (1, 1),
                      padding='same',
                      use_bias=False,
                      strides=(1, 1),
                      name='conv_pw_%d' % block_id)(x)
    x = layers.BatchNormalization(axis=channel_axis, name='conv_pw_%d_bn' % block_id)(x)
    return layers.ReLU(6., name='conv_pw_%d_relu' % block_id)(x)


def MobileNetV1(input_shape=(224, 224, 3),
                alpha=1.0,
                depth_multiplier=1,
                dropout=1e-3,
                classes=1000):
    """Instantiates the MobileNet architecture.

    Arguments:
        input_shape: Optional shape tuple, to be specified if you would
            like to use a model with an input img resolution that is not
            (224, 224, 3).
        alpha: Controls the width of the network. This is known as the
            width multiplier in the MobileNet paper.
            - If `alpha` < 1.0, proportionally decreases the number
                of filters in each layer.
            - If `alpha` > 1.0, proportionally increases the number
                of filters in each layer.
            - If `alpha` = 1, default number of filters from the paper
                are used at each layer.
        depth_multiplier: Depth multiplier for depthwise convolution.
            This is called the resolution multiplier in the MobileNet paper.
        dropout: Dropout rate.
        classes: Optional number of classes to classify images into.
    Returns:
        A Keras model instance.
    """

    img_input = layers.Input(shape=input_shape)

    x = layers.Conv2D(int(32 * alpha), (3, 3),
                      strides=(2, 2),
                      padding='same',
                      use_bias=False,
                      name='conv1')(img_input)
    x = layers.BatchNormalization(axis=-1, name='conv1_bn')(x)
    x = layers.ReLU(6., name='conv1_relu')(x)

    x = depthwise_conv_block(x, 64, alpha, depth_multiplier, block_id=1)

    x = depthwise_conv_block(x, 128, alpha, depth_multiplier, strides=(2, 2), block_id=2)
    x = depthwise_conv_block(x, 128, alpha, depth_multiplier, block_id=3)

    x = depthwise_conv_block(x, 256, alpha, depth_multiplier, strides=(2, 2), block_id=4)
    x = depthwise_conv_block(x, 256, alpha, depth_multiplier, block_id=5)

    x = depthwise_conv_block(x, 512, alpha, depth_multiplier, strides=(2, 2), block_id=6)
    x = depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=7)
    x = depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=8)
    x = depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=9)
    x = depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=10)
    x = depthwise_conv_block(x, 512, alpha, depth_multiplier, block_id=11)

    x = depthwise_conv_block(x, 1024, alpha, depth_multiplier, strides=(2, 2), block_id=12)
    x = depthwise_conv_block(x, 1024, alpha, depth_multiplier, block_id=13)

    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Reshape((1, 1, int(1024 * alpha)))(x)
    x = layers.Dropout(dropout, name='dropout')(x)

    x = layers.Conv2D(classes, (1, 1),
                      padding='same',
                      name='conv_preds')(x)
    x = layers.Reshape((classes,), name='reshape_2')(x)
    x = layers.Activation('softmax', name='act_softmax')(x)

    model = models.Model(img_input, x, name='mobilenet_v1')

    return model


# 创建MobileNet V1模型
model = MobileNetV1(input_shape=(224, 224, 3), classes=1000)

# 打印模型摘要
model.summary()

可以通过调整alpha参数来创建不同大小的MobileNetV1模型：

custom_model = MobileNetV1(input_shape=(224, 224, 3), classes=10, alpha=0.75)
custom_model.summary()

这将创建一个稍微窄一些（alpha=0.75）的MobileNet模型，用于10类分类任务。

MobileNet V2版本

import tensorflow as tf
from tensorflow.keras import layers, models


def inverted_residual_block(inputs, filters, stride, expand_ratio, alpha):
    input_channels = inputs.shape[-1]
    pointwise_filters = int(filters * alpha)

    # Expansion phase
    x = layers.Conv2D(int(input_channels * expand_ratio), kernel_size=1, padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.ReLU(6.)(x)

    # Depthwise Convolution
    x = layers.DepthwiseConv2D(kernel_size=3, strides=stride, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.ReLU(6.)(x)

    # Projection
    x = layers.Conv2D(pointwise_filters, kernel_size=1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)

    # Residual connection if possible
    if stride == 1 and input_channels == pointwise_filters:
        return layers.Add()([inputs, x])
    return x


def MobileNetV2(input_shape=(224, 224, 3), num_classes=1000, alpha=1.0, include_top=True):
    inputs = layers.Input(shape=input_shape)

    # First Convolution Layer
    x = layers.Conv2D(int(32 * alpha), kernel_size=3, strides=(2, 2), padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.ReLU(6.)(x)

    # Inverted Residual Blocks
    x = inverted_residual_block(x, filters=16, stride=1, expand_ratio=1, alpha=alpha)

    x = inverted_residual_block(x, filters=24, stride=2, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=24, stride=1, expand_ratio=6, alpha=alpha)

    x = inverted_residual_block(x, filters=32, stride=2, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=32, stride=1, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=32, stride=1, expand_ratio=6, alpha=alpha)

    x = inverted_residual_block(x, filters=64, stride=2, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=64, stride=1, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=64, stride=1, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=64, stride=1, expand_ratio=6, alpha=alpha)

    x = inverted_residual_block(x, filters=96, stride=1, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=96, stride=1, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=96, stride=1, expand_ratio=6, alpha=alpha)

    x = inverted_residual_block(x, filters=160, stride=2, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=160, stride=1, expand_ratio=6, alpha=alpha)
    x = inverted_residual_block(x, filters=160, stride=1, expand_ratio=6, alpha=alpha)

    x = inverted_residual_block(x, filters=320, stride=1, expand_ratio=6, alpha=alpha)

    # Last Convolution Layer
    x = layers.Conv2D(int(1280 * alpha), kernel_size=1, use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.ReLU(6.)(x)

    if include_top:
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.Dense(num_classes, activation='softmax')(x)

    model = models.Model(inputs, x, name='MobileNetV2')
    return model


# 创建MobileNet V2模型
model = MobileNetV2(input_shape=(224, 224, 3), num_classes=1000)

# 打印模型摘要
model.summary()

MobileNet V3 版本

Small版本

import tensorflow as tf
from tensorflow.keras import layers, models

class HSwish(layers.Layer):
    def call(self, x):
        return x * tf.nn.relu6(x + 3) / 6

class HSigmoid(layers.Layer):
    def call(self, x):
        return tf.nn.relu6(x + 3) / 6

def squeeze_excite_block(inputs, se_ratio=0.25):
    x = layers.GlobalAveragePooling2D()(inputs)
    filters = inputs.shape[-1]
    x = layers.Dense(max(1, int(filters * se_ratio)), activation='relu')(x)
    x = layers.Dense(filters, activation=HSigmoid())(x)
    x = layers.Reshape((1, 1, filters))(x)
    return layers.multiply([inputs, x])

def bneck(inputs, out_channels, exp_channels, kernel_size, stride, se_ratio, activation, alpha=1.0):
    x = layers.Conv2D(int(exp_channels * alpha), 1, padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = activation(x)

    x = layers.DepthwiseConv2D(kernel_size, stride, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = activation(x)

    if se_ratio:
        x = squeeze_excite_block(x, se_ratio)

    x = layers.Conv2D(int(out_channels * alpha), 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)

    if stride == 1 and inputs.shape[-1] == int(out_channels * alpha):
        return layers.Add()([inputs, x])
    return x

def MobileNetV3Small(input_shape=(224, 224, 3), num_classes=1000, alpha=1.0, include_top=True):
    inputs = layers.Input(shape=input_shape)

    x = layers.Conv2D(16, 3, strides=2, padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = HSwish()(x)

    x = bneck(x, 16, 16, 3, 2, 0.25, layers.ReLU(), alpha)
    x = bneck(x, 24, 72, 3, 2, None, layers.ReLU(), alpha)
    x = bneck(x, 24, 88, 3, 1, None, layers.ReLU(), alpha)
    x = bneck(x, 40, 96, 5, 2, 0.25, HSwish(), alpha)
    x = bneck(x, 40, 240, 5, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 40, 240, 5, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 48, 120, 5, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 48, 144, 5, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 96, 288, 5, 2, 0.25, HSwish(), alpha)
    x = bneck(x, 96, 576, 5, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 96, 576, 5, 1, 0.25, HSwish(), alpha)

    x = layers.Conv2D(int(576 * alpha), 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = HSwish()(x)

    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Reshape((1, 1, int(576 * alpha)))(x)

    x = layers.Conv2D(int(1024 * alpha), 1, padding='same')(x)
    x = HSwish()(x)

    if include_top:
        x = layers.Conv2D(num_classes, 1, padding='same', activation='softmax')(x)
        x = layers.Reshape((num_classes,))(x)

    model = models.Model(inputs, x, name='MobileNetV3Small')
    return model

# 创建MobileNet V3 Small模型
model = MobileNetV3Small(input_shape=(224, 224, 3), num_classes=1000)

# 打印模型摘要
model.summary()

Large版本

import tensorflow as tf
from tensorflow.keras import layers, models

class HSwish(layers.Layer):
    def call(self, x):
        return x * tf.nn.relu6(x + 3) / 6

class HSigmoid(layers.Layer):
    def call(self, x):
        return tf.nn.relu6(x + 3) / 6

def squeeze_excite_block(inputs, se_ratio=0.25):
    x = layers.GlobalAveragePooling2D()(inputs)
    filters = inputs.shape[-1]
    x = layers.Dense(max(1, int(filters * se_ratio)), activation='relu')(x)
    x = layers.Dense(filters, activation=HSigmoid())(x)
    x = layers.Reshape((1, 1, filters))(x)
    return layers.multiply([inputs, x])

def bneck(inputs, out_channels, exp_channels, kernel_size, stride, se_ratio, activation, alpha=1.0):
    x = layers.Conv2D(int(exp_channels * alpha), 1, padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = activation(x)

    x = layers.DepthwiseConv2D(kernel_size, stride, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = activation(x)

    if se_ratio:
        x = squeeze_excite_block(x, se_ratio)

    x = layers.Conv2D(int(out_channels * alpha), 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)

    if stride == 1 and inputs.shape[-1] == int(out_channels * alpha):
        return layers.Add()([inputs, x])
    return x

def MobileNetV3Large(input_shape=(224, 224, 3), num_classes=1000, alpha=1.0, include_top=True):
    inputs = layers.Input(shape=input_shape)

    x = layers.Conv2D(16, 3, strides=2, padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = HSwish()(x)

    x = bneck(x, 16, 16, 3, 1, None, layers.ReLU(), alpha)
    x = bneck(x, 24, 64, 3, 2, None, layers.ReLU(), alpha)
    x = bneck(x, 24, 72, 3, 1, None, layers.ReLU(), alpha)
    x = bneck(x, 40, 72, 5, 2, 0.25, layers.ReLU(), alpha)
    x = bneck(x, 40, 120, 5, 1, 0.25, layers.ReLU(), alpha)
    x = bneck(x, 40, 120, 5, 1, 0.25, layers.ReLU(), alpha)
    x = bneck(x, 80, 240, 3, 2, None, HSwish(), alpha)
    x = bneck(x, 80, 200, 3, 1, None, HSwish(), alpha)
    x = bneck(x, 80, 184, 3, 1, None, HSwish(), alpha)
    x = bneck(x, 80, 184, 3, 1, None, HSwish(), alpha)
    x = bneck(x, 112, 480, 3, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 112, 672, 3, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 160, 672, 5, 2, 0.25, HSwish(), alpha)
    x = bneck(x, 160, 960, 5, 1, 0.25, HSwish(), alpha)
    x = bneck(x, 160, 960, 5, 1, 0.25, HSwish(), alpha)

    x = layers.Conv2D(int(960 * alpha), 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = HSwish()(x)

    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Reshape((1, 1, int(960 * alpha)))(x)

    x = layers.Conv2D(int(1280 * alpha), 1, padding='same')(x)
    x = HSwish()(x)

    if include_top:
        x = layers.Conv2D(num_classes, 1, padding='same', activation='softmax')(x)
        x = layers.Reshape((num_classes,))(x)

    model = models.Model(inputs, x, name='MobileNetV3Large')
    return model

# 创建MobileNet V3 Large模型
model = MobileNetV3Large(input_shape=(224, 224, 3), num_classes=1000)

# 打印模型摘要
model.summary()

P182–EfficientNet【2019】

模型结构及创新性说明

EfficientNet是由Google研究人员在2019年提出的一系列卷积神经网络模型，旨在提高模型效率和准确性。以下是EfficientNet的主要特点：

模型结构

基于MobileNetV2的倒置残差结构
使用Squeeze-and-Excitation (SE) 块
采用复合缩放方法

创新性：

提出了复合缩放方法，同时缩放网络的宽度、深度和分辨率
通过神经架构搜索(NAS)优化基础网络结构
在同等计算资源下，实现了更高的准确率

模型结构代码

B0版本

import matplotlib.pyplot as plt
import tensorflow as tf
from keras.utils import plot_model
from tensorflow.keras import layers, models

# macos系统显示中文
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS']


def swish(x):
    return x * tf.nn.sigmoid(x)

def se_block(inputs, se_ratio):
    channels = inputs.shape[-1]
    x = layers.GlobalAveragePooling2D()(inputs)
    x = layers.Dense(max(1, int(channels * se_ratio)), activation=swish)(x)
    x = layers.Dense(channels, activation='sigmoid')(x)
    return layers.Multiply()([inputs, x])

def mbconv_block(inputs, out_channels, expand_ratio, stride, kernel_size, se_ratio):
    channels = inputs.shape[-1]
    x = inputs

    # Expansion phase
    if expand_ratio != 1:
        expand_channels = channels * expand_ratio
        x = layers.Conv2D(expand_channels, 1, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x)
        x = layers.Activation(swish)(x)

    # Depthwise Conv
    x = layers.DepthwiseConv2D(kernel_size, stride, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation(swish)(x)

    # Squeeze and Excitation
    if se_ratio:
        x = se_block(x, se_ratio)

    # Output phase
    x = layers.Conv2D(out_channels, 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)

    if stride == 1 and channels == out_channels:
        x = layers.Add()([inputs, x])

    return x

def efficientnet(width_coefficient, depth_coefficient, resolution, dropout_rate):
    base_architecture = [
        # expansion, channels, repeats, stride, kernel_size
        [1, 16, 1, 1, 3],
        [6, 24, 2, 2, 3],
        [6, 40, 2, 2, 5],
        [6, 80, 3, 2, 3],
        [6, 112, 3, 1, 5],
        [6, 192, 4, 2, 5],
        [6, 320, 1, 1, 3]
    ]

    inputs = layers.Input(shape=(resolution, resolution, 3))
    x = layers.Conv2D(32, 3, strides=2, padding='same', use_bias=False)(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.Activation(swish)(x)

    for i, (expansion, channels, repeats, stride, kernel_size) in enumerate(base_architecture):
        channels = int(channels * width_coefficient)
        repeats = int(repeats * depth_coefficient)

        for j in range(repeats):
            x = mbconv_block(x, channels, expansion, stride if j == 0 else 1, kernel_size, se_ratio=0.25)

    x = layers.Conv2D(1280, 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation(swish)(x)

    x = layers.GlobalAveragePooling2D()(x)
    if dropout_rate > 0:
        x = layers.Dropout(dropout_rate)(x)
    outputs = layers.Dense(1000, activation='softmax')(x)

    model = tf.keras.Model(inputs, outputs)
    return model

# EfficientNet-B0 configuration
def efficientnet_b0():
    return efficientnet(
        width_coefficient=1.0,
        depth_coefficient=1.0,
        resolution=224,
        dropout_rate=0.2
    )

# Create the model
model_b0 = efficientnet_b0()

# Print model summary
model_b0.summary()

# 将模型结构输出到pdf
plot_model(model_b0, to_file='model_b0.pdf', show_shapes=True,
           show_layer_names=True)

B1–B7版本

def efficientnet_b1():
    return efficientnet(width_coefficient=1.0, depth_coefficient=1.1, resolution=240, dropout_rate=0.2)

def efficientnet_b2():
    return efficientnet(width_coefficient=1.1, depth_coefficient=1.2, resolution=260, dropout_rate=0.3)

def efficientnet_b3():
    return efficientnet(width_coefficient=1.2, depth_coefficient=1.4, resolution=300, dropout_rate=0.3)

def efficientnet_b4():
    return efficientnet(width_coefficient=1.4, depth_coefficient=1.8, resolution=380, dropout_rate=0.4)

def efficientnet_b5():
    return efficientnet(width_coefficient=1.6, depth_coefficient=2.2, resolution=456, dropout_rate=0.4)

def efficientnet_b6():
    return efficientnet(width_coefficient=1.8, depth_coefficient=2.6, resolution=528, dropout_rate=0.5)

def efficientnet_b7():
    return efficientnet(width_coefficient=2.0, depth_coefficient=3.1, resolution=600, dropout_rate=0.5)