霹雳吧啦Wz《pytorch图像分类》-p6MobileNet网络

最新推荐文章于 2024-04-17 10:26:10 发布

失舵之舟-

最新推荐文章于 2024-04-17 10:26:10 发布

阅读量1.3k

点赞数 35

分类专栏：霹雳吧啦Wz《pytorch图像分类》舟舟自学版文章标签： pytorch 分类网络

本文链接：https://blog.csdn.net/qq_50771882/article/details/135384381

版权

本文详细介绍了MobileNet网络的发展，从v1的深度可分卷积到v2的倒残差结构和线性瓶颈，以及v3的搜索方法。重点讲解了Alpha和Beta超参数，以及如何在PyTorch中实现这些结构的代码示例，包括训练和预测部分。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 MobileNet v1网络

论文链接：MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

1.1 Depthwise convolution（DW卷积）

DW卷积大大减少了运算量和参数数量
卷积核的深度channel=1
in_channels=卷积核个数=out_channels
在这里插入图片描述

1.1.1Depthwise separable convolution（深度可分的卷积操作）

深度可分的卷积操作包括DW（Depthwise）和PW（Pointwise）
在这里插入图片描述
传统卷积：

DW+PW卷积：

MobileNet Comparison to Popular Models

1.2 增加超参数α和β

在这里插入图片描述

2 MobileNet v2网络

相比MobileNet v1 网络准确率更高参数更小
论文链接：MobileNetV2: Inverted Residuals and Linear Bottlenecks

2.1 Inverted Residuals（倒残差结构）

在这里插入图片描述

2.2 Linear Bottlenecks

针对倒残差结构最后一个1×1的卷积层，它使用了线性的激活函数而不是relu激活函数
只有stride=1且in_channels=out_channels时，才有捷径分支
在这里插入图片描述
relu6激活函数公式：
$f (x) = min (ma x (x, 0), 6)$

MobileNet v2网络模型结构参数

2.3 MobileNet v3

论文链接：Searching for MobileNetV3

3 课程代码

3.1 modle_v2.py

from torch import nn
import torch


def _make_divisible(ch, divisor=8, min_ch=None):
    """
    This function is taken from the original tf repo.
    It ensures that all layers have a channel number that is divisible by 8
    It can be seen here:
    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
    """
    if min_ch is None:
        min_ch = divisor
    new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_ch < 0.9 * ch:
        new_ch += divisor
    return new_ch


class ConvBNReLU(nn.Sequential):
    def __init__(self, in_channel, out_channel, kernel_size=3, stride=1, groups=1):
        padding = (kernel_size - 1) // 2
        super(ConvBNReLU, self).__init__(
            nn.Conv2d(in_channel, out_channel, kernel_size, stride, padding, groups=groups, bias=False),
            nn.BatchNorm2d(out_channel),
            nn.ReLU6(inplace=True)
        )


class InvertedResidual(nn.Module):
    def __init__(self, in_channel, out_channel, stride, expand_ratio):
        super(InvertedResidual, self).__init__()
        hidden_channel = in_channel * expand_ratio
        self.use_shortcut = stride == 1 and in_channel == out_channel

        layers = []
        if expand_ratio != 1:
            # 1x1 pointwise conv
            layers.append(ConvBNReLU(in_channel, hidden_channel, kernel_size=1))
        layers.extend([
            # 3x3 depthwise conv
            ConvBNReLU(hidden_channel, hidden_channel, stride=stride, groups=hidden_channel),
            # 1x1 pointwise conv(linear)
            nn.Conv2d(hidden_channel, out_channel, kernel_size=1, bias=False),
            nn.BatchNorm2d(out_channel),
        ])

        self.conv = nn.Sequential(