deeplabv3+源码之慢慢解析16 第四章network文件夹(1)backbone文件夹(b2)mobilenetv2.py--MobileNetV2类和mobilenet_v2函数

老王小可

已于 2024-05-06 16:46:18 修改

阅读量173

点赞数

分类专栏：技术文章标签：人工智能 deeplab v3+ 语义分割深度学习

于 2023-07-24 12:28:19 首次发布

本文链接：https://blog.csdn.net/xiaokeyoulile/article/details/131888623

版权

技术专栏收录该内容

46 篇文章 10 订阅

订阅专栏

系列文章目录（共五章33节已完结）

第一章deeplabv3+源码之慢慢解析根目录(1)main.py–get_argparser函数
第一章deeplabv3+源码之慢慢解析根目录(2)main.py–get_dataset函数
第一章deeplabv3+源码之慢慢解析根目录(3)main.py–validate函数
第一章deeplabv3+源码之慢慢解析根目录(4)main.py–main函数
第一章deeplabv3+源码之慢慢解析根目录(5)predict.py–get_argparser函数和main函数

第二章deeplabv3+源码之慢慢解析 datasets文件夹(1)voc.py–voc_cmap函数和download_extract函数
第二章deeplabv3+源码之慢慢解析 datasets文件夹(2)voc.py–VOCSegmentation类
第二章deeplabv3+源码之慢慢解析 datasets文件夹(3)cityscapes.py–Cityscapes类
第二章deeplabv3+源码之慢慢解析 datasets文件夹(4)utils.py–6个小函数

第三章deeplabv3+源码之慢慢解析 metrics文件夹stream_metrics.py–StreamSegMetrics类和AverageMeter类

第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a1)hrnetv2.py–4个函数和可执行代码
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a2)hrnetv2.py–Bottleneck类和BasicBlock类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a3)hrnetv2.py–StageModule类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a4)hrnetv2.py–HRNet类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(b1)mobilenetv2.py–2个类和2个函数
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(b2)mobilenetv2.py–MobileNetV2类和mobilenet_v2函数
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(c1)resnet.py–2个基础函数，BasicBlock类和Bottleneck类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(c2)resnet.py–ResNet类和10个不同结构的调用函数
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(d1)xception.py–SeparableConv2d类和Block类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(d2)xception.py–Xception类和xception函数
第四章deeplabv3+源码之慢慢解析 network文件夹(2)_deeplab.py–ASPP相关的4个类和1个函数
第四章deeplabv3+源码之慢慢解析 network文件夹(3)_deeplab.py–DeepLabV3类，DeepLabHeadV3Plus类和DeepLabHead类
第四章deeplabv3+源码之慢慢解析 network文件夹(4)modeling.py–5个私有函数（4个骨干网，1个模型载入）
第四章deeplabv3+源码之慢慢解析 network文件夹(5)modeling.py–12个调用函数
第四章deeplabv3+源码之慢慢解析 network文件夹(6)utils.py–_SimpleSegmentationModel类和IntermediateLayerGetter类

第五章deeplabv3+源码之慢慢解析 utils文件夹(1)ext_transforms.py.py–2个翻转类和ExtCompose类
第五章deeplabv3+源码之慢慢解析 utils文件夹(2)ext_transforms.py.py–2个裁剪类和2个缩放类
第五章deeplabv3+源码之慢慢解析 utils文件夹(3)ext_transforms.py.py–旋转类，填充类，张量转化类和标准化类
第五章deeplabv3+源码之慢慢解析 utils文件夹(4)ext_transforms.py.py–ExtResize类，ExtColorJitter类，Lambda类和Compose类
第五章deeplabv3+源码之慢慢解析 utils文件夹(5)loss.py–FocalLoss类
第五章deeplabv3+源码之慢慢解析 utils文件夹(6)scheduler.py–PolyLR类
第五章deeplabv3+源码之慢慢解析 utils文件夹(7)utils.py–去标准化，momentum设定，标准化层锁定和路径创建
第五章deeplabv3+源码之慢慢解析 utils文件夹(8)visualizer.py–Visualizer类（完结）

MobileNetV2类

提示：MobileNetV2类是mobilenetv2.py中的核心，整合各个功能。

class MobileNetV2(nn.Module):
    def __init__(self, num_classes=1000, output_stride=8, width_mult=1.0, inverted_residual_setting=None, round_nearest=8):
        """
        MobileNet V2 main class

        Args:
            num_classes (int): Number of classes
            width_mult (float): Width multiplier - adjusts number of channels in each layer by this amount
            inverted_residual_setting: Network structure
            round_nearest (int): Round the number of channels in each layer to be a multiple of this number
            Set to 1 to turn off rounding
        """
        super(MobileNetV2, self).__init__()
        block = InvertedResidual  #新建InvertedResidual块，即做bottleneck块。
        input_channel = 32
        last_channel = 1280
        self.output_stride = output_stride
        current_stride = 1
        if inverted_residual_setting is None:
            inverted_residual_setting = [
                # t, c, n, s #t就是扩张倍数expand_ratio，c是通道数，n是block块的重复个数，s就是stride。参数值来自于原论文。
                [1, 16, 1, 1],
                [6, 24, 2, 2],
                [6, 32, 3, 2],
                [6, 64, 4, 2],
                [6, 96, 3, 1],
                [6, 160, 3, 2],
                [6, 320, 1, 1],
            ]

        # only check the first element, assuming user knows t,c,n,s are required
        if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:#检测输入参数个数。
            raise ValueError("inverted_residual_setting should be non-empty "
                             "or a 4-element list, got {}".format(inverted_residual_setting))

        # building first layer
        input_channel = _make_divisible(input_channel * width_mult, round_nearest)#调整输入通道个数为，接近input_channel的width_mult倍的round_neares的整数倍，本代码默认为8的倍数。
        self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)#调整输出通道个数为，接近last_channel的max(1.0, width_mult)倍的round_neares的整数倍。
        features = [ConvBNReLU(3, input_channel, stride=2)] #输入RGB图像3通道，stride=2分辨率减半，主流的下采样已经很少用pooling了。
        current_stride *= 2#此处使用current_stride*2的方式实现分辨率逐步减半。
        dilation=1
        previous_dilation = 1

        # building inverted residual blocks
        for t, c, n, s in inverted_residual_setting:
            output_channel = _make_divisible(c * width_mult, round_nearest) #调整过程中的输出通道数。
            previous_dilation = dilation
            if current_stride == output_stride:   #如current_stride==8（因output_stride==8），则将stride重新调为1，此时dilation扩大s倍。
                stride = 1
                dilation *= s
            else:
                stride = s
                current_stride *= s  #如current_stride没有达到output_stride，则按不断扩大s方法继续降低分辨率。
            output_channel = int(c * width_mult)

            for i in range(n):
                if i==0:  #如果是该模块重复的第一个，则按参数stride, previous_dilation进行构建。
                    features.append(block(input_channel, output_channel, stride, previous_dilation, expand_ratio=t))
                else:  #如果是该模块的后续重复块，则按stride==1和dilation参数进行构建。
                    features.append(block(input_channel, output_channel, 1, dilation, expand_ratio=t))
                input_channel = output_channel
        # building last several layers
        features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1)) #原文是conv2d+avgpool+conv2d。本段代码直接用了ConvBNReLU块+没有pool（而是在forward用了一个mean函数）+线性分类层。网络上大多数实现都有一个avgpool(x)层。
        # make it nn.Sequential
        self.features = nn.Sequential(*features)

        # building classifier #此处把线性层单独命名为classifier，一是这个作用是分类，而是为了在forward过程中，在前面所有层和分类层中间插入一个mean函数。我个人觉得用avgpool可能更简单，体现网络整体性。
        self.classifier = nn.Sequential(
            nn.Dropout(0.2),
            nn.Linear(self.last_channel, num_classes),  #全连接层输出。
        )

        # weight initialization
        for m in self.modules(): #初始化权重设置，如果没有进行初始化设置，则各权重是随机的。
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out')
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.zeros_(m.bias)

    def forward(self, x):
        x = self.features(x)
        x = x.mean([2, 3])  #[batch_size,channels,h,w]是对输入数据长和宽即维度[2,3]求均值。
        x = self.classifier(x)  #即前面各层+mean均值+classifier线性分类层。
        return x

mobilenet_v2函数

1，文件开头的函数

def mobilenet_v2(pretrained=False, progress=True, **kwargs):
    """
    Constructs a MobileNetV2 architecture from
    `"MobileNetV2: Inverted Residuals and Linear Bottlenecks" <https://arxiv.org/abs/1801.04381>`_.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    model = MobileNetV2(**kwargs)  #新建模型对象。
    if pretrained:
        state_dict = load_state_dict_from_url(model_urls['mobilenet_v2'],
                                              progress=progress)
        model.load_state_dict(state_dict)  #如果选择预训练，则直接载入模型。
    return model