deeplabv3+源码之慢慢解析14 第四章network文件夹(1)backbone文件夹(a4)hrnetv2.py--HRNet类

老王小可

已于 2023-09-11 13:24:38 修改

阅读量140

点赞数

分类专栏：技术文章标签：人工智能 deeplabV3+ 语义分割深度学习

于 2023-07-22 16:16:29 首次发布

本文链接：https://blog.csdn.net/xiaokeyoulile/article/details/131803676

版权

技术专栏收录该内容

46 篇文章 10 订阅

订阅专栏

文章详细解析了deeplabv3+的HRNet类，包括其初始化、结构、主要函数及其实现的网络层次。HRNet类涉及图像处理的基本操作，如卷积、批量归一化和激活函数，以及不同分辨率流的创建和融合。该类基于先前定义的基础块和模块，构建了一个多分辨率并行处理的网络架构，用于图像分割任务。

摘要由CSDN通过智能技术生成

系列文章目录（共五章33节已完结）

第一章deeplabv3+源码之慢慢解析根目录(1)main.py–get_argparser函数
第一章deeplabv3+源码之慢慢解析根目录(2)main.py–get_dataset函数
第一章deeplabv3+源码之慢慢解析根目录(3)main.py–validate函数
第一章deeplabv3+源码之慢慢解析根目录(4)main.py–main函数
第一章deeplabv3+源码之慢慢解析根目录(5)predict.py–get_argparser函数和main函数

第二章deeplabv3+源码之慢慢解析 datasets文件夹(1)voc.py–voc_cmap函数和download_extract函数
第二章deeplabv3+源码之慢慢解析 datasets文件夹(2)voc.py–VOCSegmentation类
第二章deeplabv3+源码之慢慢解析 datasets文件夹(3)cityscapes.py–Cityscapes类
第二章deeplabv3+源码之慢慢解析 datasets文件夹(4)utils.py–6个小函数

第三章deeplabv3+源码之慢慢解析 metrics文件夹stream_metrics.py–StreamSegMetrics类和AverageMeter类

第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a1)hrnetv2.py–4个函数和可执行代码
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a2)hrnetv2.py–Bottleneck类和BasicBlock类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a3)hrnetv2.py–StageModule类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(a4)hrnetv2.py–HRNet类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(b1)mobilenetv2.py–2个类和2个函数
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(b2)mobilenetv2.py–MobileNetV2类和mobilenet_v2函数
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(c1)resnet.py–2个基础函数，BasicBlock类和Bottleneck类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(c2)resnet.py–ResNet类和10个不同结构的调用函数
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(d1)xception.py–SeparableConv2d类和Block类
第四章deeplabv3+源码之慢慢解析 network文件夹(1)backbone文件夹(d2)xception.py–Xception类和xception函数
第四章deeplabv3+源码之慢慢解析 network文件夹(2)_deeplab.py–ASPP相关的4个类和1个函数
第四章deeplabv3+源码之慢慢解析 network文件夹(3)_deeplab.py–DeepLabV3类，DeepLabHeadV3Plus类和DeepLabHead类
第四章deeplabv3+源码之慢慢解析 network文件夹(4)modeling.py–5个私有函数（4个骨干网，1个模型载入）
第四章deeplabv3+源码之慢慢解析 network文件夹(5)modeling.py–12个调用函数
第四章deeplabv3+源码之慢慢解析 network文件夹(6)utils.py–_SimpleSegmentationModel类和IntermediateLayerGetter类

第五章deeplabv3+源码之慢慢解析 utils文件夹(1)ext_transforms.py.py–2个翻转类和ExtCompose类
第五章deeplabv3+源码之慢慢解析 utils文件夹(2)ext_transforms.py.py–2个裁剪类和2个缩放类
第五章deeplabv3+源码之慢慢解析 utils文件夹(3)ext_transforms.py.py–旋转类，填充类，张量转化类和标准化类
第五章deeplabv3+源码之慢慢解析 utils文件夹(4)ext_transforms.py.py–ExtResize类，ExtColorJitter类，Lambda类和Compose类
第五章deeplabv3+源码之慢慢解析 utils文件夹(5)loss.py–FocalLoss类
第五章deeplabv3+源码之慢慢解析 utils文件夹(6)scheduler.py–PolyLR类
第五章deeplabv3+源码之慢慢解析 utils文件夹(7)utils.py–去标准化，momentum设定，标准化层锁定和路径创建
第五章deeplabv3+源码之慢慢解析 utils文件夹(8)visualizer.py–Visualizer类（完结）

HRNet类

了解了理论部分和上两节对各个基础类的内容之后，HRNet类反而略显简单，汇总使用功能即可。

class HRNet(nn.Module):
    #def __init__(self, c=48, num_blocks=[1, 4, 3], num_classes=1000): #此行是原代码. 此处应该改为num_blocks=[1, 2, 3]. 才能对应不同stage的分支。
    def __init__(self, c=48, num_blocks=[1, 2, 3], num_classes=1000):##此处channel设置为48，当然也可以设为32. 
        super(HRNet, self).__init__()

        # Stem:# 处理输入图片尺寸，转换为能够输入主网络的特征图；
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1, bias=False)#rgb的3通道转为64通道，stride=2长和宽都降一半。
        self.bn1 = nn.BatchNorm2d(64, eps=1e-05, affine=True, track_running_stats=True)
        self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1, bias=False)#stride=2长和宽都再降一半。
        self.bn2 = nn.BatchNorm2d(64, eps=1e-05, affine=True, track_running_stats=True)
        self.relu = nn.ReLU(inplace=True)

        # Stage 1:
        downsample = nn.Sequential(
            nn.Conv2d(64, 256, kernel_size=1, stride=1, bias=False), #Stage 1使用Bottleneck类，通道数*4，所以是64*4=256。downsample为了能和Bottleneck最终输出保持一致，所以输入使用input的64通道，输出用256通道。
            nn.BatchNorm2d(256, eps=1e-05, affine=True, track_running_stats=True),
        )
        # Note that bottleneck module will expand the output channels according to the output channels*block.expansion
        bn_expansion = Bottleneck.expansion  # The channel expansion is set in the bottleneck class.
        self.layer1 = nn.Sequential(
            Bottleneck(64, 64, downsample=downsample),  # Input is 64 for first module connection
            Bottleneck(bn_expansion * 64, 64),
            Bottleneck(bn_expansion * 64, 64),
            Bottleneck(bn_expansion * 64, 64),#此结构由4个Bottleneck组成，最终输出64*4=256。
        )

        # Transition 1 - Creation of the first two branches (one full and one half resolution)
        # Need to transition into high resolution stream and mid resolution stream
        #Stage 1的收尾,Stage2的开头，还不是融合层，Stage1结尾直接分开，没有融合。
        self.transition1 = nn.ModuleList([
            nn.Sequential(
                nn.Conv2d(256, c, kernel_size=3, stride=1, padding=1, bias=False),#原分辨率输出。
                nn.BatchNorm2d(c, eps=1e-05, affine=True, track_running_stats=True),
                nn.ReLU(inplace=True),
            ),
            nn.Sequential(nn.Sequential(  # Double Sequential to fit with official pretrained weights#此处两个Sequential嵌套是为了对应官方的与训练模型参数，如果单纯的训练自己的模型保存自己的参数，可以只用一个。（但就无法载入预训练的模型了）。
                nn.Conv2d(256, c * 2, kernel_size=3, stride=2, padding=1, bias=False), #通道数翻倍，长和宽都减半（即分辨率减半）。
                nn.BatchNorm2d(c * 2, eps=1e-05, affine=True, track_running_stats=True),
                nn.ReLU(inplace=True),
            )),
        ])

        # Stage 2:
        number_blocks_stage2 = num_blocks[0]
        self.stage2 = nn.Sequential(
            *[StageModule(stage=2, output_branches=2, c=c) for _ in range(number_blocks_stage2)])

        # Transition 2  - Creation of the third branch (1/4 resolution)
        self.transition2 = self._make_transition_layers(c, transition_number=2)#类末的静态方法_make_transition_layers，主要就是新增一个Conv2d+BatchNorm2d+ReLU的结构，且结果通道数再翻倍，长和宽都再减半（即1/4）。

        # Stage 3:
        number_blocks_stage3 = num_blocks[1]  # number blocks you want to create before fusion
        self.stage3 = nn.Sequential(
            *[StageModule(stage=3, output_branches=3, c=c) for _ in range(number_blocks_stage3)])

        # Transition  - Creation of the fourth branch (1/8 resolution)
        self.transition3 = self._make_transition_layers(c, transition_number=3)#同stage2的理解。此处新增一个1/8。

        # Stage 4:
        number_blocks_stage4 = num_blocks[2]  # number blocks you want to create before fusion
        self.stage4 = nn.Sequential(
            *[StageModule(stage=4, output_branches=4, c=c) for _ in range(number_blocks_stage4)]) #stage4，4个分支结束，不必再新增了。

        # Classifier (extra module if want to use for classification):
        # pool, reduce dimensionality, flatten, connect to linear layer for classification:
         #这个最后的分类输出部分，每个任务和不同版本的代码都有所不同。基本都是Conv2D，或者Conv2D+标准化+ReLU，或者再添加了池化和linear层。
        out_channels = sum([c * 2 ** i for i in range(len(num_blocks)+1)])  # total output channels of HRNetV2
        pool_feature_map = 8
        self.bn_classifier = nn.Sequential(
            nn.Conv2d(out_channels, out_channels // 4, kernel_size=1, bias=False),
            nn.BatchNorm2d(out_channels // 4, eps=1e-05, affine=True, track_running_stats=True),
            nn.ReLU(inplace=True),
            nn.AdaptiveAvgPool2d(pool_feature_map),#nn.AdaptiveAvgPool2d(output_size)，此处即针对任何大小的输入，指定输出大小为pool_feature_map。
            nn.Flatten(),#展平为张量.
            nn.Linear(pool_feature_map * pool_feature_map * (out_channels // 4), num_classes), #输出的全连接层比较简单了，输入个数和输出个数对应即可。
        )

    @staticmethod
     #在stage的融合之后，创建一个新的转化层（通道数翻倍，长和宽都减半），作为下一个stage的输入。如stage1无融合层，直接输出原有和减半（即1/2）代码已单独写了，用不到此。stage2在融合层后，新增再减半（即1/4）作为stage3的新分支输入。stage3在此基础上新增再减半（即1/8）作为stage4的新分支输入，stage4无需用到。
    def _make_transition_layers(c, transition_number):
        return nn.Sequential(
            nn.Conv2d(c * (2 ** (transition_number - 1)), c * (2 ** transition_number), kernel_size=3, stride=2,
                      padding=1, bias=False), #通道数翻倍，长和宽都减半。
            nn.BatchNorm2d(c * (2 ** transition_number), eps=1e-05, affine=True,
                           track_running_stats=True),
            nn.ReLU(inplace=True),
        )

    def forward(self, x): #此处x就是输入数据。
        # Stem:
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu(x)#对应前面的Stem部分。这部分源代码注释很充足。

        # Stage 1
        x = self.layer1(x)
        x = [trans(x) for trans in self.transition1]  # split to 2 branches, form a list.

        # Stage 2
        x = self.stage2(x)
        x.append(self.transition2(x[-1]))

        # Stage 3
        x = self.stage3(x)
        x.append(self.transition3(x[-1]))

        # Stage 4
        x = self.stage4(x)#此处x已经是stage4的输出结果了。

        # HRNetV2 Example: (follow paper, upsample via bilinear interpolation and to highest resolution size)
        output_h, output_w = x[0].size(2), x[0].size(3)  # Upsample to size of highest resolution stream#4个分支x[0,1,2,3]中，x[0]分支结果是原分辨率（即最高），其他各分支依次减半。
        x1 = F.interpolate(x[1], size=(output_h, output_w), mode='bilinear', align_corners=False)#各分支依次上采样（变大分辨率）。
        x2 = F.interpolate(x[2], size=(output_h, output_w), mode='bilinear', align_corners=False)
        x3 = F.interpolate(x[3], size=(output_h, output_w), mode='bilinear', align_corners=False)

        # Upsampling all the other resolution streams and then concatenate all (rather than adding/fusing like HRNetV1)
        x = torch.cat([x[0], x1, x2, x3], dim=1)
        x = self.bn_classifier(x)#直接连接后，输入最后的分类层。
        return x