【DeepLab系列】 DeepLabV3中的ASPP(Atrous Spatial Pyramid Pooling)模块搭建

【DeepLab系列】 DeepLabV3中的ASPP模块搭建

1 前言

DeepLab V3 原文:ASPP is inspired by the success of spatial pyramid pooling,which showed that it is effective to resample features at different scales for accurately and efficiently classifying regions of an arbitrary scale.

翻译:ASPP模块的应用是受SPP模块的启发,它能够通过不同尺度的卷积核对特征进行采样,能够准确、高效地对任意尺度的区域进行分类。

2 代码实现

2.1 ASPP模块流程图

说明:图片来自DeepLabV3 Rethinking Atrous Convolution for Semantic Image Segmentation
在这里插入图片描述

2.1 代码

在这里插入图片描述

  • 说明:该代码针对的是DeepLabV3+中,DCNN的Exit Flow输出为2048通道,则该代码输入为2048,使用需要进行微调。
class ASPP_module(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(ASPP_module, self).__init__()
        # 定义空洞率,根据图示空洞率为1 6 12 18 ,说明:当空洞率为1时为普通卷积
        dilations = [1, 6, 12, 18]

        self.Aspp1 = nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(1,1),padding=0,
                      dilation=dilations[0], bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU())

        self.Aspp2 = nn.Sequential(
            nn.Conv2d(in_channels = in_channels, out_channels=out_channels, kernel_size=(3,3),
                      # padding与dilation相同原因:当原始卷积核为3x3时,使输入输出尺寸大小相同,计算见3中说明。
                      padding=dilations[1],dilation=dilations[1], bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )
        self. Aspp3 = nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(3, 3),
                      padding=dilations[2], dilation=dilations[2], bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )
        self.Aspp4 = nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(3, 3),
                      padding=dilations[3], dilation=dilations[3], bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )
        self.global_avg_pool = nn.Sequential(
            nn.AdaptiveAvgPool2d((1, 1)),
            # 输入通道2048,原因:deeplab v3+ 在进去ASPP前为2048
            nn.Conv2d(2048, 256, kernel_size=(1, 1), stride=1, bias=False),  
            nn.BatchNorm2d(256),
            nn.ReLU()
        )
		 # concat后通道为1280,用1x1卷积改变通道数
        self.conv1 = nn.Conv2d(1280, 256, kernel_size=(1, 1), bias=False)
        self.bn1 = nn.BatchNorm2d(256)
		# 初始化卷积核权重
        self._init_weight()  

    def forward(self, x):
        x1 = self.Aspp1(x); print("X1.shape", x1.size())
        x2 = self.Aspp2(x); print("X2.shape", x2.size())
        x3 = self.Aspp3(x); print("X3.shape", x3.size())
        x4 = self.Aspp4(x); print("X4.shape", x4.size())
        x5 = self.global_avg_pool(x); print('x5.shape', x5.size())
        # 利用双线性插值恢复x5的尺寸,再进行concat
        x5 = F.interpolate(x5, size=x4.size()[2:], mode='bilinear', align_corners=True) 
        cat = torch.cat((x1, x2, x3, x4, x5), dim=1);print('cat.shape', cat.size())
        
        # 此处的output,包含后面1x1卷积进行的通道数调整
        output = self.conv1(cat)
        output = self.bn1(output);print('output.shape', output.size())
        return output

    def _init_weight(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))  # 初始化卷积核方式
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

if __name__ == '__main__':
    # 测试输出尺寸使用
    aspp = ASPP_module(2048, 256)
    input = torch.randn(2, 2048, 176, 240);print('input_size:', input.size())
    out = aspp(input)
  • 输出结果:
input_size: torch.Size([2, 2048, 176, 240])
X1.shape torch.Size([2, 256, 176, 240])
X2.shape torch.Size([2, 256, 176, 240])
X3.shape torch.Size([2, 256, 176, 240])
X4.shape torch.Size([2, 256, 176, 240])
x5.shape torch.Size([2, 256, 1, 1])
cat.shape torch.Size([2, 1280, 176, 240])
output.shape torch.Size([2, 256, 176, 240])

3. 常见问题

3.1 为什么在本文aspp子模块中,padding需要与dilation相同?

前提:需要有空洞卷积的基础知识。

self.Aspp2 = nn.Sequential(
            nn.Conv2d(in_channels = in_channels, out_channels=out_channels, kernel_size=(3,3),
                      # padding与dilation相同原因:当原始卷积核为3x3时,使输入输出尺寸大小相同,计算见说明。
                      padding=dilations[1],dilation=dilations[1], bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )

解释: 在Aspp2,3,4中的卷积都是使用空洞卷积(Atrous Convolution)进行的,且原始卷积核为3x3,对于其它大小的原始卷积核大小,请自行计算。
在这里插入图片描述

  • 16
    点赞
  • 61
    收藏
    觉得还不错? 一键收藏
  • 8
    评论
以下是ASPP模块的代码示例: ```python import torch import torch.nn as nn class ASPP(nn.Module): def __init__(self, in_channel, out_channel, rates=[6, 12, 18]): super(ASPP, self).__init__() self.conv1 = nn.Conv2d(in_channel, out_channel, 1) self.conv2 = nn.Conv2d(in_channel, out_channel, 3, padding=rates[0], dilation=rates[0]) self.conv3 = nn.Conv2d(in_channel, out_channel, 3, padding=rates[1], dilation=rates[1]) self.conv4 = nn.Conv2d(in_channel, out_channel, 3, padding=rates[2], dilation=rates[2]) self.conv5 = nn.Conv2d(out_channel*5, out_channel, 1) self.bn = nn.BatchNorm2d(out_channel) self.relu = nn.ReLU() def forward(self, x): feature_map_h = x.size()[2] feature_map_w = x.size()[3] out1 = self.conv1(x) out2 = self.conv2(x) out3 = self.conv3(x) out4 = self.conv4(x) out2 = self.bn(out2) out3 = self.bn(out3) out4 = self.bn(out4) out5 = nn.functional.interpolate(out1, size=(feature_map_h, feature_map_w), mode='bilinear', align_corners=True) out6 = nn.functional.interpolate(out2, size=(feature_map_h, feature_map_w), mode='bilinear', align_corners=True) out7 = nn.functional.interpolate(out3, size=(feature_map_h, feature_map_w), mode='bilinear', align_corners=True) out8 = nn.functional.interpolate(out4, size=(feature_map_h, feature_map_w), mode='bilinear', align_corners=True) out = torch.cat([out1, out6, out7, out8, out5], dim=1) out = self.conv5(out) out = self.bn(out) out = self.relu(out) return out ``` 该模块包括5个卷积层,其3个是具有不同空洞率的空洞卷积层,其余两个是1×1卷积层。在前三个卷积层,我们使用具有不同空洞率的卷积,以便能够捕获不同尺度的信息。在第四个卷积层,我们使用3×3卷积。在最后一个卷积层,我们将所有特征图拼接在一起并使用1×1卷积层来减少通道数。最后,我们使用批量归一化和ReLU激活函数来处理输出。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值