Aspp就是以不同的采样率采样特征图片,对于给定的输入以不同采样率的空洞卷积并行采样,将得到的结果在通道层面concat到一起,扩大通道数,然后再通过1*1的卷积将通道数降低到预期的数值。相当于以多个比例捕捉图像的上下文。
后附一个没有BN层的ASPP代码(PyTorch)
#without bn version
class ASPP(nn.Module):
def __init__(self, in_channel=512, depth=256):
super(ASPP,self).__init__()
self.mean = nn.AdaptiveAvgPool2d((1, 1)) #(1,1)means ouput_dim
self.conv = nn.Conv2d(in_channel, depth, 1, 1)
self.atrous_block1 = nn.Conv2d(in_channel, depth, 1, 1)
self.atrous_block6 = nn.Conv2d(in_channel, depth, 3, 1, padding=6, dilation=6)
self.atrous_block12 = nn.Conv2d(in_channel, depth, 3, 1, padding=12, dilation=12)
self.atrous_block18 = nn.Conv2d(in_channel, depth, 3, 1, padding=18, dilation=18)
self.conv_1x1_output = nn.Conv2d(depth * 5, depth, 1, 1)
def forward(self, x):
size = x.shape[2:]
image_features = self.mean(x)
image_features = self.conv(image_features)
image_features = F.upsample(image_features, size=size, mode='bilinear')
atrous_block1 = self.atrous_block1(x)
atrous_block6 = self.atrous_block6(x)
atrous_block12 = self.atrous_block12(x)
atrous_block18 = self.atrous_block18(x)
net = self.conv_1x1_output(torch.cat([image_features, atrous_block1, atrous_block6,
atrous_block12, atrous_block18], dim=1))
return net
引入了一个新的超参数 d,(d - 1) 的值则为塞入的空格数,假定原来的卷积核大小为 k,那么塞入了 (d - 1) 个空格后的卷积核大小 n 为:
进而,假定输入空洞卷积的大小为 i,步长 为 s ,空洞卷积后特征图大小 o 的计算公式为:
计算一下可以发现,不管是全局平均池化后上采样的输出,还是每个空洞卷积的输出,大小都是一样的,所以在通道维度可以concat