方法介绍
简介
像素级道路裂缝检测一直是智能交通系统中一个具有挑战性的课题。由于外界环境,如天气、光线等因素,路面裂缝往往呈现对比度低、连续性差、长宽大小不一等特点。然而,现有的研究大多对不同情况下的破解数据关注较少。与此同时,基于深度卷积神经网络(DCNNs)的新算法促进了尖端裂纹检测模型的发展。然而,为了获得良好的性能,它们通常将重点放在复杂的模型上,而忽略了实际应用中的检测效率。因此作者提出一种橄榄状结构的检测网络CarNet。
创新点介绍
橄榄形encorder网络
在encorder中通常需要考虑两个因素:
1、在编码器网络中卷积层总数不变的前提下,更深的网络阶段需要包含更少的卷积层,以降低整体空间复杂度。
2、为了降低编码器网络的整体时间复杂度,网络的初始和尾部阶段需要包含更少的卷积层,而网络的中间阶段可以产生更多的卷积层。
考虑到这两个方面,作者在网络初始阶段使用少量的跨行卷积来压缩输入图像的分辨率,然后随着网络阶段的加深减少卷积层数。然后,整个编码器在不同网络阶段的卷积层数呈橄榄形结构。编码器由两种模块组成,即下采样块(DB)和残差块(RB)。
上采样方法
多尺度特征和特征上采样方法是影响译码网络性能和效率的两个重要因素,因此引入了一种新的上采样模块,即上采样特征金字塔块(UFPB)。
轻量级的多尺度模块
在多尺度融合模块,由于直接使用小核反卷积进行特征上采样容易对测试图像产生网格效应。因此为了解决这个问题,提出将特征细化模块和小核反卷积相结合。由于裂缝大多是线性结构,通过解码器中的分解卷积块(DCB)进行特征细化。该块包含两对级联分解卷积(即一个3 × 1和另一个1 × 3)。
解码器部分
模型效果评估
模块拆分
下采样模块
class DownsamplerBlock(nn.Module):
def __init__(self, ninput, noutput):
super().__init__()
self.conv = nn.Conv2d(ninput, noutput - ninput, (3, 3), stride=2, padding=1, bias=True)
self.pool = nn.MaxPool2d(2, stride=2)
self.bn = nn.BatchNorm2d(noutput, eps=1e-3)
def forward(self, input):
output = torch.cat([self.conv(input), self.pool(input)], 1)
output = self.bn(output)
return F.relu(output)
编码器
原作者采用的是resnet34作为残差部分
class Encoder_v0_762(nn.Module):
def __init__(self, num_classes=1,
dp=DownsamplerBlock,
block=BasicBlock_encoder,
channels=[3, 16, 64, 128, 256],
dropprob=0.1, rates=[1, 1, 1, 1, 1, 1],
predict=False):
super().__init__()
self.predict = predict
self.stage_1_0 = dp(channels[0], channels[1])
self.stage_2_0 = dp(channels[1], channels[2])
self.stage_2_1 = block(channels[2], dropprob, rates[0])
self.stage_2_2 = block(channels[2], dropprob, rates[1])
self.stage_2_3 = block(channels[2], dropprob, rates[2])
self.stage_2_4 = block(channels[2], dropprob, rates[3])
self.stage_2_5 = block(channels[2], dropprob, rates[1])
self.stage_2_6 = block(channels[2], dropprob, rates[2])
self.stage_2_7 = block(channels[2], dropprob, rates[3])
self.stage_3_0 = dp(channels[2], channels[3])
self.stage_3_1 = block(channels[3], dropprob, rates[1])
self.stage_3_2 = block(channels[3], dropprob, rates[2])
self.stage_3_3 = block(channels[3], dropprob, rates[3])
self.stage_3_4 = block(channels[3], dropprob, rates[1])
self.stage_3_5 = block(channels[3], dropprob, rates[2])
self.stage_3_6 = block(channels[3], dropprob, rates[3])
self.stage_4_0 = dp(channels[3], channels[4])
self.stage_4_1 = block(channels[4], dropprob, rates[4])
self.stage_4_2 = block(channels[4], dropprob, rates[5])
def forward(self, input):
stage_1_0 = self.stage_1_0(input)
stage_2_0 = self.stage_2_0(stage_1_0)
stage_2_1 = self.stage_2_1(stage_2_0)
stage_2_2 = self.stage_2_2(stage_2_1)
stage_2_3 = self.stage_2_3(stage_2_2)
stage_2_4 = self.stage_2_4(stage_2_3)
stage_2_5 = self.stage_2_5(stage_2_4)
stage_2_6 = self.stage_2_6(stage_2_5)
stage_2_last = self.stage_2_7(stage_2_6)
stage_3_0 = self.stage_3_0(stage_2_last)
stage_3_1 = self.stage_3_1(stage_3_0)
stage_3_2 = self.stage_3_2(stage_3_1)
stage_3_3 = self.stage_3_3(stage_3_2)
stage_3_4 = self.stage_3_4(stage_3_3)
stage_3_5 = self.stage_3_5(stage_3_4)
stage_3_last = self.stage_3_6(stage_3_5)
stage_4_0 = self.stage_4_0(stage_3_last)
stage_4_1 = self.stage_4_1(stage_4_0)
stage_4_last = self.stage_4_2(stage_4_1)
output = [stage_1_0, stage_2_last, stage_3_last, stage_4_last]
return output
DCN
在这一部分作者提出了三种不同的情况,kernel分别等于3,5,7时候的代码。
class non_bottleneck_1d_2(nn.Module):
def __init__(self, chann, dropprob, kernel, dilated, encoder_stage=False, last=True):
super().__init__()
self.encoder_stage = encoder_stage
if kernel==3:
self.conv1_1 = nn.Conv2d(chann, chann, (3, 1), stride=1, padding=(1, 0), bias=True)
self.conv1_2 = nn.Conv2d(chann, chann, (1, 3), stride=1, padding=(0, 1), bias=True)
self.conv2_1 = nn.Conv2d(chann, chann, (3, 1), stride=1, padding=(1 * dilated, 0), bias=True,
dilation=(dilated, 1))
self.conv2_2 = nn.Conv2d(chann, chann, (1, 3), stride=1, padding=(0, 1 * dilated), bias=True,
dilation=(1, dilated))
elif kernel==5:
self.conv1_1 = nn.Conv2d(chann, chann, (5, 1), stride=1, padding=(2, 0), bias=True)
self.conv1_2 = nn.Conv2d(chann, chann, (1, 5), stride=1, padding=(0, 2), bias=True)
self.conv2_1 = nn.Conv2d(chann, chann, (5, 1), stride=1, padding=(2 * dilated, 0), bias=True,
dilation=(dilated, 1))
self.conv2_2 = nn.Conv2d(chann, chann, (1, 5), stride=1, padding=(0, 2 * dilated), bias=True,
dilation=(1, dilated))
elif kernel==7:
self.conv1_1 = nn.Conv2d(chann, chann, (7, 1), stride=1, padding=(3, 0), bias=True)
self.conv1_2 = nn.Conv2d(chann, chann, (1, 7), stride=1, padding=(0, 3), bias=True)
self.conv2_1 = nn.Conv2d(chann, chann, (7, 1), stride=1, padding=(3 * dilated, 0), bias=True,
dilation=(dilated, 1))
self.conv2_2 = nn.Conv2d(chann, chann, (1, 7), stride=1, padding=(0, 3 * dilated), bias=True,
dilation=(1, dilated))
self.bn1 = nn.BatchNorm2d(chann, eps=1e-03)
self.bn2 = nn.BatchNorm2d(chann, eps=1e-03)
self.dropout = nn.Dropout2d(dropprob)
self.last = last
def forward(self, input):
output = self.conv1_1(input)
if self.encoder_stage:
output = F.relu(output)
output = self.conv1_2(output)
output = self.bn1(output)
if self.encoder_stage:
output = F.relu(output)
output = self.conv2_1(output)
if self.encoder_stage:
output = F.relu(output)
output = self.conv2_2(output)
if self.last == False:
output = self.bn2(output)
if (self.dropout.p != 0):
output = self.dropout(output)
if self.encoder_stage:
return F.relu(output + input) # +input = identity (residual connection)
else:
return output
整体模型结构
class CarNet34(nn.Module):
def __init__(self,
Encoder=Encoder_v0_762,
dp=DownsamplerBlock,
block=BasicBlock_encoder,
num_classes=1,
channels=[3, 16, 64, 128, 256],
dropprob=[0, 0],
rates=[1, 1, 1, 1, 1, 1],
kernels=[3, 3, 3],
predict=False,
decoder_block=non_bottleneck_1d_2):
super().__init__()
self.encoder = Encoder(num_classes=num_classes, dp=dp, block=block,
channels=channels, dropprob=dropprob[0], rates=rates, predict=predict)
compress_channels = 32 ### channels[2] // 2
self.conv1x1_1 = nn.Conv2d(channels[4], compress_channels, (1, 1))
self.deconv_1 = nn.ConvTranspose2d(compress_channels, compress_channels, kernel_size=3,
stride=4, padding=1, output_padding=3, bias=True)
self.conv1x1_2 = nn.Conv2d(channels[3], compress_channels, (1, 1))
self.deconv_2 = nn.ConvTranspose2d(compress_channels, compress_channels, kernel_size=3,
stride=2, padding=1, output_padding=1, bias=True)
self.conv1x1_3 = nn.Conv2d(channels[2], compress_channels, (1, 1))
self.conv_1 = decoder_block(compress_channels, dropprob=dropprob[1], kernel=kernels[1],
dilated=rates[0], encoder_stage=False, last=True)
self.conv1x1_4 = nn.Conv2d(compress_channels, num_classes, (1, 1))
self.deconv_3 = nn.ConvTranspose2d(num_classes, num_classes, kernel_size=3,
stride=4, padding=1, output_padding=3, bias=True)
self.conv_2 = decoder_block(num_classes, dropprob=dropprob[1], kernel=kernels[2],
dilated=rates[0], encoder_stage=False, last=True)
def forward(self, input):
encoder = self.encoder(input)
### 不同阶段的输出
stage_1, stage_2, stage_3, stage_4 = (encoder)[0], (encoder)[1], (encoder)[2], (encoder)[3]
stage_4 = self.conv1x1_1(stage_4)
s_4to2 = self.deconv_1(stage_4)
stage_3 = self.conv1x1_2(stage_3)
s_3to2 = self.deconv_2(stage_3)
s2 = self.conv1x1_3(stage_2)
s_4_3_2 = self.conv_1(s_4to2 + s_3to2 + s2)
s_4_3_2 = self.conv1x1_4(s_4_3_2)
s = self.deconv_3(s_4_3_2)
s = self.conv_2(s)
return s