一、fpn网络结构
resnet结构组成。
fpn(Feature Pyramid Networks)架构如上图所示,以resnet50作为backbone。通过四部分的卷积,得到C2-C5特征图。然后统一使用1x1卷积层来保持特征图的维度一致。从最上层开始,每层进行上采用过程,采用的方法是‘nearnest’,把上采样的结果与当前层的特征图进行拼接得到当前层的特征图。
fpn示意图,要在不同特征层上进行预测,论文中采用的是上述公式。k0初始值设为4,比如当前图像的高宽为112,则计算得到的k=3,在C3特征层上进行预测。
二、代码
class Decoder(nn.Module):
def __init__(self, C3_size, C4_size, C5_size, feature_size=256):
super(Decoder, self).__init__()
# upsample C5 to get P5 from the FPN paper
self.P5_1 = nn.Conv2d(C5_size, feature_size, kernel_size=1, stride=1, padding=0)
self.P5_upsampled = nn.Upsample(scale_factor=2, mode='nearest')
self.P5_2 = nn.Conv2d(feature_size, feature_size, kernel_size=3, stride=1, padding=1)
# add P5 elementwise to C4
self.P4_1 = nn.Conv2d(C4_size, feature_size, kernel_size=1, stride=1, padding=0)
self.P4_upsampled = nn.Upsample(scale_factor=2, mode='nearest')
self.P4_2 = nn.Conv2d(feature_size, feature_size, kernel_size=3, stride=1, padding=1)
# add P4 elementwise to C3
self.P3_1 = nn.Conv2d(C3_size, feature_size, kernel_size=1, stride=1, padding=0)
self.P3_upsampled = nn.Upsample(scale_factor=2, mode='nearest')
self.P3_2 = nn.Conv2d(feature_size, feature_size, kernel_size=3, stride=1, padding=1)
def forward(self, inputs):
C3, C4, C5 = inputs
P5_x = self.P5_1(C5)
P5_upsampled_x = self.P5_upsampled(P5_x)
P5_x = self.P5_2(P5_x)
P4_x = self.P4_1(C4)
P4_x = P5_upsampled_x + P4_x
P4_upsampled_x = self.P4_upsampled(P4_x)
P4_x = self.P4_2(P4_x)
P3_x = self.P3_1(C3)
P3_x = P3_x + P4_upsampled_x
P3_x = self.P3_2(P3_x)
return [P3_x, P4_x, P5_x]