Focal Loss for Dense Object Detection
上篇文章简单介绍了Focal loss,该方法现已被广泛应用。除此之外,个人觉得原文中提到的RetinaNet值得关注,看了代码后着重对FPN和分类/回归子网络的细节进行整理。Figure 1
RetinaNet的网络架构如上图所示,可以简单看作是ResNet+FPN+分类/回归子网络。关于ResNet,网上的资料很多,不再赘述。
一、FPN
有关FPN,可以参考我的文章进行理解:冲鸭嘎嘎:CVPR 2017 - FPN理解 - 简单高效的特征金字塔zhuanlan.zhihu.com
FPN对应的代码块如下:
class PyramidFeatures(nn.Module):
def __init__(self, C3_size, C4_size, C5_size, feature_size=256):
super(PyramidFeatures, self).__init__()
# upsample C5 to get P5 from the FPN paper
self.P5_1 = nn.Conv2d(C5_size, feature_size, kernel_size=1, stride=1, padding=0)
self.P5_upsampled = nn.Upsample(scale_factor=2, mode='nearest')
self.P5_2 = nn.Conv2d(feature_size, feature_size, kernel_size=3, stride=1, padding=1)
# add P5 elementwise to C4
self.P4_1 = nn.Conv2d(C4_size, feature_size, kernel_size=1, stride=1, padding=0)
self.P4_upsampled = nn.Upsample(scale_factor=2, mode='nearest')
self.P4_2 = nn.Conv2d(feature_size, feature_size, kernel_size=3, stride=1, padding=1)
# ad