Two-stage Anchor-based Object Detectors

Two-stage Anchor-based Object Detectors通常具有比较高的检测准确度,其发展趋势主要有3方面:

1. 如何实现端到端的训练

  • R-CNN的端到端训练程度非常低:SS算法选择候选区域+backbone预训练+backbone微调+训练多个SVM二分类器+训练多个边界框回归器;
  • Fast R-CNN实现除SS外的端到端训练:把分类器和回归器嵌入网络中,用多个全连接层来代替;
  • Faster R-CNN实现真正地端到端训练:使用RPN代替SS来产生候选区域。

2. 如何通过共享计算提高目标检测速度

  • SPPnet通过引入SSP层解决输入图像尺寸须固定的问题,并通过候选区域映射到特征图的方法,使得只需要对整个图像提取一次特征,大大减少特征提取的时间,提高目标检测速度;
  • R-FCN引入位置敏感得分图实现head的共享计算,大大减少分类和回归的时间,提高了目标检测速度;
  • Light-Head R-CNN在通过减少特征图的通道数,提高了目标检测速度。

3. 如何进一步提高目标检测准确率

3.1 architecture diagram

  • R-FCN: object detection via region-based fully convolutional networks. In NIPS, 2016
  • ME R-CNN: multi-expert region-based CNN for object detection. In ICCV, 2017
  • Couplenet: Coupling global structure with local parts for object detection. In ICCV, 2017 【CoupleNet
  • Cascade R-CNN: delving into high quality object detection. In CVPR, 2018 【Cascade R-CNN使用递增的IoU阈值,训练了多个级联的检测器。】
  • Scale-aware trident networks for object detection. In ICCV, 2019 【TridentNet通过研究网络感受野与目标检测性能的关系,特定大小的特征层只用特定尺寸的物体来训练,并只检测特定尺寸的物体。】

3.2 multi-scale training and testing

  • An analysis of scale invariance in object detection - SNIP. In CVPR, 2018 【SNIP通过研究物体尺寸与网络性能的关系,借助图像金字塔,使网络只检测合适大小的物体,提高了目标检测的准确度】
  • Autofocus: Efficient multi-scale inference. In ICCV, 2019

3.3 feature fusion and enhancement or multiple layers exploiting

  • A unified multi-scale deep convolutional neural network for fast object detection. In ECCV, 2016 【MS-CNN使用反卷积来增大输出特征图的分辨率,并直接基于多尺度特征图进行预测】
  • Hypernet: Towards accurate region proposal generation and joint object detection. In CVPR, 2016 【HyperNet通过融合不同尺寸的特征图,输出具有较强语义信息和丰富位置信息的单尺度特征图】
  • Beyond skip connections: Top-down modulation for object detection. In CoRR, 2016 【】
  • Feature pyramid networks for object detection. In CVPR, 2017 【FPN通过映入多尺度特征融合,并进行多尺度特征预测,减小了目标尺寸变化的影响,大大提高了目标检测的准确度;】

3.4 training strategy and loss function

  • G-CNN: an iterative grid based object detector. In CVPR, 2016
  • Training region-based object detectors with online hard example mining. In CVPR, 2016
  • A-fast-rcnn: Hard positive generation via adversary for object detection. In CVPR, 2017
  • Bounding box regression with uncertainty for accurate object detection. In CVPR, 2019 【KL loss将边界框建模成高斯分布,并利用高斯分布的标准差来衡量边界框定位的不确定性】

3.5 better proposal and balance

  • Learning to rank proposals for object detection. In ICCV, 2019
  • Libra R-CNN: towards balanced learning for object detection. In CVPR, 2019

3.6 contextual reasoning

  • Insideoutside net: Detecting objects in context with skip pooling and recurrent neural networks. [CVPR, 2016
  • Object detection via a multiregion and semantic segmentation-aware CNN model. In ICCV, 2015 【下载】
  • Contextual priming and feedback for faster R-CNN. In ECCV, 2016
  • Gated bi-directional CNN for object detection. In ECCV, 2016
  • Structure inference net: Object detection using scene-level context and instance-level relationships. In CVPR, 2018
  • Context refinement for object detection. In ECCV, 2018
  • Thundernet: Towards realtime generic object detection. In ICCV, 2019




