目录
Two-stage Anchor-based Object Detectors通常具有比较高的检测准确度,其发展趋势主要有3方面:
1. 如何实现端到端的训练
- R-CNN的端到端训练程度非常低:SS算法选择候选区域+backbone预训练+backbone微调+训练多个SVM二分类器+训练多个边界框回归器;
- Fast R-CNN实现除SS外的端到端训练:把分类器和回归器嵌入网络中,用多个全连接层来代替;
- Faster R-CNN实现真正地端到端训练:使用RPN代替SS来产生候选区域。
2. 如何通过共享计算提高目标检测速度
- SPPnet通过引入SSP层解决输入图像尺寸须固定的问题,并通过候选区域映射到特征图的方法,使得只需要对整个图像提取一次特征,大大减少特征提取的时间,提高目标检测速度;
- R-FCN引入位置敏感得分图实现head的共享计算,大大减少分类和回归的时间,提高了目标检测速度;
- Light-Head R-CNN在通过减少特征图的通道数,提高了目标检测速度。
3. 如何进一步提高目标检测准确率
3.1 architecture diagram
- R-FCN: object detection via region-based fully convolutional networks. In NIPS, 2016
- ME R-CNN: multi-expert region-based CNN for object detection. In ICCV, 2017
- Couplenet: Coupling global structure with local parts for object detection. In ICCV, 2017 【CoupleNet】
- Cascade R-CNN: delving into high quality object detection. In CVPR, 2018 【Cascade R-CNN使用递增的IoU阈值,训练了多个级联的检测器。】
- Scale-aware trident networks for object detection. In ICCV, 2019 【TridentNet通过研究网络感受野与目标检测性能的关系,特定大小的特征层只用特定尺寸的物体来训练,并只检测特定尺寸的物体。】
3.2 multi-scale training and testing
- An analysis of scale invariance in object detection - SNIP. In CVPR, 2018 【SNIP通过研究物体尺寸与网络性能的关系,借助图像金字塔,使网络只检测合适大小的物体,提高了目标检测的准确度】
- Autofocus: Efficient multi-scale inference. In ICCV, 2019
3.3 feature fusion and enhancement or multiple layers exploiting
- A unified multi-scale deep convolutional neural network for fast object detection. In ECCV, 2016 【MS-CNN使用反卷积来增大输出特征图的分辨率,并直接基于多尺度特征图进行预测】
- Hypernet: Towards accurate region proposal generation and joint object detection. In CVPR, 2016 【HyperNet通过融合不同尺寸的特征图,输出具有较强语义信息和丰富位置信息的单尺度特征图】
- Beyond skip connections: Top-down modulation for object detection. In CoRR, 2016 【】
- Feature pyramid networks for object detection. In CVPR, 2017 【FPN通过映入多尺度特征融合,并进行多尺度特征预测,减小了目标尺寸变化的影响,大大提高了目标检测的准确度;】
3.4 training strategy and loss function
- G-CNN: an iterative grid based object detector. In CVPR, 2016
- Training region-based object detectors with online hard example mining. In CVPR, 2016
- A-fast-rcnn: Hard positive generation via adversary for object detection. In CVPR, 2017
- Bounding box regression with uncertainty for accurate object detection. In CVPR, 2019 【KL loss将边界框建模成高斯分布,并利用高斯分布的标准差来衡量边界框定位的不确定性】
3.5 better proposal and balance
- Learning to rank proposals for object detection. In ICCV, 2019
- Libra R-CNN: towards balanced learning for object detection. In CVPR, 2019
3.6 contextual reasoning
- Insideoutside net: Detecting objects in context with skip pooling and recurrent neural networks. [CVPR, 2016
- Object detection via a multiregion and semantic segmentation-aware CNN model. In ICCV, 2015 【下载】
- Contextual priming and feedback for faster R-CNN. In ECCV, 2016
- Gated bi-directional CNN for object detection. In ECCV, 2016
- Structure inference net: Object detection using scene-level context and instance-level relationships. In CVPR, 2018
- Context refinement for object detection. In ECCV, 2018
- Thundernet: Towards realtime generic object detection. In ICCV, 2019