Reading Paper for Pedestrian detection 3.22-3.27

最新推荐文章于 2020-11-14 20:53:52 发布

Sling8219

最新推荐文章于 2020-11-14 20:53:52 发布

阅读量661

点赞数

分类专栏： Pedestrian Detection 文章标签：行人检测论文心得

本文链接：https://blog.csdn.net/Sling8219/article/details/50994949

版权

1 篇文章 0 订阅

订阅专栏

Reading Paper for Pedestrian detection 3.22-3.27

Pedestrian Detection：An Evaluation of the State of the Art

The Fastest Pedestrian Detector in the West

核心思想: For a broad family of features, including gradient histograms, the feature responses computed at a single scale can be used to approximate feature responses at nearby scales.
总结使用技巧：multiple scales、sparsely sampled image pyramid、a step size of an entire octave、each octave we use a classifier pyramid

Taking a Deeper Look at Pedestrians

ConvNet:集中处理有限的数据；同时使用来自最后一层和倒数第二层的特征

DBN-Isol:用RBMs扩展DPM来试图解决部分和遮挡问题

DBM-Mut:Account for person-to-person relations

JointDeep:联合优化的传统特征、DPM和DBM-Mut的person-to-person relations

MultiSDP:在每层中融合不同尺度的行人检测候选集的上下文信息

SDN:使用“switchable layers”自动学习low-level features和high-level parts

修改网络结构和参数的方面：Detection proposals，Thresholds for positive and negative samples，Model window size，Training batch，Number and size of cone filters，Number and type of layers等
基于 CafirNet 训练small convnet(105parameters)，基于 AlexNet 训练big convnet(107parameters)
比较结果：On Caltech10x, we find the CifarNet performance improved to 28.4%, while the AlexNet improves to 27.1% MR；运行时间上per proposal window用时3ms，每张图少于100个检测窗，总用时每张图300ms，相比较SquaresChnFtrs每张图用时2s。

How Far are We from Solving Pedestrian Detection

基于 CafirNet 训练small convnet(105 parameters)，基于 AlexNet 训练big convnet(107 parameters)
主要工作：给出了human baseline for the Caltech Benchmark，证实行人检测仍有十倍的提升空间，但同时出“the last 20%”更难挑战；提出了当前的state-of-the-art method和完美的单帧单眼检测器RotatedFilters，取得最佳性能；讨论了影响性能的主要因素：background-versus-foreground and localization。
改进了Caltech dataset的Pedestrian bounding box绘制原则：A bounding box is then automatically generated such that its centre coincides with the centre point of the manually-drawn axis.
分析了影响训练标注的两点：Pruning benefits 和 Alignment benefits
某些时候convnets会给TP样本周围的窗赋值以低分，despite their fine-tuning, the convnet score maps are “blurrier” than the proposal ones. 作者认为导致这一现象的原因是诸如AlexNet和VGG有一些结构限制如internal feature pooling。
基于语义标注和边界估计的方法需要pixel-accurate output；可以通过bounding box regression来弥补卷积网络空间分辨率不足的情况。
个人觉得分析误差来源时提到用去除训练和测试集中的false positive的方法来提升性能很扯。觉得不对者详见论文。
记录一些State-of-the-art(在某时刻曾是)及其算法思想，见图