NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing论文笔记

最新推荐文章于 2024-09-14 19:18:15 发布

导导96

最新推荐文章于 2024-09-14 19:18:15 发布

阅读量246

点赞数

文章标签：机器学习

本文链接：https://blog.csdn.net/weixin_44830789/article/details/115485277

版权

NMS by Representative Region: Towards Crowded Pedestrian Detection by
Proposal Pairing论文笔记

看论文的笔记，欢迎一起讨论，毕竟我是条咸鱼，求大佬指点*

一般学习的过程就是发现问题,解决问题，验证自己是对的。

1.发现问题：

pedestrian detection in crowded scenes is still challenging
然后作者发现目前的NMS方法不太work的样子，然后就提出了a novel Representative Region NMS (R^2NMS)的方法

目前NMS的缺点有：
A relative low threshold of intersection over union (IoU) leads to missing highly over lapped pedestrians, while a higher one brings in plenty of false positives.
这句话说IOU太低的话过度重叠部分会检测不到，过高会带来假正例的比较多.
对NMS和Adaptive NMS的分析。
*对NMS的分析
在这里插入图片描述
红色的框是full body predictions
绿色框是visible body predictions
原始的NMS方法会让红色虚点框消失, 而作者的方法保留了这点.

*对Adaptive NMS的分析
在这里插入图片描述
绿色框是真实的
红色框是预测的
预测过高,将会删除过多的预测框
However, density estima tion itself remains a diffificult task.
Besides, the matching from the density to the optimal IoU threshold is still handcrafted in AdaptiveNMS, and thus the exact matching is diffificult to acquire.

解决问题：

这里的解决问题指作者提出了怎样的模型
在这里插入图片描述
作者的基本框架如图, CNN Feature Extractor 在 CrowdHuman dataset中作者利用 Feature Pyramid Network (FPN) with ResNet-50 作为baseline
对于 CityPersons dataset, 采取了Faster R-CNN framework.

对于这一部分的设计，作者给出了具体算法
在这里插入图片描述
这个算法就是看visual part,当这一部分超过IOU留下visual part 和Full part
同时满足

其中A是 an anchor A is viewed as positive matched to the
ground-truth pair Q = (F, V)
α1 = 0.7 and β1 = 0.7.

这一部分是将特征级联,但是中间加入01色素图与Full part 做元素相乘
for a pair of annotation Q = (F, V), a pair of proposal X = (Pf,Pv) is positive
在这里插入图片描述
α2, β2 is 0.5 and 0.5.