注:刚刚看完了这篇论文,顺便整理了一下这篇论文的思路,还是得膜一下伟诗大佬,666
论文解析
- Matching people across nonoverlapping camera views at different locations and different time.
- RE-ID的一个经典的简化设定是一个probe set(比如包括p1, p2, p3三张图,分别对应三个不同的人)和一个gallery set(g1, g2, g3)。p和g分别来自不同的camera view,probe set中的p是要查找的人,而gallery set中的g相当于一个小型数据库,我们要从中找到proset中的人,假设(p1, g1),(p2, g2),(p3, g3)是正确的匹配。假如我要寻找p1这个人,就是计算p1与(g1, g2, g3)的距离,然后做一个ranking。理想情况下,g1应该排在rank1。rank n accuracy指的是在rank n及其之前就找到了正确匹配的人的比例。
- Challenge
- In a busy uncontrolled enviroment monitored by cameras from a distance, person verification relying upon biometric such as face and gait is infeasible and unreliable.(在一个复杂的环境下行人的生物特征不明显)
- As the transition time between disjoint cameras varies greatly from individual to individual with uncertainty, it is hard to impose accurate temporal and spatial constrains.(时间跨度不同,不能用准确的空间约束条件)
- The visual appearance features, extract mainly from the clothing and shapes of people, are intrinsically for matching people. (就是指这些特征并不具有代表性). In addition, a person’s appearance often undergoes large variations accross non-overlapping camera views due to significant changes in view angle, lighting, background clutter, and occlusion。(这导致了不同的人在不同的camera views下比同一个人更加相似)
- Two steps to deal with RE-ID
- A feature representation is computed from both the query and each of the gallery image
- The distance between each pair of potential matches is measure