(CVPR2020:视频中人体姿态估计的组合检测与跟踪)
1、Background
multi-person human pose estimation and tracking in videos(视频中多人人体姿势估计和跟踪)
2、Question and difficult
·top-down methods do not perform as well on videos and were recently outperformed by a bottom-up approach(自顶向下的方法在视频上的表现不太好,最近被自下而上的方法所超越)
·detecting people bounding boxes in videos is a much harder task than on images(在视频中检测人的边界框比在图像上要困难得多)
·videos inherently contain atypical types of occlusion, viewpoints, motion blur and poses that make object detectors occasionally fail(视频固有地包含非典型类型的遮挡、视点、运动模糊和姿态,使得对象检测器偶尔会失败)
3、Existing solutions
top-down approaches, limited by the performance of its person detector(自上而下的方法,受限于人物监测器,有时在某一帧会误检测)
4、Main contents of the article
This is a novel top-down approach that tackles the problem of multi-person human pose estimation and tracking in videos.(这是一种新的自顶向下的方法,解决了视频中多人的姿态估计和跟踪问题。)
Main ideas:It can propagate known person locations forward and backward in time and searching for poses in those regions.(核心思想:它可以在时间上向前和向后传播已知的人的位置并在这些区域中搜索姿势。)We detect person bounding boxes on each frame and then propagate these to their neighbours.(我们检测每一帧上的人边界框,然后把这些传播给他们的邻帧。)If a person is present at a specific location in a frame, they should still be at approximately that location in the neighbouring frames, even when the detector fails to find them.(如果一个人出现在一个帧的特定位置,他们应该仍然在相邻帧的大约那个位置,即使检测器找不