【Challenges】
- severe camera motion
- variation in human appearance and pose
- cluttered background and occlusion
- viewpoint and illumination changes
【Abstract】
Recently dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets. This paper improves their performance by taking in to account camera motion to correct them. To estimate camera motion, we match feature points between frames using SURF descriptors and dense optical flow, which are shown to be complementary. These matches are, then, used to robustly estimate a homography with RANSAC. Human motion is in general different from camera motion and generates inconsistent matches. To improve the estimation,a human detector is employed to remove these matches. Given the estimated camera motion, we remove trajectories consistent with it.We also use this estimation to cancel out camera motion from the optical flow. This significantly improves motion-based descriptors, such as HOF and MBH. Experimental results on four challenging action datasets(i.e.,Hollywood2, HMDB51, Olympic Sports and UCF50) significantly outperform the current state of the art.
一、了解Dense Trajectory(详情请参考Dense Trajectory)
简称DT,是一种用来提取视频密集跟踪轨迹的算法;通常基于该轨迹进行取块计算descriptor。在视频序列中对每一帧的兴趣点进行跟踪就形成trajectory(轨道),若是对每一帧密集采样兴趣点进行跟踪就形成dense trajectory(密集轨道)。
算法,简单的说:(详情请参考 Dense Trajectory)
1. 整个视频序列进行光流场(optical flow)计算
2. 密集采取像素点
3. 跟踪采样点,用光流判断下一位置
M为中值滤波器,w为光流场
4. 对每个点跟踪就会形成一条trajectory。为避免出现“漂移”现象,要对跟踪长度进行约束
5. 现实视频中存在摄像头运动的缺陷,要使用相应算法消除摄像头影响,得到DT
【contributions】
1、 通过明确估计摄像头移动(camera motion)来提高密集轨道
2、 通过homography(使用RANSAC)来移除检测人的异常匹配
3、 稳定的光流排除camera motion
(一)camera motion estimate估计摄像头移动
1. 找出两个连续帧的对应关系
--提取并匹配surf特征(对运动模糊具有鲁棒性)
--在光流中取样兴趣点来跟踪
2. 在一个平衡分布中合并surf和光流结果
3.使用RANSAC从所有特征匹配中估计一个单应性矩阵(homography)
(二)Remove inconsistent matches due to humans 移除由于人造成的不一致匹配
1、因为人的移动不受摄像机的约束,所以会产生outlier matches(异常匹配)
2、在每帧使用人检测器(human detector),并跟踪
3、移除特征在人的边界框内的特征(remove feature matches inside the human bounding box during homography estimation)
(三)warp optical flow
通过homography矩阵,移除相机的机变重新计算光流,称作warped flow;
(四)remove background trajectory
Remove trajectory by thresholding the maximal magnitude of stabilized motion vector in the warp optical flow
(五)
使用Fisher vector 来编码descriptor
This paper improves dense trajectories by explicitly estimating camera motion. We show that the performance can be significantly improved by removing background trajectories and warping optical flow with a robustly estimated homography approximating the camera motion. Using a state-of-the-art human detector, potentially inconsistent matches can be removed during camera motion estimation, which makes it more robust. An extensive evaluation on four challenging datasets demonstrates the effectiveness of the proposed approach, and establishes new bounds of performance.
【参考
】
[1] Action Recognition with Improved Trajectories.pdf
[2]http://www.aiuxian.com/article/p-3173877.html
[3]http://slideplayer.com/slide/6955319/
[4]http://blog.csdn.net/breeze5428/article/details/32706507