
本文为美国西北大学(作者:Ming Yang)的博士论文,共141页。




Visual object tracking, i.e. consistentlyinferring the motion of a desired target from image sequences, is a must-havecomponent to bridge low-level image processing techniques and high-level videocontent analysis. This has been an active and fruitful research topic in thecomputer vision community for decades due to both its versatile applications inpractice, e.g. in human-computer interaction, security surveillance, robotics,medical imaging and multimedia applications, and diverse impacts in theory,e.g. Bayesian inference on graphical models, particle filtering, kernel densityestimation, and machine learning algorithms. However, long-term robust trackingin unconstrained environments remains a very challenging task, and thedifficulties in reality are far from being conquered. The two core challengesof the visual object tracking task are the computational efficiency constraintand the enormous unpredictable variations in targets due to lighting changes,deformations, partial occlusions, camouflage, quick motion and imperfect imagequalities, etc. More critical, the tracking algorithms have to deal with thesevariations in an unsupervised manner. All the target variations in on-lineapplications are unpredictable, thus it is extremely hard, if not impossible,to design universal target specific or non-specific observation models inadvance. Therefore, these challenges call for non-stationary target observationmodels and agile motion estimation paradigms that are intelligent and adaptiveto different scenarios. In the thesis, we mainly focus on how to enhance thegenerality and reliability of object-level visual tracking, which strives tohandle enormous variations and takes the computational efficiency constraintinto consideration as well. We first present an in-depth analysis of thechicken-and-egg nature of on-line adaptation of target observation modelsdirectly using the previous tracking results. Then, we propose two novel ideasto combat unpredictable variations: context-aware tracking and attentionaltracking. In context-aware tracking, the tracker automatically discovers someauxiliary objects that have short-term motion correlation with the target.These auxiliary objects are regarded as the spatial contexts to enhance thetarget observation model and verify the tracking results. The attentionaltracking algorithms enhance the robustness of the observation models byselectively focusing on some discriminative regions inside the targets, oradaptively tuning the feature granularity and model elasticity. Context-awaretracking aims to search for external informative contexts of targets, incontrast, attentional tracking tries to identify internal discriminativecharacteristics of targets, thus they are complementary to each other in somesense. The proposed approaches can tolerate many typical difficult variations,thus greatly enhancing the robustness of the region-based object trackers.Besides single object tracking, we also introduce a new view to multiple targettracking from a game-theoretic perspective which bridges the joint motionestimation and the Nash Equilibrium of a particular game and has linearcomplexity with respect to the number of targets. Extensive experiments on challengingreal-world test video sequences demonstrate excellent and promising results ofthe proposed object-level visual tracking algorithms.

1 引言
2 相关工作
3 在线外观模型的自适应研究
4 上下文感知的视觉跟踪
5 基于注意力的视觉跟踪
6 基于博弈论的多目标跟踪
7 结论






当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


