Underlying learning-based tracking algorithms lies a data-driven procedure. In most cases, a binary classifier (except [13]) that discriminates the object of interest and the background is learned, whether offline or online. If the classifier is learned offline, this solves tracking as a detection problem [17, 22]. In [16], temporal smoothing is enforced by casting the detector output as the observation likelihood in a particle filter setting. Avidan [1] proposes a support vector tracking algorithm that learns a support vector machine (SVM) from the training data using the polynomial kernel. The SVM score, after the Taylor expansion, is analytically maximized for every frame. In [24],Williams et al. build on the relevance vector machine to perform tracking, where temporal fusion is applied. In [13], Lepetit et al. artificially generates exemplars (using affine transform or 3D model) for each feature point that is treated as a class and use 1-NN neighbor searching to determine the class label.
If the classifier is learned online, the appearance model updating is embedded into the classifier. In the work of Collins and Liu [6], the appearance model is represented by a set of features that are selected online based on the variance ratio of the log likelihood function, which is empirically estimated. Ensemble tracking [2] developed by Avidan invokes the AdaBoost to learn the classifier. After tracking, the classifier is updated by adding the recently tracked results. Since the AdaBoost is a feature selection process, the ensemble tracker also represents the appearance model by features that are updated over time.
--- from paper : BoostMotion: Boosting a discriminative similarity function for motion estimation