目标跟踪的一般思想是跟踪目标中关键点。TLD也是跟踪点(但不是跟踪SIFT之类的关键点)。点跟踪采用的是光流法,具体来说是Pyramidal Lucas-Kanade tracker,这个以后机会再介绍,推荐阅读《Learning OpenCV》第10章的Lucas-Kanade Method部分,这里只介绍OpenCV的实现函数,跳过原理和实现细节。
首先看跟踪点的函数,calcOpticalFlowPyrLK,它的作用是找到上一帧中的跟踪点在当前帧的位置。调用形式如下:
calcOpticalFlowPyrLK( img_last,img_curr, points_last , points_curr, status, errs)
参数意思应该很直白了吧,补充一下status为1,表示对应点找到了,为0就是没找到,errs自然是误差。注意:可以是单个点,也可以是点集,如果是点集,那么对应的status和errs就都是vector啦。
下面说说怎么跟踪目标,TLD采用的是基于作者自己提出的Median-Flow tracker,此外增加了跟踪失败检测。
通过Forward-Backward error来筛选要跟踪的点
前面提到TLD跟踪的不是关键点,它跟踪的是更简单的点:能稳定存在的点,那哪些点是稳定的呢?Median-Flow tracker的基本思想是,看反向跟踪后的残差,用所有点的残差中值作为稳定点的筛选条件。如上图中的黄色点就因为残差太大,被pass掉了,既然稳定点是可以筛选出来的,那么就不必煞费苦心的寻找那些关键点,可以直接将所有的点都作为初始跟踪点,好吧所,有的点毕竟还是太多了,于是作者是选取网格交叉点作为初始跟踪点(见下图框框中黄色的点点)。
Median Flow tracker 的流程图
下面正式介绍作者的跟踪函数TLD::track,调用形式如下:
track(img1,img2,points1,points2)
img1是上一帧的图像,img2是当前帧的图像,points1,points2都是这个函数的输出函数,points1是将上一次跟踪到的目标区域lastbox划分成网格后,所得到的网格交点,即上图左边的黄色点,而points2是points1中能稳定出现在当前帧出的点,即右图中的点。
下面结合上面的流程图,并补充TLD需要增加的环节,来介绍track。
TLD::track函数
- void TLD::track(const Mat& img1, const Mat& img2,vector<Point2f>& points1,vector<Point2f>& points2){
-
-
- bbPoints(points1,lastbox);
- if (points1.size()<1){
- printf("BB= %d %d %d %d, Points not generated\n",lastbox.x,lastbox.y,lastbox.width,lastbox.height);
- tvalid=false;
- tracked=false;
- return;
- }
- vector<Point2f> points = points1;
-
-
-
- tracked = tracker.trackf2f(img1,img2,points,points2);
- if (tracked){
-
- bbPredict(points,points2,lastbox,tbb);
-
- if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){
- tvalid =false;
- tracked = false;
- printf("Too unstable predictions FB error=%f\n",tracker.getFB());
- return;
- }
-
- Mat pattern;
- Scalar mean, stdev;
- BoundingBox bb;
- bb.x = max(tbb.x,0);
- bb.y = max(tbb.y,0);
- bb.width = min(min(img2.cols-tbb.x,tbb.width),min(tbb.width,tbb.br().x));
- bb.height = min(min(img2.rows-tbb.y,tbb.height),min(tbb.height,tbb.br().y));
- getPattern(img2(bb),pattern,mean,stdev);
- vector<int> isin;
- float dummy;
- classifier.NNConf(pattern,isin,dummy,tconf);
- tvalid = lastvalid;
- if (tconf>classifier.thr_nn_valid){
- tvalid =true;
- }
- }
- else
- printf("No points tracked\n");
- }
1.Initialize points to grid
将bb切成10*10的网格,将网格交点存在points,函数为TLD::bbPoints。
-
- void TLD::bbPoints(vector<cv::Point2f>& points,const BoundingBox& bb){
- int max_pts=10;
- int margin_h=0;
- int margin_v=0;
- int stepx = ceil((bb.width-2*margin_h)/max_pts);
- int stepy = ceil((bb.height-2*margin_v)/max_pts);
- for (int y=bb.y+margin_v;y<bb.y+bb.height-margin_v;y+=stepy){
- for (int x=bb.x+margin_h;x<bb.x+bb.width-margin_h;x+=stepx){
- points.push_back(Point2f(x,y));
- }
- }
- }
2.Track points
3.Estimate tracking error
4.Filter out outliers
这三步都在函数trackf2f 中,调用层次关系tld.processFrame->track->[tracked = tracker.trackf2f(img1,img2,points,points2)]
-
- bool LKTracker::trackf2f(const Mat& img1, const Mat& img2,vector<Point2f> &points1, vector<cv::Point2f> &points2){
-
-
- calcOpticalFlowPyrLK( img1,img2, points1, points2, status,similarity, window_size, level, term_criteria, lambda, 0);
- calcOpticalFlowPyrLK( img2,img1, points2, pointsFB, FB_status,FB_error, window_size, level, term_criteria, lambda, 0);
-
- for( int i= 0; i<points1.size(); ++i ){
- FB_error[i] = norm(pointsFB[i]-points1[i]);
- }
-
-
- normCrossCorrelation(img1,img2,points1,points2);
- return filterPts(points1,points2);
- }
其中normCrossCorrelation(img1,img2,points1,points2)是对光流法跟踪的结果不放心,因此希望通过对比前后两点周围的小块的相似性,来进一步去掉不稳定的点。这次的相似性不是相关系数,而是normalized cross-correlation (NCC):
这个比较复杂,建议看wiki的公式,其实还是前面提到的相关系数,只不过计算的时候需要自己减去均值。
- void LKTracker::normCrossCorrelation(const Mat& img1,const Mat& img2, vector<Point2f>& points1, vector<Point2f>& points2) {
- Mat rec0(10,10,CV_8U);
- Mat rec1(10,10,CV_8U);
- Mat res(1,1,CV_32F);
- for (int i = 0; i < points1.size(); i++) {
- if (status[i] == 1) {
- getRectSubPix( img1, Size(10,10), points1[i],rec0 );
- getRectSubPix( img2, Size(10,10), points2[i],rec1);
- matchTemplate( rec0,rec1, res, CV_TM_CCOEFF_NORMED);
- similarity[i] = ((float *)(res.data))[0];
-
-
- } else {
- similarity[i] = 0.0;
- }
- }
- rec0.release();
- rec1.release();
- res.release();
- }
该计算的都计算好了,终于可以筛选了,
filterPts
(
points1
,
points2
)
-
- bool LKTracker::filterPts(vector<Point2f>& points1,vector<Point2f>& points2){
-
- simmed = median(similarity);
- size_t i, k;
- for( i=k = 0; i<points2.size(); ++i ){
- if( !status[i])
- continue;
- if(similarity[i]> simmed){
- points1[k] = points1[i];
- points2[k] = points2[i];
- FB_error[k] = FB_error[i];
- k++;
- }
- }
- if (k==0)
- return false;
- points1.resize(k);
- points2.resize(k);
- FB_error.resize(k);
-
-
- fbmed = median(FB_error);
- for( i=k = 0; i<points2.size(); ++i ){
- if( !status[i])
- continue;
- if(FB_error[i] <= fbmed){
- points1[k] = points1[i];
- points2[k] = points2[i];
- k++;
- }
- }
- points1.resize(k);
- points2.resize(k);
- if (k>0)
- return true;
- else
- return false;
- }
5.Update bounding box
bbPredict(points,points2,lastbox,tbb), points和points2是前面筛选完之后的点对,现在要依据points,points2来估计bb1的位移和尺度变化,这两个信息都有了,自然可以决定lastbox在当前帧的位置tbb。
位移估计
位移估计的方法是用所有点对x,y位移的中值作为位移的估计,如上图。尺度的估计的方法是用所有点对(同一帧)的伸缩比的中值作为尺度伸缩的估计,假设只有一堆点,尺度伸缩值的估计方式如下图:
尺度估计
-
- void TLD::bbPredict(const vector<cv::Point2f>& points1,const vector<cv::Point2f>& points2,
- const BoundingBox& bb1,BoundingBox& bb2) {
- int npoints = (int)points1.size();
- vector<float> xoff(npoints);
- vector<float> yoff(npoints);
- printf("tracked points : %d\n",npoints);
-
- for (int i=0;i<npoints;i++){
- xoff[i]=points2[i].x-points1[i].x;
- yoff[i]=points2[i].y-points1[i].y;
- }
- float dx = median(xoff);
- float dy = median(yoff);
- float s;
-
- if (npoints>1){
- vector<float> d;
- d.reserve(npoints*(npoints-1)/2);
- for (int i=0;i<npoints;i++){
- for (int j=i+1;j<npoints;j++){
- d.push_back(norm(points2[i]-points2[j])/norm(points1[i]-points1[j]));
- }
- }
- s = median(d);
- }
- else {
- s = 1.0;
- }
- float s1 = 0.5*(s-1)*bb1.width;
- float s2 = 0.5*(s-1)*bb1.height;
- printf("s= %f s1= %f s2= %f \n",s,s1,s2);
- bb2.x = round( bb1.x + dx -s1);
- bb2.y = round( bb1.y + dy -s2);
- bb2.width = round(bb1.width*s);
- bb2.height = round(bb1.height*s);
- printf("predicted bb: %d %d %d %d\n",bb2.x,bb2.y,bb2.br().x,bb2.br().y);
- }
6.Failure detection
这一步很简单,原文是说A failure of the tracker is declared if pixels,其中是残差的中值,残差即反向跟踪和原始跟踪点的距离。不过程序里面还要防止目标飞到图像外面去了。
- if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){
- tvalid =false;
- tracked = false;
- printf("Too unstable predictions FB error=%f\n",tracker.getFB());
- return;
- }
7.Estimate Confidence and Validity
- Mat pattern;
- Scalar mean, stdev;
- BoundingBox bb;
- bb.x = max(tbb.x,0);
- bb.y = max(tbb.y,0);
- bb.width = min(min(img2.cols-tbb.x,tbb.width),min(tbb.width,tbb.br().x));
- getPattern(img2(bb),pattern,mean,stdev);
- vector<int> isin;
- float dummy;
- classifier.NNConf(pattern,isin,dummy,tconf);
- tvalid = lastvalid;
- if (tconf>classifier.thr_nn_valid){
- tvalid =true;
- }
注释很清楚了,大家可以先忽略判定轨迹是否有效这一部分,只要知道它是用最近邻分类器的Conservative Similarity【5.2】作为跟踪目标的得分即可,后面要用这个分数和检测器进行比较。