MATLAB的KLT算法实现视频中人脸跟踪检测

上次已经用卡尔曼滤波实现了人脸跟踪。也取得了一定的效果。。但仍存在一些问题,如人侧脸的时候跟踪效果并不理想。而KLT算法则能很好的解决这个问题。本文一共两部分。。理论部分借鉴博了客园上的博主(https://www.cnblogs.com/moondark/archive/2012/05/12/2497391.html)

一、理论
 近来在研究跟踪,跟踪的方法其实有很多,如粒子滤波(pf)、meanshift跟踪,以及KLT跟踪或叫Lucas光流法,这些方法各自有各自的有点,对于粒子滤波而言,它能够比较好的在全局搜索到最优解,但其求解速度相对较慢,由于其是基于颜色直方图的计算,所以对相同颜色东西不太能够区别,meanshift方法很容易陷入局部最优,但速度还是挺快,所以现在很有一些人是将meanshift跟pf结合做跟踪,恰好在很多方面能够互补。

Kanade-Lucas-Tomasi方法,在跟踪方面表现的也不错,尤其在实时计算速度上,用它来得到的,是很多点的轨迹“trajectory”,并且还有一些发生了漂移的点,所以,得到跟踪点之后要进行一些后期的处理。这里讲的是一种图像点定位的方法,即图像的局部匹配,将图像匹配问题,从传统的滑动窗口搜索方法变为一个求解偏移量d的过程。在求解d的过程中,哪些情况下可以保证一定能够得到d的解,这些情况的点有什么特点(后来会发现,很多时候都是寻找的角点)。
  
  先说KLT算法的几个前提假设:
  1)亮度恒定
  2)时间连续或者是运动是“小运动”
3)空间一致,临近点有相似运动,保持相邻

这几个为什么要这么假设,很直观的讲,如果判断一个视频的相邻两帧I、J在某局部窗口w上是一样的,则在窗口w内有:I(x, y, t) = J(x’, y’, t+τ),亮度恒定的假设(假设1)即为了保证其等号成立不受亮度的影响,假设2是为了保证KLT能够找到点,假设3则为以下原因假设(即对于同一个窗口中,所有点的偏移量都相等):
在窗口w上,所有(x, y)都往一个方向移动了(dx, dy),从而得到(x’, y’),即t时刻的(x, y)点在t+τ时刻为(x+dx, y+dy),所以寻求匹配的问题可化为对以下的式子寻求最小值,或叫做最小化以下式子:
在这里插入图片描述

用积分来表示上述式子,以上式子可等效为:
在这里插入图片描述

这个式子的含义,即找到两副图像中,在W窗口中,I、J的差异,其中I以x-d/2为中心,J以x+d/2为中心,w/2为半径的一个矩形窗口间的差异,好吧,结合我们微积分的知识,函数ε(d)要取得最小值,这个极值点的导数一定为0,即
  在这里插入图片描述

的值为0,由泰勒展开的性质:
在这里插入图片描述

可以得到:
在这里插入图片描述
在这里插入图片描述

于是,问题转化为:
在这里插入图片描述

其中:
在这里插入图片描述
从而,问题即为:
在这里插入图片描述
在这里插入图片描述
即其等式可看作为:
在这里插入图片描述
其中,Z为一个22的矩阵,e为一个21的向量,
在这里插入图片描述
在这里插入图片描述
为了要使d能够得到解,则Z需要满足条件,即Z*Z’矩阵可逆,其中Z’为Z矩阵的转置(ZT),在一般情况下,角点具有这样的特点。

二、代码
程序部分是根据做了两个视频的案例来做的。话不多说,直接上代码

function [faceDet,faceTracks] = trackFaces( videoFilename,fittingModel,outFilename,visualise )
%TRACKFACES tracks all faces in a video
%   INPUT
%       videofilename - path to input video
%       fittingModel - model options for face detection
%       outfilename - path where we want to store our face detections
%       visualise - boolean to turn on visualisation
%   OUTPUT
%       faceDet - bounding box detections indexed by frame id
%       faceTracks - bounding box detections indexed by track id

%  define track structure (track id based)
%     faceTracks(trackID).bbox;
%     faceTracks(trackID).frameid;
    
%  define detection structure (frame id based)
%     faceDet(frameID).trackIDs;
%     faceDet(frameID).bboxes;
%
%   Written by James Charles at University of Leeds 27/04/16
%
%   If you use it then please cite the paper (for which is was developed): 
%     @Article{Charles16b,
%       author       = "Charles, J. and Magee, D. and Hogg.",
%       title        = "Virtual Immortality: Reanimating characters from TV shows",
%       booktitle    = "ECCV Workshop on Virtual/Augmented Reality for Visual Artificial Intelligence (VARVAI)",	
%       year         = "2016",
%     }
%
%   This work is licensed under the Creative Commons Attribution 4.0 International License. 
%   To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or 
%   send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

    faceTracks = [];
    faceDet = [];
    h_img = []; %for visualisation only (image handle)

    %load video object
    vidobj = VideoReader(videoFilename);
    
    %setup face detector
    faceDetector = vision.CascadeObjectDetector();
    faceDetector.MaxSize = round(fittingModel.maxsize*fittingModel.facedetect.imscale);
    faceDetector.MinSize = round(fittingModel.minsize*fittingModel.facedetect.imscale);
    faceDetector.MergeThreshold = fittingModel.facedetect.sensitivity;
    
    %setup trackers
    trackers(1).maxID = 0;
    trackers(1).numActive = 0;
    
    %get starting frame index based on current saved output and also
    %tracker info
    [startIndex,trackers] = getSavedIndex(outFilename);
    
    %load the saved tracks
    if exist(outFilename,'file');
        load(outFilename);
    end
    
    %loop over frames and track
    for i = startIndex:vidobj.NumberOfFrames
        if mod(i,1000)==0
            fprintf('frame %d of %d, number of active trackers: %d\n',i,vidobj.NumberOfFrames,trackers(1).numActive);
        end

        img = read(vidobj,i);
        
        %track all previous faces to this frame
        [faceDet,faceTracks,trackers] = track(faceDet,faceTracks,trackers,img,i);
        
        %detect all faces
        facebboxes = detectFaces(faceDetector,img);
        
        %start new tracks if we detect new faces
        [faceDet,faceTracks,trackers] = addNewFaces(faceDet, faceTracks,trackers,facebboxes,img,i);
        
        %save output every 1000 frames
        saveOutput(outFilename,faceDet,faceTracks,trackers,vidobj.NumberOfFrames,i,1000); 
        
        %visualise
        if visualise
            h_img = vis(img,faceDet,trackers,i,h_img);
        end
    end
    
    %produce tracking structure from faceDet
    if isempty(faceTracks)
        faceTracks = dets2tracks(faceDet);
        frameID = vidobj.NumberOfFrames;
        save(outFilename,'faceDet','frameID','faceTracks');
    end
end

%function to convert faceDet structure to a faceTracks structure
function faceTracks = dets2tracks(faceDet)
    faceTracks(1).bboxes = [];
    faceTracks(1).frameIDs = [];
    
    for frameID = 1:numel(faceDet)
        if ~isempty(faceDet(frameID).trackIDs)
            for t = 1:numel(faceDet(frameID).trackIDs)
                trackID = faceDet(frameID).trackIDs(t);
                if trackID > numel(faceTracks)
                    faceTracks(trackID).bboxes = faceDet(frameID).bboxes(t,:);
                    faceTracks(trackID).frameIDs = frameID;
                else
                    faceTracks(trackID).bboxes = cat(1,faceTracks(trackID).bboxes,faceDet(frameID).bboxes(t,:));
                    faceTracks(trackID).frameIDs = cat(2,faceTracks(trackID).frameIDs,frameID);
                end
            end
        end
    end
end

%function to return start index for tracking
function [index,trackers] = getSavedIndex(outFilename)
    if ~exist(outFilename,'file')
        index = 1;
        trackers(1).maxID = 0;
        trackers(1).numActive = 0;
    else
        data = load(outFilename);
        index = data.frameID + 1; 
        trackers = data.trackers;
    end
end

%function to save the tracks
function saveOutput(outFilename,faceDet,faceTracks,trackers,maxFrames,frameID,stepSize)
    if frameID == maxFrames || mod(frameID,stepSize)==0
        save(outFilename,'faceDet','faceTracks','frameID','trackers');
    end
end

%track all faces to the next frame (destroys trackers when neccessary)
function [faceDet,faceTracks,trackers] = track(faceDet,faceTracks,trackers,img,frameID)
    if trackers(1).maxID == 0 || trackers(1).numActive == 0 
        return
    else
        %for each tracker track bbox to new image
        for i = 1:numel(faceDet(frameID-1).trackIDs)
            trackerID = find(faceDet(frameID-1).trackIDs(i)==cat(1,trackers(:).trackID));  %find tracker associated with this face
            if isempty(trackerID); continue; end %if tracker destroyed then skip
            [isGood,facebbox,tracker] = trackBBox(trackers(trackerID),img,faceDet(frameID-1).bboxes(i,:)); %track the face forward in time
            
            %update faceDet and faceTracks with new data
            if isGood
                [faceDet,faceTracks] = updateFace(faceDet,faceTracks,facebbox,trackers(trackerID).trackID,frameID);
                %update the tracker
                trackers(trackerID).points = tracker.oldPoints;
                trackers(trackerID).tracker = tracker.tracker;
            else
                if numel(faceDet)<frameID
                    faceDet(frameID).bboxes = [];
                end
                %we need to remove this tracker and let the system
                %initialise a new one when it gets a good face detection
                trackers(1).numActive = max(0,trackers(1).numActive - 1);
                release(trackers(trackerID).tracker);
                
                if trackerID == 1
                    if numel(trackers)==1
                        trackers(trackerID).points = [];
                        trackers(trackerID).tracker = [];
                    else
                        maxID = trackers(1).maxID;
                        numActive = trackers(1).numActive;
                        trackers(trackerID) = [];
                        trackers(trackerID).maxID = maxID;
                        trackers(trackerID).numActive = numActive;
                    end
                else
                    trackers(trackerID) = [];
                end
            end
        end
    end
end

function [isGood, facebbox, tracker] = trackBBox(tracker,img,facebbox)
    isGood = false;
    tracker.oldpoints = tracker.points;
    [points, isFound] = step(tracker.tracker, img);
    tracker.points = points(isFound, :);
    oldInliers = tracker.oldpoints(isFound, :);
            
    if size(tracker.points, 1) >= 3 % need at least 4 points
        % Estimate the geometric transformation between the old points
        % and the new points and eliminate outliers
        [xform, oldInliers, visiblePoints] = estimateGeometricTransform(...
            oldInliers, tracker.points, 'similarity', 'MaxDistance', 5);

        % Position bounding box at centre of points
        centre = mean(tracker.points);
        tempFacebbox = [centre(1)-facebbox(3)/2,centre(2)-facebbox(4)/2,facebbox(3),facebbox(4)];
        tempFacebbox = round(tempFacebbox);

        %check if tempFacebbox very different, if so track is not good
        A1 = prod(facebbox(3:4)); %area of new bounding box
        A2 = prod(tempFacebbox(3:4)); %area of old bounding box
        %compute intersection area
        U = rectint(facebbox,tempFacebbox);
            
        %calculate a similarity score and threshold at 0.7
        similarity = (U/A1 + U/A2)/2;
        if similarity > 0.7
            isGood = true;
        else
            isGood = false;
        end
            
        facebbox = tempFacebbox;
        tracker.oldPoints = visiblePoints;
        setPoints(tracker.tracker, tracker.oldPoints);
    end
end

function [faceDet,faceTracks,trackers] = addNewFaces(faceDet, faceTracks,trackers,facebboxes,img,frameID)
    %for each box see if it is a new face
    if ~isempty(facebboxes)
        numDets = size(facebboxes,1);
        for i = 1:numDets
            if ~isempty(faceDet)
                [isNew,matchingFaceID] = isNewFace(faceDet,facebboxes(i,:),frameID);
                if isNew
                    [faceDet,faceTracks,trackers] = addFace(faceDet,faceTracks,trackers,facebboxes(i,:),img,frameID);
                else
                    %update tracker with points in detected face
                    if ~isempty(matchingFaceID)
                        trackerID = find(cat(1,trackers(:).trackID)==faceDet(frameID).trackIDs(matchingFaceID));
                        if ~isempty(trackerID)
                            %re-initialise the tracker
                            trackers(trackerID).points = cat(1,trackers(trackerID).points,getFeaturePoints(img,facebboxes(i,:)));

                            %randomly sample at most 30 points
                            Ridx = randperm(size(trackers(trackerID).points,1));
                            trackers(trackerID).points = trackers(trackerID).points(Ridx(1:min(numel(Ridx),30)),:);
                            setPoints(trackers(trackerID).tracker, trackers(trackerID).points);
                        end
                    end
                end
            else
                [faceDet,faceTracks,trackers] = addFace(faceDet,faceTracks,trackers,facebboxes(i,:),img,frameID);
            end
        end
    end
end

%this function adds a new face and initialises a new tracker
function [faceDet,faceTracks,trackers] = addFace(faceDet,faceTracks,trackers,facebbox,img,frameID)
    numActiveTrackers = trackers(1).numActive;
    
    %get features
    points = getFeaturePoints(img,facebbox);
    
    if ~isempty(points)
        %setup new tracker
        trackerID = numActiveTrackers+1;
        trackers(1).maxID = trackers(1).maxID + 1; %maxID stores the total number of trackers so far
        trackers(1).numActive = trackers(1).numActive + 1;
        trackers(trackerID).trackID = trackers(1).maxID;
        trackers(trackerID).tracker = vision.PointTracker('MaxBidirectionalError', 2);

        %get features to track
        trackers(trackerID).points = points;

        %initialise the tracker
        initialize(trackers(trackerID).tracker,trackers(trackerID).points,img);

        %update the faceDet and faceTracks structure
        [faceDet,faceTracks] = updateFace(faceDet,faceTracks,facebbox,trackers(trackerID).trackID,frameID);
    end
end

%this function updates the faceDet and faceTracks structure with a new
%bounding box
function [faceDet,faceTracks] = updateFace(faceDet,faceTracks,facebbox,trackID,frameID)

    if ~isempty(faceDet)
        if numel(faceDet)<frameID % add new face and frame
            faceDet(frameID).trackIDs = trackID;
            faceDet(frameID).bboxes = facebbox;
            faceDet(frameID).landmarks = [];
        else %append face to current frame
            faceDet(frameID).trackIDs = cat(2,faceDet(end).trackIDs,trackID);
            faceDet(frameID).bboxes = cat(1,faceDet(end).bboxes,facebbox);
            faceDet(frameID).landmarks = [];
        end
    else
        faceDet(frameID).trackIDs = trackID;
        faceDet(frameID).bboxes = facebbox;
        faceDet(frameID).landmarks = [];
    end

end

function bboxes = detectFaces(faceDetector,img)
    bboxes = step(faceDetector, img);   
end

%check if the current bounding box is a new detection
function [newface,matchingFaceID] = isNewFace(faceDet,facebbox,frameID)
    newface = false;
    matchingFaceID = [];
    if isempty(faceDet)
        newface = true;
    elseif numel(faceDet)<frameID
        newface = true;
    else
        %check overlap of facebbox with bounding boxes present in faceDet
        numBBoxes = size(faceDet(frameID).bboxes,1);
        A1 = prod(facebbox(3:4)); %area of new bounding box
        for i = 1:numBBoxes
            A2 = prod(faceDet(frameID).bboxes(i,3:4)); %area of old bounding box
            %compute intersection area
            U = rectint(facebbox,faceDet(frameID).bboxes(i,:));
            
            %calculate a similarity score and threshold at 0.9
            similarity = (U/A1 + U/A2)/2;
            if similarity > 0.6
                newface = false;
                matchingFaceID = i;
                break;
            else
                newface = true;
            end
        end
    end
end

%function to convert a bounding box to points (unused)
function points = bbox2point(bbox)
    points = zeros(4,2);
    points(1,:) = bbox(1:2);
    points(2,:) = points(1,:) + [0 bbox(3)];
    points(3,:) = points(2,:) + [bbox(4) 0];
    points(4,:) = points(1,:) + [bbox(4) 0];
end

%function to convert points to a bounding box (unused)
function bbox = point2bbox(points)
    bbox = zeros(1,4);
    bbox(1) = min(points(:,1));
    bbox(2) = min(points(:,2));
    bbox(3) = max(points(:,1))-bbox(1);
    bbox(4) = max(points(:,2))-bbox(2);
end

%function to return feature points given bounding box and image
function locations = getFeaturePoints(img,bbox)
%     points = detectMinEigenFeatures(rgb2gray(img), 'ROI', bbox);
    points = detectBRISKFeatures(rgb2gray(img), 'ROI', bbox,'MinContrast',0.001);%ROI感兴趣的部分
    locations = points.Location;
    if ~isempty(locations)

        %get cluster close to centre of points
        centre = mean(locations);
        dist = sqrt(sum(bsxfun(@minus,locations,centre).^2,2));
        i = 1;
        idrem = ones(1,size(locations,1));
        while sum(idrem)>=size(locations,1)
            idrem = dist>15*i;
            i = i + 0.5;
        end
        locations(idrem,:) = [];
    end
end

%function to visualise the output
function h_img = vis(img,faceDet,trackers,frameID,h_img)
    if isempty(h_img) %create figure and image pane
        figure
        h_img = imagesc(img); axis image;
    end
    
    if numel(faceDet)>=frameID
        for b = 1:size(faceDet(frameID).bboxes,1)
            img = insertObjectAnnotation(img,'rectangle',faceDet(frameID).bboxes(b,:),sprintf('track: %d',faceDet(frameID).trackIDs(b)));
        end
        for b = 1:size(faceDet(frameID).bboxes,1)
            trackerID = find(cat(1,trackers(:).trackID)==faceDet(frameID).trackIDs(b));
            if ~isempty(trackerID)
                plotPoints = cat(2,trackers(trackerID).points(:,1:2),2*ones(size(trackers(trackerID).points,1),1));
                img = insertObjectAnnotation(img,'circle',plotPoints,' ','TextBoxOpacity',0,'Color','white');
            end
        end

    end
    set(h_img,'cdata',img);
    drawnow  
end






主函数部分就不在此展示了,可以根据自己的需要来写。。对于这部分的代码是需要花点时间去理解和消化的。。跟踪视频没法放两张截图吧在这里插入图片描述
在这里插入图片描述

值得注意的是,KLT算法已经得到了更好的优化,在跟踪检测方面进一步完善。另外KLT在DSP板C6000中也有很广泛的应用,在军事领域,安防监控等方面也具有重要意义。DSP的处理器能很好地运行KLT。对以后的硬件学习中也很有帮助。

PS:谈一点新人对matlab机器学习或者是深度学习的感受吧。。现在明显感觉到matlab在深度学习的劣势了。。运行程序很费力,一两分钟的视频就可能把电脑跑卡。这个方向也确实很吃硬件。另一方面,matlab工程领域上很难在硬件开发板上运行。在转成C语言上也存在很多问题。。Opencv在这方面上稍微有优势,毕竟还有树莓派,openmv等硬件可以跑代码。。兼学python吧。。

评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值