MATLAB的KLT算法实现视频中人脸跟踪检测

最新推荐文章于 2020-11-23 13:46:54 发布

阳光的科技狗

最新推荐文章于 2020-11-23 13:46:54 发布

阅读量5.2k

点赞数 7

分类专栏： MATLAB 文章标签： MATLAB 人脸跟踪 KLT算法

本文链接：https://blog.csdn.net/qq_43409984/article/details/89606194

版权

MATLAB 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

上次已经用卡尔曼滤波实现了人脸跟踪。也取得了一定的效果。。但仍存在一些问题，如人侧脸的时候跟踪效果并不理想。而KLT算法则能很好的解决这个问题。本文一共两部分。。理论部分借鉴博了客园上的博主(https://www.cnblogs.com/moondark/archive/2012/05/12/2497391.html)

一、理论
　近来在研究跟踪，跟踪的方法其实有很多，如粒子滤波（pf）、meanshift跟踪，以及KLT跟踪或叫Lucas光流法，这些方法各自有各自的有点，对于粒子滤波而言，它能够比较好的在全局搜索到最优解，但其求解速度相对较慢，由于其是基于颜色直方图的计算，所以对相同颜色东西不太能够区别，meanshift方法很容易陷入局部最优，但速度还是挺快，所以现在很有一些人是将meanshift跟pf结合做跟踪，恰好在很多方面能够互补。

Kanade-Lucas-Tomasi方法，在跟踪方面表现的也不错，尤其在实时计算速度上，用它来得到的，是很多点的轨迹“trajectory”，并且还有一些发生了漂移的点，所以，得到跟踪点之后要进行一些后期的处理。这里讲的是一种图像点定位的方法，即图像的局部匹配，将图像匹配问题，从传统的滑动窗口搜索方法变为一个求解偏移量d的过程。在求解d的过程中，哪些情况下可以保证一定能够得到d的解，这些情况的点有什么特点（后来会发现，很多时候都是寻找的角点）。
　　
　　先说KLT算法的几个前提假设：
　 1）亮度恒定
　 2）时间连续或者是运动是“小运动”
3）空间一致，临近点有相似运动，保持相邻

这几个为什么要这么假设，很直观的讲，如果判断一个视频的相邻两帧I、J在某局部窗口w上是一样的，则在窗口w内有：I(x, y, t) = J(x’, y’, t+τ)，亮度恒定的假设(假设1)即为了保证其等号成立不受亮度的影响，假设2是为了保证KLT能够找到点，假设3则为以下原因假设（即对于同一个窗口中，所有点的偏移量都相等）：
在窗口w上，所有(x, y)都往一个方向移动了(dx, dy)，从而得到(x’, y’)，即t时刻的(x, y)点在t+τ时刻为(x+dx, y+dy)，所以寻求匹配的问题可化为对以下的式子寻求最小值，或叫做最小化以下式子：
在这里插入图片描述

用积分来表示上述式子，以上式子可等效为：
在这里插入图片描述

这个式子的含义，即找到两副图像中，在W窗口中，I、J的差异，其中I以x-d/2为中心，J以x+d/2为中心，w/2为半径的一个矩形窗口间的差异，好吧，结合我们微积分的知识，函数ε(d)要取得最小值，这个极值点的导数一定为0，即
　　在这里插入图片描述

的值为0，由泰勒展开的性质：
在这里插入图片描述

可以得到：
在这里插入图片描述

于是，问题转化为：
在这里插入图片描述

其中：
在这里插入图片描述
从而，问题即为：

即其等式可看作为：

其中，Z为一个22的矩阵，e为一个21的向量，

为了要使d能够得到解，则Z需要满足条件，即Z*Z’矩阵可逆，其中Z’为Z矩阵的转置(ZT)，在一般情况下，角点具有这样的特点。

二、代码
程序部分是根据做了两个视频的案例来做的。话不多说，直接上代码

function [faceDet,faceTracks] = trackFaces( videoFilename,fittingModel,outFilename,visualise )
%TRACKFACES tracks all faces in a video
%   INPUT
%       videofilename - path to input video
%       fittingModel - model options for face detection
%       outfilename - path where we want to store our face detections
%       visualise - boolean to turn on visualisation
%   OUTPUT
%       faceDet - bounding box detections indexed by frame id
%       faceTracks - bounding box detections indexed by track id

%  define track structure (track id based)
%     faceTracks(trackID).bbox;
%     faceTracks(trackID).frameid;
    
%  define detection structure (frame id based)
%     faceDet(frameID).trackIDs;
%     faceDet(frameID).bboxes;
%
%   Written by James Charles at University of Leeds 27/04/16
%
%   If you use it then please cite the paper (for which is was developed): 
%     @Article{Charles16b,
%       author       = "Charles, J. and Magee, D. and Hogg.",
%       title        = "Virtual Immortality: Reanimating characters from TV shows",
%       booktitle    = "ECCV Workshop on Virtual/Augmented Reality for Visual Artificial Intelligence (VARVAI)",	
%       year         = "2016",
%     }
%
%   This work is licensed under the Creative Commons Attribution 4.0 International License. 
%   To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or 
%   send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

    faceTracks = [];
    faceDet = [];
    h_img = []; %for visualisation only (image handle)

    %load video object
    vidobj = VideoReader(videoFilename);
    
    %setup face detector
    faceDetector = vision.CascadeObjectDetector();
    faceDetector.MaxSize = round(fittingModel.maxsize*fittingModel.facedetect.imscale);
    faceDetector.MinSize = round(fittingModel.minsize*fittingModel.facedetect.imscale);
    faceDetector.MergeThreshold = fittingModel.facedetect.sensitivity;
    
    %setup trackers
    trackers(1).maxID = 0;
    trackers(1).numActive = 0;
    
    %get starting frame index based on current saved output and also
    %tracker info
    [startIndex,trackers] = getSavedIndex(outFilename);
    
    %load the saved tracks
    if exist(outFilename,'file');
        load(outFilename);
    end
    
    %loop over frames and track
    for i = startIndex:vidobj.NumberOfFrames
        if mod(i,1000)==0
            fprintf('frame %d of %d, number of active trackers: %d\n',i,vidobj.NumberOfFrames,trackers(1).numActive);
        end

        img = read(vidobj,i);
        
        %track all previous faces to this frame
        [faceDet,faceTracks,trackers] = track(faceDet,faceTracks,trackers,img,i);
        
        %detect all faces
        facebboxes = detectFaces(faceDetector,img);
        
        %start new tracks if we detect new faces
        [faceDet,faceTracks,trackers] = addNewFaces(faceDet, faceTracks,trackers,facebboxes,img,i);
        
        %save output every 1000 frames
        saveOutput(outFilename,faceDet,faceTracks,trackers,vidobj.NumberOfFrames,i,1000); 
        
        %visualise
        if visualise
            h_img = vis(img,faceDet,trackers,i,h_img);
        end
    end
    
    %produce tracking structure from faceDet
    if isempty(faceTracks)
        faceTracks = dets2tracks(faceDet);
        frameID = vidobj.NumberOfFrames;
        save(outFilename,'faceDet','frameID','faceTracks');
    end
end

%function to convert faceDet structure to a faceTracks structure
function faceTracks = dets2tracks(faceDet)
    faceTracks(1).bboxes = [];
    faceTracks(1).frameIDs = [];
    
    for frameID = 1:numel(faceDet)
        if ~isempty(faceDet(frameID).trackIDs)
            for t = 1:numel(faceDet(frameID).trackIDs)
                trackID = faceDet(frameID).trackIDs(t);
                if trackID > numel(faceTracks)
                    faceTracks(trackID).bboxes = faceDet(frameID).bboxes(t,:);
                    faceTracks(trackID).frameIDs = frameID;
                else
                    faceTracks(trackID).bboxes = cat(1,faceTracks(trackID).bboxes,faceDet(frameID).bboxes(t,:));
                    faceTracks(trackID).frameIDs = cat(2,faceTracks(trackID).frameIDs,frameID);
                end
            end
        end
    end
end

%function to return start index for tracking
function [index,trackers] = getSavedIndex(outFilename)
    if ~exist(outFilename,'file')
        index = 1;
        trackers(1).maxID = 0;
        trackers(1).numActive = 0;
    else
        data = load(outFilename);
        index = data.frameID + 1; 
        trackers = data.trackers;
    end
end

%function to save the tracks
function saveOutput(outFilename,faceDet,faceTracks,trackers,maxFrames,frameID,stepSize)
    if frameID == maxFrames || mod(frameID,stepSize)==0
        save(outFilename,'faceDet','faceTracks','frameID','trackers');
    end
end

%track all faces to the next frame (destroys trackers when neccessary)
function [faceDet,faceTracks,trackers] = track(faceDet,faceTracks,trackers,img,frameID)
    if trackers(1).maxID == 0 || trackers(1).numActive == 0 
        return
    else
        %for each tracker track bbox to new image
        for i = 1:numel(faceDet(frameID-1).trackIDs)
            trackerID = find(faceDet(frameID-1).trackIDs(i)==cat(1,trackers(:).trackID));  %find tracker associated with this face
            if isempty(trackerID); continue; end %if tracker destroyed then skip
            [isGood,facebbox,tracker] = trackBBox(trackers(trackerID),img,faceDet(frameID-1).bboxes(i,:)); %track the face forward in time
            
            %update faceDet and faceTracks with new data
            if isGood
                [faceDet,faceTracks] = updateFace(faceDet,faceTracks,facebbox,trackers(trackerID).trackID,frameID);
                %update the tracker
                trackers(trackerID).points = tracker.oldPoints;
                trackers(trackerID).tracker = tracker.tracker;
            else
                if numel(faceDet)<frameID
                    faceDet(frameID).bboxes = [];
                end
                %we need to remove this tracker and let the system
                %initialise a new one when it gets a good face detection
                trackers(1).numActive = max(0,trackers(1).numActive - 1);
                release(trackers(trackerID).tracker);
                
                if trackerID == 1
                    if numel(trackers)==1
                        trackers(trackerID).points = [];
                        trackers(trackerID).tracker = [];
                    else
                        maxID = trackers(1).maxID;
                        numActive = trackers(1).numActive;
                        trackers(trackerID) = [];
                        trackers(trackerID).maxID = maxID;
                        trackers(trackerID).numActive = numActive;
                    end
                else
                    trackers(trackerID) = [];
                end
            end
        end
    end
end

function [isGood, facebbox, tracker] = trackBBox(tracker,img,facebbox)
    isGood = false;
    tracker.oldpoints = tracker.points;
    [points, isFound] = step(tracker.tracker, img);
    tracker.points = points(isFound, :);
    oldInliers = tracker.oldpoints(isFound, :);
            
    if size(tracker.points, 1) >= 3 % need at least 4 points
        % Estimate the geometric transformation between the old points
        % and the new points and eliminate outliers
        [xform, oldInliers, visiblePoints] = estimateGeometricTransform(...
            oldInliers, tracker.points, 'similarity', 'MaxDistance', 5);

        % Position bounding box at centre of points
        centre = mean(tracker.points);
        tempFacebbox = [centre(1)-facebbox(3)/2,centre(2)-facebbox(4)/2,facebbox(3),facebbox(4)];
        tempFacebbox = round(tempFacebbox);

        %check if tempFacebbox very different, if so track is not good
        A1 = prod(facebbox(3:4)); %area of new bounding box
        A2 = prod(tempFacebbox(3:4)); %area of old bounding box
        %compute intersection area
        U = rectint(facebbox,tempFacebbox);
            
        %calculate a similarity score and threshold at 0.7
        similarity = (U/A1 + U/A2)/2;
        if similarity > 0.7
            isGood = true;
        else
            isGood = false;
        end
            
        facebbox = tempFacebbox;
        tracker.oldPoints = visiblePoints;
        setPoints(tracker.tracker, tracker.oldPoints);
    end
end

function [faceDet,faceTracks,trackers] = addNewFaces(faceDet, faceTracks,trackers,facebboxes,img,frameID)
    %for each box see if it is a new face
    if ~isempty(facebboxes)
        numDets = size(facebboxes,1);
        for i = 1:numDets
            if ~isempty(faceDet)
                [isNew,matchingFaceID] = isNewFace(faceDet,facebboxes(i,:),frameID);
                if isNew
                    [faceDet,faceTracks,trackers] = addFace(faceDet,faceTracks,trackers,facebboxes(i,:),img,frameID);
                else
                    %update tracker with points in detected face
                    if ~isempty(matchingFaceID)
                        trackerID = find(cat(1,trackers(:).trackID)==faceDet(frameID).trackIDs(matchingFaceID));
                        if ~isempty(trackerID)
                            %re-initialise the tracker
                            trackers(trackerID).points = cat(1,trackers(trackerID).points,getFeaturePoints(img,facebboxes(i,:)));

                            %randomly sample at most 30 points
                            Ridx = randperm(size(trackers(trackerID).points,1));
                            trackers(trackerID).points = trackers(trackerID).points(Ridx(1:min(numel(Ridx),30)),:);
                            setPoints(trackers(trackerID).tracker, trackers(trackerID).points);
                        end
                    end
                end
            else
                [faceDet,faceTracks,trackers] = addFace(faceDet,faceTracks,trackers,facebboxes(i,:),img,frameID);
            end
        end
    end
end

%this function adds a new face and initialises a new tracker
function [faceDet,faceTracks,trackers] = addFace(faceDet,faceTracks,trackers,facebbox,img,frameID)
    numActiveTrackers = trackers(1).numActive;
    
    %get features
    points = getFeaturePoints(img,facebbox);
    
    if ~isempty(points)
        %setup new tracker
        trackerID = numActiveTrackers+1;
        trackers(1).maxID = trackers(1).maxID + 1; %maxID stores the total number of trackers so far
        trackers(1).numActive = trackers(1).numActive + 1;
        trackers(trackerID).trackID = trackers(1).maxID;
        trackers(trackerID).tracker = vision.PointTracker('MaxBidirectionalError', 2);

        %get features to track
        trackers(trackerID).points = points;

        %initialise the tracker
        initialize(trackers(trackerID).tracker,trackers(trackerID).points,img);

        %update the faceDet and faceTracks structure
        [faceDet,faceTracks] = updateFace(faceDet,faceTracks,facebbox,trackers(trackerID).trackID,frameID);
    end
end

%this function updates the faceDet and faceTracks structure with a new
%bounding box
function [faceDet,faceTracks] = updateFace(faceDet,faceTracks,facebbox,trackID,frameID)

    if ~isempty(faceDet)
        if numel(faceDet)<frameID % add new face and frame
            faceDet(frameID).trackIDs = trackID;
            faceDet(frameID).bboxes = facebbox;
            faceDet(frameID).landmarks = [];
        else %append face to current frame
            faceDet(frameID).trackIDs = cat(2,faceDet(end).trackIDs,trackID);
            faceDet(frameID).bboxes = cat(1,faceDet(end).bboxes,facebbox);
            faceDet(frameID).landmarks = [];
        end
    else
        faceDet(frameID).trackIDs = trackID;
        faceDet(frameID).bboxes = facebbox;
        faceDet(frameID).landmarks = [];
    end

end

function bboxes = detectFaces(faceDetector,img)
    bboxes = step(faceDetector, img);   
end

%check if the current bounding box is a new detection
function [newface,matchingFaceID] = isNewFace(faceDet,facebbox,frameID)
    newface = false;
    matchingFaceID = [];
    if isempty(faceDet)
        newface = true;
    elseif numel(faceDet)<frameID
        newface = true;
    else
        %check overlap of facebbox with bounding boxes present in faceDet
        numBBoxes = size(faceDet(frameID).bboxes,1);
        A1 = prod(facebbox(3:4)); %area of new bounding box
        for i = 1:numBBoxes
            A2 = prod(faceDet(frameID).bboxes(i,3:4)); %area of old bounding box
            %compute intersection area
            U = rectint(facebbox,faceDet(frameID).bboxes(i,:));
            
            %calculate a similarity score and threshold at 0.9
            similarity = (U/A1 + U/A2)/2;
            if similarity > 0.6
                newface = false;
                matchingFaceID = i;
                break;
            else
                newface = true;
            end
        end
    end
end

%function to convert a bounding box to points (unused)
function points = bbox2point(bbox)
    points = zeros(4,2);
    points(1,:) = bbox(1:2);
    points(2,:) = points(1,:) + [0 bbox(3)];
    points(3,:) = points(2,:) + [bbox(4) 0];
    points(4,:) = points(1,:) + [bbox(4) 0];
end

%function to convert points to a bounding box (unused)
function bbox = point2bbox(points)
    bbox = zeros(1,4);
    bbox(1) = min(points(:,1));
    bbox(2) = min(points(:,2));
    bbox(3) = max(points(:,1))-bbox(1);
    bbox(4) = max(points(:,2))-bbox(2);
end

%function to return feature points given bounding box and image
function locations = getFeaturePoints(img,bbox)
%     points = detectMinEigenFeatures(rgb2gray(img), 'ROI', bbox);
    points = detectBRISKFeatures(rgb2gray(img), 'ROI', bbox,'MinContrast',0.001);%ROI感兴趣的部分
    locations = points.Location;
    if ~isempty(locations)

        %get cluster close to centre of points
        centre = mean(locations);
        dist = sqrt(sum(bsxfun(@minus,locations,centre).^2,2));
        i = 1;
        idrem = ones(1,size(locations,1));
        while sum(idrem)>=size(locations,1)
            idrem = dist>15*i;
            i = i + 0.5;
        end
        locations(idrem,:) = [];
    end
end

%function to visualise the output
function h_img = vis(img,faceDet,trackers,frameID,h_img)
    if isempty(h_img) %create figure and image pane
        figure
        h_img = imagesc(img); axis image;
    end
    
    if numel(faceDet)>=frameID
        for b = 1:size(faceDet(frameID).bboxes,1)
            img = insertObjectAnnotation(img,'rectangle',faceDet(frameID).bboxes(b,:),sprintf('track: %d',faceDet(frameID).trackIDs(b)));
        end
        for b = 1:size(faceDet(frameID).bboxes,1)
            trackerID = find(cat(1,trackers(:).trackID)==faceDet(frameID).trackIDs(b));
            if ~isempty(trackerID)
                plotPoints = cat(2,trackers(trackerID).points(:,1:2),2*ones(size(trackers(trackerID).points,1),1));
                img = insertObjectAnnotation(img,'circle',plotPoints,' ','TextBoxOpacity',0,'Color','white');
            end
        end

    end
    set(h_img,'cdata',img);
    drawnow  
end

主函数部分就不在此展示了，可以根据自己的需要来写。。对于这部分的代码是需要花点时间去理解和消化的。。跟踪视频没法放两张截图吧在这里插入图片描述

值得注意的是，KLT算法已经得到了更好的优化，在跟踪检测方面进一步完善。另外KLT在DSP板C6000中也有很广泛的应用，在军事领域，安防监控等方面也具有重要意义。DSP的处理器能很好地运行KLT。对以后的硬件学习中也很有帮助。

PS:谈一点新人对matlab机器学习或者是深度学习的感受吧。。现在明显感觉到matlab在深度学习的劣势了。。运行程序很费力，一两分钟的视频就可能把电脑跑卡。这个方向也确实很吃硬件。另一方面，matlab工程领域上很难在硬件开发板上运行。在转成C语言上也存在很多问题。。Opencv在这方面上稍微有优势，毕竟还有树莓派，openmv等硬件可以跑代码。。兼学python吧。。