SDM For Face Alignment 流程介绍及Matlab代码实现之训练篇

最新推荐文章于 2023-09-21 23:07:41 发布

bbzz2

最新推荐文章于 2023-09-21 23:07:41 发布

阅读量1.5k

点赞数

分类专栏：图像识别算法

图像识别算法专栏收录该内容

52 篇文章 2 订阅

订阅专栏

SDM 训练阶段的任务如下：

载入标准化的数据（包括400*400的正脸及特征点）
对每一张标准化的图片，模拟人脸检测仪，产生10个扰动的人脸框及相应的初始特征点 x0 。
求解 Δx , Φ ,其中 Δx=x∗−x0 , x∗ 表示true shape, Φ 表示每个特征点的特征向量
求解最小二乘问题，得到一系列 {Rk}

下面分别来说明：

载入数据

载入811个训练数据，按照上一章预备篇关于第一幅图片的裁剪方法裁剪这811张图片。
matlab代码如下：

function [Data] = load_single_data2 ( dbpath_img, dbpath_pts,image_index, options )

%% output format
%{
DATA.
- width_orig: the width of the original image.
- height_orig: the height of the original image.
- img_gray: the crop image.
- height: the height of crop image.
- wdith: the width of crop image.
- shape_gt: ground-truth landmark.
- bbox_gt: bounding box of ground-truth.
%}
slash = options.slash;
dbname = options.datasetName;

imlist = dir([dbpath_img slash '*.*g']);

    %% load images
    img = im2uint8(imread([dbpath_img slash imlist(image_index).name]));
    Data.width_orig  = size(img,2);
    Data.height_orig = size(img,1);

    %% load shape
    Data.shape_gt = double(annotation_load(...
        [dbpath_pts slash imlist(image_index).name(1:end-3) 'pts'] , dbname));

    if 0
        figure(1); imshow(img); hold on;
        draw_shape(Data.shape_gt(:,1),...
            Data.shape_gt(:,2),'y');
        hold off;
        pause;
    end    


    %% get bounding box
    Data.bbox_gt = getbbox(Data.shape_gt);

    %% enlarge region of face
    region     = enlargingbbox(Data.bbox_gt, 2.0);
    region(2)  = double(max(region(2), 1));%这里主要是为了防止求出的包围盒超过图像，因此一旦超过，则region(2)必然小于0，因此此时取1即可。
    region(1)  = double(max(region(1), 1));

    bottom_y   = double(min(region(2) + region(4) - 1, ...
        Data.height_orig));
    right_x    = double(min(region(1) + region(3) - 1, ...
        Data.width_orig));%防止长和宽超过图片大小，因此取二者最小值

    img_region = img(region(2):bottom_y, region(1):right_x, :);%取人脸区域

    %% recalculate(重新计算) the location of groundtruth shape and bounding box
    Data.shape_gt = bsxfun(@minus, Data.shape_gt,...
        double([region(1) region(2)]));%等价于Data{iimgs}.shape_gt-repeat( double([region(1) region(2)]),size(Data{iimgs}.shape_gt,1),1)
    %将图像的坐标原点移到人脸包围盒的左上角,并因此得以重新计算新的特征点
    Data.bbox_gt = getbbox(Data.shape_gt);%新的特征点的包围盒的左上角坐标发生了改变，但是宽和高没有变化


    if size(img_region, 3) == 1
        Data.img_gray = img_region;
    else
       Data.img_gray = rgb2gray(img_region);
    end

    Data.width    = size(img_region, 2);
    Data.height   = size(img_region, 1);

    if 0
        figure(2); imshow(Data.img_gray); hold on;
        draw_shape(Data.shape_gt(:,1),...
            Data.shape_gt(:,2),'y');
        hold off;
        pause;
    end


    %% normalized the image to the mean-shape
    sr = options.canvasSize(1)/Data.width;
    sc = options.canvasSize(2)/Data.height;

    Data.img_gray = imresize(Data.img_gray,options.canvasSize);

    Data.width    = options.canvasSize(1);
    Data.height   = options.canvasSize(2);

    Data.shape_gt = bsxfun(@times, Data.shape_gt, [sr sc]); 

     Data.bbox_gt(1:2) = bsxfun(@times, Data.bbox_gt(1:2), [sr sc]);%补充
    Data.bbox_gt(3:4) = bsxfun(@times, Data.bbox_gt(3:4), [sr sc]);%补充

    if 0
        figure(3); imshow(Data.img_gray); hold on;
        draw_shape(Data.shape_gt(:,1),...
            Data.shape_gt(:,2),'r');
        hold on; 
        rectangle('Position',  Data.bbox_gt, 'EdgeColor', 'k');
       pause;
    end       




end


function region = enlargingbbox(bbox, scale)
%同前面一样，初始时刻这里得到仅仅是特征点盒子，而我们如果想要包住整个人脸，就必须先将原始盒子的左上角平移一半的宽高，然后再放大两倍。这个在前面求解
%rect = get_correct_region( boxes, shape,Dataa(i).img, 1 );中也用到过
%因此这里得到的盒子是包住全部人脸的盒子。    
region(1) = floor(bbox(1) - (scale - 1)/2*bbox(3));
region(2) = floor(bbox(2) - (scale - 1)/2*bbox(4));

region(3) = floor(scale*bbox(3));
region(4) = floor(scale*bbox(4));

end

模拟人脸检测，产生10个初始值

事实上，每张图片都有一个ground-truth poins,因此可以求出它的包围盒，也可以通过OpenCV或其他的检测器可以检测出这样的框来。但两者有点不一样。如下：

，我们可以对opencv的检测盒做一些变换就可以得到近似的box gt了。
我们需要对包围盒扰动，以产生更多的盒子。怎么扰动呢？

对于一个盒子，有四个属性：x,y,width,height.因此我们只要产生10种属性即可。或者，也可以从另外一种角度来考虑这个问题。假设新的盒子已产生，那么它与原来的盒子之间就会产生4个方向的偏差，因此我们只需要对这些偏差做估计即可。

事实上，我们通过对811张图片的init shape 与ground truth shape求解偏差的均值与方差，以此可以产生两个（分别是(x,y),(width,height)）二维正太分布，因此就可以产生正太分布的随机数，于是10种属性的偏差就产生了，然后加上原来盒子的属性，就产生了10个扰动的盒子。再将mean shape对齐到10个盒子上产生了10个初始值。
do_learn_variation.m：用来产生偏差的均值和方差

function do_learn_variation( options )

%% loading learned shape model
load([options.modelPath options.slash options.datasetName '_ShapeModel.mat']);

imgDir = options.trainingImageDataPath;
ptsDir = options.trainingTruthDataPath;

%% loading data
Data = load_data( imgDir, ptsDir, options );

n = length(Data);

transVec   = zeros(n,2);
scaleVec   = zeros(n,2);

debug = 0;

%% computing the translation and scale vectors %%%%%%%%%%%%%%%%%%%%%%%%%%%%


for i = 1 : n

    %% the information of i-th image
    disp(Data(i).img);

    img   = imread(Data(i).img);
    shape = Data(i).shape;

    %% if detect face using viola opencv
   % boxes = detect_face( img , options );

    %% if using ground-truth
     boxes = [];

    %% predict the face box
    rect = get_correct_region( boxes, shape,img, 1 );

    %% predict initial location
    [initX,initY,width,height] = init_face_location( rect );
    %注意：上面算出的人脸框比较大，一般是特征点包围盒的4倍，因此上面算出的width和height分别是rect宽和高的一半，实际上从bounding_box的计算中可以看出，
    %特征点的包围盒分别向左上和右下延伸了一半的宽和高，导致人脸的包围盒的面积是特征点包围盒的4倍.

    init_shape = align_init_shape(ShapeModel.MeanShape, ...
                                              initX, initY, width, height);

    if debug
        figure(1); imshow(img); hold on;
        rectangle('Position',  rect, 'EdgeColor', 'g');
        draw_shape(init_shape.XY(1:2:end), init_shape.XY(2:2:end), 'y');%绘制每幅人脸图上的平均人脸点
         hold on;
        plot(initX, initY, 'b*');%中心点
        draw_shape(shape(:,1), shape(:,2), 'r');
        hold off;
        pause;
    end

    [aligned_shape, cropIm] = align_to_mean_shape( ShapeModel, img , ...
        vec_2_shape(init_shape.XY) , options );%vec_2_shape将一维向量转化为二维向量,获取400*400下的图像和在此标准下的真实人脸点和初始化人脸点

    [aligned_true_shape] = align_shape(aligned_shape.TransM,shape_2_vec(shape));

    if debug
        figure(1); imshow(cropIm); hold on;
        draw_shape(aligned_shape.XY(1:2:end), ...
            aligned_shape.XY(2:2:end), 'y');
        draw_shape(aligned_true_shape(1:2:end), ...
            aligned_true_shape(2:2:end), 'r');
        %hold off;
        pause;
    end    

    initVector = vec_2_shape(aligned_shape.XY);
    trueVector = vec_2_shape(aligned_true_shape);

    %compute mean and covariance matrices of translation.%计算平移的平均值和协方差矩阵
    meanInitVector  = mean(initVector);
    meanTrueVector  = mean(trueVector);

    %compute bounding box size
    initLeftTop     = min(initVector);
    initRightBottom = max(initVector);

    initFaceSize = abs(initLeftTop - initRightBottom);

    trueLeftTop     = min(trueVector);
    trueRightBottom = max(trueVector);

    trueFaceSize = abs(trueLeftTop - trueRightBottom);

    transVec(i,:) = (meanInitVector - meanTrueVector)./initFaceSize;%平移要除以一个标准的人脸大小是为了消除人脸大小带来的不一致
    scaleVec(i,:) = initFaceSize./trueFaceSize;

    clear img;
    clear xy;

    %    end

end

%compute mean and covariance matrices of scale.%计算缩放的平均值和协方差矩阵
[mu_trans,cov_trans] = mean_covariance_of_data ( transVec );
[mu_scale,cov_scale] = mean_covariance_of_data ( scaleVec );

DataVariation.mu_trans  = mu_trans;
DataVariation.cov_trans = cov_trans;
DataVariation.mu_scale  = mu_scale;
DataVariation.cov_scale = cov_scale;

save([options.modelPath options.slash options.datasetName ...
    '_DataVariation.mat'], 'DataVariation');

clear Data;

end

random_init_position.m:产生10个盒子

function [rbbox] = random_init_position( bbox, ...
                                                 DataVariation, nRandInit,options)

rbbox(1,:) = bbox;    

if nRandInit > 1

center = bbox(1:2) + bbox(3:4)/2;                                            

mu_trans  = DataVariation.mu_trans;
cov_trans = DataVariation.cov_trans;
mu_scale  = DataVariation.mu_scale;
cov_scale = DataVariation.cov_scale;

rInit_trans = mvnrnd(mu_trans,cov_trans,nRandInit-1);
%rInit_trans = zeros(nRandInit-1,2);

rCenter = repmat(center,nRandInit-1,1) + ...
                      rInit_trans.*repmat([bbox(3) bbox(4)],nRandInit-1,1);

rInit_scale = mvnrnd(mu_scale,cov_scale,nRandInit-1);%r = mvnrnd(MU,SIGMA,cases)——从均值为MU(1*d)，协方差矩阵为SIGMA(d*d)的正态分布中随机抽取cases个样本，返回cases*d的矩阵r。
%rInit_scale = ones(nRandInit-1,2);

rWidth  = zeros(nRandInit-1,1);
rHeight = zeros(nRandInit-1,1);

for i = 1 : nRandInit - 1
    rWidth(i)  = bbox(3)*rInit_scale(i,1);%原始是除
    rHeight(i) = bbox(4)*rInit_scale(i,2);
end

rbbox(2:nRandInit,1:2) = rCenter - [rWidth(:,1) rHeight(:,1)]/2;
rbbox(2:nRandInit,3:4) = [rWidth(:,1) rHeight(:,1)];
%补充项，防止扰动超过图片的边界
rbbox(1:nRandInit,1:2)=max(rbbox(1:nRandInit,1:2),1);
rbbox(1:nRandInit,1:2)=min(rbbox(1:nRandInit,1:2)+rbbox(1:nRandInit,3:4),options.canvasSize(1) )-rbbox(1:nRandInit,3::4),options.canvasSize(1) )-rbbox(1:nRandInit,3:4);
end

end

resetshape.m：将shape_union对齐到bbox

function [shape_initial] = resetshape(bbox, shape_union)
%RESETSHAPE Summary of this function goes here
%   Function: reset the initial shape according to the groundtruth shape and union shape for all faces
%   Detailed explanation goes here
%   Input: 
%       bbox: bbounding box of groundtruth shape
%       shape_union: uniionshape
%   Output:
%       shape_initial: reset initial shape
%       bbox: bounding box of face image

% get the bounding box according to the ground truth shape
width_union = (max(shape_union(:, 1)) - min(shape_union(:, 1)));
height_union = (max(shape_union(:, 2)) - min(shape_union(:, 2)));

shape_union = bsxfun(@minus, (shape_union), (min(shape_union)));

shape_initial = bsxfun(@times, shape_union, [(bbox(3)/width_union) (bbox(4)/height_union)]);
shape_initial = bsxfun(@plus, shape_initial, double([bbox(1) bbox(2)]));

end

求解特征点之差和特征向量

上面我们对每幅图片求得了10个初始特征点，这样我们就很容易求解 Δx 了。同样对于特征向量 Φ ，我们也可以很容易地求出来。关于特征向量，又名描述子。我们可以选择Sift特征或者Hog特征。
local_descriptors:求解特征向量

function [desc] = local_descriptors( img, xy, dsize, dbins, options )%计算描述子

featType = options.descType;

stage = options.current_cascade;

dsize = options.descScale(stage) * size(img,1);

if strcmp(featType,'raw')

    if size(img,3) == 3
        im = im2double(rgb2gray(uint8(img)));
    else
        im = im2double(uint8(img));
    end

    for ipts = 1 : npts
        desc(ipts,:) = raw(im,xy(ipts,:),desc_scale,desc_size);
    end

elseif strcmp(featType,'xx_sift')

%     i = randi([1 68],1,1);
%     rect = [xy(18,:) - [dsize/2 dsize/2] dsize dsize];
%     
%     if 1
%         figure(2); imshow(img); hold on;
%         rectangle('Position',  rect, 'EdgeColor', 'g');
%         hold off;
%         pause;
%     end


    if size(img,3) == 3
        im = im2double(rgb2gray(uint8(img)));
    else
        im = im2double(uint8(img));
    end

    xy = xy - repmat(dsize/2,size(xy,1),2);

    desc = xx_sift(im,xy,'nsb',dbins,'winsize',dsize);

elseif strcmp(featType,'hog')

    if size(img,3) == 3
        im = im2double(rgb2gray(uint8(img)));
    else
        im = im2double(uint8(img));
    end

    npts = size(xy,1);

    for ipts = 1 : npts
        %disp(ipts);
       if isempty(im)
          disp('empty im');
       end
        if isempty(dsize)
          disp('empty dsize');
       end
        desc(ipts,:) = hog(im,xy(ipts,:),dsize);
    end

end

end

求解最小二乘问题

问题：

M i n | | Δ X - R Φ | | 22

其中

ΔX∈R(68∗2)×n,Φ∈R(128∗68)×n
这里68为特征点的个数，128为每个特征点的特征向量的维数，n为样本量，这里为811.
显然这是个最小二乘问题，可以直接求解。

Δ X = R Φ Δ X Φ' = R Φ Φ' Δ X Φ' (Φ Φ' + λ I) - 1 = R Φ Φ' (Φ Φ' + λ I) - 1 Δ X Φ' (Φ Φ' + λ I) - 1 = R

也可以通过SVM方法求解，这里我们调用了 liblinear的SVR方法来求解。
linreg.m:求解最小二乘问题

function [R,lambda] = linreg( X , Y , lambda )

%X = [ones(size(X,1),1) X];

%% method 1: soving linear regression using close-form solution %%%%%%%%%%%

% R = (X'*X+eye(size(X,2))*lambda)\X'*Y;%先是X'*Y,再是除法

%% method 2: using SVR in liblinear %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
featdim   = size(X,2);
shapedim  = size(Y,2);

param = sprintf('-s 12 -p 0 -c %f -q', lambda);
%param = sprintf('-s 12 -p 0 -c 0.3 -q');
R_tmp = zeros( featdim, shapedim );
tic;
for o = 1 : shapedim
    disp(['Training landmarks ' num2str(o)]);
    model = train(Y(:,o),sparse(X),param);
    R_tmp(:,o) = model.w';
end
toc;

R = R_tmp;

end

后续的话，我们还需要根据求解的R来更新 x0 ,进而更新 Δx,Φ ,
最后求解新的最小二乘问题，得到新的R，以此下去，迭代5步即可。
这时产生的 {Rk} 就可以用来进行下一步的test了。如下为5次的迭代的特征点效果图：
1st step
2nd step
3rd step
4th step
5th step
我们可以看到越往后迭代，产生的新的特征点就越接近true shape.