准备初始数据
mean_shape
mean_shape就是训练图片所有ground_truth points的平均值.那么具体怎么做呢?是不是直接将特征点相加求平均值呢?
显然这样做是仓促和不准确的。因为图片之间人脸是各式各样的,收到光照、姿势等各方面的影响。因此我们求取平均值,应该在一个相对统一的框架下求取。如下先给出matlab代码:
function mean_shape = calc_meanshape(shapepathlistfile)
fid = fopen(shapepathlistfile);
shapepathlist = textscan(fid, '%s', 'delimiter', '\n');
if isempty(shapepathlist)
error('no shape file found');
mean_shape = [];
return;
end
shape_header = loadshape(shapepathlist{1}{1});
if isempty(shape_header)
error('invalid shape file');
mean_shape = [];
return;
end
mean_shape = zeros(size(shape_header));
num_shapes = 0;
for i = 1:length(shapepathlist{1})
shape_i = double(loadshape(shapepathlist{1}{i}));
if isempty(shape_i)
continue;
end
shape_min = min(shape_i, [], 1);
shape_max = max(shape_i, [], 1);
% translate to origin point
shape_i = bsxfun(@minus, shape_i, shape_min);
% resize shape
shape_i = bsxfun(@rdivide, shape_i, shape_max - shape_min);
mean_shape = mean_shape + shape_i;
num_shapes = num_shapes + 1;
end
mean_shape = mean_shape ./ num_shapes;
img = 255 * ones(500, 500, 3);
drawshapes(img, 50 + 400 * mean_shape);
end
function shape = loadshape(path)
% function: load shape from pts file
file = fopen(path);
if file == -1
shape = [];
fclose(file);
return;
end
shape = textscan(file, '%d16 %d16', 'HeaderLines', 3, 'CollectOutput', 2);
fclose(file);
shape = shape{1};
end
解析:
公式表示:
{shapegt−[Region(1),Region(2)]}/[Region(3),Region(4)))]]⇒[0,1]×[0,1]
准备 ΔSt
我们知道3000FPS的核心思想是:
ΔSt=WtΦt(I,St−1)
其中 ΔSt=Sgt−St 为第t个阶段的残差;而 Φt(I,St−1) 则为特征提取函数;W为线性回归矩阵。由 《人脸配准坐标变换解析》我们可以看到所谓的 ΔSt 需进行相似性变换,而 Φt(I,St−1) 则不需要.
相似性变换的主要过程是:
先将 St , S0 中心化变换,再求解如下变换矩阵:
S0=cRSt
,求解完cR后,对
ΔSt
施加同样的变换,即
St˜=cRΔSt
.我们将使用变化后的
St˜
去求解线性回归矩阵W.
先贴代码: train_model.m 第103行起
Param.meanshape = S0(Param.ind_usedpts, :); %选取特定的landmark
dbsize = length(Data);
% load('Ts_bbox.mat');
augnumber = Param.augnumber; %为每张人脸选取的init_shape的个数
for i = 1:dbsize
% initializ the shape of current face image by randomly selecting multiple shapes from other face images
% indice = ceil(dbsize*rand(1, augnumber));
indice_rotate = ceil(dbsize*rand(1, augnumber));
indice_shift = ceil(dbsize*rand(1, augnumber));
scales = 1 + 0.2*(rand([1 augnumber]) - 0.5);
Data{i}.intermediate_shapes = cell(1, Param.max_numstage); %中间shape
Data{i}.intermediate_bboxes = cell(1, Param.max_numstage);
Data{i}.intermediate_shapes{1} = zeros([size(Param.meanshape), augnumber]); %68*2*augnumber(augnumber为第i图片设置的初始shape的个数)
Data{i}.intermediate_bboxes{1} = zeros([augnumber, size(Data{i}.bbox_gt, 2)]); %augnumber*4
Data{i}.shapes_residual = zeros([size(Param.meanshape), augnumber]); %shapes_residual为shape 残差 维数:68*2*augnumber
Data{i}.tf2meanshape = cell(augnumber, 1);
Data{i}.meanshape2tf = cell(augnumber, 1);
% if Data{i}.isdet == 1
% Data{i}.bbox_facedet = Data{i}.bbox_facedet*ts_bbox;
% end
% 如下一段的意思是如果augnumber=1,表明每个图片的Init_shape只有一个,因此这要设置成mean_shape即可,这时你会发现Data{i}.tf2meanshape{1}其实就是
% 单位矩阵,因为他是从mean_shape转化到mean_shape。后面就不一样了.
%;对于augnumber>1的其他init_shape将采用平移、旋转、
% 缩放等方式产生更多的shape,也可以从其他图片的shape中挑选shape
for sr = 1:params.augnumber
if sr == 1
% estimate the similarity transformation from initial shape to mean shape
% Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_gt, Param.meanshape);
% Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_gt;
Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_facedet, Param.meanshape);
Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_facedet;
%将mean shape reproject face detection bbox上
meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %meanshape_resize与 Data{i}.intermediate_shapes{1}(:,:, sr) 是相同的
%计算当前的shape与mean shape之间的相似性变换
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ...
(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');
Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), ...
bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');
% calculate the residual shape from initial shape to groundtruth shape under normalization scale
shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);
% transform the shape residual in the image coordinate to the mean shape coordinate
[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)');
Data{i}.shapes_residual(:, 1, 1) = u';
Data{i}.shapes_residual(:, 2, 1) = v';
else
% randomly rotate the shape
% shape = resetshape(Data{i}.bbox_gt, Param.meanshape); % Data{indice_rotate(sr)}.shape_gt
shape = resetshape(Data{i}.bbox_facedet, Param.meanshape); % Data{indice_rotate(sr)}.shape_gt
%根据随机选取的scale,rotation,translate计算新的初始shape然后投影到bbox上
if params.augnumber_scale ~= 0
shape = scaleshape(shape, scales(sr));
end
if params.augnumber_rotate ~= 0
shape = rotateshape(shape);
end
if params.augnumber_shift ~= 0
shape = translateshape(shape, Data{indice_shift(sr)}.shape_gt);
end
Data{i}.intermediate_shapes{1}(:, :, sr) = shape;
Data{i}.intermediate_bboxes{1}(sr, :) = getbbox(shape);
meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %将
Data{i}.tf2meanshape{sr} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), ...
bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), 'NonreflectiveSimilarity');
Data{i}.meanshape2tf{sr} = fitgeotrans(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), ...
bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), 'NonreflectiveSimilarity');
shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, sr), [Data{i}.intermediate_bboxes{1}(sr, 3) Data{i}.intermediate_bboxes{1}(sr, 4)]);
[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)');
Data{i}.shapes_residual(:, 1, sr) = u';
Data{i}.shapes_residual(:, 2, sr) = v';
% Data{i}.shapes_residual(:, :, sr) = tformfwd(Data{i}.tf2meanshape{sr}, shape_residual(:, 1), shape_residual(:, 2));
end
end
end
这段代码的理解需要结合上面给出的那篇文章《人脸配准坐标变换解析》。
按照《人脸配准坐标变换解析》文章所述,
S0¯¯¯¯S1¯¯¯¯=S0−mean(S0)=S1−mean(S1)}⇒S0¯¯¯¯=c1R1S1¯¯¯¯
因此根据
ΔS=Sg−S1
可推出
ΔS˜=c1R1ΔS
但是现在问题比较特殊,需要多操作一下:
由:
%将mean shape reproject face detection bbox上
meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape);
查看resetshape的定义知meanshape被映射到intermediate_bboxes中,使得
S0
和
S1
处于同样的尺度下和大致相似的位置上。用数学语言表达为:
S0_resize=S0∗Ratio+[Region(1),Region(2)]
这里Ratio实际上是intermediate_bboxes的大小。
于是同样按照上面的方法计算:
S0˜=S0_Resize−mean(S0_Resize)=S0∗Ratio−mean(S0)∗Ratio=(S0−mean(S0))∗Ratio=S0¯¯¯¯∗Ratio
经过计算得 S0˜=Ratio∗S0¯¯¯¯=c1˜R1˜S1¯¯¯¯ .( ★ )
这也就是上面的代码:
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ...
(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');
Data{i}.tf2meanshape{1}即为这里算出的
c1˜R1˜
.
但我们想要的是
S0¯¯¯¯=c1R1S1¯¯¯¯
,不用着急,(
★
)为我们指明了方向。
c1R1=c1˜R1˜/Ratio=c1˜R1˜/intermediate_bboxes
.因此:
ΔS˜=c1˜R1˜/intermediate_bboxes∗ΔS
也就是代码中提的:
%计算当前的shape与mean shape之间的相似性变换
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))),(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');
Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))),bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');
% calculate the residual shape from initial shape to groundtruth shape under normalization scale
shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);
% transform the shape residual in the image coordinate to the mean shape coordinate
[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)');
Data{i}.shapes_residual(:, 1, 1) = u';
Data{i}.shapes_residual(:, 2, 1) = v';