dlib人脸识别代码解读

最新推荐文章于 2025-09-27 21:07:16 发布

原创

最新推荐文章于 2025-09-27 21:07:16 发布 · 4.5k 阅读

64 ·

CC 4.0 BY-SA版权

文章标签：

#人脸识别

文章目录

一人脸关键点检测器的训练

人脸关键点检测器的训练使用的是级联回归算法（参考examples/train_shape_predictor_ex.cpp）。

1.1 原理

1.1.1 级联回归公式

dlib库使用级联回归树算法的原理出自文章：One Millisecond Face Alignment with an Ensemble of Regression Trees by Vahid Kazemi and Josephine Sullivan, CVPR 2014。

首先，级联回归公式为：
在这里插入图片描述
其中S ̂^((t))表示第t级回归器的形状（68个特征点坐标），I为图像（数据），r_t表示第t级回归器的更新量，更新策略采用GBDT（梯度提升决策树），即每级回归器学习的都是当前形状与groundtruth形状的残差。

1.1.2 回归方程求解

首先，为了求解回归方程r_0，需要先设置初始形状，假设已有训练数据
在这里插入图片描述
利用级联回归公式，进行迭代后：
S ̂^((t+1) )=S ̂^((t) )+r_t (I,S ̂^((t) ) )∆S_i^((t+1) )=〖S_π〗_i-S ̂_i^((t+1) )

其中r_t（包括r_0）的求解采用GBDT算法，过程如下：
在这里插入图片描述
算法步骤解释：

1.初始化，估计使损失函数极小化的常数，只有一个根节点的树，ganma为常数。

2.（a）计算残差
（b）估计回归树叶节点区域，拟合残差的近似值，得到这一轮的回归树
（c）更新回归树
（d）得到回归方程

1.1.3 分裂点

回归树有很多个分裂点和叶节点，是否分裂节点由以下公式判定：
在这里插入图片描述

1.2 源代码

完整的人脸关键点检测器训练代码如下（含注释）：

function:借助dlib训练自己的人脸关键点检测器(参考dlib/examples/train_shape_predictor_ex)

#include <dlib/image_processing.h>
#include <dlib/data_io.h>
#include <iostream>

using namespace dlib;
using namespace std;
// ----------------------------------------------------------------------------------------
//获取两眼间距离，输出D[i][j]表示objects[i][j]中人脸的两眼间距离
std::vector<std::vector<double> > get_interocular_distances(
    const std::vector<std::vector<full_object_detection> >& objects
    );
// ----------------------------------------------------------------------------------------

int main(int argc, char** argv)
{
    try
    {
        //一、preprocessing
        //1. 载入训练集，测试集
        const std::string faces_directory = "faces";
        dlib::array<array2d<unsigned char> > images_train, images_test;
        std::vector<std::vector<full_object_detection> > faces_train, faces_test;

        load_image_dataset(images_train, faces_train, faces_directory + "/training_with_face_landmarks.xml");
        load_image_dataset(images_test, faces_test, faces_directory + "/testing_with_face_landmarks.xml");

        // 二、training
        //1. 定义trainer类型
        shape_predictor_trainer trainer;
        //设置训练参数
        trainer.set_oversampling_amount(300); 
        trainer.set_nu(0.05);
        trainer.set_tree_depth(2);
        trainer.be_verbose();

        // 2. 训练，生成人脸关键点检测器
        shape_predictor sp = trainer.train(images_train, faces_train);

        // 三、测试
        cout << "mean training error: " <<
            test_shape_predictor(sp, images_train, faces_train, get_interocular_distances(faces_train)) << endl;
        cout << "mean testing error:  " <<
            test_shape_predictor(sp, images_test, faces_test, get_interocular_distances(faces_test)) << endl;

        // 四、存储
        serialize("sp.dat") << sp;
    }
    catch (exception& e)
    {
        cout << "\nexception thrown!" << endl;
        cout << e.what() << endl;
    }
}

// ----------------------------------------------------------------------------------------
double interocular_distance(
    const full_object_detection& det
    )
{
    dlib::vector<double, 2> l, r;
    double cnt = 0;
    // Find the center of the left eye by averaging the points around 
    // the eye.
    for (unsigned long i = 36; i <= 41; ++i)
    {
        l += det.part(i);
        ++cnt;
    }
    l /= cnt;

    // Find the center of the right eye by averaging the points around 
    // the eye.
    cnt = 0;
    for (unsigned long i = 42; i <= 47; ++i)
    {
        r += det.part(i);
        ++cnt;
    }
    r /= cnt;

    // Now return the distance between the centers of the eyes
    return length(l - r);
}
// 获取两眼间距离函数
std::vector<std::vector<double> > get_interocular_distances(
    const std::vector<std::vector<full_object_detection> >& objects
    )
{
    std::vector<std::vector<double> > temp(objects.size());
    for (unsigned long i = 0; i < objects.size(); ++i)
    {
        for (unsigned long j = 0; j < objects[i].size(); ++j)
        {
            temp[i].push_back(interocular_distance(objects[i][j]));
        }
    }
    return temp;
}
// ----------------------------------------------------------------------------------------

1.3 代码解读

1.3.1 预处理阶段

1、载入训练集、测试集

load_image_dataset(images_train, faces_train, faces_directory + "/training_with_face_landmarks.xml");
loa

最低0.47元/天解锁文章