Dlib学习人脸比对

最新推荐文章于 2024-07-07 10:31:47 发布

似董非董

最新推荐文章于 2024-07-07 10:31:47 发布

阅读量3.9k

点赞数

分类专栏：实例探索

本文链接：https://blog.csdn.net/u014587351/article/details/83827439

版权

实例探索专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Dlib学习人脸比对

1.Dlib库
- dlib库是干什么的
- 人脸比对流程
2.事后反思

http://dlib.net/ml.html

1.Dlib库

本文主要是记录一下使用dlib的一些笔记（实现在嵌入式平台上实现人脸比对【简易登陆】，中间使用的库主要是opencv dlib，涉及嵌入式，还有需要对dlib库进行交叉编译）。中间参考了很多网络资料，本文是记录下整个学习使用过程中的细节。如果有什么错误大家可以提出。
另外备注：交叉编译过程使用过dlib-19.1，结果发现dnn功能还没有完善，于是重新改为dlib-19.7.交叉编译opencv和dlib均使用cmake（不在详细描述）。

dlib库是干什么的

官方说明
dlib官方链接：点击这里
dlib库其实挺庞大的，library下面有：

Algorithms
API Wrappers
Bayesian Nets
Compression
Containers
Graph Tools
Image Processing
Linear Algebra
Machine Learning
Metaprogramming
Miscellaneous
Networking
Optimization
Parsing

先直接看看Machine Learning这一章。。。。

人脸比对流程

人脸位置提取
人脸识别有很多种方案，opencv里面有经典（老古董）的haar特征提取，优点是在嵌入式平台上识别速度还可以，精度有待提高。
Dlib提供了人脸识别的方法，不过该方法使用在纯cpu下，耗时惊人。先看一下经典的方案
opencv
经典HAAR特征

string cascadeName = "haarcascade_frontalface_alt.xml";
    CascadeClassifier faceCascade;
    cvtColor(src,src,CV_BGR2GRAY);//转换成灰度图片
    equalizeHist(src,src);//release the mat
    if(!faceCascade.load(cascadeName))
    {
        qDebug()<<"load error!";//备注：QT下
    }
    faceCascade.detectMultiScale(src,face,1.1,2,0|CV_HAAR_SCALE_IMAGE,Size(100,100));
    for(size_t i=0;i<face.size();i++)
    {
        rectangle(src,face[i],Scalar(0,0,255));
    }
    src(face[0]).copyTo(facePeople);

dlib

#include <dlib/dnn.h>
#include <sys/time.h>
//time struct
struct timeval starttime;
struct timeval endtime;
unsigned  long timer;
....
    try
    {
gettimeofday(&start,NULL);
        frontal_face_detector detector = get_frontal_face_detector();
            cout << "processing image "<< endl;
            array2d<unsigned char> img;
            load_image(img,"./people.jpg");
            // Make the image bigger by a factor of two.  This is useful since
            // the face detector looks for faces that are about 80 by 80 pixels
            // or larger.  Therefore, if you want to find faces that are smaller
            // than that then you need to upsample the image as we do here by
            // calling pyramid_up().  So this will allow it to detect faces that
            // are at least 40 by 40 pixels in size.  We could call pyramid_up()
            // again to find even smaller faces, but note that every time we
            // upsample the image we make the detector run slower since it must
            // process a larger image.
            pyramid_up(img);
            // Now tell the face detector to give us a list of bounding boxes
            // around all the faces it can find in the image.
            std::vector<rectangle> dets = detector(img);

            cout << "Number of faces detected: " << dets.size() << endl;
            for(rectangle n : dets) {
                cout << "faces detected: " << n << endl;
            }
 gettimeofday(&endtime,NULL);
 timer=1000000*(endtime.tv_sec-start.tv_sec)+endtime.tv_usec-start.tv_usec;
 printf("timer =%ld us\n",timer);
    }
    catch (exception& e)
    {
        cout << "\nexception thrown!" << endl;
        cout << e.what() << endl;
    }

计算了一下，纯cpu下运行快600ms判断出了里面的人脸位置，然后换到嵌入式平台（rk3128，瑞芯微），快1分钟才计算出结果，所以在嵌入式平台上被pass。

人脸比对
在前面的步骤中获取得到了face，也就是人脸区域。人脸比对也可以分为两个部分进行：
特征向量抽取
直接使用dlib中已经训练好的ResNet模型接口，这个接口可以返回一个128维的人脸特征向量，如果要实现多个人脸比对，可以直接把这个128维人脸存储标记好。
距离匹配
在获取特征向量后可以使用欧式距离和之前我们存储好的标记特征向量进行匹配，使用最近邻分类器KNN判断。（一般距离小于0.6就可以认为是同一个人脸了，实际情况可以小调）
补充欧式距离计算公式：

二维的公式
　　ρ = sqrt( (x1-x2)^2+(y1-y2)^2 )
三维的公式
　　ρ = sqrt( (x1-x2)^2+(y1-y2)^2+(z1-z2)^2 )

参靠代码：

#include <iostream>
#include <sys/time.h>
#include <stdio.h>
#include <dlib/dnn.h>
#include <vector>
#include <opencv2/opencv.hpp>


using namespace dlib;
using namespace std;
using namespace cv;

//time struct
struct timeval starttime;
struct timeval endtime;
unsigned  long timer;

template <template <int, template<typename>class, int, typename> class block, int N, template<typename>class BN, typename SUBNET>
using residual = add_prev1<block<N, BN, 1, tag1<SUBNET>>>;

template <template <int, template<typename>class, int, typename> class block, int N, template<typename>class BN, typename SUBNET>
using residual_down = add_prev2<avg_pool<2, 2, 2, 2, skip1<tag2<block<N, BN, 2, tag1<SUBNET>>>>>>;

template <int N, template <typename> class BN, int stride, typename SUBNET>
using block = BN<con<N, 3, 3, 1, 1, relu<BN<con<N, 3, 3, stride, stride, SUBNET>>>>>;

template <int N, typename SUBNET> using ares = relu<residual<block, N, affine, SUBNET>>;
template <int N, typename SUBNET> using ares_down = relu<residual_down<block, N, affine, SUBNET>>;

template <typename SUBNET> using alevel0 = ares_down<256, SUBNET>;
template <typename SUBNET> using alevel1 = ares<256, ares<256, ares_down<256, SUBNET>>>;
template <typename SUBNET> using alevel2 = ares<128, ares<128, ares_down<128, SUBNET>>>;
template <typename SUBNET> using alevel3 = ares<64, ares<64, ares<64, ares_down<64, SUBNET>>>>;
template <typename SUBNET> using alevel4 = ares<32, ares<32, ares<32, SUBNET>>>;

 using anet_type = loss_metric<fc_no_bias<128, avg_pool_everything
         <alevel0<alevel1<alevel2<
                                alevel3<
                                        alevel4<
                                                max_pool<3, 3, 2, 2, relu<affine<con<32, 7, 7, 2, 2,
                                                        input_rgb_image_sized<150>
                                                >>>>>>>>>>>>;

anet_type net;

int main(int argc, char** argv)
{
    dlib::matrix<rgb_pixel> img1;
    dlib::matrix<rgb_pixel> img2;

//resize the image
//    cv::Mat img =imread("./people.jpg");
//    cv::resize(img,img,cv::Size(150,150));
//    imwrite("face.jpg",img);
//备注：这个测试例程里面使用的face，都已经是150X150，如果不是要resize好，因为本人长相丑陋，怕吓到大家，就不贴图片了。再次叮嘱，比对图片保证是150X150.
gettimeofday(&starttime,NULL);

            cout << "compare image "<< endl;
            load_image(img1,"./face.jpg");
            load_image(img2,"./face1.jpg");
    deserialize("dlib_face_recognition_resnet_model_v1.dat") >>net;//加载好resnet训练好的模型

            std::vector<matrix<rgb_pixel>> faces;
            faces.push_back(img1);
            faces.push_back(img2);
            std::vector<matrix<float,0,1>> face_descriptors=net(faces);
            float f = length(face_descriptors[0]-face_descriptors[1]);
            cout<<"the leanth of face_descriptors is"<<f<<endl;
 gettimeofday(&endtime,NULL);
 timer=1000000*(endtime.tv_sec-starttime.tv_sec)+endtime.tv_usec-starttime.tv_usec;
 printf("timer =%ld us\n",timer);
}

上述demo例程，在PC（配置I5-6代，16G内存，微软256G固态硬盘）。

compare image 
the leanth of face_descriptors is74.3464
timer =607111 us

600ms,当然我这个例程是将图片存在了本地硬盘，时间误差大。

2.事后反思

人脸位置提取方案改进
因为目前这儿是在嵌入式平台上用，所以使用dlib在cpu模式下识别人脸位置不现实，opencv经典方案目前看，效果不是很理想，这个部分需要做一定的改进才行。
目前就暂时写到这里吧，具体dlib的其它功能，Gui，socket，pthread暂时还没有去试试，抽空去体验一下dlib性能。上述博文参考自网上很多，结合自己实际交叉编译到嵌入式平台。