【深度学习】人脸识别、视频中找人的实现

本文链接：https://blog.csdn.net/chengcheng1394/article/details/77817194

该博客介绍了如何在给定的图片或视频中识别人脸并进行特征比较。作者利用SeetaFaceEngine进行人脸识别和对齐，然后通过VGG模型提取特征以计算人脸相似度。代码和详细步骤可在GitHub项目中找到。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

转载请注明出处：http://blog.csdn.net/chengcheng1394/article/details/77817194

本程序使用SeetaFaceEngine和cv2.CascadeClassifier做人脸识别，然后利用VGG模型做人脸特征的比较。

代码下载地址：

https://github.com/chengstone/FindFaceInVideo

https://github.com/chengstone/SeetaFaceEngine

先上效果图：

视频中找人的效果在GitHub项目文件夹out中：https://github.com/chengstone/FindFaceInVideo/tree/master/VGGFace/out。

思路是这样的，目的是要在给定图片或者视频中找人，核心要解决的就是两件事，一个是人脸的识别，一个是人脸特征的比较。相似度越高，那么就意味着人找到了。

0、环境搭建

略。

需要安装：opencv2，caffe，Python2.7

关于代码工程的目录结构和使用说明请参见 https://github.com/chengstone/FindFaceInVideo/blob/master/README.md

1、人脸识别

人脸识别实现的选择，我本想使用卷积神经网络训练出人脸识别的模型，输出人脸的位置，但是发现训练的效果不理想，只好换方案了。

最终我使用了SeetaFaceEngine的实现方案，原项目地址：https://github.com/seetaface/SeetaFaceEngine。

只使用其中的FaceDetection和FaceAlignment，其中FaceAlignment/src/test/face_alignment_test.cpp做了修改，下面会大致说下改动。修改后的项目地址：https://github.com/chengstone/SeetaFaceEngine

核心代码逻辑：FaceDetection负责人脸位置的识别，FaceAlignment进行人脸对齐（仿射变换），将对齐后的人脸图片保存，作为后续人脸特征比较时使用。

核心代码：

seeta::FaceDetection detector("seeta_fd_frontal_v1.0.bin");
bool procFaceImage(string fullpath, string path, string filename, string ext, string dst_path, string in_size)
{
  // 人脸识别模型初始化
  detector.SetMinFaceSize(40);
  detector.SetScoreThresh(2.f);
  detector.SetImagePyramidScaleFactor(0.8f);
  detector.SetWindowStep(4, 4);

  // 人脸对齐模型初始化
  seeta::FaceAlignment point_detector((MODEL_DIR + "seeta_fa_v1.1.bin").c_str());

  //加载参数传入的图像，灰度图，用于人脸识别
  IplImage *img_grayscale = NULL;
  img_grayscale = cvLoadImage(/*(DATA_DIR + "image_0001.jpg")*/fullpath.c_str(), 0);
  if (img_grayscale == NULL)
  {
    printf("%s\n", fullpath.c_str());
    printf("[0]img_grayscale == NULL\n");
    return false;
  }

//缩小尺寸过大的图像，如果图像像素太大的话，会影响识别效果。
  IplImage *outImg = NULL;
  while(img_grayscale->width  > 1024 + 1024 || img_grayscale->height > 768 + 512 ){
    outImg = cvCreateImage(cvSize(img_grayscale->width / 2, img_grayscale->height / 2), 
                                     img_grayscale->depth, 
                                     img_grayscale->nChannels);
    cvPyrDown(img_grayscale, outImg);
    img_grayscale = outImg;
  }

//调用FaceDetection做人脸识别，支持一张图多个人脸
  printf("detectFace now!\n");
  seeta::ImageData image_data;
  std::vector<seeta::FaceInfo> faces = detectFace(img_grayscale, &image_data);
  if (faces.size() == (0)) {
	  printf("[1]detectFace error!\n");
	  return false;
  }
  printf("face number = %d\n",faces.size());


  printf("PointDetectLandmarks now!\n");

//准备好要保存的位置和文件名
  string result_path = (/*path*/dst_path + "/" + filename + "_result." + ext);
  // Detect 5 facial landmarks
  seeta::FacialLandmark points[5];

//又一次加载参数传入的图像，彩色图，大图缩减
  {
    IplImage *img_color = cvLoadImage(/*(DATA_DIR + "image_0001.jpg")*/fullpath.c_str(), 1);


    while(img_color->width  > 1024 + 1024 || img_color->height > 768 + 512 ){
    outImg = cvCreateImage(cvSize(img_color->width / 2, img_color->height / 2), 
                                     img_color->depth, 
                                     img_color->nChannels);
    cvPyrDown(img_color, outImg);
    img_color = outImg;
  }

//将找到的人脸位置在彩色图上画出矩形框，保存图片，图片名称类似于：IMG_3001_result.JPG
    for(int idx = 0;idx < faces.size(); idx++){
      cvRectangle(img_color, cvPoint(faces[idx].bbox.x, faces[idx].bbox.y), cvPoint(faces[idx].bbox.x + faces[idx].bbox.width - 1, faces[idx].bbox.y + faces[idx].bbox.height - 1), CV_RGB(255, 0, 0));
    }
    cvSaveImage(result_path.c_str(), img_color);
    //printf("Show result image\n");
    //cvShowImage("result", img_color);
  }

//主循环，开始处理每一张脸
  for(int idx = 0;idx < faces.size(); idx++){
    printf("Proc No.%d\n", idx);
//对每张脸找出landmarks（保存到points）
  point_detector.PointDetectLandmarks(image_data, faces[idx], points);


  IplImage *img_color = cvLoadImage(/*(DATA_DIR + "image_0001.jpg")*/fullpath.c_str(), 1);
  int pts_num = 5;
  cv::Mat img = cv::cvarrToMat(img_color);
//仿射变换，根据眼睛坐标进行人脸对齐。利用landmarks算出要旋转的角度，对彩色图做旋转（人脸对齐），然后将旋转后的landmarks坐标保存到points中
//后面要抠图，把人脸保存下来
  Mat retImg = getwarpAffineImg(img, points);
    Mat dstResizeImg;
	IplImage* dstimg_tmp = NULL;
	int resize_num = 0;


  IplImage qImg = IplImage(retImg);


  char ch_idx[3] ={0};
  sprintf(ch_idx, "%d", idx);
  char ch_size[5] = {0};
  sprintf(ch_size, "%d", atoi(in_size.c_str()));

//把旋转后的图片创建灰度图，下面要做一次人脸识别，用来抠图将人脸保存下来
//这里的代码各种图片格式转换，确实很绕：P
  IplImage *dst_gray = cvCreateImage(cvGetSize(&qImg), qImg.depth, 1);//
  cvCvtColor(&qImg, dst_gray, CV_BGR2GRAY);//
  seeta::ImageData image_data_inner;
//对旋转后的图片做人脸识别
  std::vector<seeta::FaceInfo> faces_inner = detectFace(dst_gray, &image_data_inner);
  if (faces_inner.size() == (0)) {
	  printf("[2]detectFace error!\n");
	  return false;
  }
  char ch_x1[5] = {0};
  char ch_y1[5] = {0};
  char ch_x2[5] = {0};
  char ch_y2[5] = {0};
//idx下标是主循环中的下标，这里默认对旋转后图片人脸识别出的人脸顺序，跟主循环识别出的人脸顺序是一致的
//因为两次人脸识别输入的图像不是同一个，一个是原图像，一个是旋转后的ÿ