作为新手刚刚使用dlib做人脸识别方面的工作。在这记录一下平时使用的困惑和心得,希望可以为后来的新手朋友门提供一点点经验。
首先介绍一下我的工作环境
- dlib: 19.7
- 系统:centos 7 字符界面
代码细节说明
1. 有关支持GUI的问题
#include <dlib/gui_widgets.h>
在无桌面系统运行程序时,有关窗口查看图片的行为就不支持了。在这理由两个解决方案:
(1) 将有关GUI的代码去掉,但是这个比较麻烦;
(2)安装libx11-dev ,装完之后还是看不见图片的, 但是不用修改代码,编译的时候不会报错了;
2. 有关图片缩放问题
检测图片的时候难免有的图片不合格(人脸过大(检测耗时),人脸过小(检测不到)),所以需要需要缩放处理。
(1)放大,很简单,直接调用现有接口
pyramid_up(img); //放大图片
(2)缩小
pyramid_down<2> pyr; // 2 缩放级别
pyr(img); //缩小图片
pyramid_up(img, pyr); //缩小之后恢复图片
3. 人脸检测
matrix<rgb_pixel> img;
load_image(img, image_path);
std::vector<rectangle> dets = detector(img);
检测完成之后,得到dets , 其中包含了图片img中所有检测到的人脸的矩形坐标。
对于数据结构可做一下简单的操作
dets[i].left(); //对应人脸框的左上角横坐标
dets[i].top(); //对应人脸框的左上角纵坐标
dets[i].right();
dets[i].bottom();
4. 提取特征值
auto shape = sp(img, dets[0]); //提取第0张脸的特征值
matrix<rgb_pixel> face_chip;
extract_image_chip(img, get_face_chip_details(shape,150,0.25), face_chip);
std::vector<matrix<rgb_pixel>> faces;
faces.push_back(move(face_chip));
std::vector<matrix<float,0,1>> face_descriptors = net(faces);
//face_descriptors[0] 是脸的128维的特征值
5. 人脸比对
float close = length(face_descriptors[i]-face_descriptors[j]);
6. 将代码移植到支持CUDA的服务器上运行出现的问题
//错误提示
cudaStreamDestroy() failed. Reason: driver shutting down
cudaFree() failed. Reason: driver shutting down
cudaFreeHost() failed. Reason: driver shutting down
出现这个错误是在程序成功运行之后出现的,也就是说,所有的功能可以正常实现,最后报了这个错误。经过不断尝试,找到原因由于加载模型时。
shape_predictor sp;
net_type detect_net;
anet_type feature_net;
这些变量定义全局变量,导致上面的问题。详细原因暂时没有思路。
参考:
https://stackoverflow.com/questions/40979060/cudaerrorcudartunloading-error-29-due-to-driver-shutting-down
修改方案: 改为局部变量,或者封装新的类。
7. 完整的例子,作为参考
/*************************************************************************************************
func: 传入known路径下的图片,依次循环比对unkonwn路径下的所有图片,找出最相似的人
*************************************************************************************************/
#include <dlib/dnn.h>
#include <dlib/gui_widgets.h>
#include <dlib/clustering.h>
#include <dlib/string.h>
#include <dlib/image_io.h>
#include <dlib/image_processing/frontal_face_detector.h>
#include <iostream>
#include <dirent.h>
#include <stdio.h>
#include <vector>
#include <unistd.h>
#include <map>
using namespace dlib;
using namespace std;
// ----------------------------------------------------------------------------------------
// The next bit of code defines a ResNet network. It's basically copied
// and pasted from the dnn_imagenet_ex.cpp example, except we replaced the loss
// layer with loss_metric and made the network somewhat smaller. Go read the introductory
// dlib DNN examples to learn what all this stuff means.
//
// Also, the dnn_metric_learning_on_images_ex.cpp example shows how to train this network.
// The dlib_face_recognition_resnet_model_v1 model used by this example was trained using
// essentially the code shown in dnn_metric_learning_on_images_ex.cpp except the
// mini-batches were made larger (35x15 instead of 5x5), the iterations without progress
// was set to 10000, and the training dataset consisted of about 3 million images instead of
// 55. Also, the input layer was locked to images of size 150.
template <template <int,template<typename>class,int,typename> class block, int N, template<typename>class BN, typename SUBNET>
using residual = add_prev1<block<N,BN,1,tag1<SUBNET>>>;
template <template <int,template<typename>class,int,typename> class block, int N, template<typename>class BN, typename SUBNET>
using residual_down = add_prev2<avg_pool<2,2,2,2,skip1<tag2<block<N,BN,2,tag1<SUBNET>>>>>>;
template <int N, template <typename> class BN, int stride, typename SUBNET>
using block = BN<con<N,3,3,1,1,relu<BN<con<N,3,3,stride,stride,SUBNET>>>>>;
template <int N, typename SUBNET> using ares = relu<residual<block,N,affine,SUBNET>>;
template <int N, typename SUBNET> using ares_down = relu<residual_down<block,N,affine,SUBNET>>;
template <typename SUBNET> using alevel0 = ares_down<256,SUBNET>;
template <typename SUBNET> using alevel1 = ares<256,ares<256,ares_down<256,SUBNET>>>;
template <typename SUBNET> using alevel2 = ares<128,ares<128,ares_down<128,SUBNET>>>;
template <typename SUBNET> using alevel3 = ares<64,ares<64,ares<64,ares_down<64,SUBNET>>>>;
template <typename SUBNET> using alevel4 = ares<32,ares<32,ares<32,SUBNET>>>;
using anet_type = loss_metric<fc_no_bias<128,avg_pool_everything<
alevel0<
alevel1<
alevel2<
alevel3<
alevel4<
max_pool<3,3,2,2,relu<affine<con<32,7,7,2,2,
input_rgb_image_sized<150>
>>>>>>>>>>>>;
// ------------------------------------------------------------------------------------------------------------
// The first thing we are going to do is load all our models. First, since we need to
// find faces in the image we will need a face detector:
frontal_face_detector detector;
// We will also use a face landmarking model to align faces to a standard pose: (see face_landmark_detection_ex.cpp for an introduction)
shape_predictor sp;
// And finally we load the DNN responsible for face recognition.
anet_type net;
// ----------------------------------------------------------------------------------------
// store the file name and features corresponding to them.
struct People
{
string name;
matrix<float,0,1> feature;
};
// ----------------------------------------------------------------------------------------
// get file name by path , The path is similar to that of : /dir/dir/*.jpg, must insclude *.jpg
char* getFileNameFromPath(char *path)
{
int len = strlen(path);
char *fullPath = (char*)malloc(len+1);
strcpy(fullPath, path);
const char dot[] = ".";
char *temp = strrchr(fullPath, '/');
temp++;
char *filename = strtok(temp, dot);
free(fullPath);
return filename;
}
// ----------------------------------------------------------------------------------------
// get file name by dir , The path is similar to that of: /dir/dir/ , just a dir;
// getted file name will put vector imageList.
int getImgageListByDir(const char *path, std::vector <string> &imageList)
{
DIR *directory_pointer;
struct dirent *entry;
if(NULL == path)
{
printf("error: path is null.");
return 0;
}
if((directory_pointer = opendir(path)) == NULL)
{
printf("Error open\n");
return 0;
}
else
{
while((entry = readdir(directory_pointer)) != NULL)
{
if(entry->d_name[0] == '.')
continue;
imageList.push_back(entry->d_name);
//printf("%s\n",entry->d_name);
}
}
return 0;
}
// ----------------------------------------------------------------------------------------
// This function get a list of all theimage files from the directory
std::vector<string> getImageList (
const string& dir
)
{
std::vector<string> imgs;
for (auto img : directory(dir).get_files())
{
imgs.push_back(img);
}
return imgs;
}
string getFileName(string path)
{
string file_name = "";
auto i = path.rfind('/', path.length());
if (i != string::npos) {
file_name = path.substr(i+1, path.length() - i);
}
size_t lastindex = file_name.find_last_of(".");
return file_name.substr(0, lastindex);
}
// laoding faces features from txt file
std::vector<People> LoadingFaceFeatures(std::vector<string> images)
{
std::vector<People> peopleList;
std::ifstream input;
string s;
float f;
matrix<float, 0, 1> feature_array;
feature_array.set_size(128);
for (int i = 0; i < images.size(); i++)
{
string imagePath = images[i];
string peopleName = getFileName(imagePath);
string file_ext = tolower(get_file_extension(imagePath));
if (file_ext != "txt" )
{
cout << "not an image:" << imagePath << endl;
continue;
}
People newPeople;
input.open(imagePath);
assert(input.isopen);
int j = 0;
while(getline(input, s))
{
f = atof(s.c_str());
feature_array(0, j) = f;
//cout << feature_array(0, j) << "\t" ;
j++;
}
input.close();
newPeople.name = peopleName;
newPeople.feature = feature_array;
peopleList.push_back(newPeople);
}
return peopleList;
}
std::vector<matrix<float, 0, 1>> extractFaceFeatures(string imagePath)
{
matrix<rgb_pixel> img;
matrix<float,0,1> tmp;
tmp.set_size(128);
cout << "-----------------extractFaceFeatures: " << imagePath << endl;
load_image(img, imagePath);
std::vector<rectangle> dets = detector(img);
for (int i = 0 ; i < 10; i++)
{
for (int j = 0; j < 10; j++)
{
printf("[%d, %d]:<%d> ", i, j, img(i, j).red);
}
}
if (dets.size() == 0)
{
cout << "No faces found in image:" << imagePath << endl;
}
else if(dets.size() > 1)
{
cout << "More than one face detected: " << dets.size() << " in " << imagePath << endl;
}
// for each face extract a copy that has been normalized to 150x150 pixels in size and appropriately rotated and centered.
auto shape = sp(img, dets[0]);
cout << "*************" << endl;
matrix<rgb_pixel> face_chip;
extract_image_chip(img, get_face_chip_details(shape,150,0.25), face_chip);
std::vector<matrix<rgb_pixel>> faces;
faces.push_back(move(face_chip));
// extract face features
std::vector<matrix<float,0,1>> face_descriptors = net(faces);
return face_descriptors;
}
string findPeopleFromList(string imagePath, std::vector<People> peopleList)
{
float min_similarity = 1;
string peopleName = "";
std::vector<matrix<float,0,1>> face_descriptors = extractFaceFeatures(imagePath);
if(face_descriptors.empty())
{
cout << "No face found!" << endl;
return "";
}
/*
matrix<rgb_pixel> img;
load_image(img, imagePath);
std::vector<rectangle> dets = detector(img);
cout <<"img.size():" << img.size() << endl;
if (dets.size() == 0)
{
cout << "No faces found in image:" << imagePath << endl;
return "";
}
else if(dets.size() > 1)
{
cout << "More than one face detected: " << dets.size() << " in " << imagePath << endl;
return "";
}
// for each face extract a copy that has been normalized to 150x150 pixels in size and appropriately rotated and centered.
auto shape = sp(img, dets[0]);
matrix<rgb_pixel> face_chip;
extract_image_chip(img, get_face_chip_details(shape,150,0.25), face_chip);
std::vector<matrix<rgb_pixel>> faces;
faces.push_back(move(face_chip));
// extract face features
std::vector<matrix<float,0,1>> face_descriptors = net(faces);
*/
cout << "find people in " << imagePath << endl;
for (int i = 0; i < peopleList.size(); i++)
{
float similarity = length(face_descriptors[0]-peopleList[i].feature);
cout << " similarity: " << similarity << " \t\t " << peopleList[i].name << endl;
if (similarity < min_similarity)
{
min_similarity = similarity;
peopleName = peopleList[i].name;
}
}
return peopleName;
}
int init()
{
detector = get_frontal_face_detector();
deserialize("shape_predictor_5_face_landmarks.dat") >> sp;
deserialize("dlib_face_recognition_resnet_model_v1.dat") >> net;
return 0;
}
int main(int argc, char** argv) try
{
if (argc != 3)
{
cout << "Run this example by invoking it like this: " << endl;
cout << " ./dnn_extracte_face_feature faces/*.jpeg" << endl;
cout << endl;
cout << "You will also need to get the face landmarking model file as well as " << endl;
cout << "the face recognition model file. Download and then decompress these files from: " << endl;
cout << "http://dlib.net/files/shape_predictor_5_face_landmarks.dat.bz2" << endl;
cout << "http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2" << endl;
cout << endl;
return 1;
}
init();
// ----------------------------------------------------------------------------------------
std::vector<string> knownImages = getImageList(argv[1]);
std::vector<People> peopleList = LoadingFaceFeatures(knownImages);
cout << "knownImages ---------------------------------------------------> " << peopleList.size() << endl;
std::vector<string> unknownImages = getImageList(argv[2]);
for (int i = 0; i < unknownImages.size(); i++)
{
string imagePath = unknownImages[i];
string name = findPeopleFromList(imagePath, peopleList);
cout << " " << imagePath << " -> " << name << endl << endl;
}
cout << "hit enter to terminate" << endl;
cin.get();
}
catch (std::exception& e)
{
cout << e.what() << endl;
}