MTCNN人脸及特征点检测---代码应用详解（基于ncnn架构）

最新推荐文章于 2020-12-31 10:51:58 发布

jimfang1987

最新推荐文章于 2020-12-31 10:51:58 发布

阅读量1k

点赞数

本文为博主转载文章，转载出处： https://blog.csdn.net/fuwenyan/article/details/77573755

本文主要讲述当你拿到MTCNN的caffemodel后，如何使用它对一张图里的人脸进行检测和特征点标定。

相当于一个代码实现的解释。因为最近卤煮在用ncnn，所以该代码也是基于ncnn架构做的。 caffe架构同理。

如果你对MTCNN这篇论文还不熟悉，建议先去看原理。也可以用我之前写的相关博客做参考：

MTCNN解读：Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

1. MTCNN关键参数

nms_threshold：非极大值抑制nms筛选人脸框时的IOU阈值，三个网络可单独设定阈值，值设置的过小，nms合并的少，会产生较多冗余计算。示例nms_threshold[3] = { 0.5, 0.7, 0.7 };。

threshold：人脸框得分阈值，三个网络可单独设定阈值，值设置的太小，会有很多框通过，也就增加了计算量，还有可能导致最后不是人脸的框错认为人脸。示例threshold[3] = {0.8, 0.8, 0.8};

minsize ：最小可检测图像，该值大小，可控制图像金字塔的阶层数的参数之一，越小，阶层越多，计算越多。示例minsize = 40;

factor ：生成图像金字塔时候的缩放系数, 范围(0,1)，可控制图像金字塔的阶层数的参数之一，越大，阶层越多，计算越多。示例factor = 0.709;

输入图片的尺寸，minsize和factor共同影响了图像金字塔的阶层数。用户可根据自己的精度需求进行调控。

MTCNN整体过程只管图示如下：

接下来对使用过程进行详细说明：

2. 生成图像金字塔

前面提到，输入图片的尺寸，minsize和factor共同影响了图像金字塔的阶层数。也就是说决定能够生成多少张图。

缩放后的尺寸minL=org_L*(12/minisize)*factor^(n)，n={0,1,2,3,...,N}，缩放尺寸最小不能小于12，也就是缩放到12为止。n的数量也就是能够缩放出图片的数量。

看到上面这个公式应该就明白为啥那三个参数能够影响阶层数了吧。

3. Pnet运算

一般Pnet只做检测和人脸框回归两个任务。忽略下图中的Facial landmark。

虽然网络定义的时候input的size是12*12*3，由于Pnet只有卷积层，我们可以直接将resize后的图像喂给网络进行前传，只是得到的结果就不是1*1*2和1*1*4，而是m*m*2和m*m*4了。这样就不用先从resize的图上截取各种12*12*3的图再送入网络了，而是一次性送入，再根据结果回推每个结果对应的12*12的图在输入图片的什么位置。

针对金字塔中每张图，网络forward计算后都得到了人脸得分以及人脸框回归的结果。人脸分类得分是两个通道的三维矩阵m*m*2，其实对应在网络输入图片上m*m个12*12的滑框，结合当前图片在金字塔图片中的缩放scale，可以推算出每个滑框在原始图像中的具体坐标。

首先要根据得分进行筛选，得分低于阈值的滑框，排除。

然后利用nms非极大值抑制，对剩下的滑框进行合并。nms具体解释，可以参照我上一篇博客：NMS非极大值抑制：用擂台赛带你从原理到代码脑洞大开恍然大悟

当金字塔中所有图片处理完后，再利用nms对汇总的滑框进行合并，然后利用最后剩余的滑框对应的Bbox结果转换成原始图像中像素坐标，也就是得到了人脸框的坐标。

所以，Pnet最终能够得到了一批人脸框。

3. Rnet

Rnet仍然只做检测和人脸框回归两个任务。忽略下图中的Facial landmark。

Rnet的作用是对Pnet得到的人脸框进一步打分筛选，回归人脸框。

将Pnet运算出来的人脸框从原图上截取下来，并且resize到24*24*3，作为Rnet的输入。输出仍然是得分和BBox回归结果。

对得分低于阈值的候选框进行抛弃，剩下的候选框做nms进行合并，然后再将BBox回归结果映射到原始图像的像素坐标上。

所以，Rnet最终得到的是在Pnet结果中精选出来的人脸框。

4. Onet

Onet将检测，人脸框回归和特征点定位，一起做了。

Onet的作用是对Rnet得到的人脸框进一步打分筛选，回归人脸框。同时在每个框上都计算特征点位置。

将Rnet运算出来的人脸框从原图上截取下来，并且resize到48*48*3，作为Onet的输入。输出是得分，BBox回归结果以及landmark位置数据。

分数超过阈值的候选框对应的Bbox回归数据以及landmark数据进行保存。

将Bbox回归数据以及landmark数据映射到原始图像坐标上。

再次实施nms对人脸框进行合并。

经过这层层筛选合并后，最终剩下的Bbox以及其对应的landmark就是我们苦苦追求的结果了。

下面附上代码：（代码以https://github.com/ElegantGod/ncnn/tree/master/mtcnn为base，卤煮在上面做了很多注释以及稍许修改，以助于理解）

#include <stdio.h>
#include <algorithm>
#include <vector>
#include <math.h>
#include <iostream>
#include <time.h>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>

#include “net.h”
#include"cpu.h"
using namespace std;
using namespace cv;

struct Bbox
{
float score;
int x1;
int y1;
int x2;
int y2;
float area;
bool exist;
float ppoint[10];
float regreCoord[4];
};

struct orderScore
{
float score;
int oriOrder;
};

void resize_image(ncnn::Mat& srcImage, ncnn::Mat& dstImage)
{
int src_width = srcImage.w;
int src_height = srcImage.h;
int src_channel = srcImage.c;
int dst_width = dstImage.w;
int dst_height = dstImage.h;
int dst_channel = dstImage.c;

if (src_width == dst_width &amp;&amp; src_height == dst_height)
{
	memcpy(dstImage.data, srcImage.data, src_width*src_height*src_channel*sizeof(float));
	return;
}
float lf_x_scl = static_cast&lt;float&gt;(src_width) / dst_width;
float lf_y_Scl = static_cast&lt;float&gt;(src_height) / dst_height;
const float* src_data = srcImage.data;

float* dest_data = dstImage.data;
int src_area = srcImage.cstep;
int src_area2 = 2 * src_area;
int dst_area = dstImage.cstep;
int dst_area2 = 2 * dst_area;

for (int y = 0; y &lt; dst_height; y++) {
	for (int x = 0; x &lt; dst_width; x++) {
		float lf_x_s = lf_x_scl * x;
		float lf_y_s = lf_y_Scl * y;

		int n_x_s = static_cast&lt;int&gt;(lf_x_s);
		n_x_s = (n_x_s &lt;= (src_width - 2) ? n_x_s : (src_width - 2));
		int n_y_s = static_cast&lt;int&gt;(lf_y_s);
		n_y_s = (n_y_s &lt;= (src_height - 2) ? n_y_s : (src_height - 2));

		float lf_weight_x = lf_x_s - n_x_s;
		float lf_weight_y = lf_y_s - n_y_s;

		float dest_val_b = (1 - lf_weight_y) * ((1 - lf_weight_x) *
			src_data[n_y_s * src_width + n_x_s] +
			lf_weight_x * src_data[n_y_s * src_width + n_x_s + 1]) +
			lf_weight_y * ((1 - lf_weight_x) * src_data[(n_y_s + 1) * src_width + n_x_s] +
			lf_weight_x * src_data[(n_y_s + 1) * src_width + n_x_s + 1]);
		float dest_val_g = (1 - lf_weight_y) * ((1 - lf_weight_x) *
			src_data[n_y_s * src_width + n_x_s + src_area] +
			lf_weight_x * src_data[n_y_s * src_width + n_x_s + 1 + src_area]) +
			lf_weight_y * ((1 - lf_weight_x) * src_data[(n_y_s + 1) * src_width + n_x_s + src_area] +
			lf_weight_x * src_data[(n_y_s + 1) * src_width + n_x_s + 1 + src_area]);
		float dest_val_r = (1 - lf_weight_y) * ((1 - lf_weight_x) *
			src_data[n_y_s * src_width + n_x_s + src_area2] +
			lf_weight_x * src_data[n_y_s * src_width + n_x_s + 1 + src_area2]) +
			lf_weight_y * ((1 - lf_weight_x) * src_data[(n_y_s + 1) * src_width + n_x_s + src_area2] +
			lf_weight_x * src_data[(n_y_s + 1) * src_width + n_x_s + 1 + src_area2]);

		dest_data[y * dst_width + x] = static_cast&lt;float&gt;(dest_val_b);
		dest_data[y * dst_width + x + dst_area] = static_cast&lt;float&gt;(dest_val_g);
		dest_data[y * dst_width + x + 2 * dst_area] = static_cast &lt;float&gt;(dest_val_r);
	}
}

}

bool cmpScore(orderScore lsh, orderScore rsh){
if(lsh.score<rsh.score)
return true;
else
return false;
}

class mtcnn{
public:
mtcnn();
void detect(ncnn::Mat& img_, std::vector<Bbox>& finalBbox);
cv::Mat cp_img;
private:
void generateBbox(ncnn::Mat score, ncnn::Mat location, vector<Bbox>& boundingBox_, vector<orderScore>& bboxScore_, float scale);
void nms(vector<Bbox> &boundingBox_, std::vector<orderScore> &bboxScore_, const float overlap_threshold, string modelname=“Union”);
void refineAndSquareBbox(vector<Bbox> &vecBbox, const int &height, const int &width);

ncnn::Net Pnet, Rnet, Onet;
ncnn::Mat img;

float nms_threshold[3];// = { 0.5, 0.7, 0.7 };
float threshold[3];//  = {0.8, 0.8, 0.8};
float mean_vals[3];//  = {127.5, 127.5, 127.5};
float norm_vals[3];//  = {0.0078125, 0.0078125, 0.0078125};
std::vector&lt;Bbox&gt; firstBbox_, secondBbox_,thirdBbox_;
std::vector&lt;orderScore&gt; firstOrderScore_, secondBboxScore_, thirdBboxScore_;
int img_w, img_h;

};

mtcnn::mtcnn(){
for (int i = 0; i < 3; i++)
{
nms_threshold[i]=0.7;// = { 0.5, 0.7, 0.7 };
threshold[i]=0.7;// = {0.8, 0.8, 0.8};
mean_vals[i]=127.5;// = {127.5, 127.5, 127.5};
norm_vals[i]=0.0078125;// = {0.0078125, 0.0078125, 0.0078125};
}
nms_threshold[0] = 0.5;
Pnet.load_param(“E:/Algrithm/MTCNN/MTCNN-master/mtcnn_caffe/model/det1.param”);
Pnet.load_model(“E:/Algrithm/MTCNN/MTCNN-master/mtcnn_caffe/model/det1.bin”);
Rnet.load_param(“E:/Algrithm/MTCNN/MTCNN-master/mtcnn_caffe/model/det2.param”);
Rnet.load_model(“E:/Algrithm/MTCNN/MTCNN-master/mtcnn_caffe/model/det2.bin”);
Onet.load_param(“E:/Algrithm/MTCNN/MTCNN-master/mtcnn_caffe/model/det3.param”);
Onet.load_model(“E:/Algrithm/MTCNN/MTCNN-master/mtcnn_caffe/model/det3.bin”);
//cp_img.create(295, 413, CV_8UC3);
//const char* imagepath = “E:/Algrithm/ncnn/ncnn/x64/Release/test3.jpg”;// argv[1];

//cp_img = cv::imread(imagepath);

}

/generateBbox************/
//根据Pnet的输出结果，由滑框的得分，筛选可能是人脸的滑框，并记录该框的位置、人脸坐标信息、得分以及编号
void mtcnn::generateBbox(ncnn::Mat score, ncnn::Mat location, std::vector<Bbox>& boundingBox_, std::vector<orderScore>& bboxScore_, float scale){
int stride = 2;//Pnet中有一次MP2*2，后续转换的时候相当于stride=2；
int cellsize = 12;
int count = 0;
//score p
float *p = score.channel(1);//score.data + score.cstep;//判定为人脸的概率
//float *plocal = location.data;
Bbox bbox;
orderScore order;
// float max_p = 0;
for(int row=0;row<score.h;row++){
for(int col=0;col<score.w;col++){
//printf(“Pnet prob: %f\n”, p);
//if (p>max_p)
//{
// max_p = p;
//}
if(p>threshold[0]){
bbox.score = p;//记录得分
order.score = p;
order.oriOrder = count;//记录有效滑框的编号
bbox.x1 = round((stridecol+1)/scale);//1212的滑框，换算到原始图像上的坐标
bbox.y1 = round((striderow+1)/scale);
bbox.x2 = round((stridecol+1+cellsize)/scale);
bbox.y2 = round((striderow+1+cellsize)/scale);
bbox.exist = true;
bbox.area = (bbox.x2 - bbox.x1)(bbox.y2 - bbox.y1);
for(int channel=0;channel<4;channel++)
bbox.regreCoord[channel]=location.channel(channel)[0];//人脸框的坐标相关值
boundingBox_.push_back(bbox);
bboxScore_.push_back(order);
count++;
}
p++;
//plocal++;
}
}
//printf(“Pnet max prob: %f\n”,max_p);
}

/nms非极大值抑制******/
void mtcnn::nms(std::vector<Bbox> &boundingBox_, std::vector<orderScore> &bboxScore_, const float overlap_threshold, string modelname){
if(boundingBox_.empty()){
return;
}
std::vector<int> heros;
//sort the score
sort(bboxScore_.begin(), bboxScore_.end(), cmpScore);//cmpScore指定升序排列

int order = 0;
float IOU = 0;
float maxX = 0;
float maxY = 0;
float minX = 0;
float minY = 0;
//规则，站上擂台的擂台主，永远都是胜利者。
while(bboxScore_.size()&gt;0){
    order = bboxScore_.back().oriOrder;//取得分最高勇士的编号ID。
    bboxScore_.pop_back();//勇士出列
    if(order&lt;0)continue;//死的？下一个！（order在(*it).oriOrder = -1;改变）
    heros.push_back(order);//记录擂台主ID
    boundingBox_.at(order).exist = false;//当前这个Bbox为擂台主，签订生死簿。

    for(int num=0;num&lt;boundingBox_.size();num++){
        if(boundingBox_.at(num).exist){//活着的勇士
            //the iou
            maxX = (boundingBox_.at(num).x1&gt;boundingBox_.at(order).x1)?boundingBox_.at(num).x1:boundingBox_.at(order).x1;
            maxY = (boundingBox_.at(num).y1&gt;boundingBox_.at(order).y1)?boundingBox_.at(num).y1:boundingBox_.at(order).y1;
            minX = (boundingBox_.at(num).x2&lt;boundingBox_.at(order).x2)?boundingBox_.at(num).x2:boundingBox_.at(order).x2;
            minY = (boundingBox_.at(num).y2&lt;boundingBox_.at(order).y2)?boundingBox_.at(num).y2:boundingBox_.at(order).y2;
            //maxX1 and maxY1 reuse 
            maxX = ((minX-maxX+1)&gt;0)?(minX-maxX+1):0;
            maxY = ((minY-maxY+1)&gt;0)?(minY-maxY+1):0;
            //IOU reuse for the area of two bbox
            IOU = maxX * maxY;
            if(!modelname.compare("Union"))
                IOU = IOU/(boundingBox_.at(num).area + boundingBox_.at(order).area - IOU);
            else if(!modelname.compare("Min")){
                IOU = IOU/((boundingBox_.at(num).area&lt;boundingBox_.at(order).area)?boundingBox_.at(num).area:boundingBox_.at(order).area);
            }
            if(IOU&gt;overlap_threshold){
                boundingBox_.at(num).exist=false;//如果该对比框与擂台主的IOU够大，挑战者勇士战死
                for(vector&lt;orderScore&gt;::iterator it=bboxScore_.begin(); it!=bboxScore_.end();it++){
                    if((*it).oriOrder == num) {
                        (*it).oriOrder = -1;//勇士战死标志
                        break;
                    }
                }
            }//else 那些距离擂台主比较远迎战者幸免于难，将有机会作为擂台主出现
        }
    }
}
for(int i=0;i&lt;heros.size();i++)
    boundingBox_.at(heros.at(i)).exist = true;//从生死簿上剔除，擂台主活下来了

}
void mtcnn::refineAndSquareBbox(vector<Bbox> &vecBbox, const int &height, const int &width){
if(vecBbox.empty()){
cout<<“Bbox is empty!!”<<endl;
return;
}
float bbw=0, bbh=0, maxSide=0;
float h = 0, w = 0;
float x1=0, y1=0, x2=0, y2=0;
for(vector<Bbox>::iterator it=vecBbox.begin(); it!=vecBbox.end();it++){
if((*it).exist){
bbw = (*it).x2 - (*it).x1 + 1;//滑框的宽高计算
bbh = (*it).y2 - (*it).y1 + 1;
x1 = (*it).x1 + (*it).regreCoord[0]*bbw;//人脸框的位置坐标计算
y1 = (*it).y1 + (*it).regreCoord[1]*bbh;
x2 = (*it).x2 + (*it).regreCoord[2]*bbw;
y2 = (*it).y2 + (*it).regreCoord[3]*bbh;

        w = x2 - x1 + 1;//人脸框宽高
        h = y2 - y1 + 1;
      
        maxSide = (h&gt;w)?h:w;
        x1 = x1 + w*0.5 - maxSide*0.5;
        y1 = y1 + h*0.5 - maxSide*0.5;
        (*it).x2 = round(x1 + maxSide - 1);
        (*it).y2 = round(y1 + maxSide - 1);
        (*it).x1 = round(x1);
        (*it).y1 = round(y1);

        //boundary check
        if((*it).x1&lt;0)(*it).x1=0;
        if((*it).y1&lt;0)(*it).y1=0;
        if((*it).x2&gt;width)(*it).x2 = width - 1;
        if((*it).y2&gt;height)(*it).y2 = height - 1;

        it-&gt;area = (it-&gt;x2 - it-&gt;x1)*(it-&gt;y2 - it-&gt;y1);
    }
}

}
void mtcnn::detect(ncnn::Mat& img_, std::vector<Bbox>& finalBbox_){
img = img_;
img_w = img.w;
img_h = img.h;
img.substract_mean_normalize(mean_vals, norm_vals);//数据预处理,归一化至(-1,1)

float minl = img_w&lt;img_h?img_w:img_h;
int MIN_DET_SIZE = 12;
int minsize = 40;//最小可检测图像，该值大小，控制图像金字塔的阶层数，越小，阶层越多，计算越多。
float m = (float)MIN_DET_SIZE/minsize;
minl *= m;
float factor = 0.409;
int factor_count = 0;
vector&lt;float&gt; scales_;
while(minl&gt;MIN_DET_SIZE){
	if (factor_count &gt; 0){ m = m*factor; }
    scales_.push_back(m);
    minl *= factor;
    factor_count++;
}
orderScore order;
int count = 0;

for (size_t i = 0; i &lt; scales_.size(); i++) {
    int hs = (int)ceil(img_h*scales_[i]);
    int ws = (int)ceil(img_w*scales_[i]);
    ncnn::Mat in(ws, hs, 3);
    resize_image(img, in);//一次次生成图像金字塔中的一层图
    ncnn::Extractor ex = Pnet.create_extractor();
    ex.set_light_mode(true);
	printf("Pnet input width:%d, height:%d, channel:%d\n",in.w,in.h,in.c);
    ex.input("data", in);//Pnet只有卷积层，所以可以接受不同size的input
    ncnn::Mat score_, location_;
    ex.extract("prob1", score_);
	printf("prob1 w:%d, h:%d, ch:%d, first data:%f\n", score_.w, score_.h, score_.c, score_.data[0]);
	//for (int t_w = 0; t_w &lt; score_.w*score_.h*score_.c; t_w++)
	//{
	//	printf("%f, ", score_.data[t_w]);
	//}
    ex.extract("conv4-2", location_);
    std::vector&lt;Bbox&gt; boundingBox_;
    std::vector&lt;orderScore&gt; bboxScore_;
    generateBbox(score_, location_, boundingBox_, bboxScore_, scales_[i]);
    nms(boundingBox_, bboxScore_, nms_threshold[0]);//分会场擂台赛

    for(vector&lt;Bbox&gt;::iterator it=boundingBox_.begin(); it!=boundingBox_.end();it++){
        if((*it).exist){//获胜擂台主得到进入主会场的机会
            firstBbox_.push_back(*it);//主会场花名册
            order.score = (*it).score;
            order.oriOrder = count;
            firstOrderScore_.push_back(order);
            count++;
        }
    }
    bboxScore_.clear();
    boundingBox_.clear();
}
//the first stage's nms
if(count&lt;1)return;
nms(firstBbox_, firstOrderScore_, nms_threshold[0]);//主会场擂台赛
refineAndSquareBbox(firstBbox_, img_h, img_w);
printf("firstBbox_.size()=%d\n", firstBbox_.size());
//for (vector&lt;Bbox&gt;::iterator it = firstBbox_.begin(); it != firstBbox_.end(); it++)
//{
//	cout &lt;&lt; "OK" &lt;&lt; endl;
//	//rectangle(cp_img, Point((*it).x1, (*it).y1), Point((*it).x2, (*it).y2), Scalar(0, 0, 255), 2, 8, 0);
//}
//imshow("Pnet.jpg", cp_img);
//waitKey(1000);
//second stage
count = 0;
for(vector&lt;Bbox&gt;::iterator it=firstBbox_.begin(); it!=firstBbox_.end();it++){
    if((*it).exist){
        ncnn::Mat tempIm;
        copy_cut_border(img, tempIm, (*it).y1, img_h-(*it).y2, (*it).x1, img_w-(*it).x2);
        ncnn::Mat in(24, 24, 3);		
        resize_image(tempIm, in);
        ncnn::Extractor ex = Rnet.create_extractor();
        ex.set_light_mode(true);
        ex.input("data", in);
        ncnn::Mat score, bbox;
        ex.extract("prob1", score);
        ex.extract("conv5-2", bbox);
        if(*(score.data+score.cstep)&gt;threshold[1]){
            for(int channel=0;channel&lt;4;channel++)
                it-&gt;regreCoord[channel]=bbox.channel(channel)[0];//*(bbox.data+channel*bbox.cstep);
            it-&gt;area = (it-&gt;x2 - it-&gt;x1)*(it-&gt;y2 - it-&gt;y1);
            it-&gt;score = score.channel(1)[0];//*(score.data+score.cstep);
            secondBbox_.push_back(*it);
            order.score = it-&gt;score;
            order.oriOrder = count++;
            secondBboxScore_.push_back(order);
        }
        else{
            (*it).exist=false;
        }
    }
}
printf("secondBbox_.size()=%d\n", secondBbox_.size());
if(count&lt;1)return;
nms(secondBbox_, secondBboxScore_, nms_threshold[1]);
refineAndSquareBbox(secondBbox_, img_h, img_w);

//third stage 
count = 0;
for(vector&lt;Bbox&gt;::iterator it=secondBbox_.begin(); it!=secondBbox_.end();it++){
    if((*it).exist){
        ncnn::Mat tempIm;
        copy_cut_border(img, tempIm, (*it).y1, img_h-(*it).y2, (*it).x1, img_w-(*it).x2);
        ncnn::Mat in(48, 48, 3);
        resize_image(tempIm, in);
        ncnn::Extractor ex = Onet.create_extractor();
        ex.set_light_mode(true);
        ex.input("data", in);
        ncnn::Mat score, bbox, keyPoint;
        ex.extract("prob1", score);
        ex.extract("conv6-2", bbox);
        ex.extract("conv6-3", keyPoint);
        if(score.channel(1)[0]&gt;threshold[2]){
            for(int channel=0;channel&lt;4;channel++)
                it-&gt;regreCoord[channel]=bbox.channel(channel)[0];
            it-&gt;area = (it-&gt;x2 - it-&gt;x1)*(it-&gt;y2 - it-&gt;y1);
            it-&gt;score = score.channel(1)[0];
            for(int num=0;num&lt;5;num++){
                (it-&gt;ppoint)[num] = it-&gt;x1 + (it-&gt;x2 - it-&gt;x1)*keyPoint.channel(num)[0];
                (it-&gt;ppoint)[num+5] = it-&gt;y1 + (it-&gt;y2 - it-&gt;y1)*keyPoint.channel(num+5)[0];
            }

            thirdBbox_.push_back(*it);
            order.score = it-&gt;score;
            order.oriOrder = count++;
            thirdBboxScore_.push_back(order);
        }
        else
            (*it).exist=false;
        }
    }

printf("thirdBbox_.size()=%d\n", thirdBbox_.size());
if(count&lt;1)return;
refineAndSquareBbox(thirdBbox_, img_h, img_w);
nms(thirdBbox_, thirdBboxScore_, nms_threshold[2], "Min");
finalBbox_ = thirdBbox_;

firstBbox_.clear();
firstOrderScore_.clear();
secondBbox_.clear();
secondBboxScore_.clear();
thirdBbox_.clear();
thirdBboxScore_.clear();

}

int main(int argc, char** argv)
{
/读图（start）/
const char imagepath ;// argv[1];
if (argc == 2)
{
imagepath = argv[1];
}
else{
imagepath = “E:/Algrithm/ncnn/ncnn/x64/Release/test2.jpg”;
}
cout << imagepath << endl;
cv::Mat cv_img = cv::imread(imagepath);
if (cv_img.data==NULL)
{
fprintf(stderr, “cv::imread %s failed\n”, imagepath);
system(“pause”);
return -1;
}
printf(“img w: %d h:%d ch:%d\n”,cv_img.cols,cv_img.rows,cv_img.channels());
imshow(“img”,cv_img);
waitKey(10);
/读图（end）*****/

/***********MTCNN运算（start）************/
float start = clock();
int times = 1;
ncnn::set_omp_num_threads(4);
for (int cnt = 0; cnt &lt; times; cnt++)
{
	std::vector&lt;Bbox&gt; finalBbox;
	mtcnn Net;
	//OpenCV读出的图片是BGR格式的，需要转为RGB格式，否则检出率会很低。
	ncnn::Mat ncnn_img = ncnn::Mat::from_pixels(cv_img.data, ncnn::Mat::PIXEL_BGR2RGB, cv_img.cols, cv_img.rows);
	Net.detect(ncnn_img, finalBbox);
	for (vector&lt;Bbox&gt;::iterator it = finalBbox.begin(); it != finalBbox.end(); it++){
		if ((*it).exist)
		{
			printf("Bbox [x1,y1], [x2,y2]:[%d,%d], [%d,%d] \n", (*it).x1, (*it).x2, (*it).y1, (*it).y2);
			rectangle(cv_img, Point((*it).x1, (*it).y1), Point((*it).x2, (*it).y2), Scalar(0, 0, 255), 2, 8, 0);
			for (int num = 0; num &lt; 5; num++)
			{
				printf("Landmark [x1,y1]: [%d,%d] \n", (int)*(it-&gt;ppoint + num), (int)*(it-&gt;ppoint + num + 5));
				circle(cv_img, Point((int)*(it-&gt;ppoint + num), (int)*(it-&gt;ppoint + num + 5)), 3, Scalar(0, 255, 255), -1);
			}
		}
	}
}
/***********MTCNN运算（end）************/
printf("MTCNN mean time comsuming: %f ms\n",(clock()-start)/times);
imshow("result.jpg",cv_img);
waitKey(100);
system("pause");
return 0;

}
#endif

终于将mtcnn缕清楚了，神清气爽~~~

jimfang1987

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
MTCNN人脸及特征点检测---代码应用详解（基于ncnn架构）

本文为博主转载文章，转载出处： https://blog.csdn.net/fuwenyan/article/details/77573755本文主要讲述当你拿到MTCNN的caffemodel后，如何使用它对一张图里的人脸进行检测和特征点标定。相当于一个代码实现的解释。因为最近卤煮在用ncnn，所以该代码也是基于ncnn架构做的。 caffe架构同理。如果你对MTC...
复制链接

扫一扫