大总结--人脸检测

 

人脸检测(一)

2017年05月06日 17:34:16 HamTam12 阅读数:7960 标签: 人脸检测 OpenCV人脸检测 Dlib人脸检测 级联网络人脸检测 Seetaface 更多

个人分类: 人脸检测 caffe

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/sinat_14916279/article/details/71273892

有天导师突然找我,让我搞一些关于人脸的应用,比如换个脸什么的……没办法那就先把人脸自动检测出来吧。人脸检测,即检测出图像中存在的人脸,并把它的位置准确地框出来。是人脸特征点检测、人脸识别的基础。可以谷歌Face Detection Benchmark寻找数据集和优秀论文,上thinkface论坛,搜集人脸检测数据集和方法。常用的人脸检测数据集,包括FDDB、AFLW、WIDER FACE等。随着近年来随着深度学习的快速发展,涌现出来很多优秀的人脸检测算法。
例如,FDDB数据库就提交了很多出色的人脸检测算法,例如采用级联CNN网络的人脸检测方法:A Convolutioanal Neural Network Cascade,改进的faster rcnn做人脸检测:Face Detection using Deep Learning:An Improved Faster RCNN Approach,还有对小脸检测非常成功的Finding tiny faces等等,建议找个三篇左右认真研读就行了,也不需要去一一实现,没有太大意义。
另外,像opencv、dlib、libfacedetect等也提供了人脸检测的接口。因为人脸检测是很基本的任务,所以很多公司都做了人脸检测的工作,而且做得很牛,例如face++。
这里写图片描述

下面仅介绍本人尝试并实现了的几种常见的人脸检测方法:

1.单个CNN人脸检测方法
2.级联CNN人脸检测方法
3.OpenCV人脸检测方法
4.Dlib人脸检测方法
5.libfacedetect人脸检测方法
6.Seetaface人脸检测方法


1.单个CNN人脸检测方法

该人脸检测方法的有点在于,思路简单,实现简单;缺点是速度较慢(在一块普通的gpu上对一副1000x600的图像进行多尺度检测也可能花上一两秒),检测效果还可以,但得到的人脸框不够准确。
首先训练一个判断人脸非人脸的二分类器。例如采用卷积神经网络caffenet进行二分类,可以在imagenet数据集训练过的模型,利用自己的人脸数据集,进行微调。也可以自定义卷积网络进行训练,为了能检测到更小的人脸目标,我们一般采用小一点的卷积神经网络作为二分类模型,减小图像输入尺寸,加快预测速度。
然后将训练好的人脸判断分类网络的全连接层改为卷积层,这样网络变成了全卷积网络,可以接受任意输入图像大小,图像经过全卷积网络将得到特征图,特征图上每一个“点”对应该位置映射到原图上的感受野区域属于人脸的概率,将属于人脸概率大于设定阈值的视为人脸候选框。
图像上人脸的大小是变化的,为了适应这种变化,最暴力的办法就是使用图像金字塔的方式,将待检测的图像缩放到不同大小,以进行多尺度人脸检测。对多个尺度下检测出来的所有人脸候选框,做非极大值抑制NMS,得到最后人脸检测的结果。

这里写图片描述

这里提供用caffe实现该方法的数据集、模型文件和代码打包的 下载链接

下面介绍用caffe实现该方法的具体过程。因为需要训练判断是否为人脸的CNN分类器,准备好正负训练样本,然后得到caffe训练所需的的数据集文件(由于采用的是48x48的网络,原始数据集归一化到了48x48)。
这里写图片描述
这里CNN采用的是DeepID卷积神经网络,网络结构如下,它的输入只有48x48大小,而采用AlexNet或CaffeNet网络会增加时间开销。
这里写图片描述
准备好网络模型文件train_val.prototxt和超参数配置文件solver.prototxt之后(下载链接中都有),开始训练,迭代10w次得到caffemodel。对测试集face_test文件夹中的图像进行测试,准备好测试用的deploy.prototxt。
测试单张图像的python脚本face_test.py如下:

# -*- coding: utf-8 -*-
"""
Created on Fri Mar 10 23:02:06 2017
@author: Administrator
"""
import numpy as np
import caffe
size = 48
image_file = 'C:/Users/Administrator/Desktop/caffe/data/face/face_test/0/253_faceimage07068.jpg'#测试图片路径
model_def = 'C:/Users/Administrator/Desktop/caffe/models/face/deploy.prototxt'
model_weights = 'C:/Users/Administrator/Desktop/caffe/models/face/_iter_10000.caffemodel'
net = caffe.Net(model_def, model_weights, caffe.TEST)    

# 加载均值文件  也可指定数值做相应的操作
#mu = np.load('C:/Users/Administrator/Desktop/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy')  ###caffe 自带的文件
#mu = mu.mean(1).mean(1)  # average over pixels to obtain the mean (BGR) pixel values

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) ##设定图片的shape格式(1,3,48,48),大小由deploy 文件指定
#transformer.set_mean('data', mu)            # 每个通道减去均值
# python读取的图片文件格式为H×W×K,需转化为K×H×W
transformer.set_transpose('data', (2,0,1))  #改变维度的顺序,由原始图片(48,48,3)变为(3,48,48) 
# python中将图片存储为[0, 1],而caffe中将图片存储为[0, 255],所以需要一个转换
transformer.set_raw_scale('data', 255)      # 缩放到【0,255】之间
transformer.set_channel_swap('data', (2,1,0))   #交换通道,将图片由RGB变为BGR
#net.blobs['data'].reshape(1,3,size, size)  # 将输入图片格式转化为合适格式(与deploy文件相同)
#上面这句,第一参数:图片数量 第二个参数 :通道数 第三个参数:图片高度 第四个参数:图片宽度

image = caffe.io.load_image(image_file) #加载图片,始终是得到一副(h,w,3),rgb,0~1,float32的图像
net.blobs['data'].data[...]  = transformer.preprocess('data', image) #用上面的transformer.preprocess来处理刚刚加载图片

caffe.set_device(0)
caffe.set_mode_gpu()
output = net.forward()
output_prob = output['prob'][0].argmax()  # 给出概率最高的是第几类,需要自己对应到我们约定的类别去
print output_prob
print output['prob'][0][0] #或print output['prob'][0,1]

批量测试计算准确度的matlab脚本face_test.m如下:

%注意:caffe中维度顺序为(N,C,H,W),而matcaffe中Blob维度顺序为(W,H,C,N),即完全相反
%matlab加载图像为(h,w,c),得到的是rgb,而caffe使用的是bgr

function test_face()
clear;
addpath('..');%添加上级目录搜索路径
addpath('.');%添加当前目录搜索路径
caffe.set_mode_gpu(); %设置gpu模式
caffe.set_device(0); %gpu的id为0
%caffe.set_mode_cpu();
net_model = 'C:\Users\Administrator\Desktop\caffe\models\face\deploy.prototxt'; %网络模型deploy.prototxt
net_weights = 'C:\Users\Administrator\Desktop\caffe\models\face\_iter_10000.caffemodel'; %训练好的模型文件
%net_model = 'C:\Users\Administrator\Desktop\caffe\models\face2\deploy.prototxt'; %网络模型deploy.prototxt
%net_weights = 'C:\Users\Administrator\Desktop\caffe\models\face2\_iter_100000.caffemodel'; %训练好的模型文件
phase = 'test'; %不做训练,而是测试
net = caffe.Net(net_model, net_weights, phase); %获取网络

tic;
error = 0;
total = 0;
%批量读取图像进行测试
datadir = 'C:\Users\Administrator\Desktop\caffe\data\face\face_test\0';
imagefiles = dir(datadir);
for i = 3:length(imagefiles)
    im = imread(fullfile(datadir,imagefiles(i).name));
    [input_data,flag] = prepare_image(im); %图像数据预处理
    if flag ~= 1
        continue;
    end
    input_data ={input_data};
    net.forward(input_data); %做前向传播

    scores = net.blobs('prob').get_data();

    [best_score,best] = max(scores);
%     fprintf('*****%.3f %d %d\n',best_score,best - 1,classid(i-2));
    best = best - 1; %matlab中从1开始,减1变成从0开始
    if best ~= 0
        error = error + 1;
        fprintf('-----error: %d\n',error);
        errorfile = ['error\' imagefiles(i).name];
        %imwrite(im,errorfile);
    end
    total = total + 1;
end
datadir_1 = 'C:\Users\Administrator\Desktop\caffe\data\face\face_test\1';
imagefiles_1 = dir(datadir_1);
for i = 3:length(imagefiles_1)
    im_1 = imread(fullfile(datadir_1,imagefiles_1(i).name));
    [input_data_1,flag] = prepare_image(im_1); %图像数据预处理
    if flag ~= 1
        continue;
    end
    input_data_1 = {input_data_1};
    net.forward(input_data_1); %做前向传播

    scores_1 = net.blobs('prob').get_data();

    [best_score_1,best_1] = max(scores_1);
%     fprintf('*****%.3f %d %d\n',best_score,best - 1,classid(i-2));
    best_1 = best_1 - 1; %matlab中从1开始,减1变成从0开始
    if best_1 ~= 1
        error = error + 1;
        fprintf('error: %d-----\n',error);
        errorfile = ['face_error\' imagefiles_1(i).name];
        %imwrite(im,errorfile);
    end
    total = total + 1;
end
total_time = toc;
%打印到屏幕上
fprintf('total_time: %.3f s\n',total_time);
fprintf('aver_time: %.3f s\n',total_time/total);
fprintf('error/total: %d/%d\n',error,total);
fprintf('accurary: %.4f\n',1.0 - (error*1.0)/total);
%disp(['error/total: ',num2str(error),'/',num2str(length(imagefiles)-2)]);
end

function [im_data,flag] = prepare_image(im)
%d = load('../+caffe/imagenet/ilsvrc_2012_mean.mat');
%mean_data = d.mean_data;

%resize to 227 x 227
im_data = [];
im = imresize(im,[227 227],'bilinear');
%im = imresize(im,[48 48],'bilinear');
[h,w,c] = size(im);
if c ~= 3
    flag = 0;
    return;
end
flag = 1;
%caffe的blob顺序是[w h c num]
%matlab:[h w c] rgb -> caffe:[w h c] bgr
im_data = im(:,:,[3,2,1]); %rgb -> bgr
im_data = permute(im_data,[2,1,3]); %[h w c] -> [w h c]
[w,h,~] = size(im_data);
%ImageNet数据集的均值具有统计规律,这里可以直接拿来使用
mean_data(:,:,1) = ones(w,h) .* 104; %b
mean_data(:,:,2) = ones(w,h) .* 117; %g
mean_data(:,:,3) = ones(w,h) .* 123; %r

im_data = single(im_data);
%im_data = im_data - single(mean_data); %因为训练集和测试集都没有做去均值,所以这里也不做(如果只是这里做了去均值效果会变差)
end

在测试集上进行批量测试,准确率达到了98%。
这里写图片描述
为了利用CNN分类器来检测人脸,需要将CNN网络中的全连接层替换为卷积层得到全卷积网络,修改好的全卷积网络deploy_full_conv.prototxt内容如下:

name: "face_full_conv_net"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 3 dim: 48 dim: 48 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 20
    kernel_size: 3
    stride: 1
    pad: 1
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "conv1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type:  "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 40
    kernel_size: 3
    pad: 1
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "conv2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  convolution_param {
    num_output: 60
    kernel_size: 3
    pad: 1
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "norm3"
  type: "LRN"
  bottom: "conv3"
  top: "conv3"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "pool3"
  top: "conv4"
  convolution_param {
    num_output: 80
    kernel_size: 3
    pad: 1
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "norm4"
  type: "LRN"
  bottom: "conv4"
  top: "conv4"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool4"
  type: "Pooling"
  bottom: "conv4"
  top: "pool4"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
#修改为卷积层
layer {
  name: "fc5-conv" ### fc5
  type: "Convolution" ### InnerProduct
  bottom: "pool4"
  top: "fc5-conv" ### fc5
  #inner_product_param {
   # num_output: 160
  #}
  convolution_param {
    num_output: 160
    kernel_size: 3
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "fc5-conv"
  top: "fc5-conv"
}
layer {
  name: "drop5"
  type:  "Dropout"
  bottom: "fc5-conv"
  top: "fc5-conv"
  dropout_param {
    dropout_ratio: 0.5
  }
}
#修改为卷积层
layer {
  name: "fc6-conv" ### fc6
  type:  "Convolution" ### InnerProduct
  bottom: "fc5-conv"
  top: "fc6-conv"
  #inner_product_param {
   # num_output: 2
  #}
  convolution_param {
    num_output: 2
    kernel_size: 1
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc6-conv"
  top: "prob"
}

还需要将训练好的_iter_100000.caffemodel模型文件也转化为全卷积的,得到的_iter_100000_full_conv.caffemodel,转换脚本convert_full_conv.py如下:

# -*- coding: utf-8 -*-
"""
Created on Fri Mar 10 21:14:09 2017
@author: Administrator
"""
###首先需要手动将deploy.prototxt修改成全卷积的deploy_full_conv.prorotxt,特别要注意全连接层修改成卷积层的细节
###将训练好的分类模型caffemodel转换成可以接受任意输入大小,最后输出特征图的全卷积模型caffemodel

import numpy as np
import caffe

model_def = 'C:/Users/Administrator/Desktop/caffe/models/face/deploy.prototxt'
model_weights = 'C:/Users/Administrator/Desktop/caffe/models/face/_iter_100000.caffemodel'
net = caffe.Net(model_def,
                model_weights,
                caffe.TEST)
params = ['fc5', 'fc6']
# fc_params = {name: (weights, biases)}
fc_params = {pr: (net.params[pr][0].data, net.params[pr][1].data) for pr in params}
for fc in params:
    print '{} weights are {} dimensional and biases are {} dimensional'.format(fc, fc_params[fc][0].shape, fc_params[fc][1].shape)

# Load the fully convolutional network to transplant the parameters.
net_full_conv = caffe.Net('./deploy_full_conv.prototxt', 
                          './_iter_100000.caffemodel',
                          caffe.TEST)
params_full_conv = ['fc5-conv', 'fc6-conv']
# conv_params = {name: (weights, biases)}
conv_params = {pr: (net_full_conv.params[pr][0].data, net_full_conv.params[pr][1].data) for pr in params_full_conv}
for conv in params_full_conv:
    print '{} weights are {} dimensional and biases are {} dimensional'.format(conv, conv_params[conv][0].shape, conv_params[conv][1].shape)

for pr, pr_conv in zip(params, params_full_conv):
    conv_params[pr_conv][0].flat = fc_params[pr][0].flat  # flat unrolls the arrays
    conv_params[pr_conv][1][...] = fc_params[pr][1]

net_full_conv.save('./_iter_100000_full_conv.caffemodel')
print 'success'

最后,就可以用deploy_full_conv.prototxt和_iter_100000_full_conv.caffemodel对任意输入尺寸的图像进行人脸检测了。对单张图像进行人脸检测的python脚本face_detect如下:

# -*- coding: utf-8 -*-
import numpy as np
import cv2 #需要安装opencv,然后将opencv安装目录下build\python\2.7\x64\cv2.pyd拷贝到python的安装目录下Anaconda2\Lib\site-packages文件夹下
from operator import itemgetter
import time
import caffe
caffe.set_device(0)
caffe.set_mode_gpu()

def IoU(rect_1, rect_2):
    '''
    :param rect_1: list in format [x11, y11, x12, y12, confidence]
    :param rect_2:  list in format [x21, y21, x22, y22, confidence]
    :return:    returns IoU ratio (intersection over union) of two rectangles
    '''
    x11 = rect_1[0]    # first rectangle top left x
    y11 = rect_1[1]    # first rectangle top left y
    x12 = rect_1[2]    # first rectangle bottom right x
    y12 = rect_1[3]    # first rectangle bottom right y
    x21 = rect_2[0]    # second rectangle top left x
    y21 = rect_2[1]    # second rectangle top left y
    x22 = rect_2[2]    # second rectangle bottom right x
    y22 = rect_2[3]    # second rectangle bottom right y
    x_overlap = max(0, min(x12,x22) -max(x11,x21))
    y_overlap = max(0, min(y12,y22) -max(y11,y21))
    intersection = x_overlap * y_overlap
    union = (x12-x11) * (y12-y11) + (x22-x21) * (y22-y21) - intersection
    return float(intersection) / union

def IoM(rect_1, rect_2):
    '''
    :param rect_1: list in format [x11, y11, x12, y12, confidence]
    :param rect_2:  list in format [x21, y21, x22, y22, confidence]
    :return:    returns IoM ratio (intersection over min-area) of two rectangles
    '''
    x11 = rect_1[0]    # first rectangle top left x
    y11 = rect_1[1]    # first rectangle top left y
    x12 = rect_1[2]    # first rectangle bottom right x
    y12 = rect_1[3]    # first rectangle bottom right y
    x21 = rect_2[0]    # second rectangle top left x
    y21 = rect_2[1]    # second rectangle top left y
    x22 = rect_2[2]    # second rectangle bottom right x
    y22 = rect_2[3]    # second rectangle bottom right y
    x_overlap = max(0, min(x12,x22) -max(x11,x21))
    y_overlap = max(0, min(y12,y22) -max(y11,y21))
    intersection = x_overlap * y_overlap
    rect1_area = (y12 - y11) * (x12 - x11)
    rect2_area = (y22 - y21) * (x22 - x21)
    min_area = min(rect1_area, rect2_area)
    return float(intersection) / min_area

def NMS(rectangles,threshold=0.3):
    '''
    :param rectangles:  list of rectangles, which are lists in format [x11, y11, x12, y12, confidence]
    :return:    list of rectangles after local NMS
    '''
    rectangles = sorted(rectangles, key=itemgetter(4), reverse=True) #按照confidence降序排列
    result_rectangles = rectangles[:]  # list to return

    '''
    while not result_rectangles == []:
        rect = result_rectangles[0]
        for index in range(1,len(result_rectangles)):
            iou = IoU(rect,result_rectangles[index])
            if
    '''
    number_of_rects = len(result_rectangles)
    #threshold = 0.3     # threshold of IoU of two rectangles
    cur_rect = 0
    while cur_rect < number_of_rects - 1:     # start from first element to second last element
        rects_to_compare = number_of_rects - cur_rect - 1      # elements after current element to compare
        cur_rect_to_compare = cur_rect + 1    # start comparing with element after current
        while rects_to_compare > 0:      # while there is at least one element after current to compare
            if (IoU(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= threshold or IoM(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= 0.3):
                del result_rectangles[cur_rect_to_compare]      # delete the rectangle
                number_of_rects -= 1
            else:
                cur_rect_to_compare += 1    # skip to next rectangle
            rects_to_compare -= 1
        cur_rect += 1   # finished comparing for current rectangle

    return result_rectangles

def face_detection(imgFile) :
    #model_def = 'C:/Users/Administrator/Desktop/caffe/models/face/deploy_full_conv.prototxt' 
    #model_weights = 'C:/Users/Administrator/Desktop/caffe/models/face/_iter_10000_full_conv.caffemodel'
    model_def = 'C:/Users/Administrator/Desktop/caffe/models/face2/deploy_full_conv.prototxt' 
    model_weights = 'C:/Users/Administrator/Desktop/caffe/models/face2/_iter_100000_full_conv.caffemodel'
    net_full_conv = caffe.Net(model_def,
                              model_weights,
                              caffe.TEST)

    mu = np.load('C:/Users/Administrator/Desktop/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy')
    mu = mu.mean(1).mean(1)  # average over pixels to obtain the mean (BGR) pixel values
    #print 'mean-subtracted values:' , zip('BGR', mu)

    start_time = time.time()
    scales = [] #尺度变换和尺度变换因子
    factor = 0.793700526

    img = cv2.imread(imgFile) #opencv读取的图像为(h,w,c),bgr,caffe的blob维度为(n,c,h,w),使用的也是rgb
    print img.shape

    largest = min(2, 4000/max(img.shape[0:2])) #4000是人脸检测的经验值
    scale = largest
    minD = largest*min(img.shape[0:2])
    while minD >= 48:  #网络的输入是227x227??? #多尺度变换 
        scales.append(scale) #添加当前尺度
        scale *= factor #乘以尺度变换因子
        minD *= factor #得到新的尺度

    true_boxes = []

    for scale in scales:
        scale_img = cv2.resize(img,((int(img.shape[1] * scale), int(img.shape[0] * scale)))) #将图像缩放到各尺度
        cv2.imwrite('C:/Users/Administrator/Desktop/caffe/scale_img.jpg',scale_img)
        im = caffe.io.load_image('C:/Users/Administrator/Desktop/caffe/scale_img.jpg') #利用caffe的io接口加载图像,始终是得到一副(h,w,3),rgb,0~1,float32的图像

        net_full_conv.blobs['data'].reshape(1,3,scale_img.shape[0],scale_img.shape[1]) #重新设置网络data层Blob维度为:1,3,height,width            
        transformer = caffe.io.Transformer({'data': net_full_conv.blobs['data'].data.shape}) #为data层创建transformer
        transformer.set_transpose('data', (2,0,1)) #(h,w,3)->(3,h,w)
        #transformer.set_mean('data', mu) #设置均值,由于训练集没有去均值,这里也不去均值
        transformer.set_raw_scale('data', 255.0) #rescale from [0,1] to [0,255]
        transformer.set_channel_swap('data', (2,1,0)) #RGB -> BGR

        net_full_conv.blobs['data'].data[...] = transformer.preprocess('data', im)
        out = net_full_conv.forward()

        print out['prob'][0,0].shape #输出层prob结果,行x列
        #print out['prob'][0].argmax(axis=0)
        featureMap = out['prob'][0,0] #out['prob'][0][0]属于人脸的概率特征图
        stride = 16 #特征图感受野大小
        cellSize = 48 #网络输入尺寸
        thresh = 0.95
        for (y,x),prob in np.ndenumerate(featureMap):
            if prob > thresh :
                true_boxes.append([float(x*stride)/scale,
                                    float(y*stride)/scale,
                                    float(x*stride + cellSize - 1)/scale,
                                    float(y*stride + cellSize - 1)/scale,
                                    prob])

    true_boxes = NMS(true_boxes,0.2) #非极大值抑制
    for true_box in true_boxes:
        (x1, y1, x2, y2) = true_box[0:4] #取出人脸框的坐标
        cv2.rectangle(img, (int(x1),int(y1)), (int(x2),int(y2)), (0,255,0)) #画人脸框

    end_time = time.time()
    print (end_time-start_time)*1000,'ms'

    cv2.imwrite('output.jpg',img)
    cv2.namedWindow('test win')  
    cv2.imshow('test win', img)          
    cv2.waitKey(0)  
    cv2.destroyWindow('test win')

if __name__ == "__main__":
    imgFile = 'C:/Users/Administrator/Desktop/caffe/matlab/demo/1.jpg'
    face_detection(imgFile)

 



人脸检测(二)

2017年05月06日 22:10:20 HamTam12 阅读数:1122 标签: 人脸检测 级联CNN人脸检测 更多

个人分类: caffe 人脸检测

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/sinat_14916279/article/details/71305094

2.级联CNN人脸检测方法
采用级联网络来进行人脸检测,参考2015年CVPR上的一篇论文A Convolution Neural Network Cascade for Face Detection,它采用了12-net,24-net,48-net级联网络用于人脸检测,12-calibration-net,24-calibration,48-calibration边界校订网络用于更好的定位人脸框。它最小能够检测12x12大小的人脸,相比于单个CNN的人脸检测方法,大大加快了人脸检测的速度,并提高了人脸框的准确度,人脸检测的准确率和召回率也很高,在FDDB数据集上达到了当时最高的分数。

这里写图片描述

论文下载地址
github开源代码
github开源模型

作者的代码写的非常优美易懂,注释简洁明了,其级联CNN人脸检测的基本思路如下:
这里写图片描述
(1)12-net:首先用一个输入为12 x 12图像的小网络来训练人脸非人脸二分类器,将最后的全连接层修改成卷积层,这样的全卷积网络即12-full-conv-net就可以接受任意大小的输入图像,输出的特征图表示对应感受野区域属于人脸的概率。在检测时,假设要检测的最小人连尺寸为K x K,例如40 x 40,将待检测的图像缩放到原来的12/K,然后将整幅图像输入到训练好的12 x 12的全卷积网络,得到特征图,设定阈值过滤掉得分低的,这样就可以去除绝大部分不感兴趣区域,保留一定数量的候选框。
(2)12-calibration-net:训练一个输入为12 x 12图像的校订网络,来矫正上一步12-net得到的人脸框边界,它实质上是一个45类分类器,判断当前图像中包含的人脸是左右偏了、上下偏了还是大了小了,即包括:
x方向上-0.17、0、0.17共3种平移
y方向上-0.17、0、0.17共3三种平移
以及0.83、0.91、1.0、1.1、1.21共5种缩放尺度
检测时,将上一步12-net得到的所有人脸候选框作为12-calibration-net的输入,根据其分类结果对候选框位置进行校订。
(3)localNMS:对上述12-calibration-net矫正后的人脸候选框做局部非极大值抑制,过滤掉其中重叠的得分较低的候选框,保留得分更高的人脸候选框。
这里写图片描述
(4)24-net:训练输入为24 x 24图像的人脸分类器网络。测试时,以上一步localNMS得到的人脸候选框缩放到24 x 24大小,作为24-net网络输入,判定是否属于人脸,设置阈值保留得分较高的候选框。
(5)24-calibration-net:同样训练一个输入图像为24 x 24大小的边界校订分类网络,来矫正上一步24-net保留的人脸候选框的位置,候选框区域图像缩放到24 x 24大小,其它与12-calibration-net一致。
(6)localNMS:将24-calibration-net矫正后的人脸候选框进行局部非极大值抑制,过滤掉重叠的得分较低的候选框,保留得分更高的。
这里写图片描述
(7)48-net:训练一个更加准确的输入为48 x 48的人脸非人脸分类器。测试时,将上一步localNMS得到的人脸候选框缩放到48 x 48大小,作为48-net输入,保留得分高的人脸候选框。
(8)globalNMS:将48-net得到的所有人脸候选框进行全局非极大值抑制,保留所有的最佳人脸框
(9)48-calibration-net:训练一个输入为48 x 48的边界校订分类网络。测试时,将globalNMS得到的最佳人脸框缩放到48 x 48作为输入进行人脸框边界校订。

因此,我们需要先单独训练人脸非人脸二分类12-net,24-net,48-net网络,以及人脸框边界校订12-calibration-net,24-calibration,48-calibration网络。测试时,用图像金字塔来做多尺度人脸检测,对于任意的输入图像,依次经过12-net(12-full-conv-net) -> 12-calibration-net -> localNMS -> 24-net -> 24-calibration-net -> localNMS -> 48-net -> globalNMS -> 48-calibration-net,得到最终的人脸框作为检测结果。
其中localNMS和globalNMS(Non-maximum Suppression,NMS,非极大值抑制)的区别主要在于前者仅使用了IoU(Intersection over Union),即交集与并集的比值,而后者还用到了IoM(Intersection over Min-area),即交集与两者中最小面积的比值,来过滤重叠的候选框。
博主没有自己训练,而是直接用了作者训练好的模型。使用CNN_face_detection-master\face_detection文件夹下的face_cascade_fullconv_single_crop_single_image.py脚本可以对但张图像进行测试,代码如下,其中需要注意的是,我们可以根据需求来设置检测最小的人脸尺寸min_face_size,它与检测速度直接相关,如果我们需要检测的人脸比较大,例如在128x128以上时,检测可以达到实时水平。

import numpy as np
import cv2
import time
import os
#from operator import itemgetter
from load_model_functions import *
from face_detection_functions import *

# ==================  caffe  ======================================
#caffe_root = '/home/anson/caffe-master/'  # this file is expected to be in {caffe_root}/examples
import sys
#sys.path.insert(0, caffe_root + 'python')
import caffe
# ==================  load models  ======================================
net_12c_full_conv, net_12_cal, net_24c, net_24_cal, net_48c, net_48_cal = \
    load_face_models(loadNet=True)

nets = (net_12c_full_conv, net_12_cal, net_24c, net_24_cal, net_48c, net_48_cal)

start_time = time.time()

read_img_name = 'C:/Users/Administrator/Desktop/caffe/matlab/demo/1.jpg'
img = cv2.imread(read_img_name)     # BGR

print img.shape

min_face_size = 48 #最小的人脸检测尺寸,设置的越小能检测到更小人脸的同时,速度下降的很快
stride = 5 #步长,实际并未使用

# caffe_image = np.true_divide(img, 255)      # convert to caffe style (0~1 BGR)
# caffe_image = caffe_image[:, :, (2, 1, 0)]
img_forward = np.array(img, dtype=np.float32)
img_forward -= np.array((104, 117, 123))

rectangles = detect_faces_net(nets, img_forward, min_face_size, stride, True, 2, 0.05)
for rectangle in rectangles:    # draw rectangles
        cv2.rectangle(img, (rectangle[0], rectangle[1]), (rectangle[2], rectangle[3]), (255, 0, 0), 2)

end_time = time.time()
print 'aver_time = ',(end_time-start_time)*1000,'ms'

cv2.imshow('test img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

上述脚本文件中的detect_faces_net()函数在CNN_face_detection-master\face_detection文件夹下的face_detection_functions.py脚本中实现,face_detection_functions.py是整个级联网络的检测过程脚本,写的非常优雅,思路和注释清晰,本人针对自己需求进行了修改,代码如下:

import numpy as np
import cv2
import time
from operator import itemgetter
# ==================  caffe  ======================================
#caffe_root = '/home/anson/caffe-master/'  # this file is expected to be in {caffe_root}/examples
import sys
#sys.path.insert(0, caffe_root + 'python')
import caffe

def find_initial_scale(net_kind, min_face_size):
    '''
    :param net_kind: what kind of net (12, 24, or 48)
    :param min_face_size: minimum face size
    :return:    returns scale factor
    '''
    return float(min_face_size) / net_kind
def resize_image(img, scale):
    '''
    :param img: original img
    :param scale: scale factor
    :return:    resized image
    '''
    height, width, channels = img.shape
    new_height = int(height / scale)     # resized new height
    new_width = int(width / scale)       # resized new width
    new_dim = (new_width, new_height)
    img_resized = cv2.resize(img, new_dim)      # resized image
    return img_resized
def draw_rectangle(net_kind, img, face):
    '''
    :param net_kind: what kind of net (12, 24, or 48)
    :param img: image to draw on
    :param face: # list of info. in format [x, y, scale]
    :return:    nothing
    '''
    x = face[0]
    y = face[1]
    scale = face[2]
    original_x = int(x * scale)      # corresponding x and y at original image
    original_y = int(y * scale)
    original_x_br = int(x * scale + net_kind * scale)    # bottom right x and y
    original_y_br = int(y * scale + net_kind * scale)
    cv2.rectangle(img, (original_x, original_y), (original_x_br, original_y_br), (255,0,0), 2)
def IoU(rect_1, rect_2):
    '''
    :param rect_1: list in format [x11, y11, x12, y12, confidence, current_scale]
    :param rect_2:  list in format [x21, y21, x22, y22, confidence, current_scale]
    :return:    returns IoU ratio (intersection over union) of two rectangles
    '''
    x11 = rect_1[0]    # first rectangle top left x
    y11 = rect_1[1]    # first rectangle top left y
    x12 = rect_1[2]    # first rectangle bottom right x
    y12 = rect_1[3]    # first rectangle bottom right y
    x21 = rect_2[0]    # second rectangle top left x
    y21 = rect_2[1]    # second rectangle top left y
    x22 = rect_2[2]    # second rectangle bottom right x
    y22 = rect_2[3]    # second rectangle bottom right y
    x_overlap = max(0, min(x12,x22) -max(x11,x21))
    y_overlap = max(0, min(y12,y22) -max(y11,y21))
    intersection = x_overlap * y_overlap
    union = (x12-x11) * (y12-y11) + (x22-x21) * (y22-y21) - intersection
    return float(intersection) / union
def IoM(rect_1, rect_2):
    '''
    :param rect_1: list in format [x11, y11, x12, y12, confidence, current_scale]
    :param rect_2:  list in format [x21, y21, x22, y22, confidence, current_scale]
    :return:    returns IoM ratio (intersection over min-area) of two rectangles
    '''
    x11 = rect_1[0]    # first rectangle top left x
    y11 = rect_1[1]    # first rectangle top left y
    x12 = rect_1[2]    # first rectangle bottom right x
    y12 = rect_1[3]    # first rectangle bottom right y
    x21 = rect_2[0]    # second rectangle top left x
    y21 = rect_2[1]    # second rectangle top left y
    x22 = rect_2[2]    # second rectangle bottom right x
    y22 = rect_2[3]    # second rectangle bottom right y
    x_overlap = max(0, min(x12,x22) -max(x11,x21))
    y_overlap = max(0, min(y12,y22) -max(y11,y21))
    intersection = x_overlap * y_overlap
    rect1_area = (y12 - y11) * (x12 - x11)
    rect2_area = (y22 - y21) * (x22 - x21)
    min_area = min(rect1_area, rect2_area)
    return float(intersection) / min_area
def localNMS(rectangles):
    '''
    :param rectangles:  list of rectangles, which are lists in format [x11, y11, x12, y12, confidence, current_scale],
                        sorted from highest confidence to smallest
    :return:    list of rectangles after local NMS
    '''
    result_rectangles = rectangles[:]  # list to return
    number_of_rects = len(result_rectangles)
    threshold = 0.3     # threshold of IoU of two rectangles
    cur_rect = 0
    while cur_rect < number_of_rects - 1:     # start from first element to second last element
        rects_to_compare = number_of_rects - cur_rect - 1      # elements after current element to compare
        cur_rect_to_compare = cur_rect + 1    # start comparing with element after current
        while rects_to_compare > 0:      # while there is at least one element after current to compare
            if (IoU(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= threshold) \
                    and (result_rectangles[cur_rect][5] == result_rectangles[cur_rect_to_compare][5]):  # scale is same

                del result_rectangles[cur_rect_to_compare]      # delete the rectangle
                number_of_rects -= 1
            else:
                cur_rect_to_compare += 1    # skip to next rectangle
            rects_to_compare -= 1
        cur_rect += 1   # finished comparing for current rectangle

    return result_rectangles
def globalNMS(rectangles):
    '''
    :param rectangles:  list of rectangles, which are lists in format [x11, y11, x12, y12, confidence, current_scale],
                        sorted from highest confidence to smallest
    :return:    list of rectangles after global NMS
    '''
    result_rectangles = rectangles[:]  # list to return
    number_of_rects = len(result_rectangles)
    threshold = 0.3     # threshold of IoU of two rectangles
    cur_rect = 0
    while cur_rect < number_of_rects - 1:     # start from first element to second last element
        rects_to_compare = number_of_rects - cur_rect - 1      # elements after current element to compare
        cur_rect_to_compare = cur_rect + 1    # start comparing with element ater current
        while rects_to_compare > 0:      # while there is at least one element after current to compare
            if IoU(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= 0.2  \
                    or ((IoM(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= threshold)):
                        #and (result_rectangles[cur_rect_to_compare][5] < 0.85)):  # if IoU ratio is higher than threshold #10/12=0.8333?
                del result_rectangles[cur_rect_to_compare]      # delete the rectangle
                number_of_rects -= 1
            else:
                cur_rect_to_compare += 1    # skip to next rectangle
            rects_to_compare -= 1
        cur_rect += 1   # finished comparing for current rectangle

    return result_rectangles

# ====== Below functions (12cal ~ 48cal) take images in style of caffe (0~1 BGR)===
def detect_face_12c(net_12c_full_conv, img, min_face_size, stride,
                    multiScale=False, scale_factor=1.414, threshold=0.05):
    '''
    :param img: image to detect faces
    :param min_face_size: minimum face size to detect (in pixels)
    :param stride: stride (in pixels)
    :param multiScale: whether to find faces under multiple scales or not
    :param scale_factor: scale to apply for pyramid
    :param threshold: score of patch must be above this value to pass to next net
    :return:    list of rectangles after global NMS
    '''
    net_kind = 12
    rectangles = []   # list of rectangles [x11, y11, x12, y12, confidence, current_scale] (corresponding to original image)

    current_scale = find_initial_scale(net_kind, min_face_size)     # find initial scale
    caffe_img_resized = resize_image(img, current_scale)      # resized initial caffe image
    current_height, current_width, channels = caffe_img_resized.shape

    while current_height > net_kind and current_width > net_kind:
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))  # switch from H x W x C to C x H x W
        # shape for input (data blob is N x C x H x W), set data
        net_12c_full_conv.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_12c_full_conv.blobs['data'].data[...] = caffe_img_resized_CHW
        # run net and take argmax for prediction
        net_12c_full_conv.forward()
        out = net_12c_full_conv.blobs['prob'].data[0][1, :, :]
        # print out.shape
        out_height, out_width = out.shape

        for current_y in range(0, out_height):
            for current_x in range(0, out_width):
                # total_windows += 1
                confidence = out[current_y, current_x]  # left index is y, right index is x (starting from 0)
                if confidence >= threshold:
                    current_rectangle = [int(2*current_x*current_scale), int(2*current_y*current_scale),
                                             int(2*current_x*current_scale + net_kind*current_scale),
                                             int(2*current_y*current_scale + net_kind*current_scale),
                                             confidence, current_scale]     # find corresponding patch on image
                    rectangles.append(current_rectangle)
        if multiScale is False:
            break
        else:
            caffe_img_resized = resize_image(caffe_img_resized, scale_factor)
            current_scale *= scale_factor
            current_height, current_width, channels = caffe_img_resized.shape

    return rectangles
def cal_face_12c(net_12_cal, caffe_img, rectangles):
    '''
    :param caffe_image: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    height, width, channels = caffe_img.shape
    result = []
    all_cropped_caffe_img = []

    for cur_rectangle in rectangles:

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]

        cropped_caffe_img = caffe_img[original_y1:original_y2, original_x1:original_x2] # crop image
        all_cropped_caffe_img.append(cropped_caffe_img)

    if len(all_cropped_caffe_img) == 0:
        return []

    output_all = net_12_cal.predict(all_cropped_caffe_img)   # predict through caffe

    for cur_rect in range(len(rectangles)):
        cur_rectangle = rectangles[cur_rect]
        output = output_all[cur_rect]
        prediction = output[0]      # (44, 1) ndarray

        threshold = 0.1
        indices = np.nonzero(prediction > threshold)[0]   # ndarray of indices where prediction is larger than threshold

        number_of_cals = len(indices)   # number of calibrations larger than threshold

        if number_of_cals == 0:     # if no calibration is needed, check next rectangle
            result.append(cur_rectangle)
            continue

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]
        original_w = original_x2 - original_x1
        original_h = original_y2 - original_y1

        total_s_change = 0
        total_x_change = 0
        total_y_change = 0

        for current_cal in range(number_of_cals):       # accumulate changes, and calculate average
            cal_label = int(indices[current_cal])   # should be number in 0~44
            if (cal_label >= 0) and (cal_label <= 8):       # decide s change
                total_s_change += 0.83
            elif (cal_label >= 9) and (cal_label <= 17):
                total_s_change += 0.91
            elif (cal_label >= 18) and (cal_label <= 26):
                total_s_change += 1.0
            elif (cal_label >= 27) and (cal_label <= 35):
                total_s_change += 1.10
            else:
                total_s_change += 1.21

            if cal_label % 9 <= 2:       # decide x change
                total_x_change += -0.17
            elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8):     # ignore case when 3<=x<=5, since adding 0 doesn't change
                total_x_change += 0.17

            if cal_label % 3 == 0:       # decide y change
                total_y_change += -0.17
            elif cal_label % 3 == 2:     # ignore case when 1, since adding 0 doesn't change
                total_y_change += 0.17

        s_change = total_s_change / number_of_cals      # calculate average
        x_change = total_x_change / number_of_cals
        y_change = total_y_change / number_of_cals

        cur_result = cur_rectangle      # inherit format and last two attributes from original rectangle
        cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
        cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
        cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
        cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))

        result.append(cur_result)

    result = sorted(result, key=itemgetter(4), reverse=True)    # sort rectangles according to confidence
                                                                        # reverse, so that it ranks from large to small
    return result
def detect_face_24c(net_24c, caffe_img, rectangles):
    '''
    :param caffe_img: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    result = []
    all_cropped_caffe_img = []

    for cur_rectangle in rectangles:

        x1 = cur_rectangle[0]
        y1 = cur_rectangle[1]
        x2 = cur_rectangle[2]
        y2 = cur_rectangle[3]

        cropped_caffe_img = caffe_img[y1:y2, x1:x2]     # crop image
        all_cropped_caffe_img.append(cropped_caffe_img)

    if len(all_cropped_caffe_img) == 0:
        return []

    prediction_all = net_24c.predict(all_cropped_caffe_img)   # predict through caffe

    for cur_rect in range(len(rectangles)):
        confidence = prediction_all[cur_rect][1]
        if confidence > 0.05:
            cur_rectangle = rectangles[cur_rect]
            cur_rectangle[4] = confidence
            result.append(cur_rectangle)

    return result
def cal_face_24c(net_24_cal, caffe_img, rectangles):
    '''
    :param caffe_image: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    height, width, channels = caffe_img.shape
    result = []

    for cur_rectangle in rectangles:

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]
        original_w = original_x2 - original_x1
        original_h = original_y2 - original_y1

        cropped_caffe_img = caffe_img[original_y1:original_y2, original_x1:original_x2] # crop image
        output = net_24_cal.predict([cropped_caffe_img])   # predict through caffe
        prediction = output[0]      # (44, 1) ndarray

        threshold = 0.1
        indices = np.nonzero(prediction > threshold)[0]   # ndarray of indices where prediction is larger than threshold

        number_of_cals = len(indices)   # number of calibrations larger than threshold

        if number_of_cals == 0:     # if no calibration is needed, check next rectangle
            result.append(cur_rectangle)
            continue

        total_s_change = 0
        total_x_change = 0
        total_y_change = 0

        for current_cal in range(number_of_cals):       # accumulate changes, and calculate average
            cal_label = int(indices[current_cal])   # should be number in 0~44
            if (cal_label >= 0) and (cal_label <= 8):       # decide s change
                total_s_change += 0.83
            elif (cal_label >= 9) and (cal_label <= 17):
                total_s_change += 0.91
            elif (cal_label >= 18) and (cal_label <= 26):
                total_s_change += 1.0
            elif (cal_label >= 27) and (cal_label <= 35):
                total_s_change += 1.10
            else:
                total_s_change += 1.21

            if cal_label % 9 <= 2:       # decide x change
                total_x_change += -0.17
            elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8):     # ignore case when 3<=x<=5, since adding 0 doesn't change
                total_x_change += 0.17

            if cal_label % 3 == 0:       # decide y change
                total_y_change += -0.17
            elif cal_label % 3 == 2:     # ignore case when 1, since adding 0 doesn't change
                total_y_change += 0.17

        s_change = total_s_change / number_of_cals      # calculate average
        x_change = total_x_change / number_of_cals
        y_change = total_y_change / number_of_cals

        cur_result = cur_rectangle      # inherit format and last two attributes from original rectangle
        cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
        cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
        cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
        cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))

        result.append(cur_result)

    return result
def detect_face_48c(net_48c, caffe_img, rectangles):
    '''
    :param caffe_img: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    result = []
    all_cropped_caffe_img = []

    for cur_rectangle in rectangles:

        x1 = cur_rectangle[0]
        y1 = cur_rectangle[1]
        x2 = cur_rectangle[2]
        y2 = cur_rectangle[3]

        cropped_caffe_img = caffe_img[y1:y2, x1:x2]     # crop image
        all_cropped_caffe_img.append(cropped_caffe_img)

        prediction = net_48c.predict([cropped_caffe_img])   # predict through caffe
        confidence = prediction[0][1]

        if confidence > 0.3:
            cur_rectangle[4] = confidence
            result.append(cur_rectangle)

    result = sorted(result, key=itemgetter(4), reverse=True)    # sort rectangles according to confidence
                                                                        # reverse, so that it ranks from large to small
    return result
def cal_face_48c(net_48_cal, caffe_img, rectangles):
    '''
    :param caffe_image: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    height, width, channels = caffe_img.shape
    result = []
    for cur_rectangle in rectangles:

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]
        original_w = original_x2 - original_x1
        original_h = original_y2 - original_y1

        cropped_caffe_img = caffe_img[original_y1:original_y2, original_x1:original_x2] # crop image
        output = net_48_cal.predict([cropped_caffe_img])   # predict through caffe

        prediction = output[0]      # (44, 1) ndarray

        threshold = 0.1
        indices = np.nonzero(prediction > threshold)[0]   # ndarray of indices where prediction is larger than threshold

        number_of_cals = len(indices)   # number of calibrations larger than threshold

        if number_of_cals == 0:     # if no calibration is needed, check next rectangle
            result.append(cur_rectangle)
            continue

        total_s_change = 0
        total_x_change = 0
        total_y_change = 0

        for current_cal in range(number_of_cals):       # accumulate changes, and calculate average
            cal_label = int(indices[current_cal])   # should be number in 0~44
            if (cal_label >= 0) and (cal_label <= 8):       # decide s change
                total_s_change += 0.83
            elif (cal_label >= 9) and (cal_label <= 17):
                total_s_change += 0.91
            elif (cal_label >= 18) and (cal_label <= 26):
                total_s_change += 1.0
            elif (cal_label >= 27) and (cal_label <= 35):
                total_s_change += 1.10
            else:
                total_s_change += 1.21

            if cal_label % 9 <= 2:       # decide x change
                total_x_change += -0.17
            elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8):     # ignore case when 3<=x<=5, since adding 0 doesn't change
                total_x_change += 0.17

            if cal_label % 3 == 0:       # decide y change
                total_y_change += -0.17
            elif cal_label % 3 == 2:     # ignore case when 1, since adding 0 doesn't change
                total_y_change += 0.17

        s_change = total_s_change / number_of_cals      # calculate average
        x_change = total_x_change / number_of_cals
        y_change = total_y_change / number_of_cals

        cur_result = cur_rectangle      # inherit format and last two attributes from original rectangle
        cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
        cur_result[1] = int(max(0, original_y1 - 1.1 * original_h * y_change / s_change))
        cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
        cur_result[3] = int(min(height, cur_result[1] + 1.1 * original_h / s_change))

        result.append(cur_result)

    return result

def detect_faces(nets, img_forward, caffe_image, min_face_size, stride,
                 multiScale=False, scale_factor=1.414, threshold=0.05):
    '''
    Complete flow of face cascade detection
    :param nets: 6 nets as a tuple
    :param img_forward: image in normal style after subtracting mean pixel value
    :param caffe_image: image in style of caffe (0~1 BGR)
    :param min_face_size:
    :param stride:
    :param multiScale:
    :param scale_factor:
    :param threshold:
    :return: list of rectangles
    '''
    net_12c_full_conv = nets[0]
    net_12_cal = nets[1]
    net_24c = nets[2]
    net_24_cal = nets[3]
    net_48c = nets[4]
    net_48_cal = nets[5]

    rectangles = detect_face_12c(net_12c_full_conv, img_forward, min_face_size,
                                 stride, multiScale, scale_factor, threshold)     # detect faces
    rectangles = cal_face_12c(net_12_cal, caffe_image, rectangles)      # calibration
    rectangles = localNMS(rectangles)      # apply local NMS
    rectangles = detect_face_24c(net_24c, caffe_image, rectangles)
    rectangles = cal_face_24c(net_24_cal, caffe_image, rectangles)      # calibration
    rectangles = localNMS(rectangles)      # apply local NMS
    rectangles = detect_face_48c(net_48c, caffe_image, rectangles)
    rectangles = globalNMS(rectangles)      # apply global NMS
    rectangles = cal_face_48c(net_48_cal, caffe_image, rectangles)      # calibration

    return rectangles

# ========== Adjusts net to take one crop of image only during test time ==========
# ====== Below functions take images in normal style after subtracting mean pixel value===
def detect_face_12c_net(net_12c_full_conv, img_forward, min_face_size, stride,
                    multiScale=False, scale_factor=1.414, threshold=0.05):
    '''
    Adjusts net to take one crop of image only during test time
    :param img: image in caffe style to detect faces
    :param min_face_size: minimum face size to detect (in pixels)
    :param stride: stride (in pixels)
    :param multiScale: whether to find faces under multiple scales or not
    :param scale_factor: scale to apply for pyramid
    :param threshold: score of patch must be above this value to pass to next net
    :return:    list of rectangles after global NMS
    '''
    net_kind = 12
    rectangles = []   # list of rectangles [x11, y11, x12, y12, confidence, current_scale] (corresponding to original image)

    current_scale = find_initial_scale(net_kind, min_face_size)     # find initial scale
    caffe_img_resized = resize_image(img_forward, current_scale)      # resized initial caffe image
    current_height, current_width, channels = caffe_img_resized.shape

    # print "Shape after resizing : " + str(caffe_img_resized.shape)

    while current_height > net_kind and current_width > net_kind:
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))  # switch from H x W x C to C x H x W
        # shape for input (data blob is N x C x H x W), set data
        net_12c_full_conv.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_12c_full_conv.blobs['data'].data[...] = caffe_img_resized_CHW
        # run net and take argmax for prediction
        net_12c_full_conv.forward()

        out = net_12c_full_conv.blobs['prob'].data[0][1, :, :]
        # print out.shape
        out_height, out_width = out.shape

        # threshold = 0.02
        # idx = out[:, :] >= threshold
        # out[idx] = 1
        # idx = out[:, :] < threshold
        # out[idx] = 0
        # cv2.imshow('img', out*255)
        # cv2.waitKey(0)

        # print "Shape of output after resizing " + str(caffe_img_resized.shape) + " : " + str(out.shape)

        for current_y in range(0, out_height):
            for current_x in range(0, out_width):
                # total_windows += 1
                confidence = out[current_y, current_x]  # left index is y, right index is x (starting from 0)
                if confidence >= threshold:
                    current_rectangle = [int(2*current_x*current_scale), int(2*current_y*current_scale),
                                             int(2*current_x*current_scale + net_kind*current_scale),
                                             int(2*current_y*current_scale + net_kind*current_scale),
                                             confidence, current_scale]     # find corresponding patch on image
                    rectangles.append(current_rectangle)
        if multiScale is False:
            break
        else:
            caffe_img_resized = resize_image(caffe_img_resized, scale_factor)
            current_scale *= scale_factor
            current_height, current_width, channels = caffe_img_resized.shape

    return rectangles
def cal_face_12c_net(net_12_cal, img_forward, rectangles):
    '''
    Adjusts net to take one crop of image only during test time
    :param caffe_image: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''

    height, width, channels = img_forward.shape
    result = []
    for cur_rectangle in rectangles:

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]
        original_w = original_x2 - original_x1
        original_h = original_y2 - original_y1

        cropped_caffe_img = img_forward[original_y1:original_y2, original_x1:original_x2] # crop image

        caffe_img_resized = cv2.resize(cropped_caffe_img, (12, 12))
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
        net_12_cal.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_12_cal.blobs['data'].data[...] = caffe_img_resized_CHW
        net_12_cal.forward()

        output = net_12_cal.blobs['prob'].data

        # output = net_12_cal.predict([cropped_caffe_img])   # predict through caffe

        prediction = output[0]      # (44, 1) ndarray

        threshold = 0.1
        indices = np.nonzero(prediction > threshold)[0]   # ndarray of indices where prediction is larger than threshold

        number_of_cals = len(indices)   # number of calibrations larger than threshold

        if number_of_cals == 0:     # if no calibration is needed, check next rectangle
            result.append(cur_rectangle)
            continue

        total_s_change = 0
        total_x_change = 0
        total_y_change = 0

        for current_cal in range(number_of_cals):       # accumulate changes, and calculate average
            cal_label = int(indices[current_cal])   # should be number in 0~44
            if (cal_label >= 0) and (cal_label <= 8):       # decide s change
                total_s_change += 0.83
            elif (cal_label >= 9) and (cal_label <= 17):
                total_s_change += 0.91
            elif (cal_label >= 18) and (cal_label <= 26):
                total_s_change += 1.0
            elif (cal_label >= 27) and (cal_label <= 35):
                total_s_change += 1.10
            else:
                total_s_change += 1.21

            if cal_label % 9 <= 2:       # decide x change
                total_x_change += -0.17
            elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8):     # ignore case when 3<=x<=5, since adding 0 doesn't change
                total_x_change += 0.17

            if cal_label % 3 == 0:       # decide y change
                total_y_change += -0.17
            elif cal_label % 3 == 2:     # ignore case when 1, since adding 0 doesn't change
                total_y_change += 0.17

        s_change = total_s_change / number_of_cals      # calculate average
        x_change = total_x_change / number_of_cals
        y_change = total_y_change / number_of_cals

        cur_result = cur_rectangle      # inherit format and last two attributes from original rectangle
        cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
        cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
        cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
        cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))

        result.append(cur_result)

    result = sorted(result, key=itemgetter(4), reverse=True)    # sort rectangles according to confidence
                                                                        # reverse, so that it ranks from large to small
    return result
def detect_face_24c_net(net_24c, img_forward, rectangles):
    '''
    Adjusts net to take one crop of image only during test time
    :param caffe_img: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    result = []
    for cur_rectangle in rectangles:

        x1 = cur_rectangle[0]
        y1 = cur_rectangle[1]
        x2 = cur_rectangle[2]
        y2 = cur_rectangle[3]

        cropped_caffe_img = img_forward[y1:y2, x1:x2]     # crop image

        caffe_img_resized = cv2.resize(cropped_caffe_img, (24, 24))
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
        net_24c.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_24c.blobs['data'].data[...] = caffe_img_resized_CHW
        net_24c.forward()

        prediction = net_24c.blobs['prob'].data

        confidence = prediction[0][1]

        if confidence > 0.9:#0.05:
            cur_rectangle[4] = confidence
            result.append(cur_rectangle)

    return result
def cal_face_24c_net(net_24_cal, img_forward, rectangles):
    '''
    Adjusts net to take one crop of image only during test time
    :param caffe_image: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    height, width, channels = img_forward.shape
    result = []
    for cur_rectangle in rectangles:

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]
        original_w = original_x2 - original_x1
        original_h = original_y2 - original_y1

        cropped_caffe_img = img_forward[original_y1:original_y2, original_x1:original_x2] # crop image

        caffe_img_resized = cv2.resize(cropped_caffe_img, (24, 24))
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
        net_24_cal.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_24_cal.blobs['data'].data[...] = caffe_img_resized_CHW
        net_24_cal.forward()

        output = net_24_cal.blobs['prob'].data

        prediction = output[0]      # (44, 1) ndarray

        threshold = 0.1
        indices = np.nonzero(prediction > threshold)[0]   # ndarray of indices where prediction is larger than threshold

        number_of_cals = len(indices)   # number of calibrations larger than threshold

        if number_of_cals == 0:     # if no calibration is needed, check next rectangle
            result.append(cur_rectangle)
            continue

        total_s_change = 0
        total_x_change = 0
        total_y_change = 0

        for current_cal in range(number_of_cals):       # accumulate changes, and calculate average
            cal_label = int(indices[current_cal])   # should be number in 0~44
            if (cal_label >= 0) and (cal_label <= 8):       # decide s change
                total_s_change += 0.83
            elif (cal_label >= 9) and (cal_label <= 17):
                total_s_change += 0.91
            elif (cal_label >= 18) and (cal_label <= 26):
                total_s_change += 1.0
            elif (cal_label >= 27) and (cal_label <= 35):
                total_s_change += 1.10
            else:
                total_s_change += 1.21

            if cal_label % 9 <= 2:       # decide x change
                total_x_change += -0.17
            elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8):     # ignore case when 3<=x<=5, since adding 0 doesn't change
                total_x_change += 0.17

            if cal_label % 3 == 0:       # decide y change
                total_y_change += -0.17
            elif cal_label % 3 == 2:     # ignore case when 1, since adding 0 doesn't change
                total_y_change += 0.17

        s_change = total_s_change / number_of_cals      # calculate average
        x_change = total_x_change / number_of_cals
        y_change = total_y_change / number_of_cals

        cur_result = cur_rectangle      # inherit format and last two attributes from original rectangle
        cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
        cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
        cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
        cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))

        result.append(cur_result)

    result = sorted(result, key=itemgetter(4), reverse=True)    # sort rectangles according to confidence                                                                       # reverse, so that it ranks from large to small

    return result
def detect_face_48c_net(net_48c, img_forward, rectangles):
    '''
    Adjusts net to take one crop of image only during test time
    :param caffe_img: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    result = []
    for cur_rectangle in rectangles:

        x1 = cur_rectangle[0]
        y1 = cur_rectangle[1]
        x2 = cur_rectangle[2]
        y2 = cur_rectangle[3]

        cropped_caffe_img = img_forward[y1:y2, x1:x2]     # crop image

        caffe_img_resized = cv2.resize(cropped_caffe_img, (48, 48))
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
        net_48c.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_48c.blobs['data'].data[...] = caffe_img_resized_CHW
        net_48c.forward()

        prediction = net_48c.blobs['prob'].data

        confidence = prediction[0][1]

        if confidence > 0.95:#0.1:
            cur_rectangle[4] = confidence
            result.append(cur_rectangle)

    result = sorted(result, key=itemgetter(4), reverse=True)    # sort rectangles according to confidence
                                                                        # reverse, so that it ranks from large to small #after sorter and globalnms the best faces will be detected
    return result
def cal_face_48c_net(net_48_cal, img_forward, rectangles):
    '''
    Adjusts net to take one crop of image only during test time
    :param caffe_image: image in caffe style to detect faces
    :param rectangles:  rectangles in form [x11, y11, x12, y12, confidence, current_scale]
    :return:    rectangles after calibration
    '''
    height, width, channels = img_forward.shape
    result = []
    for cur_rectangle in rectangles:

        original_x1 = cur_rectangle[0]
        original_y1 = cur_rectangle[1]
        original_x2 = cur_rectangle[2]
        original_y2 = cur_rectangle[3]
        original_w = original_x2 - original_x1
        original_h = original_y2 - original_y1

        cropped_caffe_img = img_forward[original_y1:original_y2, original_x1:original_x2] # crop image
        caffe_img_resized = cv2.resize(cropped_caffe_img, (48, 48))
        caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
        net_48_cal.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
        net_48_cal.blobs['data'].data[...] = caffe_img_resized_CHW
        net_48_cal.forward()

        output = net_48_cal.blobs['prob'].data

        prediction = output[0]      # (44, 1) ndarray

        threshold = 0.1
        indices = np.nonzero(prediction > threshold)[0]   # ndarray of indices where prediction is larger than threshold

        number_of_cals = len(indices)   # number of calibrations larger than threshold

        if number_of_cals == 0:     # if no calibration is needed, check next rectangle
            result.append(cur_rectangle)
            continue

        total_s_change = 0
        total_x_change = 0
        total_y_change = 0

        for current_cal in range(number_of_cals):       # accumulate changes, and calculate average
            cal_label = int(indices[current_cal])   # should be number in 0~44
            if (cal_label >= 0) and (cal_label <= 8):       # decide s change
                total_s_change += 0.83
            elif (cal_label >= 9) and (cal_label <= 17):
                total_s_change += 0.91
            elif (cal_label >= 18) and (cal_label <= 26):
                total_s_change += 1.0
            elif (cal_label >= 27) and (cal_label <= 35):
                total_s_change += 1.10
            else:
                total_s_change += 1.21

            if cal_label % 9 <= 2:       # decide x change
                total_x_change += -0.17
            elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8):     # ignore case when 3<=x<=5, since adding 0 doesn't change
                total_x_change += 0.17

            if cal_label % 3 == 0:       # decide y change
                total_y_change += -0.17
            elif cal_label % 3 == 2:     # ignore case when 1, since adding 0 doesn't change
                total_y_change += 0.17

        s_change = total_s_change / number_of_cals      # calculate average
        x_change = total_x_change / number_of_cals
        y_change = total_y_change / number_of_cals

        cur_result = cur_rectangle      # inherit format and last two attributes from original rectangle
        cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
        cur_result[1] = int(max(0, original_y1 - 1.1 * original_h * y_change / s_change))
        cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
        cur_result[3] = int(min(height, cur_result[1] + 1.1 * original_h / s_change))

        result.append(cur_result)

    return result

def detect_faces_net(nets, img_forward, min_face_size, stride,
                 multiScale=False, scale_factor=1.414, threshold=0.05):
    '''
    Complete flow of face cascade detection
    :param nets: 6 nets as a tuple
    :param img_forward: image in normal style after subtracting mean pixel value
    :param min_face_size:
    :param stride:
    :param multiScale:
    :param scale_factor:
    :param threshold:
    :return: list of rectangles
    '''
    net_12c_full_conv = nets[0]
    net_12_cal = nets[1]
    net_24c = nets[2]
    net_24_cal = nets[3]
    net_48c = nets[4]
    net_48_cal = nets[5]

    rectangles = detect_face_12c_net(net_12c_full_conv, img_forward, min_face_size,
                                 stride, multiScale, scale_factor, threshold)  # detect faces
    rectangles = cal_face_12c_net(net_12_cal, img_forward, rectangles)      # calibration
    rectangles = localNMS(rectangles)      # apply local NMS
    rectangles = detect_face_24c_net(net_24c, img_forward, rectangles)
    rectangles = cal_face_24c_net(net_24_cal, img_forward, rectangles)      # calibration
    rectangles = localNMS(rectangles)      # apply local NMS
    rectangles = detect_face_48c_net(net_48c, img_forward, rectangles)
    rectangles = globalNMS(rectangles)      # apply global NMS
    rectangles = cal_face_48c_net(net_48_cal, img_forward, rectangles)      # calibration

    return rectangles


人脸检测(三)

2017年05月07日 21:09:24 HamTam12 阅读数:1467 标签: 人脸检测 opencv人脸检测 更多

个人分类: 人脸检测

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/sinat_14916279/article/details/71374059

3.OpenCV人脸检测方法
采用的是harr-like特征提取 + adaboost级联分类器构成的人脸检测算法,可以设置要检测的人脸大小范围,召回率高,但有时会出现误检;当设置要检测的人脸比较大时,检测速度很快,可以达到实时检测;检测到的人脸框位置不够精细。

这里写图片描述

参考opencv安装目录下source\samples\cpp文件夹tutorial_code\objectDetection中的示例程序,官网 中有人脸检测相关的函数说明。
下面给出利用opencv3进行人脸检测的python和C++代码,可以进行单张图像、摄像头中或视频中的人脸检测。
python版本:

# -*- coding: utf-8 -*-
"""
Created on Sun May 07 18:49:11 2017

@author: Administrator
"""
import cv2
import time

def face_detection(ccf,im):
    if im.shape[2] == 3:
        gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY) #级联分类器只能检测灰度图像
    else:
        gray = im
    faces = ccf.detectMultiScale(gray,1.1,3,0,(64,64),(256,256)) #多尺度人脸检测,设置检测的人脸大小范围
    for face in faces: #face:(x,y,width,height)
        cv2.rectangle(im,(face[0],face[1]),(face[0]+face[2],face[1]+face[3]),(0,0,255),2,8)
    return im

def read_capture(ccf):
#    video = cv2.VideoCapture('xxx.mp4') #从视频中读取
    video = cv2.VideoCapture(0) #从摄像头读取
    if video.isOpened():
        success,frame = video.read()
        while success:
            im = face_detection(ccf,frame)
            cv2.imshow('capture face detection',im)
            if cv2.waitKey(1) >= 0 :
                break
            success,frame = video.read()
        cv2.destroyAllWindows()
        video.release()

if __name__ == '__main__' :
    ccf = cv2.CascadeClassifier('D:/OPENCV3.0.0/opencv/sources/data/haarcascades/haarcascade_frontalface_alt2.xml') #加载级联人脸检测器
    start_time = time.time()                           
    #从单张图像中检测人脸
    im = cv2.imread('C:/Users/Administrator/Desktop/caffe/matlab/demo/1.jpg')
    output = face_detection(ccf,im)

    end_time = time.time()
    print (end_time-start_time)*1000,'ms'
    cv2.imshow('face detection',output)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

#    read_capture(ccf) #视频或摄像头中人脸检测

C++版本:

#include <iostream>
#include <opencv2/opencv.hpp>
#include <time.h>
using namespace cv;
using namespace std;

Mat faceDetection(CascadeClassifier ccf, Mat img_src);
void readCapture(CascadeClassifier ccf);
void readVideo(CascadeClassifier ccf);

void main()
{
    //string xmlPath = "D:\\OPENCV3.0.0\\opencv\\sources\\data\\haarcascades\\haarcascade_frontalface_default.xml"; //训练好的分类xml文件
    string xmlPath = "D:\\OPENCV3.0.0\\opencv\\sources\\data\\haarcascades\\haarcascade_frontalface_alt2.xml"; //训练好的分类xml文件(改进版)
    CascadeClassifier ccf; //级联分类器,很重要的类
    if (!ccf.load(xmlPath))
    {
        cout << "can not open this xml file" << endl;
        system("pause");
        return;
    }

    //readCapture(ccf); //摄像头中人脸检测
    //readVideo(ccf); //视频中人脸检测
    //单张图片人脸检测
    Mat img_src = imread("1.jpg");
    Mat img_dst = faceDetection(ccf, img_src); 
    imshow("face detection", img_dst);
    waitKey(0);
}

//opencv人脸检测,原理是提取harr或lbp特征,送入adaboost级联分类器
//输入待检测图像,返回画框/圆图像
Mat faceDetection(CascadeClassifier ccf, Mat img_src)
{
    clock_t start, end; //用于计时,毫秒级
    start = clock();

    Mat img_dst = img_src.clone(); //将输入图像拷贝为输出结果图像

    Mat gray;
    cvtColor(img_src, gray, COLOR_BGR2GRAY); //在opencv3命名规范中,CV_BGR2GRAY变成了COLOR_BGR2GRAY
    equalizeHist(gray, gray);

    /*
    void CascadeClassifier::detectMultiScale(InputArray image, vector<Rect>& objects, double scaleFactor=1.1,
    int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size())
    第一个参数image:  要检测的图片,一般为灰度图
    第二个参数objects:  Rect型的容器,存放所有检测出的人脸,每个人脸是一个矩形
    第三个参数scaleFactor:  缩放因子,对图片进行缩放,默认为1.1
    第四个参数minNeighbors: 最小邻居数,默认为3
    第五个参数flags:  兼容老版本的一个参数,在3.0版本中没用处。默认为0
    第六个参数minSize: 最小尺寸,检测出的人脸最小尺寸
    第七个参数maxSize: 最大尺寸,检测出的人脸最大尺寸
    */
    vector<Rect> faces; //定义人脸框
    ccf.detectMultiScale(gray, faces, 1.1, 3, 0, Size(64 + 32, 64 + 32), Size(256 + 128, 256 + 128)); //级联分类器最核心函数,多尺度检测,设置要检测的人脸大小范围

    end = clock();
    cout << (double)(end - start) / CLOCKS_PER_SEC << endl;

    for (auto face : faces)
    {
        rectangle(img_dst, face, Scalar(0, 0, 255), 2, 8); //画矩形框

        //画圆形
        //Point center;
        //center.x = face.x + face.width * 0.5;
        //center.y = face.y + face.height * 0.5;
        //int radius = (face.width + face.height) * 0.25;
        //circle(img_dst,center,radius,Scalar(0,0,255),2.8);
    }
    return img_dst;
}

//读取摄像头
void readCapture(CascadeClassifier ccf)
{
    VideoCapture capture(0);
    if (!capture.isOpened())
    {
        cout << "can not open capture..." << endl;
        return;
    }
    int width = (int)capture.get(CAP_PROP_FRAME_WIDTH); //opencv3中将CV_CAP_PROP_FRAME_WIDTH去掉了CV_前缀,变成了CAP_PROP_FRAME_WIDTH
    int height = (int)capture.get(CAP_PROP_FRAME_HEIGHT);
    cout << "width: " << width << "," << "height: " << height << endl;
    Mat img;
    while (1)
    {
        Mat frame;
        capture >> frame;

        img = faceDetection(ccf, frame);

        imshow("capture face detection", img);
        if (waitKey(1) >= 0) break;
    }
}
//读取视频
void readVideo(CascadeClassifier ccf)
{
    VideoCapture capture("test.avi");
    int width = (int)capture.get(CAP_PROP_FRAME_WIDTH); //opencv3中将CV_CAP_PROP_FRAME_WIDTH去掉了CV_前缀,变成了CAP_PROP_FRAME_WIDTH
    int height = (int)capture.get(CAP_PROP_FRAME_HEIGHT);
    cout << "width: " << width << "," << "height: " << height << endl;
    Mat img;
    while (1)
    {
        Mat frame;
        capture >> frame;
        if (frame.empty())
            break;

        img = faceDetection(ccf, frame); //人脸检测

        imshow("video face detection", img);
        if (waitKey(1) >= 0) break;
    }
}

4.Dlib人脸检测方法

dlib人脸检测采用的是经典的梯度方向直方图(HOG特征)提取 + 线性分类器 + 图像金字塔 + 滑动窗口的人脸检测方法,速度很快,只能检测80x80或更大的人脸,但可以通过图像上采样来检测更小的人脸。相比于opencv的人脸检测,召回率更高,误检率更低,且人脸框更准确,检测速度也同样很快。

这里写图片描述

dlib官网 有face_detection的示例代码,包括C++和Python的,代码很精简,也是直接调用它给的接口。下面给出利用dlib进行人脸检测的python代码,用于检测单张图像、视频或摄像头中的人脸。

# coding: utf-8
#face detection by dlib

#   This face detector is made using the now classic Histogram of Oriented
#   Gradients (HOG) feature combined with a linear classifier, an image
#   pyramid, and sliding window detection scheme.  This type of object detector
#   is fairly general and capable of detecting many types of semi-rigid objects
#   in addition to human faces.  Therefore, if you are interested in making
#   your own object detectors then read the train_object_detector.py example
#   program.  
import dlib
import cv2
import time

def read_capture(detector):
#    video = cv2.VideoCapture('xxx.mp4') #从视频中读取
    video = cv2.VideoCapture(0) #从摄像头读取
    if video.isOpened():
        success,frame = video.read()
        while success:
            im = frame.copy()
            #dlib的人脸检测器只能检测80x80和更大的人脸,如果需要检测比它小的人脸,需要对图像上采样,一次上采样图像尺寸放大一倍
            #rects = detector(img,1) #1次上采样
            rects = detector(im,0)
            for rect in rects: #rect.left(),rect.top(),rect.right(),rect.bottom()
                cv2.rectangle(im,(rect.left(),rect.top()),(rect.right(),rect.bottom()),(0,0,255),2,8)
            cv2.imshow('capture face detection',im)
            if cv2.waitKey(1) >= 0 :
                break
            success,frame = video.read()
        cv2.destroyAllWindows()
        video.release()

if __name__ == '__main__' :
    detector = dlib.get_frontal_face_detector() #获取dlib正脸检测器
    start_time = time.time()                           
    #从单张图像中检测人脸
    im = cv2.imread('C:/Users/Administrator/Desktop/caffe/matlab/demo/1.jpg')
    #dlib的人脸检测器只能检测80x80和更大的人脸,如果需要检测比它小的人脸,需要对图像上采样,一次上采样图像尺寸放大一倍
    #rects = detector(img,1) #1次上采样
    #rects = detector(img,0) #0次上采样
    rects = detector(im,2)
    for rect in rects: #rect.left(),rect.top(),rect.right(),rect.bottom()
        cv2.rectangle(im,(rect.left(),rect.top()),(rect.right(),rect.bottom()),(0,0,255),2,8)
    end_time = time.time()
    print (end_time-start_time)*1000,'ms'
    cv2.imshow('dlib face detection',im)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

    #read_capture(detector) #检测视频或摄像头中的人脸

5.libfacedetect人脸检测方法

libfacedetect是深圳大学于仕祺老师提供的免费的人脸检测库,以二进制的方式免费发布可商用,可将这个库应用到系统中,无论是科研目的还是商业目的,无任何限制。该库调用非常简单,只有一个函数,纯C语言编译而成,不依赖任何第三方库。通过连续的优化,其人脸检测算法可以做到3.6毫秒处理一张VGA图像(最小48x48人脸),无论是检测速度、效果和人脸框位置的准确性都胜过opencv!!!
libfacedetect提供了四套接口,分别为frontal、frontal_surveillance、multiview、multiview_reinforce,其中multiview_reinforce效果最好,可以检测侧脸,速度比其它接口稍慢,但仍然比opencv的正脸检测要快,总而言之,非常好用,用起来非常简单!!!
此外,libfacedetect和dlib一样提供了人脸特征点检测,且libfacedetect对侧脸检测仍有不错的效果。详情可以参考 github

这里写图片描述

/*
The MIT License (MIT)

Copyright (c) 2015-2017 Shiqi Yu
shiqi.yu@gmail.com

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
*/

#include <stdio.h>
#include <opencv2/opencv.hpp>
#include "facedetect-dll.h"

//#pragma comment(lib,"libfacedetect.lib")
#pragma comment(lib,"libfacedetect-x64.lib")

//define the buffer size. Do not change the size!
#define DETECT_BUFFER_SIZE 0x20000
using namespace cv;

int main()
{
    //load an image and convert it to gray (single-channel)
    Mat image = imread("C:\\Users\\Administrator\\Desktop\\faces\\39649a2f070828388e658cb3ba99a9014d08f1cc.jpg");
    Mat gray;
    cvtColor(image, gray, CV_BGR2GRAY);


    int * pResults = NULL;
    //pBuffer is used in the detection functions.
    //If you call functions in multiple threads, please create one buffer for each thread!
    unsigned char * pBuffer = (unsigned char *)malloc(DETECT_BUFFER_SIZE);
    if (!pBuffer)
    {
        fprintf(stderr, "Can not alloc buffer.\n");
        return -1;
    }

    int doLandmark = 1;

    ///////////////////////////////////////////
    // frontal face detection / 68 landmark detection
    // it's fast, but cannot detect side view faces
    //////////////////////////////////////////
    //!!! The input image must be a gray one (single-channel)
    //!!! DO NOT RELEASE pResults !!!
    pResults = facedetect_frontal(pBuffer, (unsigned char*)(gray.ptr(0)), gray.cols, gray.rows, (int)gray.step,
        1.2f, 2, 48, 0, doLandmark);

    printf("%d faces detected.\n", (pResults ? *pResults : 0));
    Mat result_frontal = image.clone();
    //print the detection results
    for (int i = 0; i < (pResults ? *pResults : 0); i++)
    {
        short * p = ((short*)(pResults + 1)) + 142 * i;
        int x = p[0];
        int y = p[1];
        int w = p[2];
        int h = p[3];
        int neighbors = p[4];
        int angle = p[5];

        printf("face_rect=[%d, %d, %d, %d], neighbors=%d, angle=%d\n", x, y, w, h, neighbors, angle);
        rectangle(result_frontal, Rect(x, y, w, h), Scalar(0, 255, 0), 2);
        if (doLandmark)
        {
            for (int j = 0; j < 68; j++)
                circle(result_frontal, Point((int)p[6 + 2 * j], (int)p[6 + 2 * j + 1]), 1, Scalar(0, 255, 0));
        }
    }
    imshow("Results_frontal", result_frontal);

    /////////////////////////////////////////////
    //// frontal face detection designed for video surveillance / 68 landmark detection
    //// it can detect faces with bad illumination.
    ////////////////////////////////////////////
    ////!!! The input image must be a gray one (single-channel)
    ////!!! DO NOT RELEASE pResults !!!
    //pResults = facedetect_frontal_surveillance(pBuffer, (unsigned char*)(gray.ptr(0)), gray.cols, gray.rows, (int)gray.step,
    //  1.2f, 2, 48, 0, doLandmark);
    //printf("%d faces detected.\n", (pResults ? *pResults : 0));
    //Mat result_frontal_surveillance = image.clone();;
    ////print the detection results
    //for (int i = 0; i < (pResults ? *pResults : 0); i++)
    //{
    //  short * p = ((short*)(pResults + 1)) + 142 * i;
    //  int x = p[0];
    //  int y = p[1];
    //  int w = p[2];
    //  int h = p[3];
    //  int neighbors = p[4];
    //  int angle = p[5];

    //  printf("face_rect=[%d, %d, %d, %d], neighbors=%d, angle=%d\n", x, y, w, h, neighbors, angle);
    //  rectangle(result_frontal_surveillance, Rect(x, y, w, h), Scalar(0, 255, 0), 2);
    //  if (doLandmark)
    //  {
    //      for (int j = 0; j < 68; j++)
    //          circle(result_frontal_surveillance, Point((int)p[6 + 2 * j], (int)p[6 + 2 * j + 1]), 1, Scalar(0, 255, 0));
    //  }
    //}
    //imshow("Results_frontal_surveillance", result_frontal_surveillance);


    /////////////////////////////////////////////
    //// multiview face detection / 68 landmark detection
    //// it can detect side view faces, but slower than facedetect_frontal().
    ////////////////////////////////////////////
    ////!!! The input image must be a gray one (single-channel)
    ////!!! DO NOT RELEASE pResults !!!
    //pResults = facedetect_multiview(pBuffer, (unsigned char*)(gray.ptr(0)), gray.cols, gray.rows, (int)gray.step,
    //  1.2f, 2, 48, 0, doLandmark);

    //printf("%d faces detected.\n", (pResults ? *pResults : 0));
    //Mat result_multiview = image.clone();;
    ////print the detection results
    //for (int i = 0; i < (pResults ? *pResults : 0); i++)
    //{
    //  short * p = ((short*)(pResults + 1)) + 142 * i;
    //  int x = p[0];
    //  int y = p[1];
    //  int w = p[2];
    //  int h = p[3];
    //  int neighbors = p[4];
    //  int angle = p[5];

    //  printf("face_rect=[%d, %d, %d, %d], neighbors=%d, angle=%d\n", x, y, w, h, neighbors, angle);
    //  rectangle(result_multiview, Rect(x, y, w, h), Scalar(0, 255, 0), 2);
    //  if (doLandmark)
    //  {
    //      for (int j = 0; j < 68; j++)
    //          circle(result_multiview, Point((int)p[6 + 2 * j], (int)p[6 + 2 * j + 1]), 1, Scalar(0, 255, 0));
    //  }
    //}
    //imshow("Results_multiview", result_multiview);


    /////////////////////////////////////////////
    //// reinforced multiview face detection / 68 landmark detection
    //// it can detect side view faces, better but slower than facedetect_multiview().
    ////////////////////////////////////////////
    ////!!! The input image must be a gray one (single-channel)
    ////!!! DO NOT RELEASE pResults !!!
    //pResults = facedetect_multiview_reinforce(pBuffer, (unsigned char*)(gray.ptr(0)), gray.cols, gray.rows, (int)gray.step,
    //  1.2f, 3, 48, 0, doLandmark);

    //printf("%d faces detected.\n", (pResults ? *pResults : 0));
    //Mat result_multiview_reinforce = image.clone();;
    ////print the detection results
    //for (int i = 0; i < (pResults ? *pResults : 0); i++)
    //{
    //  short * p = ((short*)(pResults + 1)) + 142 * i;
    //  int x = p[0];
    //  int y = p[1];
    //  int w = p[2];
    //  int h = p[3];
    //  int neighbors = p[4];
    //  int angle = p[5];

    //  printf("face_rect=[%d, %d, %d, %d], neighbors=%d, angle=%d\n", x, y, w, h, neighbors, angle);
    //  rectangle(result_multiview_reinforce, Rect(x, y, w, h), Scalar(0, 255, 0), 2);
    //  if (doLandmark)
    //  {
    //      for (int j = 0; j < 68; j++)
    //          circle(result_multiview_reinforce, Point((int)p[6 + 2 * j], (int)p[6 + 2 * j + 1]), 1, Scalar(0, 255, 0));
    //  }
    //}
    //imshow("Results_multiview_reinforce", result_multiview_reinforce);

    waitKey();

    //release the buffer
    free(pBuffer);

    return 0;
}

6.Seetaface人脸检测方法

Seetaface由中科院计算所山世光研究员带领的人脸识别研究组研发,代码基于C++实现,不依赖第三方库,开源协议为BSD-2,可供学术界和工业界免费使用。SeetaFace人脸识别引擎包括了搭建一套全自动人脸识别系统所需的三个核心模块,即:人脸检测模块SeetaFace Detection、面部特征点定位模块SeetaFace Alignment以及人脸特征提取与比对模块SeetaFace Identification。它是一套包括所有技术模块的、完全开源的基准人脸识别系统。开源Github地址
其中,SeetaFace Detection采用了一种结合传统人造特征与多层感知机(MLP)的级联结构,在FDDB上达到了84.4%的召回率(100个误检时),并可在单个i7 CPU上实时处理VGA分辨率的图像。
人脸检测模块SeetaFace Detection采用的是一种结合经典级联结构和多层神经网络的人脸检测方法实现,其所采用的漏斗型级联结构(Funnel-Structured Cascade,FuSt)专门针对多姿态人脸检测而设计,其中引入了由粗到精的设计理念,兼顾了速度和精度的平衡。
如图1所示,FuSt级联结构在顶部由多个针对不同姿态的快速LAB级联分类器构成,紧接着是若干个基于SURF特征的多层感知机(MLP)级联结构,最后由一个统一的MLP级联结构(同样基于SURF特征)来处理所有姿态的候选窗口,整体上呈现出上宽下窄的漏斗形状。从上往下,各个层次上的分类器及其所采用的特征逐步变得复杂,从而可以保留人脸窗口并排除越来越难与人脸区分的非人脸候选窗口。
这里写图片描述

本人在ubuntu下编译测试过,FaceDetection模块问题不大,FaceAlign和FaceIdentification有些小坑要踩,参照以下两个博客,测试效果还是很不错的。
这里写图片描述
这里写图片描述


小结

对比上述六种人脸检测方法,其中opencv、dlib和libfacedetect、seetaface都提供了人脸检测的接口可以直接调用,检测效果上个人认为是dlib ≈ libfacedetect ≈ seetaface > opencv,检测速度上,libfacedetect > dlib ≈ seetaface ≈ opencv。另外,前两种是采用深度学习的方法,单个CNN做人脸检测,效果一般速度较慢,但实现简单,而级联CNN人脸检测方法检测效果和速度都与dlib接近,但训练起来可能麻烦点。对于一般的应用需求,除了单CNN的方法,另外五种基本都能达到要求,其中libfacedetect具有最好的实时性,seetaface优势在于有一整套的人脸识别系统。
1.单个CNN人脸检测方法
2.级联CNN人脸检测方法
3.OpenCV人脸检测方法
4.Dlib人脸检测方法
5.libfacedetect人脸检测方法
6.Seetaface人脸检测方法
除了上述这些方法,还有很多很多优秀的算法,在国内face++在人脸方面非常成功,目前已经提供人脸检测、人脸比对、人脸搜索、人脸属性、人脸关键点等SDK,但好像是收费的,做的的确非常出色,百度也做了类似的工作,估计是个公司都做了这里就不多说了…之后会接着介绍人脸特征点检测(人脸对齐)和在此基础上的应用。

发布了16 篇原创文章 · 获赞 314 · 访问量 96万+
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 书香水墨 设计师: CSDN官方博客

分享到微信朋友圈

×

扫一扫,手机浏览