(二)混合边缘AI人脸检测

寒冰屋

已于 2022-02-28 21:47:00 修改

阅读量377

点赞数

分类专栏：人工智能 python 文章标签：人工智能深度学习 python

于 2022-02-27 18:20:25 首次发布

原文链接：https://www.codeproject.com/Articles/5306636/Hybrid-Edge-AI-Face-Detection

版权

人工智能同时被 2 个专栏收录

564 篇文章 57 订阅

订阅专栏

python

261 篇文章 9 订阅

订阅专栏

介绍

人脸识别是人工智能(AI)的一个领域，深度学习(DL)在过去十年中取得了巨大成功。最好的人脸识别系统可以以与人类相同的精度识别图像和视频中的人物，甚至更好。人脸识别的两个主要基础阶段是人员验证和身份识别。

在本系列文章的前半部分（当前）中，我们将：

讨论现有的AI人脸检测方法并开发运行预训练DNN模型的程序
考虑面部对齐并使用面部标志实现一些对齐算法
在Raspberry Pi设备上运行人脸检测DNN，探索其性能，并考虑可能的方法来更快地运行它，以及实时检测人脸
创建一个简单的人脸数据库并用从图像或视频中提取的人脸填充它

我们假设您熟悉DNN、Python、Keras和TensorFlow。

在上一篇文章中，我们讨论了人脸检测和人脸识别的原理。在这一节中，我们将了解特定的人脸检测方法并实现其中的一种。

人脸检测方法

人脸检测是任何人脸识别过程的第一阶段。这是影响所有后续步骤的关键步骤。它需要一种稳健的方法来最小化检测误差。人脸检测的方法有很多种；我们将专注于基于人工智能的方法。

我们想提及以下现代人脸检测方法：最大边距目标检测(MMOD)、单次检测器(SSD)、多任务级联卷积网络(MTCNN) 和您只看一次(YOLO)。

MMOD模型需要太多资源才能在边缘设备上运行。最快的DNN是YOLO；它在检测真实场景视频中的人脸时提供了相当好的精度。上述方法中最精确的是SSD。它具有足够的处理速度，可用于低功耗设备。

YOLO和SSD方法的主要缺点是它们无法提供有关面部标志的信息。正如我们将进一步看到的，这些信息对于人脸对齐很重要。

MTCNN提供良好的精度并找到面部标志。它足够轻量级，可以在资源受限的边缘设备上运行。

MTCNN检测器

在本系列中，我们将使用MTCNN检测器的免费Keras实现。您可以使用标准pip命令在 Python环境中安装此库。它需要OpenCV 4.1和TensorFlow 2.0（或更高版本）。

您可以通过运行简单的Python代码来测试MTCNN是否安装成功：

import mtcnn

print(mtcnn.__version__)

输出必须显示已安装库的版本——0.1.0。

安装库后，我们可以为简单的人脸检测器编写基于MTCNN的代码：

import os
import time
import numpy as np
import copy
import mtcnn
from mtcnn import MTCNN
import cv2

class MTCNN_Detector:    
    def __init__(self, min_size, min_confidence):
        self.min_size = min_size
        self.f_detector = MTCNN(min_face_size=min_size)
        self.min_confidence = min_confidence
    
    def detect(self, frame):
        faces = self.f_detector.detect_faces(frame)
        
        detected = []
        for (i, face) in enumerate(faces):
            f_conf = face['confidence']
            if f_conf>=self.min_confidence:
                detected.append(face)
        
        return detected
    
    def extract(self, frame, face):
        (x1, y1, w, h) =  face['box']
        (l_eye, r_eye, nose, mouth_l, mouth_r) = Utils.get_keypoints(face)
        
        f_cropped = copy.deepcopy(face)
        move = (-x1, -y1)
        l_eye = Utils.move_point(l_eye, move)
        r_eye = Utils.move_point(r_eye, move)
        nose = Utils.move_point(nose, move)
        mouth_l = Utils.move_point(mouth_l, move)
        mouth_r = Utils.move_point(mouth_r, move)
            
        f_cropped['box'] = (0, 0, w, h)
        f_img = frame[y1:y1+h, x1:x1+w].copy()
            
        f_cropped = Utils.set_keypoints(f_cropped, (l_eye, r_eye, nose, mouth_l, mouth_r))
        
        return (f_cropped, f_img)

检测器类具有带有两个参数的构造函数：min_size——人脸的最小尺寸（以像素为单位）；和min_confidence——最小的自信，以确认所检测到的物体是人脸。该类的detect方法使用内部MTCNN检测器获取帧中的人脸，然后过滤检测到的至少具有最小置信度值的对象。最后一个方法，extract，旨在从帧中裁剪面部图像。

我们还需要以下Utils类：

class Utils:    
    @staticmethod
    def draw_face(face, color, frame, draw_points=True, draw_rect=True, n_data=None):
        (x1, y1, w, h) =  face['box']
        confidence = face['confidence']
        x2 = x1+w
        y2 = y1+h
        if draw_rect:
            cv2.rectangle(frame, (x1, y1), (x2, y2), color, 1)
        y3 = y1-12
        if not (n_data is None):
            (name, conf) = n_data
            text = name+ (" %.3f" % conf)
        else:
            text = "%.3f" % confidence
        
        cv2.putText(frame, text, (x1, y3), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 1, cv2.LINE_AA)
        if draw_points:
            (l_eye, r_eye, nose, mouth_l, mouth_r) = Utils.get_keypoints(face)
            Utils.draw_point(l_eye, color, frame)
            Utils.draw_point(r_eye, color, frame)
            Utils.draw_point(nose, color, frame)
            Utils.draw_point(mouth_l, color, frame)
            Utils.draw_point(mouth_r, color, frame)
        
    @staticmethod
    def get_keypoints(face):
        keypoints = face['keypoints']
        l_eye = keypoints['left_eye']
        r_eye = keypoints['right_eye']
        nose = keypoints['nose']
        mouth_l = keypoints['mouth_left']
        mouth_r = keypoints['mouth_right']
        return (l_eye, r_eye, nose, mouth_l, mouth_r)
    
    def set_keypoints(face, points):
        (l_eye, r_eye, nose, mouth_l, mouth_r) = points
        keypoints = face['keypoints']
        keypoints['left_eye'] = l_eye
        keypoints['right_eye'] = r_eye
        keypoints['nose'] = nose
        keypoints['mouth_left'] = mouth_l
        keypoints['mouth_right'] = mouth_r
        
        return face
        
    @staticmethod
    def move_point(point, move):
        (x, y) = point
        (dx, dy) = move
        res = (x+dx, y+dy)
        return res
        
    @staticmethod
    def draw_point(point, color, frame):
        (x, y) =  point
        x1 = x-1
        y1 = y-1
        x2 = x+1
        y2 = y+1
        cv2.rectangle(frame, (x1, y1), (x2, y2), color, 1)
        
    @staticmethod
    def draw_faces(faces, color, frame, draw_points=True, draw_rect=True, names=None):
        for (i, face) in enumerate(faces):
            n_data = None
            if not (names is None):
                n_data = names[i]
            Utils.draw_face(face, color, frame, draw_points, draw_rect, n_data)

在MTCNN检测器的输出，每个面对象是具有以下键的字典：box，confidence和keypoints。该keypoints项是一个包含面部标志数据字典：left_eye，right_eye，nose，mouth_left，和mouth_right。Utils类提供简单的访问脸数据并实现多种功能来操纵数据，并绘制图像边界周围的面框。

图像中的人脸检测

现在我们可以编写Python代码来检测图像中的人脸：

d = MTCNN_Detector(30, 0.5)
print("Detector loaded.")

f_file = r"C:\PI_FR\frames\frame_5_02.png"
fimg = cv2.imread(f_file)

faces = d.detect(fimg)

for face in faces:
	print(face)

Utils.draw_faces(faces, (0, 0, 255), fimg, True, True)

res_path = r"C:\PI_FR\detect"
f_base = os.path.basename(f_file)
r_file = os.path.join(res_path, f_base+"_detected.png")
cv2.imwrite(r_file, fimg)

for (i, face) in enumerate(faces):
	(f_cropped, f_img) = d.extract(fimg, face)
	Utils.draw_faces([f_cropped], (255, 0, 0), f_img, True, False)
	dfname = os.path.join(res_path, f_base + ("_%06d" % i) + ".png")
	cv2.imwrite(dfname, f_img)

运行上述代码会在detect文件夹中生成此图像。

如您所见，检测器以良好的置信度找到了所有三张脸——约99%。我们还在同一目录中获得了裁剪过的面孔。

对不同的帧运行相同的代码，我们可以测试各种情况的检测。这是两帧的结果。

结果表明，检测器能够找到戴眼镜的人脸，并成功地检测到婴儿的脸。

视频中的人脸检测

在单独的图像上测试了检测器后，现在让我们编写用于检测视频中的人脸的代码：

class VideoFD:    
    def __init__(self, detector):
        self.detector = detector
    
    def detect(self, video, save_path = None, align = False, draw_points = False):
        detection_num = 0;
        capture = cv2.VideoCapture(video)
        img = None

        dname = 'AI face detection'
        cv2.namedWindow(dname, cv2.WINDOW_NORMAL)
        cv2.resizeWindow(dname, 960, 720)
        
        frame_count = 0
        dt = 0
        face_num = 0
        # Capture all frames
        while(True):    
            (ret, frame) = capture.read()
            if frame is None:
                break
            frame_count = frame_count+1
            
            t1 = time.time()
            faces = self.detector.detect(frame)
            t2 = time.time()
            p_count = len(faces)
            detection_num += p_count
            dt = dt + (t2-t1)
            
            if (not (save_path is None)) and (len(faces)>0) :
                f_base = os.path.basename(video)
                for (i, face) in enumerate(faces):
                    (f_cropped, f_img) = self.detector.extract(frame, face)
                    if (not (f_img is None)) and (not f_img.size==0):
                        if draw_points:
                            Utils.draw_faces([f_cropped], (255, 0, 0), f_img, draw_points, False)
                        face_num = face_num+1
                        dfname = os.path.join(save_path, f_base + ("_%06d" % face_num) + ".png") 
                        cv2.imwrite(dfname, f_img)
            
            if len(faces)>0:
                Utils.draw_faces(faces, (0, 0, 255), frame)
            
            # Display the resulting frame
            cv2.imshow(dname,frame)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            
        capture.release()
        cv2.destroyAllWindows()    
        
        fps = frame_count/dt
        
        return (detection_num, fps)

该VideoFD类仅包装我们执行MTCNN检测器的实现，并将从视频文件中提取的帧提供给它。它使用OpenCV库中的VideoCapture类。

我们可以使用以下代码启动视频检测器：

d = MTCNN_Detector(50, 0.95)
vd = VideoFD(d)
v_file = r"C:\PI_FR\video\5_3.mp4"

save_path = r"C:\PI_FR\detect"
(f_count, fps) = vd.detect(v_file, save_path, False, False)

print("Face detections: "+str(f_count))
print("FPS: "+str(fps))

测试显示出良好的结果：在视频文件的大多数帧中都检测到了人脸。Core i7 CPU的处理速度约为20 FPS。对于像人脸检测这样的艰巨任务来说，这令人印象深刻。