OpenCV进阶(8)性别和年龄识别

在本教程中,我们将讨论深度学习应用于人脸的一个有趣应用。我们将估计年龄并从单个图像中找出人的性别。我们将简要讨论本文的主要思想,并提供有关如何在 OpenCV 中使用该模型的分步说明。我们将使用 OpenCV 学习性别和年龄分类。

1. 使用 CNN 进行性别和年龄分类

作者使用了一个非常简单的卷积神经网络架构,类似于CaffeNet和AlexNet。该网络使用3个卷积层,2个完全连接层和一个最终输出层。下面给出了这些层的详细信息。

  • Conv1:第一个卷积层有96 个,核大小为 7的卷积核 。
  • Conv2:第二个卷积层有256 个,核大小为 5的卷积核 。
  • Conv3:第三个卷积层有 384 个,核大小为 3的卷积核 。
  • 两个全连接层各有 512 个节点。
    他们使用 Adience 数据集来训练模型。

1.1性别预测

们将性别预测定义为一个分类问题。性别预测网络中的输出层是 softmax 类型,有 2 个节点,表示“男”和“女”两类。

1.2年龄预测

理想情况下,年龄预测应该作为回归问题来处理,因为我们期望输出一个实数。然而,使用回归准确估计年龄具有挑战性。即使是人类也无法通过观察一个人来准确预测年龄。但是,我们知道他们是 20 多岁还是 30 多岁。由于这个原因,明智的做法是将这个问题定义为一个分类问题,我们尝试估计这个人所处的年龄组。例如,0-2 岁范围内的年龄是一个类别,4-6 岁是另一个类别类等。

Adience 数据集有 8 个类别,分为以下年龄组 [(0 – 2), (4 – 6), (8 – 12), (15 – 20), (25 – 32), (38 – 43), ( 48 – 53), (60 – 100)]。因此,年龄预测网络在最后的 softmax 层中有 8 个节点,表示提到的年龄范围。

应该记住,从单一图像预测年龄不是一个很容易解决的问题,因为感知的年龄取决于很多因素,相同年龄的人在世界各地可能看起来非常不同。而且,人们非常努力地隐藏他们的真实年龄!

2. 代码教程

代码可以分为四部分:

  • 1.检测人脸
  • 2.检测性别
  • 3.检测年龄
  • 4.显示输出

2.1 代码展示

让我们看看在 OpenCV 中使用 DNN 模块进行性别和年龄预测的代码。

链接:https://pan.baidu.com/s/1eVd4kEczt4diGApc6BbQFQ 
提取码:123a

(1)Python

# Usage
# python AgeGender.py --input sample1.jpg

# 导入所需模块
import cv2 as cv
import math
import time
import argparse

def getFaceBox(net, frame, conf_threshold=0.15):
    frameOpencvDnn = frame.copy()
    frameHeight = frameOpencvDnn.shape[0]
    frameWidth = frameOpencvDnn.shape[1]
    blob = cv.dnn.blobFromImage(frameOpencvDnn, 1.0, (300, 300), [104, 117, 123], True, False)

    net.setInput(blob)
    detections = net.forward()
    bboxes = []
    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2]
        if confidence > conf_threshold:
            x1 = int(detections[0, 0, i, 3] * frameWidth)
            y1 = int(detections[0, 0, i, 4] * frameHeight)
            x2 = int(detections[0, 0, i, 5] * frameWidth)
            y2 = int(detections[0, 0, i, 6] * frameHeight)
            bboxes.append([x1, y1, x2, y2])
            cv.rectangle(frameOpencvDnn, (x1, y1), (x2, y2), (0, 255, 0), int(round(frameHeight/150)), 8)
    return frameOpencvDnn, bboxes


parser = argparse.ArgumentParser(description='Use this script to run age and gender recognition using OpenCV.')
parser.add_argument('--input', help='Path to input image or video file. Skip this argument to capture frames from a camera.', default="people.jpg")
parser.add_argument("--device", default="cpu", help="Device to inference on")

args = parser.parse_args()


args = parser.parse_args()

faceProto = "opencv_face_detector.pbtxt"
faceModel = "opencv_face_detector_uint8.pb"

ageProto = "age_deploy.prototxt"
ageModel = "age_net.caffemodel"

genderProto = "gender_deploy.prototxt"
genderModel = "gender_net.caffemodel"

MODEL_MEAN_VALUES = (78.4263377603, 87.7689143744, 114.895847746)
ageList = ['(0-2)', '(4-6)', '(8-12)', '(15-20)', '(25-32)', '(38-43)', '(48-53)', '(60-100)']
genderList = ['Male', 'Female']

# 加载网络
ageNet = cv.dnn.readNet(ageModel, ageProto)
genderNet = cv.dnn.readNet(genderModel, genderProto)
faceNet = cv.dnn.readNet(faceModel, faceProto)


if args.device == "cpu":
    ageNet.setPreferableBackend(cv.dnn.DNN_TARGET_CPU)

    genderNet.setPreferableBackend(cv.dnn.DNN_TARGET_CPU)
    
    faceNet.setPreferableBackend(cv.dnn.DNN_TARGET_CPU)

    print("Using CPU device")
elif args.device == "gpu":
    ageNet.setPreferableBackend(cv.dnn.DNN_BACKEND_CUDA)
    ageNet.setPreferableTarget(cv.dnn.DNN_TARGET_CUDA)

    genderNet.setPreferableBackend(cv.dnn.DNN_BACKEND_CUDA)
    genderNet.setPreferableTarget(cv.dnn.DNN_TARGET_CUDA)

    genderNet.setPreferableBackend(cv.dnn.DNN_BACKEND_CUDA)
    genderNet.setPreferableTarget(cv.dnn.DNN_TARGET_CUDA)
    print("Using GPU device")


# 打开视频文件或图像文件或相机流
cap = cv.VideoCapture(args.input if args.input else 0)
padding = 20
while cv.waitKey(1) < 0:
    # 读取帧图像
    t = time.time()
    hasFrame, frame = cap.read()
    if not hasFrame:
        cv.waitKey()
        break

    frameFace, bboxes = getFaceBox(faceNet, frame)
    if not bboxes:
        print("No face Detected, Checking next frame")
        continue

    for bbox in bboxes:
        # print(bbox)
        face = frame[max(0,bbox[1]-padding):min(bbox[3]+padding,frame.shape[0]-1),max(0,bbox[0]-padding):min(bbox[2]+padding, frame.shape[1]-1)]

        blob = cv.dnn.blobFromImage(face, 1.0, (227, 227), MODEL_MEAN_VALUES, swapRB=False)
        genderNet.setInput(blob)
        genderPreds = genderNet.forward()
        gender = genderList[genderPreds[0].argmax()]
        # print("Gender Output : {}".format(genderPreds))
        print("Gender : {}, conf = {:.3f}".format(gender, genderPreds[0].max()))

        ageNet.setInput(blob)
        agePreds = ageNet.forward()
        age = ageList[agePreds[0].argmax()]
        print("Age Output : {}".format(agePreds))
        print("Age : {}, conf = {:.3f}".format(age, agePreds[0].max()))

        label = "{},{}".format(gender, age)
        cv.putText(frameFace, label, (bbox[0], bbox[1]-10), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2, cv.LINE_AA)
        cv.imshow("Age Gender Demo", frameFace)
        # cv.imwrite("age-gender-out-{}".format(args.input),frameFace)
    print("time : {:.3f}".format(time.time() - t))


 
# cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=~/opencv_gpu -DINSTALL_PYTHON_EXAMPLES=OFF -DINSTALL_C_EXAMPLES=OFF -DOPENCV_ENABLE_NONFREE=ON -DOPENCV_EXTRA_MODULES_PATH=~/cv2_gpu/opencv_contrib/modules -DPYTHON_EXECUTABLE=~/env/bin/python3 -DBUILD_EXAMPLES=ON -DWITH_CUDA=ON -DWITH_CUDNN=ON -DOPENCV_DNN_CUDA=ON  -DENABLE_FAST_MATH=ON -DCUDA_FAST_MATH=ON  -DWITH_CUBLAS=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-10.2 -DOpenCL_LIBRARY=/usr/local/cuda-10.2/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda-10.2/include/ ..

(2)C++

// Usage
//./AgeGender sample1.jpg
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/dnn.hpp>
#include <tuple>
#include <iostream>
#include <opencv2/opencv.hpp>
#include <iterator>
using namespace cv;
using namespace cv::dnn;
using namespace std;

tuple<Mat, vector<vector<int>>> getFaceBox(Net net, Mat &frame, double conf_threshold)
{
    Mat frameOpenCVDNN = frame.clone();
    int frameHeight = frameOpenCVDNN.rows;
    int frameWidth = frameOpenCVDNN.cols;
    double inScaleFactor = 1.0;
    Size size = Size(300, 300);
    // std::vector<int> meanVal = {104, 117, 123};
    Scalar meanVal = Scalar(104, 117, 123);

    cv::Mat inputBlob;
    inputBlob = cv::dnn::blobFromImage(frameOpenCVDNN, inScaleFactor, size, meanVal, true, false);

    net.setInput(inputBlob, "data");
    cv::Mat detection = net.forward("detection_out");

    cv::Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());

    vector<vector<int>> bboxes;

    for(int i = 0; i < detectionMat.rows; i++)
    {
        float confidence = detectionMat.at<float>(i, 2);

        if(confidence > conf_threshold)
        {
            int x1 = static_cast<int>(detectionMat.at<float>(i, 3) * frameWidth);
            int y1 = static_cast<int>(detectionMat.at<float>(i, 4) * frameHeight);
            int x2 = static_cast<int>(detectionMat.at<float>(i, 5) * frameWidth);
            int y2 = static_cast<int>(detectionMat.at<float>(i, 6) * frameHeight);
            vector<int> box = {x1, y1, x2, y2};
            bboxes.push_back(box);
            cv::rectangle(frameOpenCVDNN, cv::Point(x1, y1), cv::Point(x2, y2), cv::Scalar(0, 255, 0),2, 4);
        }
    }

    return make_tuple(frameOpenCVDNN, bboxes);
}

int main(int argc, char** argv)
{
    string faceProto = "opencv_face_detector.pbtxt";
    string faceModel = "opencv_face_detector_uint8.pb";

    string ageProto = "age_deploy.prototxt";
    string ageModel = "age_net.caffemodel";

    string genderProto = "gender_deploy.prototxt";
    string genderModel = "gender_net.caffemodel";

    Scalar MODEL_MEAN_VALUES = Scalar(78.4263377603, 87.7689143744, 114.895847746);

    vector<string> ageList = {"(0-2)", "(4-6)", "(8-12)", "(15-20)", "(25-32)",
      "(38-43)", "(48-53)", "(60-100)"};

    vector<string> genderList = {"Male", "Female"};


    cout << "USAGE : ./AgeGender <videoFile> " << endl;
    cout << "USAGE : ./AgeGender <device> " << endl;
    cout << "USAGE : ./AgeGender <videoFile> <device>" << endl;

    string device = "cpu";

    string videoFile = "0";

    // 从命令行获取参数
    if (argc == 2)
    {   
      if((string)argv[1] == "gpu")
        device = "gpu";
      else if((string)argv[1] == "cpu")
        device = "cpu";
      else 
      videoFile = argv[1];
    }
    else if (argc == 3)
    {
        videoFile = argv[1];
        if((string)argv[2] == "gpu")
            device = "gpu";
    }

    // 加载网络模型
    Net ageNet = readNet(ageModel, ageProto);
    Net genderNet = readNet(genderModel, genderProto);
    Net faceNet = readNet(faceModel, faceProto);

    if (device == "cpu")
    {
        cout << "Using CPU device" << endl;
        ageNet.setPreferableBackend(DNN_TARGET_CPU);
        
        genderNet.setPreferableBackend(DNN_TARGET_CPU);

        faceNet.setPreferableBackend(DNN_TARGET_CPU);
    }
    else if (device == "gpu")
    {
        cout << "Using GPU device" << endl;
        ageNet.setPreferableBackend(DNN_BACKEND_CUDA);
        ageNet.setPreferableTarget(DNN_TARGET_CUDA);

        genderNet.setPreferableBackend(DNN_BACKEND_CUDA);
        genderNet.setPreferableTarget(DNN_TARGET_CUDA);

        faceNet.setPreferableBackend(DNN_BACKEND_CUDA);
        faceNet.setPreferableTarget(DNN_TARGET_CUDA);
    }


    VideoCapture cap;
    if (videoFile.length() > 1)
        cap.open(videoFile);
    else
        cap.open(0);
    int padding = 20;
    while(waitKey(1) < 0) {
      // read frame
      Mat frame;
      cap.read(frame);
      if (frame.empty())
      {
          waitKey();
          break;
      }

      vector<vector<int>> bboxes;
      Mat frameFace;
      tie(frameFace, bboxes) = getFaceBox(faceNet, frame, 0.7);

      if(bboxes.size() == 0) {
        cout << "No face detected, checking next frame." << endl;
        continue;
      }
      for (auto it = begin(bboxes); it != end(bboxes); ++it) {
        Rect rec(it->at(0) - padding, it->at(1) - padding, it->at(2) - it->at(0) + 2*padding, it->at(3) - it->at(1) + 2*padding);
        Mat face = frame(rec); // take the ROI of box on the frame

        Mat blob;
        blob = blobFromImage(face, 1, Size(227, 227), MODEL_MEAN_VALUES, false);
        genderNet.setInput(blob);
        // string gender_preds;
        vector<float> genderPreds = genderNet.forward();
        // 在这里打印性别
        // 找到最大元素索引
        // 距离函数在 C++ 中执行 argmax() 工作
        int max_index_gender = std::distance(genderPreds.begin(), max_element(genderPreds.begin(), genderPreds.end()));
        string gender = genderList[max_index_gender];
        cout << "Gender: " << gender << endl;

        /* // 如果您想遍历gender_preds 向量,请取消注释
        for(auto it=begin(gender_preds); it != end(gender_preds); ++it) {
          cout << *it << endl;
        }
        */

        ageNet.setInput(blob);
        vector<float> agePreds = ageNet.forward();
        /* // 如果要遍历 age_preds,请取消注释下面的代码
         * vector
        cout << "PRINTING AGE_PREDS" << endl;
        for(auto it = age_preds.begin(); it != age_preds.end(); ++it) {
          cout << *it << endl;
        }
        */

        // 在 age_preds 向量中找到最大值索引
        int max_indice_age = std::distance(agePreds.begin(), max_element(agePreds.begin(), agePreds.end()));
        string age = ageList[max_indice_age];
        cout << "Age: " << age << endl;
        string label = gender + ", " + age; // label
        cv::putText(frameFace, label, Point(it->at(0), it->at(1) -15), cv::FONT_HERSHEY_SIMPLEX, 0.9, Scalar(0, 255, 255), 2, cv::LINE_AA);
        imshow("Frame", frameFace);
        imwrite("out.jpg",frameFace);
      }

    }
}

2.2代码解析

2.2.1人脸检测

我们将使用 DNN 人脸检测器进行人脸检测。该模型只有 2.7MB,即使在 CPU 上也非常快。人脸检测是使用函数 getFaceBox 完成的,如下所示。

tuple<Mat, vector<vector<int>>> getFaceBox(Net net, Mat &frame, double conf_threshold)
{
    Mat frameOpenCVDNN = frame.clone();
    int frameHeight = frameOpenCVDNN.rows;
    int frameWidth = frameOpenCVDNN.cols;
    double inScaleFactor = 1.0;
    Size size = Size(300, 300);
    // std::vector<int> meanVal = {104, 117, 123};
    Scalar meanVal = Scalar(104, 117, 123);

    cv::Mat inputBlob;
    cv::dnn::blobFromImage(frameOpenCVDNN, inputBlob, inScaleFactor, size, meanVal, true, false);

    net.setInput(inputBlob, "data");
    cv::Mat detection = net.forward("detection_out");

    cv::Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());

    vector<vector<int>> bboxes;

    for(int i = 0; i < detectionMat.rows; i++)
    {
        float confidence = detectionMat.at<float>(i, 2);

        if(confidence > conf_threshold)
        {
            int x1 = static_cast<int>(detectionMat.at<float>(i, 3) * frameWidth);
            int y1 = static_cast<int>(detectionMat.at<float>(i, 4) * frameHeight);
            int x2 = static_cast<int>(detectionMat.at<float>(i, 5) * frameWidth);
            int y2 = static_cast<int>(detectionMat.at<float>(i, 6) * frameHeight);
            vector<int> box = {x1, y1, x2, y2};
            bboxes.push_back(box);
            cv::rectangle(frameOpenCVDNN, cv::Point(x1, y1), cv::Point(x2, y2), cv::Scalar(0, 255, 0),2, 4);
        }
    }

    return make_tuple(frameOpenCVDNN, bboxes);
}
def getFaceBox(net, frame, conf_threshold=0.7):
    frameOpencvDnn = frame.copy()
    frameHeight = frameOpencvDnn.shape[0]
    frameWidth = frameOpencvDnn.shape[1]
    blob = cv.dnn.blobFromImage(frameOpencvDnn, 1.0, (300, 300), [104, 117, 123], True, False)

    net.setInput(blob)
    detections = net.forward()
    bboxes = []
    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2]
        if confidence > conf_threshold:
            x1 = int(detections[0, 0, i, 3] * frameWidth)
            y1 = int(detections[0, 0, i, 4] * frameHeight)
            x2 = int(detections[0, 0, i, 5] * frameWidth)
            y2 = int(detections[0, 0, i, 6] * frameHeight)
            bboxes.append([x1, y1, x2, y2])
            cv.rectangle(frameOpencvDnn, (x1, y1), (x2, y2), (0, 255, 0), int(round(frameHeight/150)), 8)
    return frameOpencvDnn, bboxes
2.2.2预测性别

我们将性别网络加载到内存中,并将检测到的人脸通过网络。前向传递给出了两个类的概率或置信度。我们取两个输出的最大值并将其用作最终的性别预测。

string genderProto = "gender_deploy.prototxt";
string genderModel = "gender_net.caffemodel";
Net genderNet = readNet(genderModel, genderProto);

vector<string> genderList = {"Male", "Female"};

blob = blobFromImage(face, 1, Size(227, 227), MODEL_MEAN_VALUES, false);
genderNet.setInput(blob);
// string gender_preds;
vector<float> genderPreds = genderNet.forward();
// 在这里打印性别
// 找到最大元素索引
// distance函数在c++中执行argmax()函数
int max_index_gender = std::distance(genderPreds.begin(), max_element(genderPreds.begin(), genderPreds.end()));
string gender = genderList[max_index_gender];
genderProto = "gender_deploy.prototxt"
genderModel = "gender_net.caffemodel"
ageNet = cv.dnn.readNet(ageModel, ageProto)

genderList = ['Male', 'Female']

blob = cv.dnn.blobFromImage(face, 1, (227, 227), MODEL_MEAN_VALUES, swapRB=False)
genderNet.setInput(blob)
genderPreds = genderNet.forward()
gender = genderList[genderPreds[0].argmax()]
print("Gender Output : {}".format(genderPreds))
print("Gender : {}".format(gender))
2.2.3预测年龄

我们加载年龄网络并使用前向传递来获得输出。由于网络架构类似于性别网络,我们可以从所有输出中取最大值以获得预测的年龄组。

string ageProto = "age_deploy.prototxt";
string ageModel = "age_net.caffemodel";
Net ageNet = readNet(ageModel, ageProto);

vector<string> ageList = {"(0-2)", "(4-6)", "(8-12)", "(15-20)", "(25-32)", "(38-43)", "(48-53)", "(60-100)"};

ageNet.setInput(blob);
vector<float> agePreds = ageNet.forward();
int max_indice_age = distance(agePreds.begin(), max_element(agePreds.begin(), agePreds.end()));
string age = ageList[max_indice_age];
ageProto = "age_deploy.prototxt"
ageModel = "age_net.caffemodel"
ageNet = cv.dnn.readNet(ageModel, ageProto)

ageList = ['(0 - 2)', '(4 - 6)', '(8 - 12)', '(15 - 20)', '(25 - 32)', '(38 - 43)', '(48 - 53)', '(60 - 100)']

ageNet.setInput(blob)
agePreds = ageNet.forward()
age = ageList[agePreds[0].argmax()]
print("Gender Output : {}".format(agePreds))
print("Gender : {}".format(age))

2.3显示输出

我们将在输入图像上显示网络的输出,并使用 imshow 函数显示它们。

string label = gender + ", " + age; // label
cv::putText(frameFace, label, Point(it->at(0), it->at(1) -20), cv::FONT_HERSHEY_SIMPLEX, 0.9, Scalar(0, 255, 255), 2, cv::LINE_AA);
imshow("Frame", frameFace);
label = "{}, {}".format(gender, age)
cv.putText(frameFace, label, (bbox[0], bbox[1]-20), cv.FONT_HERSHEY_SIMPLEX, 0.8, (255, 0, 0), 3, cv.LINE_AA)
cv.imshow("Age Gender Demo", frameFace)

2.4结果

在这里插入图片描述
在这里插入图片描述

3.结论

尽管性别预测网络表现良好,但年龄预测网络没有达到我们的预期。我们试图在论文中找到答案,并找到以下年龄预测模型的混淆矩阵。
在这里插入图片描述
从上表可以得出以下意见:

  • 0-2、4-6、8-13 和 25-32 年龄组的预测准确度相对较高。 (见对角线元素)
  • 输出严重偏向于年龄组 25-32(参见属于年龄组 25-32 的行)。这意味着网络很容易混淆 15 到 43 岁之间的年龄。因此,即使实际年龄在 15-20 或 38-43 之间,预测年龄也很有可能是 25- 32.这在结果部分也很明显。

除此之外,我们观察到如果我们在检测到的人脸周围使用填充,模型的准确性会提高。这可能是因为训练时的输入是标准的人脸图像,而不是我们在人脸检测后得到的裁剪得很近的人脸。

我们还在进行预测之前分析了人脸对齐的使用,发现某些示例的预测有所改善,但与此同时,某些示例的预测变得更糟。如果您主要使用非正面的面孔,则使用对齐可能是个好主意。

参考目录

https://learnopencv.com/age-gender-classification-using-opencv-deep-learning-c-python/

  • 4
    点赞
  • 33
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值