MTCNN + Deep_Sort实现多目标人脸跟踪之Deep_Sort算法部分(二)

最新推荐文章于 2024-02-22 07:30:00 发布

沙皮狗de忧伤

最新推荐文章于 2024-02-22 07:30:00 发布

阅读量4k

点赞数 1

分类专栏： Project 文章标签：人脸检测深度学习 MTCNN 跟踪

本文链接：https://blog.csdn.net/weixin_38106878/article/details/100009101

版权

Project 专栏收录该内容

12 篇文章 12 订阅

订阅专栏

前言：

本文的测试思路仅供参考和学习，希望能和大家分享、交流相关的学习经验！
同时，本人的文字功底不是那么好，所以就直接上代码，请多多谅解！文章内也没对算法做解读，所以，若是想要了解算法，可以参考相关算法的解读博客！！文章最后贴上Project的github链接～
项目参考链接：https://github.com/Qidian213/deep_sort_yolov3

目录
一. MTCNN实时视频人脸检测
二. Deep_Sort与MTCNN结合
三. 结果截图展示

一. MTCNN实时视频人脸检测
此部分本人的上一篇博客已经写了，如何训练人脸检测model，以及实现实时视频的人脸检测；这里贴一下链接，防止迷路： MTCNN实时人脸检测

二. Deep_Sort跟踪算法与MTCNN人脸检测算法的结合
deep_sort是sort算法的升级版，提升了算法对多目标跟踪的准确性(其实我没做过跟踪，只知道这些)。同时，该算法在速度上还是很理想的，与MTCNN结合，做人脸的跟踪，也能够达到实时的效果；
大致讲一下本人code思路，然后直接上code；
【deep-sort算法需要获取到人脸的定位框信息，也就是检测框位置坐标信息，然后根据坐标信息去预测下一个时刻的状态，然后就可以实现跟踪了。所以，很简单，把mtcnn人脸检测的boxes信息丢给deep_sort算法就好了, 到这里就实现了人脸的跟踪了】

1.deep_sort算法环境的搭建
上一篇MTCNN人脸检测的博客已经完成了基本环境的搭建了，在上一篇博客的基础之上，将下面的packages用pip指令安装好即可：

pip install opencv-python
pip install numpy 
pip install scikit-learn
pip install pillow

2.deep_sort算法与MTCNN检测部分的结合
直接上主要的code： mtcnn_sort.py

#-*-coding: utf-8-*-
#Author: lxz-HXY

'''
deep_sort combine with MTCNN  to follow people's face!!!
'''
from __future__ import division, print_function, absolute_import
import os
from timeit import time
import warnings
import sys
from cv2 import cv2
import numpy as np   
import gc
from multiprocessing import Process, Manager 

from training.mtcnn_model import P_Net, R_Net,O_Net
from tools.loader import TestLoader
from detection.MtcnnDetector import MtcnnDetector
from detection.detector import Detector
from detection.fcn_detector import FcnDetector

from deep_sort import preprocessing
from deep_sort import nn_matching
from deep_sort.detection import Detection
from deep_sort.tracker import Tracker
from tools import generate_detections as gdet
from deep_sort.detection import Detection as ddet
warnings.filterwarnings('ignore')

def mtcnn(stage):
    detectors = [None, None, None]
    if stage in ['pnet', 'rnet', 'onet']:
        modelPath = './tmp/model/pnet/'
        a = [b[5:-6] for b in os.listdir(modelPath) if b.startswith('pnet-') and b.endswith('.index')]
        maxEpoch = max(map(int, a)) # auto match a max epoch model
        modelPath = os.path.join(modelPath, "pnet-%d"%(maxEpoch))
        print("Use PNet model: %s"%(modelPath))
        detectors[0] = FcnDetector(P_Net,modelPath) 
    if stage in ['rnet', 'onet']:
        modelPath = './tmp/model/rnet/'
        a = [b[5:-6] for b in os.listdir(modelPath) if b.startswith('rnet-') and b.endswith('.index')]
        maxEpoch = max(map(int, a))
        modelPath = os.path.join(modelPath, "rnet-%d"%(maxEpoch))
        print("Use RNet model: %s"%(modelPath))
        detectors[1] = Detector(R_Net, 24, 1, modelPath)
    if stage in ['onet']:
        modelPath = './tmp/model/onet/'
        a = [b[5:-6] for b in os.listdir(modelPath) if b.startswith('onet-') and b.endswith('.index')]
        maxEpoch = max(map(int, a))
        modelPath = os.path.join(modelPath, "onet-%d"%(maxEpoch))
        print("Use ONet model: %s"%(modelPath))
        detectors[2] = Detector(O_Net, 48, 1, modelPath)
    return detectors

'''
python多进程：
receive：该进程接收图片
realse：该进程处理图片，进行人脸检测
'''
def receive(stack):
    top = 100
    # url = ' '
    cap = cv2.VideoCapture(0)
    ret, frame = cap.read()
    while True:
        ret, frame = cap.read()
        if ret:
            stack.append(frame)
            if len(stack) >= top:
                print('Stack is full.........')
                del stack[:50]
                gc.collect()

def realse(stack):
    print('Begin to get frame......')
    #解决GPU使用占用问题
    import tensorflow as tf
    os.environ['CUDA_VISIBLE_DEVICES'] = '0'
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    session = tf.Session(config=config)

    tmp = []
    max_cosine_distance = 0.3
    nn_budget = None
    nms_max_overlap = 1.0
    # deep_sort
    model_filename = 'mars-small128.pb'
    encoder = gdet.create_box_encoder(model_filename,batch_size=1)
    metric = nn_matching.NearestNeighborDistanceMetric("cosine", max_cosine_distance, nn_budget)
    tracker = Tracker(metric)
    # mtcnn
    detectors = mtcnn('onet')
    mtcnnDetector = MtcnnDetector(detectors=detectors, min_face_size = 24, threshold=[0.9, 0.6, 0.7])
    while True:
        if len(stack) > 0:
            frame = stack.pop()
            frame = cv2.resize(frame, (int(frame.shape[1]/3), int(frame.shape[0]/3)))
            frame = np.array(frame)
            #这里mtcnn检测到的box信息经过处理，转换成deepsort所需要的坐标位置
            #左上角坐标信息和box中心坐标位置
            tmp, _ = mtcnnDetector.detect_video(frame)     
            features = encoder(frame, tmp)
            detections = [Detection(bbox, 1.0, feature) for bbox, feature in zip(tmp, features)]
            # Run non-maxima suppression.
            boxes = np.array([d.tlwh for d in detections])
            scores = np.array([d.confidence for d in detections])
            indices = preprocessing.non_max_suppression(boxes, nms_max_overlap, scores)
            detections = [detections[i] for i in indices]
            # Call the tracker
            tracker.predict()
            tracker.update(detections)
            for track in tracker.tracks:
                if not track.is_confirmed() or track.time_since_update > 1:
                    continue 
                bbox = track.to_tlbr()
                cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), (255,0,0), 1)
                cv2.putText(frame, str(track.track_id),(int(bbox[0]), int(bbox[1])), 0, 0.5, (0,255,0), 1)
    
            for det in detections:
                bbox = det.to_tlbr()
                cv2.rectangle(frame,(int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),(0,0,255), 1)

            cv2.imshow("Detected", frame)

            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
    cv2.destroyAllWindows()

if __name__=='__main__':
    t = Manager().list()
    t1 = Process(target=receive, args=(t,))
    t2 = Process(target=realse, args=(t,))
    t1.start()
    t2.start()
    t1.join()
    t2.terminate()

代码看起来还是很容易理解的，就是调用了人脸检测，然后将位置信息输送给Deep-sort算发。这里需要注意，在您使用的时候注意相关的文件路径问题【比如说：mtcnn的模型路径，deep-sort文件路径等】。各位朋友使用整个project，在所有前提要求准备好的情况下直接运行该代码即可！！一个回车实现人脸跟踪！

要是遇到代码报错：
1.请检查你的cuda+cudnn版本和tf版本是不是匹配【匹配配置表相关博客】；
2.相关的依赖包是否有安装；

3.整个project文件结构解读
在这里插入图片描述
在图片里面，除了deep_sort文件夹以及mars-small128.pb文件为 deepsort算法的核心文件。其余的文件都来自于mtcnn人脸检测部分。

三.结果截图展示
这里就只上了一个截图展示效果，其实效果没有大家想的那么理想。你要是在摄像头面前，人脸不乱动，不是很大幅度的动，跟踪效果还是不错的！你要是大幅度动头，那就不是那么理想了。跟踪的id会变化。至于为什么，你应该知道的，deepsort做的是行人的跟踪。。。。。如果你有足够的数据，训练一下人脸的跟踪模型，效果你该也会不错的！如果，你真的做了，该部分工作，欢迎交流～
【图片为本人截图，若果有侵权，本人将立即删除！！！图片请勿随意转载使用！！】

最后，贴上整个project的github地址，本博客简单，还算明了吧。。。。。。如果本文对您的学习、科研有帮助的话欢迎大家点赞！顺便给project留下个小星星，谢谢！
GitHub： https://github.com/YingXiuHe/MTCNN-Deep_Sort-.git
【下几篇博客内容：一步一步实现实时人脸识别】

欢迎各位留言交流，若有不足多多谅解！互相学习！

沙皮狗de忧伤

关注

1
点赞
踩
27

收藏

觉得还不错? 一键收藏
16
评论
MTCNN + Deep_Sort实现多目标人脸跟踪之Deep_Sort算法部分(二)

前言：本文的测试思路仅供参考和学习，希望能和大家分享、交流相关的学习经验！同时，本人的文字功底不是那么好，所以就直接上代码，请多多谅解！文章内也没对算法做解读，所以，若是想要了解算法，可以参考相关算法的解读博客！！文章最后贴上Project的github链接～目录一. MTCNN实时视频人脸检测二. Deep_Sort与MTCNN结合三. 结果截图展示一. MTCNN实时视频人...
复制链接

扫一扫