深度学习DFace框架和视频人脸检测

最新推荐文章于 2024-03-26 10:00:36 发布

大郎拱白菜

最新推荐文章于 2024-03-26 10:00:36 发布

阅读量1.2k

点赞数

分类专栏：深度学习

本文链接：https://blog.csdn.net/u013591306/article/details/99633789

版权

深度学习专栏收录该内容

34 篇文章 1 订阅

订阅专栏

一、准备工作

1.Anaconda3环境：pytorch-gpu

CUDA：8.0
cuDNN：5.1
环境配置请参考https://blog.csdn.net/hhy_csdn/article/details/82263078

二、修改代码

激活环境

activate pytorch-gpu

运行test_image.py，执行python test_image.py
出现bug

Traceback (most recent call last):
  File "test_image.py", line 20, in <module>
    bboxs, landmarks = mtcnn_detector.detect_face(img)
  File "K:\Desktop\face_detect\DFace-win64\src\core\detect.py", line 602, in detect_face
    boxes, boxes_align = self.detect_pnet(img)
  File "K:\Desktop\face_detect\DFace-win64\src\core\detect.py", line 263, in detect_pnet
    cls_map, reg = self.pnet_detector(feed_imgs)
  File "D:\Program Files\Anaconda3\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "K:\Desktop\face_detect\DFace-win64\src\core\models.py", line 97, in forward
    x = self.pre_layer(x)
  File "D:\Program Files\Anaconda3\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\Program Files\Anaconda3\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\container.py", line 67, in forward
    input = module(input)
  File "D:\Program Files\Anaconda3\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\Program Files\Anaconda3\envs\pytorch-gpu\lib\site-packages\torch\nn\modules\conv.py", line 277, in forward
    self.padding, self.dilation, self.groups)
  File "D:\Program Files\Anaconda3\envs\pytorch-gpu\lib\site-packages\torch\nn\functional.py", line 90, in conv2d
    return f(input, weight, bias)
RuntimeError: Input type (CUDADoubleTensor) and weight type (CUDAFloatTensor) should be the same

说的是detect.py某处的数据类型不匹配，也就是说，输入的image是double类型的，但模型文件的权重都是float类型的。通过反复排查，问题出在./src/core/detect.py的255行。只需要做这样的修改：

255行
feed_imgs.append(image_tensor)改成
feed_imgs.append(image_tensor.float())

394行
cls_map, reg = self.rnet_detector(feed_imgs)改成
cls_map, reg = self.rnet_detector(feed_imgs.float())

514行
cls_map, reg, landmark = self.onet_detector(feed_imgs)改成
cls_map, reg, landmark = self.onet_detector(feed_imgs.float())

把image_tensor强制转化为float类型，就好了。

重新运行，又出现bug 这个错误是说，在画Bounding Box的时候，bbox的高和宽都负数。。。。

经过两个通宵的排查，最后发现问题出在另一个文件./src/core/image_tools.py的第20行。
image_tools.py的一部分长这个样子。

import torchvision.transforms as transforms
import torch
from torch.autograd.variable import Variable
import numpy as np

transform = transforms.ToTensor()

def convert_image_to_tensor(image):
    """convert an image to pytorch tensor

        Parameters:
        ----------
        image: numpy array , h * w * c

        Returns:
        -------
        image_tensor: pytorch.FloatTensor, c * h * w
        """
    image = image.astype(np.float)
    return transform(image)
    # return transform(image)

网上查到说ToTensor()这个函数可以把shape=(H x W x C)的像素值范围为[0, 255]的PIL.Image或者numpy.ndarray转换成shape=(C x H x W)的像素值范围为[0.0, 1.0]的torch.FloatTensor。
但是我把transform(image)的值打印出来，发现依然是[0,255]，估计模型文件中都是[0,1]的数据分布，所以把bbox的结果算错了。
在函数的最后改成return transform(image)/255

这下终于好了。
这里写图片描述

PS：我在另一台电脑重复了这个操作，同样的环境和pytorch版本，但是ToTensor()的返回值就是[0,1]，所以就不存在ValueError: negative dimensions are not allowed，真是邪了门了。所以在debug时，可以先看一下image_tools.py里面到底正常不正常。

三、视频检测

不再使用自带的vision.py文件进行可视化，而是利用OpenCV。对test_image.py进行修改，完整的代码是这样的。

import cv2
from src.core.detect import create_mtcnn_net, MtcnnDetector
import src.core.vision as vision
import torch
import numpy as np

if __name__ == '__main__':

p_model_path = "./model_store/pnet_epoch.pt"
r_model_path = "./model_store/rnet_epoch.pt"
o_model_path = "./model_store/onet_epoch.pt"
video_path = "./2.jpg"
pnet, rnet, onet = create_mtcnn_net(p_model_path, r_model_path, o_model_path, use_cuda=True)
mtcnn_detector = MtcnnDetector(pnet=pnet, rnet=rnet, onet=onet, min_face_size=24)
# 启用摄像头，可以改成视频的路径
cap = cv2.VideoCapture(0)
while cap.isOpened():
# 逐帧捕获
ret, frame = cap.read()
bboxs = mtcnn_detector.detect_face(frame)
# print(type(bboxs),bboxs[0],bboxs[1])
# bboxs = bboxs.tolist()

# mtcnn_detector检测不到人脸就会返回一个空的元组，所以加if判断，在不是元组的情况下，才画bbox
if not isinstance(bboxs,tuple):
bboxs = np.round(bboxs).astype('int32')
# print(bboxs)
for i in range(0,int(np.shape(bboxs)[0])):
cv2.rectangle(frame,(bboxs[i,0],bboxs[i,1]),(bboxs[i,2],bboxs[i,3]),(55,255,155),3)
cv2.imshow('video_face_detect',frame)

if cv2.waitKey(1) & 0xFF == ord('q'):
break

# 一切完成后，释放捕获
cap.release()
cv2.destroyAllWindows()

大郎拱白菜

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
深度学习DFace框架和视频人脸检测

一、准备工作1.Anaconda3环境：pytorch-gpuCUDA：8.0cuDNN：5.1环境配置请参考https://blog.csdn.net/hhy_csdn/article/details/82263078二、修改代码激活环境activate pytorch-gpu1运行test_image.py，执行python test_image.py出...
复制链接

扫一扫

专栏目录