doppia一共提供三种人脸检测方法,分别是DPM,HeadHunter和HeadHunter_baseline,后俩者在使用上没有什么差别,只不过是选择的模型不同,而DPM则可以视为与整个doppia相对立的一部分内容,它使用matlab编译运行,本节主要阐述HeadHunter的使用方法。
人脸检测模块在src/applications/objects_detection/中,假如你编译成功可以看到可执行文件objects_detection,该程序支持多种目标检测,具体要检测什么,主要看你的配置文件,人脸检测采用的配置文件是eccv2014_face_detection_pascal.config.ini,我们可以以此为模版,根据情况修改。 先说一下几个修改的地方,
save_detections = true 设置为true用于保存结果
process_folder = /opt/wangchao/new-fddb/ 这个就是你图片保存的目录,注意这里只支持单层目录,假如图片存储在这个目录下的其他目录,那么是读取不了的。
method = gpu_channels 这里可以设置选择cpu或gpu等,我们选择gpu
model = ../../../data/trained_models/face_detection/headhunter.proto.bin 之前提到过HeadHunter和HeadHunter_baseline方法,其不同就在于此处所择的模型不同,我们选择的是HeadHunter,用另一种的话就把路径指向那个就行了。
score_threshold = 0.5 这是一个阈值,调低的话,可以增加召回率,相应的也会降低准确率,关于不同阈值对结果的影响建议参考论文。
还有一些其他的选项,我们暂时可以不用修,改保持默认即可。
另外,运行 ./objects_detection --help 可以看到更多的其他配置选项,接下来运行如下命令可以启动程序:
./objects_detection -c eccv2014_face_detection_pascal.config.ini --gui.disable true --gpu.device_id 1
在这里我们可以看到有俩个额外的选项–gui.disable 和 --gpu.device_id,程序规定,外部选项的优先级高于配置文件,也就是说我们可以用相同的配置文件,而在外部修改其某个选项,来方便测试,这里俩个选项也是我们常用的选项,–gui.disable用来选择是否开启gui,由于我们运行在云主机上,所以选择true,禁用gui;–gpu.device_id用来选择gpu设备,我们的机器有俩张卡,我们选择第二张卡,所以设为1,默认为0.
程序启动后会打印出很多命令,细心的可能会发现如下一条报错信息:
Error parsing text-format doppia_protobuf.DetectorModel: 2:2: Message type "doppia_protobuf.DetectorModel" has no field named "E".
无需介意这条错误信息,它并不会影响程序运行,开发者说此处报错是因为protobuf在选择表达方式,失败后会选择另一种可用的(大概就是这个意思吧),就是说你的程序是在正确运行。
程序运行结束后会在当前目录产生一个类似于2015_09_18_79659_recordings的文件夹,文件夹内有detections.data_sequence,这就是我们所要的结果,但是显然还需要进一步的处理。
在doppia根目录下 tools/objects_detection/detections_to_caltech.py 脚本用于帮助我们将.data_sequence转换为我们可以理解的方式,该脚本有俩个选项-i和-o,分别指定.data_sequence和输出目录。
运行后会在输出目录产生txt文件,文件名是图片名,文件内存储着截取的人脸坐标和置信概率,至此,我们通过doppia完成了人脸检测程序,获得了我们想要的结果。
这里提供俩个脚本,分别是修改过的detections_to_caltech.py和一个人脸截取脚本,修改过后的detections_to_caltech.py可以将所有照片按指定格式存储于一个txt内,人脸截取脚本通过读入该txt从相应照片上扣取人脸。
参考论文: Download paper
最后附上我用fddb对doppia和dlib分别做的测试结果:
|
召回率
|
准确率
|
处理速度/单张
|
---|---|---|---|
doppia | 87% | 87.2% | 约14s |
dlib | 76.8% | 99.3% | 0.2s |
</pre>fddb_crop.py<pre name="code" class="python">#!/usr/bin/env python
# -*- coding:utf8 -*-
from PIL import Image
import sys
import os
def crop_img_by_half_center(src_file_path, dest_file_path, box):
# try:
print src_file_path
im = Image.open(src_file_path)
additions=box.split(' ')
boxs=additions[0].split(' ')
x,y = im.size
i = 0
for i in range(0,4):
temp1 = boxs[i].split(',')
temp2 = temp1[0].split('.')
boxs[i] = temp2[0]
x0 = int(boxs[0])
y0 = int(boxs[1])
x1 = int(boxs[2])
y1 = int(boxs[3])
x,y = im.size
x0 = x0-x1
x1 = x0+x1+x1
y1 = y0+y1
if x0<((x1-x0)/4):
cut0 = x0
else:
cut0 = (x1-x0)/4
if y0<((y1-y0)/3):
cut1 = y0
else:
cut1 = ((y1-y0)/3)
if x<((x1-x0)/3):
cut2 = x
else:
cut2 = ((x1-x0)/3)
if y<((y1-y0)/3):
cut3 = y
else:
cut3 = ((y1-y0)/3)
box = (x0, y0-cut1, x1+cut2, y1+cut3)
new_im = im.crop(box)
new_im.save(dest_file_path)
# except:
# print 'wrong format'
def walk_through_the_folder_for_crop(aligned_db_folder):
fp = open(aligned_db_folder,'r')
while 1:
line = fp.readline()
if not line:
break
line = line.strip('\n')
src_path = 'new-fddb/'+line+'.jpg'
num = fp.readline()
for i in range(1,int(num)+1):
blank = fp.readline()
blanks = blank.split(' ')
img_box = blanks[0]
dest_img_path = 'out-fddb/'+line+str(i)+'.jpg'
crop_img_by_half_center(src_path, dest_img_path, img_box)
if __name__ == '__main__':
aligned_db_folder = sys.argv[1]
walk_through_the_folder_for_crop(aligned_db_folder)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import os.path
import sys
local_dir = os.path.dirname(sys.argv[0])
sys.path.append(os.path.join(local_dir, ".."))
sys.path.append(os.path.join(local_dir, "../data_sequence"))
sys.path.append(os.path.join(local_dir, "../helpers"))
from detections_pb2 import Detections, Detection
from data_sequence import DataSequence
import os, os.path
from optparse import OptionParser
def open_data_sequence(data_filepath):
assert os.path.exists(data_filepath)
the_data_sequence = DataSequence(data_filepath, Detections)
def data_sequence_reader(data_sequence):
while True:
data = data_sequence.read()
if data is None:
raise StopIteration
else:
yield data
return data_sequence_reader(the_data_sequence)
def parse_arguments():
parser = OptionParser()
parser.description = \
"This program takes a detections.data_sequence created by ./objects_detection and converts it into the Caltech dataset evaluation format"
parser.add_option("-i", "--input", dest="input_path",
metavar="FILE", type="string",
help="path to the .data_sequence file")
parser.add_option("-o", "--output", dest="output_path",
metavar="DIRECTORY", type="string",
help="path to a non existing directory where the caltech .txt files will be created")
(options, args) = parser.parse_args()
print (options, args)
if options.input_path:
if not os.path.exists(options.input_path):
parser.error("Could not find the input file")
else:
parser.error("'input' option is required to run this program")
if not options.output_path:
parser.error("'output' option is required to run this program")
return options
def create_caltech_detections(detections_sequence, output_path):
"""
"""
text_file = open(output_path, "w")
for detections in detections_sequence:
text_file.write(os.path.splitext(detections.image_name)[0]+'\n')
i = 0
for num in detections.detections:
i = i+1
text_file.write(str(i)+'\n')
for detection in detections.detections:
if detection.object_class != Detection.Pedestrian:
continue
if False and detection.score < 0:
# we skip negative scores
continue
box = detection.bounding_box
min_x, min_y = box.min_corner.x, box.min_corner.y
width = box.max_corner.x - box.min_corner.x
height = box.max_corner.y - box.min_corner.y
adjust_width = True
if adjust_width:
# in v3.0 they use 0.41 as the aspect ratio
# before v3.0 they use 0.43 (measured in their result files)
#aspect_ratio = 0.41
aspect_ratio = 0.43
center_x = (box.max_corner.x + box.min_corner.x) / 2.0
width = height*aspect_ratio
min_x = center_x - (width/2)
# data is [x,y,w,h, score]
detection_data = []
detection_data += [min_x, min_y]
detection_data += [width, height]
detection_data += [detection.score]
detection_line = ", ".join([str(x) for x in detection_data]) + "\n"
text_file.write(detection_line)
text_file.close()
return
def detections_to_caltech(input_path, output_path):
# get the input file
#input_file = open(options.input_path, "r")
detections_sequence = open_data_sequence(input_path)
# convert data sequence to caltech data format
create_caltech_detections(detections_sequence, output_path)
return
def main():
options = parse_arguments()
detections_to_caltech(options.input_path, options.output_path)
return
if __name__ == "__main__":
# Import Psyco if available
try:
import psyco
psyco.full()
except ImportError:
#print("(psyco not found)")
pass
else:
print("(using psyco)")
main()