解读 Joint Cascade Face Detection and Alignment 人脸检测算法

最新推荐文章于 2020-06-17 17:59:21 发布

荪荪

最新推荐文章于 2020-06-17 17:59:21 发布

阅读量3.5k

点赞数

JDA简介

JDA（Joint Cascade Face Detection and Alignment）算法 [1] 应该算目前比较先进的人脸检测算法.它结合了 cascade 和 alignment ，一方面做alignment对进一步的人脸识别意义重大，另一方面作者在 section 2 讲到了landmark附近的特征可促进分类器分辨出更准确的结果，最后，将这两者放在一起做不仅相互促进而且还相互节省了时间.

JDA训练算法流程

训练随机森林（这里用的是random classification/regression forest）
提取LBF特征
利用LBF做全局回归

训练随机森林

整个JDA被分为T个stage，每个stage要训练K个mixed classification/regression decision tree，另外每棵树 只回归一个关键点（landmark） . 每个树的训练方法都是一样，所以只要了解一棵树的训练方法就行了.

训练一棵树的流程如下：

提取局部特征
训练一棵 mixed classification/regression decision tree
训练整个森林

提取局部特征

在 [1] 中原文 section 4.2 中明确提出了提取特征的方法：

Specifically, we generate three scales of images by down sampling the input image to half and one fourth. To generate a feature, we randomly choose an image scale, pick up two random facial points in the current shape, generate two random offsets with respect to the points and take the difference of the two offsetted pixels as the feature.

文中提到的 facial point 就是 landmark ，也就是人脸上的特征点

所以按如下方法提取特征：

每张图片提取FN（例如：500）次特征，这里FN就代表了整个特征的维度
对于每个特征，随机选择一个scale，可以是原图，也可以是半图或者四分之一图，再随机地在L个landmark中选出两个，分别在radius范围内再随机选取两个点与landmark的差作为offset
计算特征值

综上，每张样本图片将获得一个维度为FN的特征

训练一棵 mixed classification/regression decision tree

同样在 [1] 文中 section 4.2 中，作者描述了决策树的训练方法：

In the split test of each internal node, we randomly choose to either minimize the binary entropy for classification (with probability ) or the variance of the facial point increments for regression (with probability ). Intuitively, the parameter should be larger in earlier stages to favor classification learning and reject easy negative samples more quickly. It should be smaller in later stages to favor regression learning and improve the alignment accuracy. We empirically make this parameter linearly decreasing with respect to the regression stage number t, that is,.

[1] Dong Chen, Shaoqing Ren, Yichen Wei, Xudong Cao, Jian Sun. Joint Cascade Face Detection and Alignment. European Conference on Computer Vision (ECCV), 2014. [ PDF ]

[2] Face Alignment at 3000 FPS via Regressing Local Binary Features. Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1685-1692 [ PDF ]