随便搜索了一下: 人脸检测识别文献阅读总结
检测文章中一般都细节很多,这里只总结主要思路。
Joint Training of Cascaded CNN for Face Detection
cascade的优势:
handle unbalanced distribution of negative and positive samples. In the early stages, week classifiers can reject most false negatives. In the later stages, stronger classifiers can save computation with less proposals. 如Faster R-CNN
这里的三层结构类似cascaded cnn,不过位置估计使用了bbox regression而不是边界校准网络。看4.3. Testing pipeline,下一层的输入图片是用上一层中通过阈值的那些Box从原图中截取的。这样相当于首先在粗粒度检测是否有人脸和人脸的大概位置,然后把ROI截取出来,在细粒度做进一步检测和定位(注意与image pyramid区分,设计的目的不太一样)。第二层是hard negative sample mining,第三层是harder negative mining,即逐渐解决更困难的样本。
另外注意训练过程的设计:
The principle is to make the threshold as high as possible while keeping the recall, so as to reject as many proposals as possible in the earlier stages.
文章处理人脸尺度变化的办法是用image pyramid,每一个层次都作为上图中的input。
Scale-Aware Face Detection
cnn在多尺度人脸检测的办法:
either fitting a large single model to faces across a large scale range or multi-scale testing (如上文中的image pyramid).
但是这样引入了较大的计算量,本文的想法是先用一个cnn估计图片中人脸的尺度分布,然后对图片进行放大和缩小。
In this way, the face detection procedure can be divided into face scale estimation and single scale detection.
如图:
第一阶段Scale Proposal Network (SPN):
注意如何为SPN生成ground truth直方图。
第二阶段检测器(RPN)
Since the face size variation is already handled in the first stage, in this stage, we only use an RPN with one anchor. The largest detectable face size is set to be twice the size of the smallest detectable face. This configuration is enough to achieve high accuracy while keeping average zooms per image low and the RPN computationally cheap. The RPN we use is called Single-Scale RPN, since it has only one anchor and has a narrow face size coverage.
未来可扩展的方向
The proposed method can also be applied to general object detection problems. Moreover, the SPN is essentially a weakly-supervised detector, which could be used to generate coarse region proposals and further improves speed. SPN can also share convolution layers with RPN to further reduce model size.
一个问题,还可以用cascade提高效率吗?之前遇到过的其他检测提速方法: 降低proposal, yolo等, rfcn和light head RCNN。
另外这次设计了较多的尺度问题,如image pyramid和SPN。可以结合FPN paper考虑考虑: