【Face Detection】《Face Detection using Deep Learning: An Improved Faster RCNN Approach》

最新推荐文章于 2024-04-08 09:36:20 发布

bryant_meng

最新推荐文章于 2024-04-08 09:36:20 发布

阅读量488

点赞数 2

分类专栏： CNN / Transformer 文章标签：人脸识别 face detection

本文链接：https://blog.csdn.net/bryant_meng/article/details/104864105

版权

CNN / Transformer 专栏收录该内容

211 篇文章 7 订阅

订阅专栏

在这里插入图片描述
Neurocomputing-2018

1 Background and Motivation

face detection 效果的提升，有助于许多 subsequent face-related applications，例如 face verification，face recognition and face clustering！

传统的 face detection 方法（eg ViolaJones）依赖 hand-crafted features，each individual component is optimized separately（不是 end-to-end 的）, making the whole detection pipeline often sub-optimal.

这几年，CNN 横空出世，在各大 CV tasks 中大放异彩，随着 CNN 的普及和推广，许多研究者也将注意力聚焦在用深度学习做 face detection 上！

通常，face detection 可以看做是 a special type of object detection task！所以现有的方法也基本基于 R-CNN 的 pipeline！

作者在 Faster R-CNN 上扩展（ R-CNN 系列中最好的方法），运用各种策略，在 Face Detection Dataset and Benchmark (FDDB) 数据集上夺魁！

2 Advantages / Contributions

提出了 a new scheme for face detection by improving the Faster RCNN framework，在 FDDB 数据集上夺魁（更多是工程上）

3 Method

feature concatenation
hard negative mining
multi-scale training
Convert bbox to ellipses

在这里插入图片描述

用 WIDER FACE 数据集训练，来产生 hard negatives！完成的细节流程请看后面实验部分

3.1 Feature Concatenation

在这里插入图片描述
faster rcnn 的 ROI pooling 是接在最后一个特征图上，这可能会 omit some feature 特征（更深层的特征图虽感受野更大，但有 grosser granularity）

作者，在多个 stage 的特征图上采用 RoI pooling，然后 concatenate 起来（H,W 应该都一样），接 1x1 Conv 恢复成原来的 channels！以此来 capture more fine-grained details of the RoIs

3.2 Hard Negative Mining

作者将 hard negative sample 掺杂到负样本中！

hard negatives are the regions where the network has failed to make correct prediction

在 proposals 到 RoIs ——准备训练 head 的过程（不是 anchor 到 proposal——训练 RPN），正负样本 1：3，IoU threshold 为 0.5

3.3 Multi-Scale Training

randomly assign one of three scales for each image before it is fed into the network

shorter side will be one of 480，600，750 长边不超过 1200

多尺度训练，可惜，没有实战过！

4 Experiments

caffe，VGG-16, Faster R-CNN

4.1 Datasets

FDDB face detetion benchmark，5,171 faces in 2,845 images
WIDER FACE（相比于 FDDB，larger-scale face data）
including various detection challenges, such as occlusions, difficult poses, and low resolution and out-of-focus faces.

4.2 Experimental Setup

第一步，用 WIDER FACE training and validation datasets 作为训练集，训练 VGG16+Faster RCNN

对每个 face 按照下表的评分系统进行打分（正常图0分），discard 得分超过两分的图片，discard 超过 1000 个 annotation 的图片

在这里插入图片描述
第二步，用 WIDER FACE dataset inference 一遍模型，score 高于 0.8，IoU 小于 0.5 的 proposal 视为 hard negatives! 接着用固定的学习率训练 100，000 个 iteration 进行 hard negative mining procedure，每次要确保上一次筛选出来的 hard negatives 被抽取到成为 RoIs