Machine Vision Technology:Lecture10 Object Detection
- Introduction of object detection
- Face Detection
- Pedestrian Detection
object detection and challenges
Object Detection Design challenges:
-
How to efficiently search for likely objects
- Even simple models require searching hundreds of thousands of positions and scales.即使是简单的模型也需要搜索成千上万的位置和尺度。
-
Feature design and scoring 特征设计和评分
- How should appearance be modeled?
- What features correspond to the object?
-
How to deal with different viewpoints? 处理不同的视角差异
- Often train different models for a few different viewpoints 经常为几个不同的视角训练不同的模型
Face Detection人脸检测
Challenges of face detection:
-
Sliding window = tens of thousands of location/scale evaluations
- 一个百万像素的图像有大约 1 0 6 10^6 106 个像素,以及相当数量的候选人脸位置
-
Faces are rare: 0–10 per image 人脸罕见:每张图片0-10哥人脸
- For computational efficiency, spend as little time as possible on the non-face windows. 为了提高计算效率,在非人脸窗口上花费尽可能少的时间。
- For 1 Mpix, to avoid having a false positive in every image, our false positive rate has to be less than 1 0 − 6 10^{-6} 10−6 1Mpix,避免假阳性在每一个图像,我们的假阳性率必须小于 1 0 − 6 10^{-6} 10−6
Sliding Window Face Detection with Viola-Jones 2001
Viola-Jones使用了机器学习的boosting算法,下面是boosting算法介绍:
- A simple algorithm for learning robust classifiers
- Provides efficient algorithm for sparse visual feature selection
- Easy to implement, not requires external optimization tools
1.找到正确率大于0.5的分类器 h i ( x ) h_i(x) hi(x)
2.把错误分类的权重放大
3.迭代1-2
通过几个分类器组合起来得到最终分类器。
h
(
x
)
=
α
1
h
1
(
x
)
+
α
2
h
2
(
x
)
+
α
3
h
3
(
x
)
+
⋯
h(x) = \alpha_1 h_1(x) + \alpha_2 h_2(x) + \alpha_3 h_3(x) + \cdots
h(x)=α1h1(x)+α2h2(x)+α3h3(x)+⋯
其中
h
(
x
)
h(x)
h(x) 是 Strong classifier 强分类器,
h
i
(
x
)
h_i(x)
hi(x) 是Weak classifier,
x
x
x 是 Features vector,
α
i
\alpha_i
αi 是 Weight。
每个弱分类器:
h
j
(
x
)
=
{
1
if
f
j
(
x
)
>
θ
j
0
otherwise
h_j(x) = \left\{ \begin{array}{rcl} 1 & & \text{if} \quad {f_j(x) \gt \theta_j} \\ 0 & & \text{otherwise} \\ \end{array} \right.
hj(x)={10iffj(x)>θjotherwise
其中
f
j
(
x
)
f_j(x)
fj(x) 是 value of rectangle feature,
θ
j
\theta_j
θj 是threshold。如下图所示。
所以最终的 strong classfier:
h
(
x
)
=
{
1
∑
t
=
1
T
α
t
h
t
(
x
)
>
1
2
∑
t
=
1
T
α
t
0
otherwise
h(x) = \left\{ \begin{array}{rcl} 1 & & {\sum\limits_{t = 1}^{T} \alpha_t h_t(x) \gt \frac{1}{2} \sum\limits_{t = 1}^{T} \alpha_t } \\ 0 & & \text{otherwise} \\ \end{array} \right.
h(x)=⎩
⎨
⎧10t=1∑Tαtht(x)>21t=1∑Tαtotherwise
Viola & Jones algorithm :
- A “paradigmatic” method for real‐time object detection 实时目标检测的“范例”方法
- Training is slow, but detection is very fast
- Key ideas:
- Integral images for fast feature evaluation 用于快速特征评估的积分图像(积分图)。
- Boosting for feature selection
- Attentional cascade for fast rejection of non‐face windows 注意力级联快速拒绝非人脸窗口。也就是对非人脸窗口处理用时更少。
详情看论文,没听太明白。。。就积分图有点像二维联合分布函数的矩形公式。
Pedestrian Detection行人检测
Histograms of oriented gradients for human detection 2005
HoG Feature.