一.概述
1.paper: Stacked Hourglass Networks for Human Pose Estimation
2.code: http://www-personal.umich.edu/~alnewell/pose
3.Time: 201603
4.task: human pose estimation
5.eval_dataset: FLIC and MPII benchmarks
6.Advantage:
-
features from all scales
-
consolidated to best capture the various spatial relationships associated with the body.
(名字应该是根据结构来的,堆叠的沙漏,确实很像,哈哈哈~)
二.方法介绍
1)总体网络结构
2)详细介绍
目的:capture infotmation at every scale
(1)单个 hourglass block结构(和U-Net相似)
hourglass结构特点: 对称
(2)Hourglass design detail
down_sample: Conv + max pooling
up_sample: nearest neighnor upsampling od thr lower resolution followed bu an elementwise addition of two sets of features.
当上采样达到网络的输出尺度后,接了2个连续的1x1 Conv,最后输出。
(3)Layer实现
a.用 2个 3x3 Conv 替代 1个 5x5 Conv,以及Residual(skip connection)能够提高网络的性能
Residual Moduel that use in hourglass network:
b. 2个blocks之间结构
3)loss 函数
- 每个 hourglass block都计算一次loss,上一个hourglass block的output是下一个hourglass block的Input.
- same as Tompson[15], MSE loss