香港中文大学 pose

http://www.ee.cuhk.edu.hk/~xgwang/projectpage_structured_feature_pose.html



Structured Feature Learning for Pose Estimation



Abstract

In this paper, we propose a structured feature learning framework to reason the correlations among body joints at the feature level in human pose estimation.Different from existing approaches of modeling structures on score maps or predicted labels, feature maps preserve substantially richer descriptions of body joints. The relationships between feature maps of joints are captured with the introduced geometrical transform kernels, which can be easily implemented with a convolution layer. Features and their relationships are jointly learned in an end-to-end learning system. A bi-directional tree structured model is proposed, so that the feature channels at a body joint can well receive information from other joints. The proposed framework improves feature learning substantially. With very simple post processing, it reaches the best mean PCP on the LSP and FLIC datasets. Compared with the baseline of learning features at each joint separately with ConvNet, the mean PCP has been improved by 18% on FLIC.

How to cite

@inproceedings{chu2016structure,
	  	title 		={Structured Feature Learning for Pose Estimation},
	  	author 		={Chu, Xiao and Ouyang, Wanli and Li,Hongsheng and Wang, Xiaogang},
	  	booktitle 	={CVPR},
	  	year 		={2016},
	}
	

Predictions & Codes


Our paper is now available: [ Paper

Our results are ready for download: [ Predictions

The proto files are released: [ Proto Files

The full code is now available. [ Full code

Steps to train your own model:
  1. Make caffe
  2. Data prepare: run "Data_prepare.m" in MATLAB and then run "ConvertLMDB.sh" to generate LMDB data
  3. Model training: run "Baseline.sh"
  4. TestModel: Select one model and write the directory in "TestModel.m", run "TestModel.m"
For more qualititive results, please refer to the supplementary material: [ Supplementary]

Experimetal Results


Comparison of  strict PCP results on the  Leeds Sport Pose (LSP) Dataset using  Observer-Centric (OC) annotations.
MethodTorsoHeadUpper ArmsLower ArmsUpper LegsLower LegsMean
Ours95.489.677.065.287.683.281.1
Xianjie Chen et al., NIPS'1492.787.869.255.482.977.075.0
Pishchulin et al., ICCV'1388.785.661.544.978.873.469.2
Ouyang et al., CVPR'1485.883.163.346.676.572.268.6
Ramakrishna et al., ECCV'1488.180.962.339.178.973.467.6
Eichner&Ferrari, ACCV'1286.280.156.537.474.369.364.3
Pishchulin et al., CVPR'1387.578.154.233.975.768.062.9
Yang&Ramanan, CVPR'1184.177.152.535.969.565.660.8
Kiefel&Gehler, ECCV'1484.478.453.327.474.467.160.7
Comparison of  strict PCP results on the  Frames Labeled In Cinema (FLIC) Dataset using Observer-Centric (OC) annotations.
MethodUpper ArmsLower ArmsMean
Ours97.992.495.2
Xianjie Chen et al., NIPS'1497.086.891.9
Tompson et al., NIPS'1493.780.987.3
MODEC, CVPR'1384.452.168.3
Comparison of PDJ curves of elbows and wrists on the Frames Labeled In Cinema (FLIC) Dataset using Observer-Centric (OC) annotations. The curves are for Xianjie Chen et al., NIPS'14Tompson et al., NIPS'14DeepPose, CVPR'14, and MODEC, CVPR'13. FLIC PDJ curves


Our method


1. Motivation
Independent prediction of body joint locations from appearance score maps can be refined by modeling the spatial relationship among correlated body joints. On score maps, the information at a location is summarized into a single probability value, while detailed information indicating the attritbutes of the body joint is missing. These information is valuable for the  structural learning among body joints much less effective. We observe that these types of information are well preserved at the  feature level. As shown in Fig. 2 on the right.

Figure 2. Examples of response maps of different images to the same feature channels. (a) A feature channel for the neck. (b) A feature channel for the left wrist. (c) A feature channel for the left lower arm.


2. Geometrical transfer kenerls
We show that under a fully convolutional neural network, messages can be passed between feature maps through the introduced geometrical transform kernels. The FCN filters and the kernels can be jointly learned. convolution with asymmetric kernels could geometrically shift the feature responses. (a) is a feature map assuming Gaussian distribution. (b) are different kernels for illustration. (c) are the transformed feature maps after convolution. The feature map has been shifted towards different directions and sum up to different values.

Figure 3.


3. Feature updatae
We propose to build up dependency at feature level. The process of how one set of feauture map influence another is shown in Fig. 4. These kernels can be implemented with convolution and the relationships can be learned in and end-to-end learning system.

Fig. 4. An example of updating feature maps by passing information between joints


4. Bidirectional Tree:
It is important to design proper information flow between body joints, so that features at a joint can be optimized by receiving messages from highly correlated joints and will not be disturbed by less correlated joints in distance. A bi-directional tree-structured model is proposed. The pro- posed model connects correlated joints and passes messages in both directions along the tree. Therefore, every joint can receive information from all the neighboring joints. (1) Original feature maps for body joints. (2) Refine the feature maps by information passing in a structure feature learning layer. (2,a) and (2,b) show the details of the bi-directional tree which have information flows in opposite directions. The process of updating feature maps are also illustrated. (3) Predict score maps for joints based on feature maps. Dashed line is copy operation and solid line is convolution.

Fig. 5. Our pipeline for pose estimation.




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值