论文阅读:Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition

Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition

(2019 CVPR)

Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, and Qi Tian

Notes

 

Contributions

  1. we propose the A-link inference module (AIM) to infer actional links which capture action-specific latent dependencies. The actional links are combined with structural links as generalized skeleton graphs.
  2. We propose the actional-structural graph convolution network (AS-GCN) to extract useful spatial and temporal information based on the multiple graphs.
  3. We introduce an additional future pose prediction head to predict future poses, which also improves the recognition performance by capturing more detailed action patterns.
  4. The AS-GCN outperforms several state-of-the-art meth- ods on two large-scale data sets; As a side product, AS- GCN is also able to precisely predict the future poses.

 


 

Method

Actional Links (A-links)

To capture richer dependencies, we introduce an encoder-decoder structure, called A-link inference module, to capture action-specific latent dependencies, i.e. actional links, directly from actions.

1、Encoder. The functionality of an encoder is to estimate the states of the A-links given the 3D joint positions across time; that is,

where C is the number of A-link types. The encoder produces A-links by propagating information between joints and links iteratively to learn link features.

2、Decoder. The functionality of the decoder to predict the future 3D joint positions conditioned on the A-links inferred by the encoder and previous poses; that is,

The decoder predict future joint positions based on the inferred A-links.

3、AGC. Given the input Xin, the AGC is

where W is the trainable weight to capture feature importance. Note that we use the AIM to warm-up A-links in the pretraining process; during the training of action recognition and pose prediction, the A-links are further optimized by forward-passing the encoder of AIM only.

 

 

Structural Links (S-links)

With the L-order polynomial, we define the structural graph convolution (SGC), which can directly reach the L-hop neighbors to increase the receptive field. The SGC is formulated as

where M and W are the trainable weights to capture edge weights and feature importance.

 

 

Actional-Structural Graph Convolution Block

 

 

AS-GCN (Backbone network)

 

 

Multitasking of AS-GCN

1、Action recognition head. To classify actions, we construct a recognition head following the backbone network. We apply the global averaging pooling on the joint and temporal dimensions of the feature maps output by the backbone network, and obtain the feature vector, which is finally fed into a softmax classifier to obtain the predicted class-label. The loss function for action recognition is the standard cross entropy loss

2、Future pose prediction head. To predict future poses, we construct a prediction module followed by the backbone network. We use several AS-GCN blocks to decode the high-level feature maps extracted from the historical data and obtain the predicted future 3D joint positions.

when we train the recognition head and future prediction head together, recognition performance gets improved.

 


 

Results

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值