【论文阅读】：NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

最新推荐文章于 2024-05-10 14:38:20 发布

小吴同学真棒

最新推荐文章于 2024-05-10 14:38:20 发布

阅读量1.2k

点赞数 1

分类专栏：学习人工智能文章标签：数据集骨架点 NTU RGB+D 120 深度学习计算机视觉

本文链接：https://blog.csdn.net/qq_36627158/article/details/119909064

版权

学习同时被 2 个专栏收录

116 篇文章

订阅专栏

人工智能

73 篇文章

订阅专栏

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

（2019 TPAMI）

Jun Liu, Amir Shahroudy, Mauricio Perez, Gang Wang, Ling-Yu Duan, and Alex C. Kot

Note

论文链接：https://arxiv.org/pdf/1905.04757.pdf

Github：https://github.com/shahroudy/NTURGB-D

数据集链接：

https://rose1.ntu.edu.sg/dataset/actionRecognition/
https://drive.google.com/open?id=1CUZnBtYwifVXS21yVg62T-vrPVayso5H （only NTU RGB+D skeleton data）
https://drive.google.com/open?id=1tEbuaEqMxAV7dNc4fqu1O4M7mC6CJ50w（only NTU RGB+D 120 skeleton data）

Contribution

1、introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities.

2、investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes.

Comparison with Other Datasets

Details of NTU RGB+D 120

1、114, 480 RGB+D video samples

2、120 action categories in total

82 daily actions (eating, writing, sitting down, moving objects, etc),
12 health-related actions (blowing nose, vomiting, staggering, falling down, etc), and
26 mutual actions (handshaking, pushing, hitting, hugging, etc).

3、106 distinct human subjects.

4、RGB videos, depth sequences, skeleton data (3D locations of 25 major body joints), and infrared frames

5、hardware：Microsoft Kinect v2

6、155 different camera viewpoints.

7、The subjects in this dataset are in a wide range of age distribution (from 10 to 57) and from different cultural backgrounds (15 countries)

8、various environmental conditions (96 different backgrounds with illumination variation)

9、cross-subject and cross-setup evaluations metrics

Newly Added Actions Compared to the NTU RGB+D

(1) Fine-grained hand/finger motions.

fine-grained hand and finger motions, such as “make ok sign” and “snapping fingers”.

(2) Fine-grained object-related individual actions.

the body movements are not significant and the sizes of the involved objects are relatively small, such as “counting money” and “play magic cube”. 【这个用单用骨架点数据应该会很难区分】

(3) Object-related mutual actions.

the interactions with objects, such as “wield knife towards other person” and “hit other person with object”.【这个用单用骨架点数据应该会很难区分】

(4) Different actions with similar posture patters but with different motion speeds.

there are some different actions that have similar posture patterns but have different motion speeds. For example, “grab other person’s stuff” is a newly added action, and its main difference compared to “touch other person’s pocket (steal)” is the motion speed.

(5)Different actions with similar body motions but with different objects involved.

There are some different actions that have very similar body motions but involve different objects. For example, the motions in the newly added action “put on bag/backpack” are similar to those in the existing action “put on jacket”.【这个用单用骨架点数据应该会很难区分】

(6)Different actions with similar objects involved but with different body motions.

share the same interacted objects, such as “put on bag/backpack” and “take something out of a bag/backpack”.

Benchmark Evaluations

1、Cross-Subject Evaluation

the 106 subjects are split into training and testing groups. Each group consists of 53 sub- jects.
The IDs of the training subjects in this evaluation are: 1, 2, 4, 5, 8, 9, 13, 14, 15, 16, 17, 18, 19, 25, 27, 28, 31, 34, 35, 38, 45, 46, 47, 49, 50, 52, 53, 54, 55, 56, 57, 58, 59, 70, 74, 78, 80, 81, 82, 83, 84, 85, 86, 89, 91, 92, 93, 94, 95, 97, 98, 100, 103.
The remaining subjects are reserved for testing.

2、Cross-Setup Evaluation

we pick all the samples with even collection setup IDs for training, and those with odd setup IDs for testing, i.e., 16 setups are used for training, and
the other 16 setups are reserved for testing.