NTU-60

Citation Author(s):: Wang Qicong
Submitted by:: QICONG WANG
Last updated:: Mon, 11/04/2024 - 19:34
DOI:: 10.21227/tqfb-7n73

15 views

Categories:

Artificial Intelligence

Keywords:

NTU-60

ACCESS DATASET CITE

Abstract

This dataset contains RGB+D videos and skeleton data for human behavior. The behavior data is captured by 3 Microsoft Kinect V2 cameras from 40 human subjects, with a total of 56,880 samples containing 60 categories totaling 4 million frames, where the maximum frame for all samples is 300. 25 joints are recorded for each body skeleton. The dataset provides two original settings, namely two evaluation protocols, Cross-Subject (Xsub) and Cross-View (Xview). In Xsub protocol, the training set contains 40,320 samples from 20 subjects, and the remaining 16,560 samples are used for testing. In Xview protocol, 37,920 samples captured by cameras 2 and 3 are used for training, and camera 1 is used for training. The remaining 18960 samples were used for testing. We follow these two settings and report the Top-1 accuracy of experimental results.