NTU-60

Citation Author(s):
Wang
Qicong
Submitted by:
QICONG WANG
Last updated:
Mon, 11/04/2024 - 14:34
DOI:
10.21227/tqfb-7n73
License:
1 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset contains RGB+D videos and skeleton data for human behavior. The behavior data is captured by 3 Microsoft Kinect V2 cameras from 40 human subjects, with a total of 56,880 samples containing 60 categories totaling 4 million frames, where the maximum frame for all samples is 300. 25 joints are recorded for each body skeleton. The dataset provides two original settings, namely two evaluation protocols, Cross-Subject (Xsub) and Cross-View (Xview). In Xsub protocol, the training set contains 40,320 samples from 20 subjects, and the remaining 16,560 samples are used for testing. In Xview protocol, 37,920 samples captured by cameras 2 and 3 are used for training, and camera 1 is used for training. The remaining 18960 samples were used for testing. We follow these two settings and report the Top-1 accuracy of experimental results.

Instructions: 

human skeleton

Dataset Files

LOGIN TO ACCESS DATASET FILES