THU-HRIA dataset(Human-Robot Interactive Action Dataset From The Perspective Of Service Robot)

Citation Author(s):
Tsinghua University
Submitted by:
Jazon Wang
Last updated:
Wed, 03/15/2023 - 08:57
Data Format:
0 ratings - Please login to submit your rating.


Most of the existing human action datasets are common human actions in daily scenes(e.g. NTU RGB+D series, Kinetics series), not created for Human-Robot Interaction(HRI), and most of them are not collected based on the perspective of the service robot, which can not meet the needs of vision-based interactive action recognition.

This dataset is named as “Human-Robot Interactive Action Dataset From The Perspective Of Service Robot (THU-HRIA dataset) “ ,which is created for indoor service robot action interaction, collected by Tsinghua University.

THU-HRIA Dataset uses a kinectV2 as multimodal sensor data collector, which can collect  RGB, infrared, depth, bodypoints and other forms of data. In terms of specific content, the THU-HRIA dataset includes 8 common actions in both daily restaurant scene and robot interactive scene, totally 576 action samples. The eight action classes are: 1- waving ; 2- calling gesture; 3- beckoning; 4- stepping back; 5-eating; 6- pass by; 7-fall; 8:using phone. For each action, we set different specifications like distance, situation and repetition. Please see the description for more details.


1.     Download link:

Due to the problem of file uploading, the uploaded data only has the format of key point type. The original file is being uploaded. If you are particularly urgent, please use the following external link channels:


External download link: Quark disk link

Please contact us to get the file extraction code


2.       detailed information

Number of actions8 action , 576samples in total

Acquisition format   RGB, depth, infrared, bodypoints, etc

Dataset size       242 GB (including the original file size of all formats)

Action type8types:1- waving ; 2- calling gesture; 3- beckoning; 4- stepping back; 5-eating; 6- pass by; 7-fall; 8:using phone.

Acquisition equipment     Microsoft KinectV2

Camera height  The camera is placed at a height of 1.2m (robot height), and the angle of view is 15 ° upward

Action durationStarting from 0, usually around 150 frames (about 5s)

RepetitionsRepeat 2 times for each action

Performer  Maximum number of people : 1

view   Single view (simulated robot)

scene   Single scene, indoor

situation    Sitting/standing; Front/side; From left to right/from right to left

distance     Fixed distance, long distance: 3m; Short distance 1.5m


3.  Citation: The article to which this dataset belongs is currently not published. If you need to quote, please directly quote this website or contact us. After the paper is published, we will update the quoted content as soon as possible.


4.  Contact us

If you want more information or have any questions about this dataset, please contact     or   ,Jazon Wang . We will try our best to help you.