Wild-SHARD: Smartphone Sensor-based Human Activity Recognition Dataset in Wild

Citation Author(s):
Nurul Amin
Choudhury
National Institute of Technology Silchar
Badal
Soni
National Institute of Technology Silchar
Submitted by:
Nurul Choudhury
Last updated:
Thu, 06/13/2024 - 03:01
DOI:
10.21227/dc2h-a428
Data Format:
Research Article Link:
Links:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Wild-SHARD presents a novel Human Activity Recognition (HAR) dataset collected in an uncontrolled, real-world (wild) environment to address the limitations of existing datasets, which often need more non-simulated data. Our dataset comprises a time series of Activities of Daily Living (ADLs) captured using multiple smartphone models such as Samsung Galaxy F62, Samsung Galaxy A30s, Poco X2, One Plus 9 Pro and many more. These devices enhance data variability and robustness with their varied sensor manufacturers. The sensor module, consisting of accelerometers and gyroscopes, was mounted in the front pockets (vertically, phone earpiece side up) of 40 adult subjects of diverse ages, genders, weights, and heights. Subjects performed activities naturally to capture authentic ADL data, including sitting, walking, standing, running, and navigating stairs indoors and outdoors. The dataset was collected at a sampling frequency of 100 Hz, covering six ADLs: sitting, walking, standing, running, going upstairs, and going downstairs. The attributes recorded in the dataset include acceleration due to gravity, linear acceleration, gravity, rotational rate, rotational vector, and the cosine of the rotational vector. This comprehensive dataset structure, as detailed in the provided equations, aims to improve real-time activity recognition and overall system performance by offering high-quality, realistic sensor data.

Instructions: 

Introduction-
This dataset contains Human Activity Recognition (HAR) data collected in an uncontrolled real-world (wild) environment. The data was collected using multiple smartphone models such as Samsung Galaxy F62, Samsung Galaxy A30s, Poco X2, One Plus 9 Pro and many more. The dataset is intended to improve real-time activity recognition and overall system performance by providing high-quality, non-simulated sensor data.

Data Acquisition-
Numerous sensor-based HAR datasets are publicly available and offer high-quality data at different sampling frequencies with various sensor combinations. However, very few datasets consist of time-series ADL data collected in uncontrolled environments. Due to the lack of non-simulated data, real-time activity recognition often needs to improve overall system performance.
To address this problem, we created a state-of-the-art HAR dataset in an uncontrolled environment. The data collection devices were different and had different sensor types, which increased data variability and robustness. The sensor module was mounted in the front pockets (vertically, with the phone earpiece side up) of various subjects who performed different activities naturally to capture authentic ADLs data.

Subjects-
A total of 40 subjects were considered, including adults of different ages, genders, weights, and heights. All subjects were above 20 years old, with weights in the range of 60 ± 94 kg and heights in the range of 165 ± 186 cm. The dataset covers the following six ADLs:
Sitting
Walking
Standing
Running
Going Upstairs
Going Downstairs
Data Collection
The activities were performed as follows:
Running, walking, sitting, and standing were conducted outdoors.
Upstairs and downstairs activities were performed in a departmental building.
The sensors used for data collection were accelerometers and gyroscopes with a sampling frequency of 100 Hz. The dataset structure is as follows:
FnX16={a1x0a2y0a3z0\hdotsa14y0a15y0a16a1x1a2y1a3z1\hdotsa14y1a15y1a16⋱a1xna2yna3zn\hdotsa14na15yna16}FnX16​=⎩
⎨⎧​a1x0​​a1x1​​⋮a1xn​​​a2y0​​a2y1​​⋮a2yn​​​a3z0​​a3z1​​⋮a3zn​​​\hdots\hdots⋮\hdots​a14y0​​a14y1​​⋮a14n​​​a15y0​​a15y1​​⋮a15yn​​​a16a16⋱a16​⎭⎬
⎫​
DatasetnX17={FClassLabels}DatasetnX17​={F​ClassLabels​}
The attributes in the 3D-axis are:
1. Acceleration due to Gravity in x, y, z - axis.
2. Linear Acceleration in x, y, z - axis.
3. Gravity in x, y, z - axis.
4. Rotational Rate in x, y, z - axis.
5. Rotational Vector in x, y, z - axis.
6. cos(Rotational Rate in X axis)

File Structure-
data/: Contains the raw sensor data and corresponding activity labels.
README.md: This file.
LICENSE: License information for the dataset.

Usage-
To use this dataset, download the data files from the IEEE Dataport and load them into your preferred data analysis tool. The data can be used to train and test machines and deep learning models for activity recognition tasks.
Citation
If you use this dataset in your research, please cite the following paper:
[1] N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522.

[2] N. A. Choudhury and B. Soni, "An Efficient and Lightweight Deep Learning Model for Human Activity Recognition on Raw Sensor Data in Uncontrolled Environment," in IEEE Sensors Journal, vol. 23, no. 20, pp. 25579-25586, 15 Oct.15, 2023, doi: 10.1109/JSEN.2023.3312478.

[3] Choudhury, N.A., Soni, B. In-depth analysis of design & development for sensor-based human activity recognition system. Multimed Tools Appl (2023). https://doi.org/10.1007/s11042-023-16423-5

Contact-
For any questions or further information, please contact:
Nurul Amin Choudhury
National Institue of Technology Silchar, Assam, INDIA - 78010
nurul0400@gmail.com

Comments

Feel Free to contact us for dataset usage.

Submitted by Nurul Choudhury on Thu, 06/13/2024 - 03:01