Respiration and Exhaled Hydration Dataset Based on Data Augmentation

Citation Author(s):
Sagheer
Khan
The University of Edinburgh
Aaesha
Alzaabi
The University of Edinburgh
Imran
Saied
The University of Edinburgh
Tughrul
Arslan
The University of Edinburgh
Submitted by:
sagheer khan
Last updated:
Tue, 03/12/2024 - 22:08
DOI:
10.21227/rzc7-ma61
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Healthcare 4.0 introduces the groundbreaking notion of the "Digital Twin (DT)," utilizing a digital model to encompass an individual's biological traits. This technology allows for the development of tailored treatment approaches, supports timely interventions, monitors respiratory issues, and offers decision-making assistance to healthcare professionals, ultimately advancing healthcare capabilities.
Comprehensive patient data is necessary for accurate monitoring and decision support in large-scale Digital Twin (DT) technology implementations, particularly when ML and DL are used. Due to limited respiration and exhaled hydration data, novel statistical time series and frequency domain data augmentation methods are utilized to generate a larger synthetic respiration dataset. A Wi-Fi sensor with CSI characteristics, named ESP32, is employed for collecting respiration data as time series data, while a Stepped Monopole RF sensor operating from 0.5GHz to 5GHz is used for collecting breathing-exhaled hydration data in Decibels (dB). For both datasets, subjects' 12BPM, 20BPM, and 28BPM are considered in this multi-sensor data collection experiment.
Through statistical time domain and frequency domain methods, a total of 15 larger synthetic datasets are generated. Using a larger respiration and exhaled hydration dataset, signal processing methods are employed to provide noise reduction and accurately determine the subject's BPM from the raw respiration data. However, the exhaled hydration data from the experiment does not require any preprocessing or noise reduction. As a result, a larger pre-processed dataset is used for BPM and exhaled hydration classification, providing healthcare practitioners with decision support using Deep Learning (DL) and Machine Learning (ML).

Instructions: 

The focus of the respiration and exhaled hydration datasets is on the subjects' 12BPM, 20BPM, and 28BPM. A total of 15 patient datasets are provided through data augmentation in "Exhaled Hydration - Augmented Dataset" and "Respiration - Augmented Dataset." These datasets are in CSV format, where P1 represents patient 1, and so on. The respiration dataset, collected with the ESP32 Wi-Fi sensor, includes 52 subcarriers for each BPM. For instance, in the CSV file named "P1-12BPMCSV," the columns represent the 52 subcarriers, and the rows indicate the number of samples per subcarrier. This dataset is in raw form.
The exhaled hydration dataset, collected with the RF sensor, has one signal for each BPM. In the CSV file named "12BPM - P1 - Exhaled Hydration," the column represents the signal in Decibels (dB), while the rows correspond to the frequency components where these values are recorded. This dataset does not require additional preprocessing.

Documentation