Datasets
Standard Dataset
Modified FedHome
- Citation Author(s):
- Submitted by:
- Yaser Banad
- Last updated:
- Mon, 06/03/2024 - 15:23
- DOI:
- 10.21227/5m4q-m158
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
This paper presents a comparative study of sampling methods within the FedHome framework, designed for personalized in-home health monitoring. FedHome leverages federated learning (FL) and generative convolutional autoencoders (GCAE) to train models on decentralized edge devices while prioritizing data privacy. A notable challenge in this domain is the class imbalance in health data, where critical events such as falls are underrepresented, adversely affecting model performance. To address this, the research evaluates six oversampling techniques using Stratified K-fold cross-validation: SMOTE, Borderline-SMOTE, Random OverSampler, SMOTETomek, SVM-SMOTE, and SMOTE-ENN. These methods are tested on FedHome’s public implementation over 200 training rounds with and without stratified K-fold cross-validation. The findings indicate that SMOTE-ENN achieves the most consistent test accuracy, with a standard deviation range of 0.0167-0.0176, demonstrating stable performance compared to other samplers. In contrast, SMOTE and SVM-SMOTE exhibit higher variability in performance, as reflected by their wider standard deviation ranges of 0.0157-0.0180 and 0.0155-0.0180, respectively. Similarly, the Random OverSampler method shows a significant deviation range of 0.0155-0.0176. SMOTE-Tomek, with a deviation range of 0.0160-0.0175, also shows greater stability but not as much as SMOTE-ENN. This finding highlights the potential of SMOTEENN to enhance the reliability and accuracy of personalized health monitoring systems within the FedHome framework.
Installation
To set up the project, you need to install the required packages. Since the project relies on PyTorch for training, it is recommended to install the CUDA version to enable GPU support for faster execution. You can find the appropriate installation commands for your operating system and GPU at the following link:
PyTorch Installation Guide: https://pytorch.org/get-started/locally/
For a standard installation without GPU support, you can simply run the following command in your terminal or command prompt:
pip install torch
-----------------------------------
Additionally, you need to install the following packages:
pip install torchvision scikit-learn ujson opacus==0.15.0 h5py imblearn calmsize
-----------------------------------
Dataset Generation
To generate the dataset, execute the following command in the dateset directory:
python generate_har.py
-----------------------------------
Running the Code
To run the code, execute the following command in the system folder using the command prompt or terminal:
python -u main.py -lr 0.01 -lbs 10 -nc 30 -jr 1 -nb 6 -data har -m harcnn -algo FedHome -gr 200 -fd x -did 1 > har-FedHome-cross-validation-o-fold-x.out
-----------------------------------
Note: The -fd x argument represents the fold number to be used in the project. Replace x with the desired fold number. The jpg files of the graphs would be generated at the current folder.
To recreate all the graphs presented in the paper, simply execute the graph.py script with this command:
python graph.py
which is located in the graph folder.
-----------------------------------
Dataset Files
- Modified FedHome Modified FedHome Project.zip (4.31 MB)
- Modified FedHome Project.zip (4.31 MB)
Documentation
Attachment | Size |
---|---|
readme.txt | 1.63 KB |