Datasets
Standard Dataset
TNCOVID_19
- Citation Author(s):
- Submitted by:
- khem poudel
- Last updated:
- Fri, 01/13/2023 - 14:32
- DOI:
- 10.21227/6856-5h18
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Data preprocessing is a fundamental stage in deep learning modeling and serves as the cornerstone of reliable data analytics. These deep learning models require significant amounts of training data to be effective, with small datasets often resulting in overfitting and poor performance on large datasets. One solution to this problem is parallelization in data modeling, which allows the model to fit the training data more effectively, leading to higher accuracy on large data sets and higher performance overall. In this research, we developed a novel approach that effectively deployed tools such as MPI and MPI4Py from parallel computing to handle data preprocessing and deep learning modeling processes. As a case study, the technique is applied to COVID-19 data from state of Tennessee, USA. Finally, the effectiveness of our approach is demonstrated by comparing it with existing methods without parallel computing concepts like MPI4Py. Our results demonstrate promising outcome for the deployment of parallel computing in modeling to minimize high computational cost
Documentation
Attachment | Size |
---|---|
README.md | 880 bytes |