Datasets
Standard Dataset
Lungs Disease Dataset (4 types)
- Citation Author(s):
- Submitted by:
- Amrita Tripathi
- Last updated:
- Tue, 03/26/2024 - 10:33
- DOI:
- 10.21227/c5ax-qj62
- Data Format:
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
Any damage that affects the normal functioning of the lungs is termed as a lung disease,
which can prove fatal if not detected early. To address this challenge, two innovative techniques proposed
for the lung disease classification, supporting medical professionals to diagnose and provides preventive
measures at an early stage. The proposed Model 1 integrates a custom MobileNetV2L2 architecture, that
builds upon the MobileNetV2 framework through fine-tuning and customization. This model incorporates a
ridge regularizer within its dense layer to enhance its performance. The Proposed Model 2, built on CNN as
its foundational block, is fine-tuned with ELU as the activation function, replacing ReLU, and incorporates
the L2 regularization technique. The proposed research utilizes two publicly available datasets: DS1(Data
Set1), which is the Lung Disease 5-class dataset, and DS2(Data Set2), which is the Lung Disease 4-class
dataset and are collected from Kaggle. The results from the proposed Model 1 provides better performance
than state-of-the-art techniques like EfficientNet B0, InceptionV3, ResNet, and InceptionResNetV2. It
achieved a training accuracy of 99.53%, validation accuracy of 100%, and test accuracy of 95.51%. The
proposed Model 2 provides outsatnding performance, with a training accuracy of 96.79%, validation
accuracy of 91.56%, and testing accuracy reaching 99.26% The proposed research serves as a valuable
tool for doctors, providing a secondary opinion in the diagnostic process.
The Dataset contains chest x-rays images. This Dataset was prepared from various datasets like I combined the datasets accordingly (Yes I removed same images in dataset using VisiPics). It has 4 types of Lungs Diseases and a folder of Normal Lungs. I augmented the dataset with factor 6 so there are basically 10000 images.