Datasets
Standard Dataset
Dataset_for_review
- Citation Author(s):
- Submitted by:
- Sabina Rakhmetu...
- Last updated:
- Fri, 07/12/2024 - 04:54
- DOI:
- 10.21227/kpkn-0j05
- License:
- Categories:
- Keywords:
Abstract
Machine learning (ML) in the medical domain faces challenges due to limited high-quality data. This study addresses the scarcity of echocardiography images (echoCG) by generating synthetic data using state-of-the-art generative models. We evaluated a cycle-consistent generative adversarial network (CycleGAN), contrastive unpaired translation (CUT) method, and latent diffusion model (Stable Diffusion 1.5). Our methodology, image samples, and evaluation strategy are detailed, including a user study with cardiologists and surgeons assessing the quality and medical soundness of the generated samples. This research highlights the potential of synthetic data to improve ML applications in healthcare, enhancing diagnostic accuracy and patient outcomes.
Images folder contains subfolders named after the prompts used to generate images in them using stable diffusion. Within "Random-Statistics-Sample-main" folder there are subfolders that countain images generated using CycleGAN, CUT and real images sampled from EchoNET dataset. CUT and CycleGAN generated images have epoch number of the generation process marked in their names. Outside of those two subfolders there are tables with datasets describing doctors that participated in the study (within .csv tables labeled as "doctor") and marks that they gave to each images (within .csv tables labeled as "question")