AI in Medical Applications
We present the SynSUM benchmark, a synthetic dataset linking unstructured clinical notes to structured background variables. The dataset consists of 10,000 artificial patient records containing tabular variables (like symptoms, diagnoses and underlying conditions) and associated clinical notes describing the fictional patient encounter in the domain of respiratory diseases. The tabular portion of the data is generated through a Bayesian network, where both the causal structure between the variables and the conditional probabilities are proposed by an expert based on domain knowledge.
- Categories:
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/dna-3598439_1920.jpg?itok=dq6kcJl6)
This dataset comprises a comprehensive analysis of state-of-the-art techniques and systems for seizure detection and classification, based on various papers and studies. It integrates detailed metadata on publications, including their year, methodologies, seizure types (both ILAE-2017 and paper-specific), datasets, and biomarker utilization. The dataset also provides performance metrics such as accuracy, sensitivity, specificity, false-positive rates, and AUC-ROC values, alongside additional technical details about machine learning models, feature extraction techniques, and biomarkers.
- Categories: