Datasets
Standard Dataset
Luna 16
![](https://ieee-dataport.org/sites/default/files/styles/3x2/public/tags/images/mri-2813894_1920.jpg?itok=K1jJFR69)
- Citation Author(s):
- Submitted by:
- Nan Wang
- Last updated:
- Sun, 01/12/2025 - 22:01
- DOI:
- 10.21227/0kjp-g187
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
About the data
The publicly available LIDC/IDRI database. This data uses the Creative Commons Attribution 3.0 Unported License. We excluded scans with a slice thickness greater than 2.5 mm. In total, 888 CT scans are included. The LIDC/IDRI database also contains annotations which were collected during a two-phase annotation process using 4 experienced radiologists. Each radiologist marked lesions they identified as non-nodule, nodule < 3 mm, and nodules >= 3 mm. See this publication for the details of the annotation process. The reference standard of our challenge consists of all nodules >= 3 mm accepted by at least 3 out of 4 radiologists. Annotations that are not included in the reference standard (non-nodules, nodules < 3 mm, and nodules annotated by only 1 or 2 radiologists) are referred to as irrelevant findings. The list of irrelevant findings is provided inside the evaluation script (annotations_excluded.csv).
This dataset contains the download data for the LUNA16 challenge available at https://luna16.grand-challenge.org/
The LUNA16 dataset, used in this study, is a benchmark medical imaging dataset designed for lung nodule detection in low-dose CT scans. It includes high-resolution 3D CT images and annotated nodule locations, enabling robust evaluation of anomaly detection frameworks. In our research, we preprocessed the data by normalizing pixel intensities, resampling to uniform voxel sizes, and segmenting lung regions to extract nodule patches. These patches were used to train and evaluate our proposed framework, achieving superior accuracy and computational efficiency compared to existing methods. Detailed processing steps and code snippets are available for reproducibility.