Citation Author(s):
Hunan Normal University
Hunan Normal University
Submitted by:
Jintao Zhai
Last updated:
Mon, 12/11/2023 - 06:46
Data Format:
0 ratings - Please login to submit your rating.


UTERUS: The uterus dataset (Huang et al. 2021) collected from the treatment device HIFU Pro2008 of Shenzhen ProHuiren Company. The dataset comprises 495 HIFU treatment ultrasound monitoring images of uterus, with 330 images randomly selected for training, 50 images for validating and 115 images for testing. The target region for treatment is the tumor region in the ultrasound image, as marked by professional doctors, and the image size is 448×544 pixels.

BUSI: The breast ultrasound dataset consists of 780 images with pixel-wise breast cancer annotations (Al-Dhabyani et al. 2020). The dataset was collected from two different kinds of ultrasonic imaging devices, including LOGIQ E9 and LOGIQ E9 Agile. BUSI includes three classes of breast cases: 487 for benign, 210 for malignant, and 133 for normal. All images are preprocessed by (Al-Dhabyani et al. 2020). In the experiments in this paper, we used only benign and malignant, with a total of 647 images.

BUSC: The Mendeley (Paulo Sergio Rodrigues, 2017) ultrasound dataset includes 100 benign images and 150 malignant cancer images. The original resolution of ultrasound images is 64×64 pixels, later transformed to 128×128 pixels. The dataset is basically classification-based, and no ground truth images are provided. Therefore, with the help of an experienced radiologist, benign and malignant tumor images are annotated for the model training process. 

BUS: The UDIAT dataset was collected at the UDIAT Diagnostic Centre of the Parc Tauli Corporation, Sabadell, Spain, using a Siemens ACUSON scanner. The dataset contains 163 US images: 109 benign and 54 malignant cases, with only one lesion per image.

STU: The dataset, comprising 42 breast ultrasound images from the Hospital of Shantou University, is relatively small. Despite its limited size, it shares the same segmentation object as BUSI. Consequently, we designate this dataset as the unseen test set to assess the generalization ability of various models.

DDTI: The thyroid nodule dataset contains 637 ultrasound images with pixel-wise thyroid nodule annotations (Pedraza et al. 2015). DDTI was collected from two different kinds of ultrasonic imaging devices, including TOSHIBA Nemio 30 and TOSHIBA Nemio MX. We adopt the data pre-processed by (Gong et al. 2023). 

TN3K: The dataset comprises 3493 ultrasound images meticulously annotated for thyroid nodules at Zhujiang Hospital, South Medical University. Following Gong et al.'s methodology, the TN3K dataset is partitioned into subsets of 2303, 576, and 614 images for training, validation, and testing, respectively. In this paper, we utilize only 614 test images as an unseen dataset to evaluate all the foundation models.


we delve into the generalization of the proposed method and constructed a large ultrasonic dataset named LUD