Cancer Data

Studies indicate the occurrence and development of lung adenocarcinoma (LUAD) is regulated by ferroptosis and long non-coding RNA (lncRNA). While the role of ferroptosis-related lncRNA signature on the prognosis of LUAD is unclear. This study aimed to identify ferroptosis-related lncRNA signature for predicting the prognosis of LUAD. RNA expression profile and clinical data of LUAD patients were downloaded from public databases. The cox regression model was used to construct a multi-lncRNA signature.


CT RECIST response, as measured by the change of tumor diameter, can accurately reflect objective response rate for advanced NSCLC patients. However, there exists obvious discordant between CT RECIST response and prognostic indicators. Thus, our study aimed to identify a new CT RECIST response indicator at the early treatment stage to reflect the prognosis more accurately.We studied 916 tumor lesions obtained through deep learning and found that the shape of the lesions was irregular.


The prognostic survival dataset, Pancreatic Cancer Survival based on Preoperative Features (PCSPF), was constructed to explore the impact of key preoperative features on prognosis based on the follow-up data of patients with pancreatic cancer at Changhai Hospital, Shanghai, China.


The data included here within is the associated model training results from the correlated paper "Distribution-Driven Augmentation of Real-World Datasets for Improved Cancer Diagnostics With Machine Learning". This paper focuses on using kernel density estimators to curate datasets by balancing classes and filling missing null values though synthetically generated data. Additionally, this manuscript proposes a technique for joining distinct datasets to train a model with necessary features from multiple different datasets as a type of transfer-learning.


The pathology files of 194 colon cancer patients, 137 breast cancer patients, 124 gastric cancer patients, and 169 thyroid cancer patients who were referred to the healthcare facilities of Qazvin Province, Iran  were examined for age, sex, surgery type, and pathological information. We collected information between 2010 and 2020.


This dataset contains 37 estrogen receptor immunohistochemistry (ER-IHC) whole slide images (WSIs) obtained from Universiti Malaya Medical Centre (UMMC), Malaysia. The WSI is scanned using 3DHistech Pannoramic DESK at 20x magnification with an approximate dimension of 80,000 pixels width and 200,000 pixels height per WSI.


This robust dataset is extracted from the International Skin Imaging Collaboration (ISIC). Similar datasets are used for the annual ISIC Challenge, presenting an opportunity for the computer science community to produce algorithms that can outperform professional dermatology. The submitted dataset contains approximately 1,000 images of malignant melanomas, as well as approximately 1,000 images of benign melanomas.


The medical community strives continually to improve the quality of care patients receive.

Predictions of prognosis are essential for doctors and patients to choose a course of treatment. Recent years

have witnessed the development of numerous new cancer survival prediction models. Most attempts to

predict the prognosis of people with malignant development rely on classification techniques. We could

experiment with significantly different results using only a subset of SEER (Surveillance, Epidemiology,


The development of high throughput sequencing technologies i.e. Next Generation Sequencing (NGS) is revolutionizing the exploration of cancer. Though sequence datasets are highly complex, mutation can occur randomly in DNA or RNA sequences that can make cells sicker or less fit. The unusual growth and behavior of genes in cells cause cancer. Cancer-driver gene cells grow when mutation occurs. Identification of cancer driver genes is a critical and challenging issue for researchers.