Skip to main content

Datasets

Standard Dataset

Thyroid Cancer

Citation Author(s):
Aminul Haque
Submitted by:
S M Aminul Haque
Last updated:
DOI:
10.21227/fabw-q842
855 views
Categories:
Keywords:
Average: 5 (1 vote)

Abstract

The medical community strives continually to improve the quality of care patients receive.

Predictions of prognosis are essential for doctors and patients to choose a course of treatment. Recent years

have witnessed the development of numerous new cancer survival prediction models. Most attempts to

predict the prognosis of people with malignant development rely on classification techniques. We could

experiment with significantly different results using only a subset of SEER (Surveillance, Epidemiology,

and End Results) data. These models were created using machine learning techniques by selecting univariate

features and calculating correlations. We illustrated the variation in results and discrepancy of impurity

that can result from varying data quantities and critical factors. Seventeen crucial factors were identified

to evaluate the effectiveness of an estimation technique. The most effective machine learning algorithms are

Logistic Regression, Gradient Boosting Classifier, Random Forest, Extra Trees, Light Gradient Boost, Ada

Boost Classifier, and Hist Gradient Boosting

Instructions:

Obtained from SEER Data