Heart Disease Dataset (Comprehensive)
- Citation Author(s):
- Submitted by:
- MANU SIDDHARTHA
- Last updated:
- DOI:
- 10.21227/dz4t-cm36
- Data Format:
- Links:
- Categories:
- Keywords:
Abstract
This heart disease dataset is curated by combining 5 popular heart disease datasets already available independently but not combined before. In this dataset, 5 heart datasets are combined over 11 common features which makes it the largest heart disease dataset available so far for research purposes. The five datasets used for its curation are:
- Cleveland
- Hungarian
- Switzerland
- Long Beach VA
- Statlog (Heart) Data Set.
This dataset consists of 1190 instances with 11 features. These datasets were collected and combined at one place to help advance research on CAD-related machine learning and data mining algorithms, and hopefully to ultimately advance clinical diagnosis and early treatment.
Instructions:
This dataset can be used for building a predictive machine learning model for early-stage heart disease detection
In reply to This dataset includes 272 by Jeremy Huckins
In reply to This dataset includes 272 by Jeremy Huckins
In reply to This dataset includes 272 by Jeremy Huckins
How to deal with the cholestrol column with zeroes in it
In reply to How to deal with the by Ayisha COK
Check outliers first, if they're too many, use the median value to replace the zeros, otherwise use the mean value
This dataset includes 272 duplicate records, notably all data from statlog is in the original dataset. Also all locations where data was previously missing look like they were simply set to 0. User beware.
Hi, attribute 'st slope' has units: 0, 1, 2, 3. The document.pdf says its units are 1, 2, 3. What about 0?