Heart Disease Dataset (Comprehensive)

Citation Author(s):
Manu
Siddhartha
Liverpool John Moore's University
Submitted by:
MANU SIDDHARTHA
Last updated:
Fri, 11/06/2020 - 04:17
DOI:
10.21227/dz4t-cm36
Data Format:
Links:
License:
5
6 ratings - Please login to submit your rating.

Abstract 

This heart disease dataset is curated by combining 5 popular heart disease datasets already available independently but not combined before. In this dataset, 5 heart datasets are combined over 11 common features which makes it the largest heart disease dataset available so far for research purposes. The five datasets used for its curation are:

  1. Cleveland
  2. Hungarian
  3. Switzerland
  4. Long Beach VA
  5. Statlog (Heart) Data Set.

This dataset consists of 1190 instances with 11 features. These datasets were collected and combined at one place to help advance research on CAD-related machine learning and data mining algorithms, and hopefully to ultimately advance clinical diagnosis and early treatment. 

Instructions: 

This dataset can be used for building a predictive machine learning model for early-stage heart disease detection

Comments

ok

Submitted by Ashish Basnet on Fri, 02/12/2021 - 12:45

1

Submitted by moslem darvishi on Mon, 03/01/2021 - 05:16

2

Submitted by Kevin zhang on Wed, 03/10/2021 - 08:14

This dataset includes 272 duplicate records, notably all data from statlog is in the original dataset. Also all locations where data was previously missing look like they were simply set to 0. User beware.

Submitted by Jeremy Huckins on Thu, 05/27/2021 - 10:39

Hello, I will work on this database soon. I ask you, if possible, to tell the information you have obtained about the problems of this database in full and cleared.

Submitted by Arman Daliri on Sat, 06/12/2021 - 16:39

Your name
Rabia Almamlook

Submitted by Rabia Almamlook on Sat, 11/13/2021 - 15:07

hey Jeremy, how would you suggest to deal with missing data?

Submitted by Amir Jayousi on Wed, 12/14/2022 - 18:15

Thank you!

Submitted by JIAN HAO LOO on Sat, 09/18/2021 - 15:16

Thanks

Submitted by Aqil Azmi on Sun, 09/26/2021 - 16:04

cholesterol has 172 (14.5%) zeros

Submitted by Chee Hong Wong on Wed, 09/29/2021 - 06:35

ok

Submitted by Vincent Udechukwu on Thu, 12/22/2022 - 01:37

azhe

Submitted by EJO LIU on Mon, 10/16/2023 - 07:55

How to deal with the cholestrol column with zeroes in it

Submitted by Ayisha COK on Thu, 07/04/2024 - 18:38

Check outliers first, if they're too many, use the median value to replace the zeros, otherwise use the mean value

Submitted by Arun M on Thu, 09/12/2024 - 22:25

This dataset includes 272 duplicate records, notably all data from statlog is in the original dataset. Also all locations where data was previously missing look like they were simply set to 0. User beware.

Submitted by jess wei on Tue, 10/01/2024 - 19:54

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.

Documentation

AttachmentSize
File documentation.pdf410.81 KB