HEART DISEASE DATASET

Citation Author(s):
KEXIN
ZHENG
Submitted by:
KEXIN ZHENG
Last updated:
Thu, 01/02/2025 - 03:43
DOI:
10.21227/49m4-zh81
License:
173 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

At present, big data technology is booming, and it plays an extremely important role in all aspects of our lives, from healthcare and financial services to smart city construction and personalized recommendations, big data applications are everywhere.   Through the analysis and processing of massive data, big data technology not only improves the accuracy of decision-making, but also provides strong support for optimizing resource allocation, improving operational efficiency and achieving innovation.   The use of big data to prevent and control heart disease is on the rise.  By leveraging big data, we can enhance our understanding of heart disease patterns to better optimize prevention strategies and more effective treatments.  However, the imbalance of medical data on heart disease poses difficulties for big data research.  Dealing with these problems requires detailed data preprocessing strategies.  In addition, identifying the truly useful features and the most effective predictive models for heart disease is a challenge amid the massive amount of medical data. This is an imbalanced data set on heart disease, with far more people healthy than sick.

Instructions: 

Data Source: Behavioral Risk Factor Surveillance System (2015)
Data information: This dataset contains 253,680 rows and 22 attributes. Of those, 229,787 did not have a heart disease and 23,893 did.
Attributes,Range,Interpretation:
HeartDiseaseorAttack,0/1,Target;1: Have heart disease
HighBP,0/1,1:Diagnosed with high blood pressure
HighChol,0/1,1:Diagnosed with high cholesterol levels
CholCheck,0/1,1:Cholesterol checked within past 5 years
BMI,12-98,Body Mass Index value
Smoker,0/1,1:Smoker
Stroke,0/1,1:Cardiovascular disease or stroke
Diabetes,0-2,0: No diabetes or only during pregnancy; 1: Pre-diabetes or borderline; 2: Diagnosed diabetes
PhysActivity,0/1,1:Do physical activity
Fruits,0/1,1:Usually eat fruit
Veggies,0/1,1:Usually eat vegetables
HvyAlcoholConsump,0/1,1:Heavy Alcohol Consumption
AnyHealthcare,0/1,1: Have medical care
NoDocbcCost,0/1,1:No Doctor because of Cost
GenHlth,1-5,General health level,1 is the best
MentHlth,0-30,Total days in a month with mental health challenges
PhysHlth,0-30,Total days in a month with physical health challenges
DiffWalk,0/1,1:Have difficulty walking
Sex,0/1,1:Male
Age,1-13,Coded in age groups: 1 = 18–24, up to 13 = 80 and older, in 5-year increments
Education,1-6,Education level,6 is the best
Income,1-8,Income level,8 is the best