Heart disease is the [leading cause of death in the United States](https://www.cdc.gov/heartdisease/facts.htm). Researchers are using several data mining techniques to help health care professionals in the diagnosis of heart disease. In this project, you will examine the relationship between the maximum heart rate one can achieve during exercise and the likelihood of developing heart disease. Using multiple logistic regression, you will handle the confounding effects of age and gender. This project uses the Cleveland heart disease dataset.
- 1Heart disease and potential risk factors
- 2Converting diagnosis class into outcome variable
- 3Identifying important clinical variables
- 4Explore the associations graphically (i)
- 5Explore the associations graphically (ii)
- 6Explore the associations graphically (iii)
- 7Putting all three variables in one model
- 8Extracting useful information from the model output
- 9Predicted probabilities from our model
- 10Model performance metrics
Senior Data Scientist at Uptake
Amy Yang is a Sr. Data Scientist at Uptake where she conducts industrial analytics and build prediction models to major industries and help them increase productivity, security, safety, and reliability. She began using R for simulation and statistical analysis during her study at the University of Pennsylvania where she received her MS degree in Biostatistics. She also teaches R programming and statistical courses for graduate students.