Coronary Heart Disease Prediction - A Shiny App

Anoop Swarup
June 15, 2018

Project for Coursera “Developing Data Products” Course

Would you or your patient have "Coronary Heart Disease" ?

Enter some data on the risk factors:

  • Age - Age of the patient
  • LDL - Low Density Lipoprotein Cholestorel Level
  • Tobacco - Cumulative Tobacco Usage (kg)
  • Typea - Type A Behavior
  • Famhist - Family History of Heart Disease (Present / Absent)

Then you are given an estimate of a Coronary Heart Disease (CHD) risk. In subsequent slides we describe the model for this web-based Shiny App.

The Data and Model Development

  • The regression model we built is based on data from a study by Rousseauw et al, 1983, presented in South African Medical Journal. We first created a generalized linear model (logistic regression) using all the variables in the 'SAheart' dataset.
fit <- glm(factor(chd) ~ ., data=SAheart, family = binomial)
  • Results from this model gave us the significant predictor variables to be used in our model for Shiny App. Those are: tobacco, ldl, famhist, typea, and age.

  • We partitioned the 'SAheart' data into training (70%) and test (30%) datasets. The model was then built on the training dataset, and tested on the test dataset.

More on the Model

Model built using the caret package:

modFit <- train(chd ~ age + tobacco + typea + ldl + famhist, method = "glm", family="binomial", data = trainSA)
testing_prediction <- predict(modFit, testSA)
confMat <- confusionMatrix(testSA$chd, testing_prediction)
paste("Prediction accuracy - test:", round(confMat$overall["Accuracy"], 2))
[1] "Prediction accuracy - test: 0.7"

The model achieved accuracy of 77% on the training dataset, and 70% on the test dataset.

The Shiny App

  • The shipy app is hosted on shinyapps.io, and is available for anyone to use at:

https://alphasig.shinyapps.io/HeartPredict/

  • Please note that though modelled with a real world data, the prediction does not in any way take the place of seeing a doctor.