Shiny App: Prediction of Diabetes Mellitus

John Slough
August 15, 2015

Project for Coursera: Developing Data Products

Does Your Patient Have Diabetes?

Enter some data:

  • number of times pregnant
  • plasmsa glucose concentration from a 2 hour oral glucose tolerance test
  • diastolic blood pressure
  • triceps skin fold thickness
  • 2 hour serum insulin
  • BMI
  • age
  • Diabetes Pedigree Function (explained on the app)

Then you are given a probability of diabetes diagnosis!

The Data and Developing the Model

Pima Indian dataset from UCI Machine Learning Repository

  • May not be generalizable
  • Based on a generalized linear model (logistic regression)
  • Data partitioned into training (75%) and testing (25%)
  • Accuracy of approximately 80% achieved on the testing dataset.
  • The prediction does not in any way take the place of seeing a doctor
  • Educational/informational purposes only

Data Visualization

A visualization of the logistic regression model using the Plasma Glucose variable. The plot shows a dot plot of the data with a box plot for positive and negative diabetes diagnoses, and the prediction line. plot of chunk unnamed-chunk-1

Model Summary

Below is a summary of the final model:

               Estimate Std. Error Pr(>|z|)
(Intercept)     -9.7210     1.3777   0.0000
times_pregnant   0.1286     0.0656   0.0500
plasma_glucose   0.0430     0.0071   0.0000
diastolic_BP    -0.0031     0.0139   0.8249
tri_skin_fold    0.0260     0.0192   0.1753
insulin         -0.0010     0.0014   0.4845
BMI              0.0479     0.0314   0.1270
d_ped            0.9843     0.4807   0.0406
age              0.0160     0.0201   0.4251

Go here to access the shiny app:
https://sloughje.shinyapps.io/extra

and here for the code:
https://github.com/SloughJE/Coursera-Developing-Data-Products