Demo of PLOTLY: Kaggle Heart Disease UCI

Robert Jeenchen Chen, MD, MPH ()

11/24/2019

Variables

1. age

2. sex: 0 female, 1 male

3. cp (chest pain): 1 typical, 2 atypical, 3 non-angina, 4 asymptomatic

4. trestbps (resting blood pressure)

5. chol (serum cholesterol in mg/dl)

6. fbs (fasting blood sugar > 120 mg/dl): 0 N, 1 Y

7. restecg (resting ECG): 0 normal, 1: ST-T change, 2: LVH

8. thalach (maximum heart rate)

9. exang (exercise induced angina): 0 N, 1 Y

10. oldpeak (ST depression induced by exercise)

11. slope (ST-seg at the peak exercise): 1 up-, 2 flat, 3 down-slope

12. ca (# of lesion coronary arteries by cardiac cath: 0-3)

13. tha (viability perfusion scan) 3 = normal, 6 = fixed, 7 = reversible defect

14. target (diseased coronary artery): 0 N, 1 Y

Exploratory Data Analysis with Plotly

Source: Kaggle.com

https://www.kaggle.com/ronitf/heart-disease-uci

EDA by K-means:

https://rjcc.shinyapps.io/UCI_Heart_EDA_KM/

Acknowledgements

1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.

2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.

3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.

4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.

Donor: David W. Aha ()

Scatterplots Color

library(plotly)
plot_ly(heartdata, x = ~age, y = ~thalach, type = "scatter", color = ~factor(slope))
plot_ly(heartdata, x = ~age, y = ~thalach, type = "scatter", color = ~factor(sex))
plot_ly(heartdata, x = ~age, y = ~thalach, type = "scatter", color = ~factor(fbs))
plot_ly(heartdata, x = ~age, y = ~thalach, type = "scatter", color = ~factor(exang))

Continuous Color Scatterplots

plot_ly(heartdata, x = ~chol, y = ~thalach, type = "scatter", color = ~age)

Scatterplot Sizing

plot_ly(heartdata, x = ~trestbps, y = ~thalach, type = "scatter", 
        color = ~chol, size = ~age)

3D Scatterplot

plot_ly(heartdata, x = ~trestbps, y = ~chol, z = ~thalach,
        type = "scatter3d", color = ~age)

Histograms

plot_ly(heartdata, x = ~chol, type = "histogram", color=~factor(fbs))
plot_ly(heartdata, x = ~thalach, type = "histogram", color=~age)
plot_ly(heartdata, x = ~trestbps, type = "histogram", color=~age)

Interim Summary

  1. Preliminary relationships between different scale, nominal, and ordinal variables were shown.
  2. Predictor (Xi) and outcome (Y) variables should be assigned.
  3. Hypotheses should be proposed to build statistical models for inferences.

Welcome to email me: !

EDA by K-means: https://rjcc.shinyapps.io/UCI_Heart_EDA_KM/