INTRODUCTION

OBJECTIVE

  1. To develop a heart disease prediction model using logistic regression and identify significant factors.
  2. To compare the performance of logistic regression with prediction outcome from other classification model on detecting the heart diseases.

Import relevant libraries

library(dplyr)
library(Rcpp)
library(lattice)
library(ggplot2)
library(proto)
library(RSQLite)
library(gsubfn)
library(caret)
library(sqldf)
library(Amelia)
library(BinMat)
library(tidyr)
library(tidyverse)
library(MASS)
library(Hmisc)
library(Formula)
library(klaR)
library(e1071)
library(survival)
library(mlbench)
library(readr)
library(skimr)
library(DataExplorer)
library(funModeling) 
library(Hmisc)
library(Rcpp)
library(ROCR)
library(knitr)
library(kableExtra)
library(GGally)
library(rsample)
library(viridisLite)
library(yardstick)
library(parsnip)
library(recipes)

Data Source and Description

This section provides information on the source of the data and high level explanation on the features (extracted from website https://archive.ics.uci.edu/ml/datasets/Heart+Disease)

The data used for this project is an open source database obtained from UCI Machine Learning website. The data consists of 302 rows and 14 columns. The last column in the dataset is the target feature that shows the presence of heart disease.

Load the Dataset

#heart<-read.csv("processed_cleveland_ori_data.csv", header = T)
heart <- read.csv("D:/01a Prog DS (Thursday)/01 Project/cleveland.csv", header = F)

Assign Names to Each Column

# List out column names
names <- c("Age",
           "Sex",
           "Chest_Pain_Type",
           "Resting_Blood_Pressure",
           "Serum_Cholesterol",
           "Fasting_Blood_Sugar",
           "Resting_ECG",
           "Max_Heart_Rate_Achieved",
           "Exercise_Induced_Angina",
           "ST_Depression_Exercise",
           "Peak_Exercise_ST_Segment",
           "Num_Major_Vessels_Flouro",
           "Thalassemia",
           "target")

# Apply column names to the dataframe
colnames(heart) <- names

The details of all the features are listed below:

  • Age: Age of subject

  • Sex: Gender of subject: 0 = female 1 = male

  • Chest-pain type: Type of chest-pain experienced by the individual: 1 = typical angina 2 = atypical angina 3 = non-angina pain 4 = asymptomatic angina

  • Resting Blood Pressure: Resting blood pressure in mm Hg

  • Serum Cholesterol: Serum cholesterol in mg/dl

  • Fasting Blood Sugar: Fasting blood sugar level relative to 120 mg/dl: 0 = fasting blood sugar <= 120 mg/dl 1 = fasting blood sugar > 120 mg/dl

  • Resting ECG: Resting electrocardiographic results 0 = normal 1 = ST-T wave abnormality 2 = left ventricle hyperthrophy

  • Max Heart Rate Achieved: Max heart rate of subject

  • Exercise Induced Angina: 0 = no 1 = yes

  • ST Depression Induced by Exercise Relative to Rest: ST Depression of subject

  • Peak Exercise ST Segment: 1 = Up-sloaping 2 = Flat 3 = Down-sloaping

  • Number of Major Vessels (0-3) Visible on Flouroscopy: Number of visible vessels under flouro

  • Thal: Form of thalassemia: 3 3 = normal 6 = fixed defect 7 = reversible defect

  • target: Indicates whether subject is suffering from heart disease or not: 0 = absence 1-4 = heart disease present

Data Examination Using Head, Tail, Structure, Summary, Dimension, Glimpse

# HEAD / TAIL
# It allows us to see the first and last 6 rows by default. 

head(heart)
##     Age Sex Chest_Pain_Type Resting_Blood_Pressure Serum_Cholesterol
## 1 63   1               1                    145               233
## 2    67   1               4                    160               286
## 3    67   1               4                    120               229
## 4    37   1               3                    130               250
## 5    41   0               2                    130               204
## 6    56   1               2                    120               236
##   Fasting_Blood_Sugar Resting_ECG Max_Heart_Rate_Achieved
## 1                   1           2                     150
## 2                   0           2                     108
## 3                   0           2                     129
## 4                   0           0                     187
## 5                   0           2                     172
## 6                   0           0                     178
##   Exercise_Induced_Angina ST_Depression_Exercise Peak_Exercise_ST_Segment
## 1                       0                    2.3                        3
## 2                       1                    1.5                        2
## 3                       1                    2.6                        2
## 4                       0                    3.5                        3
## 5                       0                    1.4                        1
## 6                       0                    0.8                        1
##   Num_Major_Vessels_Flouro Thalassemia target
## 1                        0           6      0
## 2                        3           3      2
## 3                        2           7      1
## 4                        0           3      0
## 5                        0           3      0
## 6                        0           3      0
tail(heart)
##     Age Sex Chest_Pain_Type Resting_Blood_Pressure Serum_Cholesterol
## 298  57   0               4                    140               241
## 299  45   1               1                    110               264
## 300  68   1               4                    144               193
## 301  57   1               4                    130               131
## 302  57   0               2                    130               236
## 303  38   1               3                    138               175
##     Fasting_Blood_Sugar Resting_ECG Max_Heart_Rate_Achieved
## 298                   0           0                     123
## 299                   0           0                     132
## 300                   1           0                     141
## 301                   0           0                     115
## 302                   0           2                     174
## 303                   0           0                     173
##     Exercise_Induced_Angina ST_Depression_Exercise Peak_Exercise_ST_Segment
## 298                       1                    0.2                        2
## 299                       0                    1.2                        2
## 300                       0                    3.4                        2
## 301                       1                    1.2                        2
## 302                       0                    0.0                        2
## 303                       0                    0.0                        1
##     Num_Major_Vessels_Flouro Thalassemia target
## 298                        0           7      1
## 299                        0           7      1
## 300                        2           7      2
## 301                        1           7      3
## 302                        1           3      1
## 303                        ?           3      0
# Structure of the dataset
str(heart)
## 'data.frame':    303 obs. of  14 variables:
##  $ Age                     : chr  "63" "67" "67" "37" ...
##  $ Sex                     : int  1 1 1 1 0 1 0 0 1 1 ...
##  $ Chest_Pain_Type         : int  1 4 4 3 2 2 4 4 4 4 ...
##  $ Resting_Blood_Pressure  : int  145 160 120 130 130 120 140 120 130 140 ...
##  $ Serum_Cholesterol       : int  233 286 229 250 204 236 268 354 254 203 ...
##  $ Fasting_Blood_Sugar     : int  1 0 0 0 0 0 0 0 0 1 ...
##  $ Resting_ECG             : int  2 2 2 0 2 0 2 0 2 2 ...
##  $ Max_Heart_Rate_Achieved : int  150 108 129 187 172 178 160 163 147 155 ...
##  $ Exercise_Induced_Angina : int  0 1 1 0 0 0 0 1 0 1 ...
##  $ ST_Depression_Exercise  : num  2.3 1.5 2.6 3.5 1.4 0.8 3.6 0.6 1.4 3.1 ...
##  $ Peak_Exercise_ST_Segment: int  3 2 2 3 1 1 3 1 2 3 ...
##  $ Num_Major_Vessels_Flouro: chr  "0" "3" "2" "0" ...
##  $ Thalassemia             : chr  "6" "3" "7" "3" ...
##  $ target                  : int  0 2 1 0 0 0 3 0 2 1 ...
# Summary of the dataset
summary(heart)
##      Age                 Sex         Chest_Pain_Type Resting_Blood_Pressure
##  Length:303         Min.   :0.0000   Min.   :1.000   Min.   : 94.0         
##  Class :character   1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:120.0         
##  Mode  :character   Median :1.0000   Median :3.000   Median :130.0         
##                     Mean   :0.6799   Mean   :3.158   Mean   :131.7         
##                     3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:140.0         
##                     Max.   :1.0000   Max.   :4.000   Max.   :200.0         
##  Serum_Cholesterol Fasting_Blood_Sugar  Resting_ECG     Max_Heart_Rate_Achieved
##  Min.   :126.0     Min.   :0.0000      Min.   :0.0000   Min.   : 71.0          
##  1st Qu.:211.0     1st Qu.:0.0000      1st Qu.:0.0000   1st Qu.:133.5          
##  Median :241.0     Median :0.0000      Median :1.0000   Median :153.0          
##  Mean   :246.7     Mean   :0.1485      Mean   :0.9901   Mean   :149.6          
##  3rd Qu.:275.0     3rd Qu.:0.0000      3rd Qu.:2.0000   3rd Qu.:166.0          
##  Max.   :564.0     Max.   :1.0000      Max.   :2.0000   Max.   :202.0          
##  Exercise_Induced_Angina ST_Depression_Exercise Peak_Exercise_ST_Segment
##  Min.   :0.0000          Min.   :0.00           Min.   :1.000           
##  1st Qu.:0.0000          1st Qu.:0.00           1st Qu.:1.000           
##  Median :0.0000          Median :0.80           Median :2.000           
##  Mean   :0.3267          Mean   :1.04           Mean   :1.601           
##  3rd Qu.:1.0000          3rd Qu.:1.60           3rd Qu.:2.000           
##  Max.   :1.0000          Max.   :6.20           Max.   :3.000           
##  Num_Major_Vessels_Flouro Thalassemia            target      
##  Length:303               Length:303         Min.   :0.0000  
##  Class :character         Class :character   1st Qu.:0.0000  
##  Mode  :character         Mode  :character   Median :0.0000  
##                                              Mean   :0.9373  
##                                              3rd Qu.:2.0000  
##                                              Max.   :4.0000
# DIMENSION
# Displays the dimensions of the table. The output takes the form of row, column.
dim(heart)
## [1] 303  14
### GLIMPSE
# Displays the type and a preview of all columns as a row so that it's very easy to take in.
# This will display a vertical preview of the dataset. 
# It allows us to easily preview the data type and sample data.
glimpse(heart)
## Rows: 303
## Columns: 14
## $ Age                      <chr> "63", "67", "67", "37", "41", "56", "62", ~
## $ Sex                      <int> 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, ~
## $ Chest_Pain_Type          <int> 1, 4, 4, 3, 2, 2, 4, 4, 4, 4, 4, 2, 3, 2, 3, ~
## $ Resting_Blood_Pressure   <int> 145, 160, 120, 130, 130, 120, 140, 120, 130, ~
## $ Serum_Cholesterol        <int> 233, 286, 229, 250, 204, 236, 268, 354, 254, ~
## $ Fasting_Blood_Sugar      <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, ~
## $ Resting_ECG              <int> 2, 2, 2, 0, 2, 0, 2, 0, 2, 2, 0, 2, 2, 0, 0, ~
## $ Max_Heart_Rate_Achieved  <int> 150, 108, 129, 187, 172, 178, 160, 163, 147, ~
## $ Exercise_Induced_Angina  <int> 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, ~
## $ ST_Depression_Exercise   <dbl> 2.3, 1.5, 2.6, 3.5, 1.4, 0.8, 3.6, 0.6, 1.4, ~
## $ Peak_Exercise_ST_Segment <int> 3, 2, 2, 3, 1, 1, 3, 1, 2, 3, 2, 2, 2, 1, 1, ~
## $ Num_Major_Vessels_Flouro <chr> "0", "3", "2", "0", "0", "0", "2", "0", "1", ~
## $ Thalassemia              <chr> "6", "3", "7", "3", "3", "3", "3", "3", "7", ~
## $ target                   <int> 0, 2, 1, 0, 0, 0, 3, 0, 2, 1, 0, 0, 2, 0, 0, ~

Statistical Information and Data Exploration

# Skim
# This function is a good addition to the summary function. 
# It displays most of the numerical attributes from summary, but it also 
# displays missing values, more quantile information and an inline histogram for each variable
skim(heart)
Data summary
Name heart
Number of rows 303
Number of columns 14
_______________________
Column type frequency:
character 3
numeric 11
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Age 0 1 2 5 0 42 0
Num_Major_Vessels_Flouro 0 1 1 1 0 5 0
Thalassemia 0 1 1 1 0 4 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Sex 0 1 0.68 0.47 0 0.0 1.0 1.0 1.0 <U+2583><U+2581><U+2581><U+2581><U+2587>
Chest_Pain_Type 0 1 3.16 0.96 1 3.0 3.0 4.0 4.0 <U+2581><U+2583><U+2581><U+2585><U+2587>
Resting_Blood_Pressure 0 1 131.69 17.60 94 120.0 130.0 140.0 200.0 <U+2583><U+2587><U+2585><U+2581><U+2581>
Serum_Cholesterol 0 1 246.69 51.78 126 211.0 241.0 275.0 564.0 <U+2583><U+2587><U+2582><U+2581><U+2581>
Fasting_Blood_Sugar 0 1 0.15 0.36 0 0.0 0.0 0.0 1.0 <U+2587><U+2581><U+2581><U+2581><U+2582>
Resting_ECG 0 1 0.99 0.99 0 0.0 1.0 2.0 2.0 <U+2587><U+2581><U+2581><U+2581><U+2587>
Max_Heart_Rate_Achieved 0 1 149.61 22.88 71 133.5 153.0 166.0 202.0 <U+2581><U+2582><U+2585><U+2587><U+2582>
Exercise_Induced_Angina 0 1 0.33 0.47 0 0.0 0.0 1.0 1.0 <U+2587><U+2581><U+2581><U+2581><U+2583>
ST_Depression_Exercise 0 1 1.04 1.16 0 0.0 0.8 1.6 6.2 <U+2587><U+2582><U+2581><U+2581><U+2581>
Peak_Exercise_ST_Segment 0 1 1.60 0.62 1 1.0 2.0 2.0 3.0 <U+2587><U+2581><U+2587><U+2581><U+2581>
target 0 1 0.94 1.23 0 0.0 0.0 2.0 4.0 <U+2587><U+2583><U+2582><U+2582><U+2581>
# Analyzing categorical variables
# freq function runs for all factor or character variables automatically:
freq(heart)
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
##      Age frequency percentage cumulative_perc
## 1     58        19       6.27            6.27
## 2     57        17       5.61           11.88
## 3     54        16       5.28           17.16
## 4     59        14       4.62           21.78
## 5     52        13       4.29           26.07
## 6     51        12       3.96           30.03
## 7     60        12       3.96           33.99
## 8     44        11       3.63           37.62
## 9     56        11       3.63           41.25
## 10    62        11       3.63           44.88
## 11    41        10       3.30           48.18
## 12    64        10       3.30           51.48
## 13    67         9       2.97           54.45
## 14    42         8       2.64           57.09
## 15    43         8       2.64           59.73
## 16    45         8       2.64           62.37
## 17    53         8       2.64           65.01
## 18    55         8       2.64           67.65
## 19    61         8       2.64           70.29
## 20    63         8       2.64           72.93
## 21    65         8       2.64           75.57
## 22    46         7       2.31           77.88
## 23    48         7       2.31           80.19
## 24    50         7       2.31           82.50
## 25    66         7       2.31           84.81
## 26    47         5       1.65           86.46
## 27    49         5       1.65           88.11
## 28    35         4       1.32           89.43
## 29    39         4       1.32           90.75
## 30    68         4       1.32           92.07
## 31    70         4       1.32           93.39
## 32    40         3       0.99           94.38
## 33    69         3       0.99           95.37
## 34    71         3       0.99           96.36
## 35    34         2       0.66           97.02
## 36    37         2       0.66           97.68
## 37    38         2       0.66           98.34
## 38    29         1       0.33           98.67
## 39    74         1       0.33           99.00
## 40    76         1       0.33           99.33
## 41    77         1       0.33           99.66
## 42 63         1       0.33          100.00
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

##   Num_Major_Vessels_Flouro frequency percentage cumulative_perc
## 1                        0       176      58.09           58.09
## 2                        1        65      21.45           79.54
## 3                        2        38      12.54           92.08
## 4                        3        20       6.60           98.68
## 5                        ?         4       1.32          100.00
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

##   Thalassemia frequency percentage cumulative_perc
## 1           3       166      54.79           54.79
## 2           7       117      38.61           93.40
## 3           6        18       5.94           99.34
## 4           ?         2       0.66          100.00
## [1] "Variables processed: Age, Num_Major_Vessels_Flouro, Thalassemia"
# Analyzing numerical variables
# Quantitatively
# profiling_num runs for all numerical/integer variables automatically:
profiling_num(heart)
##                    variable        mean    std_dev variation_coef   p_01  p_05
## 1                       Sex   0.6798680  0.4672988      0.6873376   0.00   0.0
## 2           Chest_Pain_Type   3.1584158  0.9601256      0.3039896   1.00   1.0
## 3    Resting_Blood_Pressure 131.6897690 17.5997477      0.1336455 100.00 108.0
## 4         Serum_Cholesterol 246.6930693 51.7769175      0.2098840 149.00 175.1
## 5       Fasting_Blood_Sugar   0.1485149  0.3561979      2.3983990   0.00   0.0
## 6               Resting_ECG   0.9900990  0.9949713      1.0049210   0.00   0.0
## 7   Max_Heart_Rate_Achieved 149.6072607 22.8750033      0.1529004  95.02 108.1
## 8   Exercise_Induced_Angina   0.3267327  0.4697945      1.4378558   0.00   0.0
## 9    ST_Depression_Exercise   1.0396040  1.1610750      1.1168436   0.00   0.0
## 10 Peak_Exercise_ST_Segment   1.6006601  0.6162261      0.3849825   1.00   1.0
## 11                   target   0.9372937  1.2285357      1.3107265   0.00   0.0
##     p_25  p_50  p_75  p_95   p_99    skewness kurtosis  iqr        range_98
## 1    0.0   1.0   1.0   1.0   1.00 -0.77109346 1.594585  1.0          [0, 1]
## 2    3.0   3.0   4.0   4.0   4.00 -0.83758103 2.586189  1.0          [1, 4]
## 3  120.0 130.0 140.0 160.0 180.00  0.70253461 3.845881 20.0      [100, 180]
## 4  211.0 241.0 275.0 326.9 406.74  1.12987410 7.398208 64.0   [149, 406.74]
## 5    0.0   0.0   0.0   1.0   1.00  1.97680346 4.907752  0.0          [0, 1]
## 6    0.0   1.0   2.0   2.0   2.00  0.01980163 1.013773  2.0          [0, 2]
## 7  133.5 153.0 166.0 181.9 191.96 -0.53478437 2.927602 32.5 [95.02, 191.96]
## 8    0.0   0.0   1.0   1.0   1.00  0.73885058 1.545900  1.0          [0, 1]
## 9    0.0   0.8   1.6   3.4   4.20  1.26342552 4.530193  1.6        [0, 4.2]
## 10   1.0   2.0   2.0   3.0   3.00  0.50579573 2.363050  1.0          [1, 3]
## 11   0.0   0.0   2.0   3.0   4.00  1.05324831 2.843788  2.0          [0, 4]
##          range_80
## 1          [0, 1]
## 2          [2, 4]
## 3      [110, 152]
## 4  [188.8, 308.8]
## 5          [0, 1]
## 6          [0, 2]
## 7    [116, 176.6]
## 8          [0, 1]
## 9        [0, 2.8]
## 10         [1, 2]
## 11         [0, 3]
# Graphically
# Plot_num and profiling_num. Both run automatically for all numerical/integer variables:
plot_num(heart)
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

# Describe from Hmisc Package
# Analyzing numerical and categorical at the same time
describe(heart)
## heart 
## 
##  14  Variables      303  Observations
## --------------------------------------------------------------------------------
## Age 
##        n  missing distinct 
##      303        0       42 
## 
## lowest : 29    34    35    37    38   , highest: 71    74    76    77    63
## --------------------------------------------------------------------------------
## Sex 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      303        0        2    0.653      206   0.6799   0.4367 
## 
## --------------------------------------------------------------------------------
## Chest_Pain_Type 
##        n  missing distinct     Info     Mean      Gmd 
##      303        0        4    0.865    3.158    1.008 
##                                   
## Value          1     2     3     4
## Frequency     23    50    86   144
## Proportion 0.076 0.165 0.284 0.475
## --------------------------------------------------------------------------------
## Resting_Blood_Pressure 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##      303        0       50    0.995    131.7    19.41      108      110 
##      .25      .50      .75      .90      .95 
##      120      130      140      152      160 
## 
## lowest :  94 100 101 102 104, highest: 174 178 180 192 200
## --------------------------------------------------------------------------------
## Serum_Cholesterol 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##      303        0      152        1    246.7    55.91    175.1    188.8 
##      .25      .50      .75      .90      .95 
##    211.0    241.0    275.0    308.8    326.9 
## 
## lowest : 126 131 141 149 157, highest: 394 407 409 417 564
## --------------------------------------------------------------------------------
## Fasting_Blood_Sugar 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      303        0        2    0.379       45   0.1485   0.2538 
## 
## --------------------------------------------------------------------------------
## Resting_ECG 
##        n  missing distinct     Info     Mean      Gmd 
##      303        0        3     0.76   0.9901    1.003 
##                             
## Value          0     1     2
## Frequency    151     4   148
## Proportion 0.498 0.013 0.488
## --------------------------------------------------------------------------------
## Max_Heart_Rate_Achieved 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##      303        0       91        1    149.6    25.73    108.1    116.0 
##      .25      .50      .75      .90      .95 
##    133.5    153.0    166.0    176.6    181.9 
## 
## lowest :  71  88  90  95  96, highest: 190 192 194 195 202
## --------------------------------------------------------------------------------
## Exercise_Induced_Angina 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      303        0        2     0.66       99   0.3267   0.4414 
## 
## --------------------------------------------------------------------------------
## ST_Depression_Exercise 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##      303        0       40    0.964     1.04    1.225      0.0      0.0 
##      .25      .50      .75      .90      .95 
##      0.0      0.8      1.6      2.8      3.4 
## 
## lowest : 0.0 0.1 0.2 0.3 0.4, highest: 4.0 4.2 4.4 5.6 6.2
## --------------------------------------------------------------------------------
## Peak_Exercise_ST_Segment 
##        n  missing distinct     Info     Mean      Gmd 
##      303        0        3    0.798    1.601   0.6291 
##                             
## Value          1     2     3
## Frequency    142   140    21
## Proportion 0.469 0.462 0.069
## --------------------------------------------------------------------------------
## Num_Major_Vessels_Flouro 
##        n  missing distinct 
##      303        0        5 
## 
## lowest : ? 0 1 2 3, highest: ? 0 1 2 3
##                                         
## Value          ?     0     1     2     3
## Frequency      4   176    65    38    20
## Proportion 0.013 0.581 0.215 0.125 0.066
## --------------------------------------------------------------------------------
## Thalassemia 
##        n  missing distinct 
##      303        0        4 
##                                   
## Value          ?     3     6     7
## Frequency      2   166    18   117
## Proportion 0.007 0.548 0.059 0.386
## --------------------------------------------------------------------------------
## target 
##        n  missing distinct     Info     Mean      Gmd 
##      303        0        5    0.832   0.9373     1.25 
## 
## lowest : 0 1 2 3 4, highest: 0 1 2 3 4
##                                         
## Value          0     1     2     3     4
## Frequency    164    55    36    35    13
## Proportion 0.541 0.182 0.119 0.116 0.043
## --------------------------------------------------------------------------------
#Determine the number of values in each level of dependent variable
heart %>% 
  drop_na() %>%
  group_by(target) %>%
  count() %>% 
  ungroup() %>%
  kable(align = rep("c", 2)) %>% kable_styling("full_width" = F)
target n
0 164
1 55
2 36
3 35
4 13
#Identify the different levels of Thalassemia
heart %>% 
  drop_na() %>%
  group_by(Thalassemia) %>%
  count() %>% 
  ungroup() %>%
  kable(align = rep("c", 2)) %>% kable_styling("full_width" = F)
Thalassemia n
? 2
3 166
6 18
7 117

Data Pre-processing

# The section below checks for missing values and perform missing value imputation (using median)

heart$Num_Major_Vessels_Flouro[which(heart$Num_Major_Vessels_Flouro== "?")] <- NA
heart$Thalassemia[which(heart$Thalassemia== "?")] <- NA
colSums(is.na(heart))
##                      Age                      Sex          Chest_Pain_Type 
##                        0                        0                        0 
##   Resting_Blood_Pressure        Serum_Cholesterol      Fasting_Blood_Sugar 
##                        0                        0                        0 
##              Resting_ECG  Max_Heart_Rate_Achieved  Exercise_Induced_Angina 
##                        0                        0                        0 
##   ST_Depression_Exercise Peak_Exercise_ST_Segment Num_Major_Vessels_Flouro 
##                        0                        0                        4 
##              Thalassemia                   target 
##                        2                        0
# Change the data type
heart$Num_Major_Vessels_Flouro <- as.numeric(heart$Num_Major_Vessels_Flouro)

# Obtain the median value
median.result_heart <- median(heart$Num_Major_Vessels_Flouro, na.rm = TRUE)
median.result_heart #1
## [1] 0
# Missing Value Imputation with Median 
# Replace na value with median value
heart$Num_Major_Vessels_Flouro[is.na(heart$Num_Major_Vessels_Flouro)] <- 1
heart$Thalassemia[is.na(heart$Thalassemia)] <- 3
# Recode the 'target' Feature Into a Binary Class

# Any value above 0 in column ‘target’ indicates the presence of heart disease, 
# we can combine all levels > 0 together so the classification predictions are 
# binary – Yes or No (1 or 0). 

heart$target <- ifelse(heart$target== 0, yes = 0, no=1)

# Check the latest class for 'target'
str(heart)
## 'data.frame':    303 obs. of  14 variables:
##  $ Age                     : chr  "63" "67" "67" "37" ...
##  $ Sex                     : int  1 1 1 1 0 1 0 0 1 1 ...
##  $ Chest_Pain_Type         : int  1 4 4 3 2 2 4 4 4 4 ...
##  $ Resting_Blood_Pressure  : int  145 160 120 130 130 120 140 120 130 140 ...
##  $ Serum_Cholesterol       : int  233 286 229 250 204 236 268 354 254 203 ...
##  $ Fasting_Blood_Sugar     : int  1 0 0 0 0 0 0 0 0 1 ...
##  $ Resting_ECG             : int  2 2 2 0 2 0 2 0 2 2 ...
##  $ Max_Heart_Rate_Achieved : int  150 108 129 187 172 178 160 163 147 155 ...
##  $ Exercise_Induced_Angina : int  0 1 1 0 0 0 0 1 0 1 ...
##  $ ST_Depression_Exercise  : num  2.3 1.5 2.6 3.5 1.4 0.8 3.6 0.6 1.4 3.1 ...
##  $ Peak_Exercise_ST_Segment: int  3 2 2 3 1 1 3 1 2 3 ...
##  $ Num_Major_Vessels_Flouro: num  0 3 2 0 0 0 2 0 1 0 ...
##  $ Thalassemia             : chr  "6" "3" "7" "3" ...
##  $ target                  : num  0 1 1 0 0 0 1 0 1 1 ...
# Copy the clean data into a new DF (for model creation purpose)
heart1 <- heart

Recoding categorical features into characters

# Select categorical vars, recode them to their character values, convert to long format
heart
##       Age Sex Chest_Pain_Type Resting_Blood_Pressure Serum_Cholesterol
## 1   63   1               1                    145               233
## 2      67   1               4                    160               286
## 3      67   1               4                    120               229
## 4      37   1               3                    130               250
## 5      41   0               2                    130               204
## 6      56   1               2                    120               236
## 7      62   0               4                    140               268
## 8      57   0               4                    120               354
## 9      63   1               4                    130               254
## 10     53   1               4                    140               203
## 11     57   1               4                    140               192
## 12     56   0               2                    140               294
## 13     56   1               3                    130               256
## 14     44   1               2                    120               263
## 15     52   1               3                    172               199
## 16     57   1               3                    150               168
## 17     48   1               2                    110               229
## 18     54   1               4                    140               239
## 19     48   0               3                    130               275
## 20     49   1               2                    130               266
## 21     64   1               1                    110               211
## 22     58   0               1                    150               283
## 23     58   1               2                    120               284
## 24     58   1               3                    132               224
## 25     60   1               4                    130               206
## 26     50   0               3                    120               219
## 27     58   0               3                    120               340
## 28     66   0               1                    150               226
## 29     43   1               4                    150               247
## 30     40   1               4                    110               167
## 31     69   0               1                    140               239
## 32     60   1               4                    117               230
## 33     64   1               3                    140               335
## 34     59   1               4                    135               234
## 35     44   1               3                    130               233
## 36     42   1               4                    140               226
## 37     43   1               4                    120               177
## 38     57   1               4                    150               276
## 39     55   1               4                    132               353
## 40     61   1               3                    150               243
## 41     65   0               4                    150               225
## 42     40   1               1                    140               199
## 43     71   0               2                    160               302
## 44     59   1               3                    150               212
## 45     61   0               4                    130               330
## 46     58   1               3                    112               230
## 47     51   1               3                    110               175
## 48     50   1               4                    150               243
## 49     65   0               3                    140               417
## 50     53   1               3                    130               197
## 51     41   0               2                    105               198
## 52     65   1               4                    120               177
## 53     44   1               4                    112               290
## 54     44   1               2                    130               219
## 55     60   1               4                    130               253
## 56     54   1               4                    124               266
## 57     50   1               3                    140               233
## 58     41   1               4                    110               172
## 59     54   1               3                    125               273
## 60     51   1               1                    125               213
## 61     51   0               4                    130               305
## 62     46   0               3                    142               177
## 63     58   1               4                    128               216
## 64     54   0               3                    135               304
## 65     54   1               4                    120               188
## 66     60   1               4                    145               282
## 67     60   1               3                    140               185
## 68     54   1               3                    150               232
## 69     59   1               4                    170               326
## 70     46   1               3                    150               231
## 71     65   0               3                    155               269
## 72     67   1               4                    125               254
## 73     62   1               4                    120               267
## 74     65   1               4                    110               248
## 75     44   1               4                    110               197
## 76     65   0               3                    160               360
## 77     60   1               4                    125               258
## 78     51   0               3                    140               308
## 79     48   1               2                    130               245
## 80     58   1               4                    150               270
## 81     45   1               4                    104               208
## 82     53   0               4                    130               264
## 83     39   1               3                    140               321
## 84     68   1               3                    180               274
## 85     52   1               2                    120               325
## 86     44   1               3                    140               235
## 87     47   1               3                    138               257
## 88     53   0               3                    128               216
## 89     53   0               4                    138               234
## 90     51   0               3                    130               256
## 91     66   1               4                    120               302
## 92     62   0               4                    160               164
## 93     62   1               3                    130               231
## 94     44   0               3                    108               141
## 95     63   0               3                    135               252
## 96     52   1               4                    128               255
## 97     59   1               4                    110               239
## 98     60   0               4                    150               258
## 99     52   1               2                    134               201
## 100    48   1               4                    122               222
## 101    45   1               4                    115               260
## 102    34   1               1                    118               182
## 103    57   0               4                    128               303
## 104    71   0               3                    110               265
## 105    49   1               3                    120               188
## 106    54   1               2                    108               309
## 107    59   1               4                    140               177
## 108    57   1               3                    128               229
## 109    61   1               4                    120               260
## 110    39   1               4                    118               219
## 111    61   0               4                    145               307
## 112    56   1               4                    125               249
## 113    52   1               1                    118               186
## 114    43   0               4                    132               341
## 115    62   0               3                    130               263
## 116    41   1               2                    135               203
## 117    58   1               3                    140               211
## 118    35   0               4                    138               183
## 119    63   1               4                    130               330
## 120    65   1               4                    135               254
## 121    48   1               4                    130               256
## 122    63   0               4                    150               407
## 123    51   1               3                    100               222
## 124    55   1               4                    140               217
## 125    65   1               1                    138               282
## 126    45   0               2                    130               234
## 127    56   0               4                    200               288
## 128    54   1               4                    110               239
## 129    44   1               2                    120               220
## 130    62   0               4                    124               209
## 131    54   1               3                    120               258
## 132    51   1               3                     94               227
## 133    29   1               2                    130               204
## 134    51   1               4                    140               261
## 135    43   0               3                    122               213
## 136    55   0               2                    135               250
## 137    70   1               4                    145               174
## 138    62   1               2                    120               281
## 139    35   1               4                    120               198
## 140    51   1               3                    125               245
## 141    59   1               2                    140               221
## 142    59   1               1                    170               288
## 143    52   1               2                    128               205
## 144    64   1               3                    125               309
## 145    58   1               3                    105               240
## 146    47   1               3                    108               243
## 147    57   1               4                    165               289
## 148    41   1               3                    112               250
## 149    45   1               2                    128               308
## 150    60   0               3                    102               318
## 151    52   1               1                    152               298
## 152    42   0               4                    102               265
## 153    67   0               3                    115               564
## 154    55   1               4                    160               289
## 155    64   1               4                    120               246
## 156    70   1               4                    130               322
## 157    51   1               4                    140               299
## 158    58   1               4                    125               300
## 159    60   1               4                    140               293
## 160    68   1               3                    118               277
## 161    46   1               2                    101               197
## 162    77   1               4                    125               304
## 163    54   0               3                    110               214
## 164    58   0               4                    100               248
## 165    48   1               3                    124               255
## 166    57   1               4                    132               207
## 167    52   1               3                    138               223
## 168    54   0               2                    132               288
## 169    35   1               4                    126               282
## 170    45   0               2                    112               160
## 171    70   1               3                    160               269
## 172    53   1               4                    142               226
## 173    59   0               4                    174               249
## 174    62   0               4                    140               394
## 175    64   1               4                    145               212
## 176    57   1               4                    152               274
## 177    52   1               4                    108               233
## 178    56   1               4                    132               184
## 179    43   1               3                    130               315
## 180    53   1               3                    130               246
## 181    48   1               4                    124               274
## 182    56   0               4                    134               409
## 183    42   1               1                    148               244
## 184    59   1               1                    178               270
## 185    60   0               4                    158               305
## 186    63   0               2                    140               195
## 187    42   1               3                    120               240
## 188    66   1               2                    160               246
## 189    54   1               2                    192               283
## 190    69   1               3                    140               254
## 191    50   1               3                    129               196
## 192    51   1               4                    140               298
## 193    43   1               4                    132               247
## 194    62   0               4                    138               294
## 195    68   0               3                    120               211
## 196    67   1               4                    100               299
## 197    69   1               1                    160               234
## 198    45   0               4                    138               236
## 199    50   0               2                    120               244
## 200    59   1               1                    160               273
## 201    50   0               4                    110               254
## 202    64   0               4                    180               325
## 203    57   1               3                    150               126
## 204    64   0               3                    140               313
## 205    43   1               4                    110               211
## 206    45   1               4                    142               309
## 207    58   1               4                    128               259
## 208    50   1               4                    144               200
## 209    55   1               2                    130               262
## 210    62   0               4                    150               244
## 211    37   0               3                    120               215
## 212    38   1               1                    120               231
## 213    41   1               3                    130               214
## 214    66   0               4                    178               228
## 215    52   1               4                    112               230
## 216    56   1               1                    120               193
## 217    46   0               2                    105               204
## 218    46   0               4                    138               243
## 219    64   0               4                    130               303
## 220    59   1               4                    138               271
## 221    41   0               3                    112               268
## 222    54   0               3                    108               267
## 223    39   0               3                     94               199
## 224    53   1               4                    123               282
## 225    63   0               4                    108               269
## 226    34   0               2                    118               210
## 227    47   1               4                    112               204
## 228    67   0               3                    152               277
## 229    54   1               4                    110               206
## 230    66   1               4                    112               212
## 231    52   0               3                    136               196
## 232    55   0               4                    180               327
## 233    49   1               3                    118               149
## 234    74   0               2                    120               269
## 235    54   0               3                    160               201
## 236    54   1               4                    122               286
## 237    56   1               4                    130               283
## 238    46   1               4                    120               249
## 239    49   0               2                    134               271
## 240    42   1               2                    120               295
## 241    41   1               2                    110               235
## 242    41   0               2                    126               306
## 243    49   0               4                    130               269
## 244    61   1               1                    134               234
## 245    60   0               3                    120               178
## 246    67   1               4                    120               237
## 247    58   1               4                    100               234
## 248    47   1               4                    110               275
## 249    52   1               4                    125               212
## 250    62   1               2                    128               208
## 251    57   1               4                    110               201
## 252    58   1               4                    146               218
## 253    64   1               4                    128               263
## 254    51   0               3                    120               295
## 255    43   1               4                    115               303
## 256    42   0               3                    120               209
## 257    67   0               4                    106               223
## 258    76   0               3                    140               197
## 259    70   1               2                    156               245
## 260    57   1               2                    124               261
## 261    44   0               3                    118               242
## 262    58   0               2                    136               319
## 263    60   0               1                    150               240
## 264    44   1               3                    120               226
## 265    61   1               4                    138               166
## 266    42   1               4                    136               315
## 267    52   1               4                    128               204
## 268    59   1               3                    126               218
## 269    40   1               4                    152               223
## 270    42   1               3                    130               180
## 271    61   1               4                    140               207
## 272    66   1               4                    160               228
## 273    46   1               4                    140               311
## 274    71   0               4                    112               149
## 275    59   1               1                    134               204
## 276    64   1               1                    170               227
## 277    66   0               3                    146               278
## 278    39   0               3                    138               220
## 279    57   1               2                    154               232
## 280    58   0               4                    130               197
## 281    57   1               4                    110               335
## 282    47   1               3                    130               253
## 283    55   0               4                    128               205
## 284    35   1               2                    122               192
## 285    61   1               4                    148               203
## 286    58   1               4                    114               318
## 287    58   0               4                    170               225
## 288    58   1               2                    125               220
## 289    56   1               2                    130               221
## 290    56   1               2                    120               240
## 291    67   1               3                    152               212
## 292    55   0               2                    132               342
## 293    44   1               4                    120               169
## 294    63   1               4                    140               187
## 295    63   0               4                    124               197
## 296    41   1               2                    120               157
## 297    59   1               4                    164               176
## 298    57   0               4                    140               241
## 299    45   1               1                    110               264
## 300    68   1               4                    144               193
## 301    57   1               4                    130               131
## 302    57   0               2                    130               236
## 303    38   1               3                    138               175
##     Fasting_Blood_Sugar Resting_ECG Max_Heart_Rate_Achieved
## 1                     1           2                     150
## 2                     0           2                     108
## 3                     0           2                     129
## 4                     0           0                     187
## 5                     0           2                     172
## 6                     0           0                     178
## 7                     0           2                     160
## 8                     0           0                     163
## 9                     0           2                     147
## 10                    1           2                     155
## 11                    0           0                     148
## 12                    0           2                     153
## 13                    1           2                     142
## 14                    0           0                     173
## 15                    1           0                     162
## 16                    0           0                     174
## 17                    0           0                     168
## 18                    0           0                     160
## 19                    0           0                     139
## 20                    0           0                     171
## 21                    0           2                     144
## 22                    1           2                     162
## 23                    0           2                     160
## 24                    0           2                     173
## 25                    0           2                     132
## 26                    0           0                     158
## 27                    0           0                     172
## 28                    0           0                     114
## 29                    0           0                     171
## 30                    0           2                     114
## 31                    0           0                     151
## 32                    1           0                     160
## 33                    0           0                     158
## 34                    0           0                     161
## 35                    0           0                     179
## 36                    0           0                     178
## 37                    0           2                     120
## 38                    0           2                     112
## 39                    0           0                     132
## 40                    1           0                     137
## 41                    0           2                     114
## 42                    0           0                     178
## 43                    0           0                     162
## 44                    1           0                     157
## 45                    0           2                     169
## 46                    0           2                     165
## 47                    0           0                     123
## 48                    0           2                     128
## 49                    1           2                     157
## 50                    1           2                     152
## 51                    0           0                     168
## 52                    0           0                     140
## 53                    0           2                     153
## 54                    0           2                     188
## 55                    0           0                     144
## 56                    0           2                     109
## 57                    0           0                     163
## 58                    0           2                     158
## 59                    0           2                     152
## 60                    0           2                     125
## 61                    0           0                     142
## 62                    0           2                     160
## 63                    0           2                     131
## 64                    1           0                     170
## 65                    0           0                     113
## 66                    0           2                     142
## 67                    0           2                     155
## 68                    0           2                     165
## 69                    0           2                     140
## 70                    0           0                     147
## 71                    0           0                     148
## 72                    1           0                     163
## 73                    0           0                      99
## 74                    0           2                     158
## 75                    0           2                     177
## 76                    0           2                     151
## 77                    0           2                     141
## 78                    0           2                     142
## 79                    0           2                     180
## 80                    0           2                     111
## 81                    0           2                     148
## 82                    0           2                     143
## 83                    0           2                     182
## 84                    1           2                     150
## 85                    0           0                     172
## 86                    0           2                     180
## 87                    0           2                     156
## 88                    0           2                     115
## 89                    0           2                     160
## 90                    0           2                     149
## 91                    0           2                     151
## 92                    0           2                     145
## 93                    0           0                     146
## 94                    0           0                     175
## 95                    0           2                     172
## 96                    0           0                     161
## 97                    0           2                     142
## 98                    0           2                     157
## 99                    0           0                     158
## 100                   0           2                     186
## 101                   0           2                     185
## 102                   0           2                     174
## 103                   0           2                     159
## 104                   1           2                     130
## 105                   0           0                     139
## 106                   0           0                     156
## 107                   0           0                     162
## 108                   0           2                     150
## 109                   0           0                     140
## 110                   0           0                     140
## 111                   0           2                     146
## 112                   1           2                     144
## 113                   0           2                     190
## 114                   1           2                     136
## 115                   0           0                      97
## 116                   0           0                     132
## 117                   1           2                     165
## 118                   0           0                     182
## 119                   1           2                     132
## 120                   0           2                     127
## 121                   1           2                     150
## 122                   0           2                     154
## 123                   0           0                     143
## 124                   0           0                     111
## 125                   1           2                     174
## 126                   0           2                     175
## 127                   1           2                     133
## 128                   0           0                     126
## 129                   0           0                     170
## 130                   0           0                     163
## 131                   0           2                     147
## 132                   0           0                     154
## 133                   0           2                     202
## 134                   0           2                     186
## 135                   0           0                     165
## 136                   0           2                     161
## 137                   0           0                     125
## 138                   0           2                     103
## 139                   0           0                     130
## 140                   1           2                     166
## 141                   0           0                     164
## 142                   0           2                     159
## 143                   1           0                     184
## 144                   0           0                     131
## 145                   0           2                     154
## 146                   0           0                     152
## 147                   1           2                     124
## 148                   0           0                     179
## 149                   0           2                     170
## 150                   0           0                     160
## 151                   1           0                     178
## 152                   0           2                     122
## 153                   0           2                     160
## 154                   0           2                     145
## 155                   0           2                      96
## 156                   0           2                     109
## 157                   0           0                     173
## 158                   0           2                     171
## 159                   0           2                     170
## 160                   0           0                     151
## 161                   1           0                     156
## 162                   0           2                     162
## 163                   0           0                     158
## 164                   0           2                     122
## 165                   1           0                     175
## 166                   0           0                     168
## 167                   0           0                     169
## 168                   1           2                     159
## 169                   0           2                     156
## 170                   0           0                     138
## 171                   0           0                     112
## 172                   0           2                     111
## 173                   0           0                     143
## 174                   0           2                     157
## 175                   0           2                     132
## 176                   0           0                      88
## 177                   1           0                     147
## 178                   0           2                     105
## 179                   0           0                     162
## 180                   1           2                     173
## 181                   0           2                     166
## 182                   0           2                     150
## 183                   0           2                     178
## 184                   0           2                     145
## 185                   0           2                     161
## 186                   0           0                     179
## 187                   1           0                     194
## 188                   0           0                     120
## 189                   0           2                     195
## 190                   0           2                     146
## 191                   0           0                     163
## 192                   0           0                     122
## 193                   1           2                     143
## 194                   1           0                     106
## 195                   0           2                     115
## 196                   0           2                     125
## 197                   1           2                     131
## 198                   0           2                     152
## 199                   0           0                     162
## 200                   0           2                     125
## 201                   0           2                     159
## 202                   0           0                     154
## 203                   1           0                     173
## 204                   0           0                     133
## 205                   0           0                     161
## 206                   0           2                     147
## 207                   0           2                     130
## 208                   0           2                     126
## 209                   0           0                     155
## 210                   0           0                     154
## 211                   0           0                     170
## 212                   0           0                     182
## 213                   0           2                     168
## 214                   1           0                     165
## 215                   0           0                     160
## 216                   0           2                     162
## 217                   0           0                     172
## 218                   0           2                     152
## 219                   0           0                     122
## 220                   0           2                     182
## 221                   0           2                     172
## 222                   0           2                     167
## 223                   0           0                     179
## 224                   0           0                      95
## 225                   0           0                     169
## 226                   0           0                     192
## 227                   0           0                     143
## 228                   0           0                     172
## 229                   0           2                     108
## 230                   0           2                     132
## 231                   0           2                     169
## 232                   0           1                     117
## 233                   0           2                     126
## 234                   0           2                     121
## 235                   0           0                     163
## 236                   0           2                     116
## 237                   1           2                     103
## 238                   0           2                     144
## 239                   0           0                     162
## 240                   0           0                     162
## 241                   0           0                     153
## 242                   0           0                     163
## 243                   0           0                     163
## 244                   0           0                     145
## 245                   1           0                      96
## 246                   0           0                      71
## 247                   0           0                     156
## 248                   0           2                     118
## 249                   0           0                     168
## 250                   1           2                     140
## 251                   0           0                     126
## 252                   0           0                     105
## 253                   0           0                     105
## 254                   0           2                     157
## 255                   0           0                     181
## 256                   0           0                     173
## 257                   0           0                     142
## 258                   0           1                     116
## 259                   0           2                     143
## 260                   0           0                     141
## 261                   0           0                     149
## 262                   1           2                     152
## 263                   0           0                     171
## 264                   0           0                     169
## 265                   0           2                     125
## 266                   0           0                     125
## 267                   1           0                     156
## 268                   1           0                     134
## 269                   0           0                     181
## 270                   0           0                     150
## 271                   0           2                     138
## 272                   0           2                     138
## 273                   0           0                     120
## 274                   0           0                     125
## 275                   0           0                     162
## 276                   0           2                     155
## 277                   0           2                     152
## 278                   0           0                     152
## 279                   0           2                     164
## 280                   0           0                     131
## 281                   0           0                     143
## 282                   0           0                     179
## 283                   0           1                     130
## 284                   0           0                     174
## 285                   0           0                     161
## 286                   0           1                     140
## 287                   1           2                     146
## 288                   0           0                     144
## 289                   0           2                     163
## 290                   0           0                     169
## 291                   0           2                     150
## 292                   0           0                     166
## 293                   0           0                     144
## 294                   0           2                     144
## 295                   0           0                     136
## 296                   0           0                     182
## 297                   1           2                      90
## 298                   0           0                     123
## 299                   0           0                     132
## 300                   1           0                     141
## 301                   0           0                     115
## 302                   0           2                     174
## 303                   0           0                     173
##     Exercise_Induced_Angina ST_Depression_Exercise Peak_Exercise_ST_Segment
## 1                         0                    2.3                        3
## 2                         1                    1.5                        2
## 3                         1                    2.6                        2
## 4                         0                    3.5                        3
## 5                         0                    1.4                        1
## 6                         0                    0.8                        1
## 7                         0                    3.6                        3
## 8                         1                    0.6                        1
## 9                         0                    1.4                        2
## 10                        1                    3.1                        3
## 11                        0                    0.4                        2
## 12                        0                    1.3                        2
## 13                        1                    0.6                        2
## 14                        0                    0.0                        1
## 15                        0                    0.5                        1
## 16                        0                    1.6                        1
## 17                        0                    1.0                        3
## 18                        0                    1.2                        1
## 19                        0                    0.2                        1
## 20                        0                    0.6                        1
## 21                        1                    1.8                        2
## 22                        0                    1.0                        1
## 23                        0                    1.8                        2
## 24                        0                    3.2                        1
## 25                        1                    2.4                        2
## 26                        0                    1.6                        2
## 27                        0                    0.0                        1
## 28                        0                    2.6                        3
## 29                        0                    1.5                        1
## 30                        1                    2.0                        2
## 31                        0                    1.8                        1
## 32                        1                    1.4                        1
## 33                        0                    0.0                        1
## 34                        0                    0.5                        2
## 35                        1                    0.4                        1
## 36                        0                    0.0                        1
## 37                        1                    2.5                        2
## 38                        1                    0.6                        2
## 39                        1                    1.2                        2
## 40                        1                    1.0                        2
## 41                        0                    1.0                        2
## 42                        1                    1.4                        1
## 43                        0                    0.4                        1
## 44                        0                    1.6                        1
## 45                        0                    0.0                        1
## 46                        0                    2.5                        2
## 47                        0                    0.6                        1
## 48                        0                    2.6                        2
## 49                        0                    0.8                        1
## 50                        0                    1.2                        3
## 51                        0                    0.0                        1
## 52                        0                    0.4                        1
## 53                        0                    0.0                        1
## 54                        0                    0.0                        1
## 55                        1                    1.4                        1
## 56                        1                    2.2                        2
## 57                        0                    0.6                        2
## 58                        0                    0.0                        1
## 59                        0                    0.5                        3
## 60                        1                    1.4                        1
## 61                        1                    1.2                        2
## 62                        1                    1.4                        3
## 63                        1                    2.2                        2
## 64                        0                    0.0                        1
## 65                        0                    1.4                        2
## 66                        1                    2.8                        2
## 67                        0                    3.0                        2
## 68                        0                    1.6                        1
## 69                        1                    3.4                        3
## 70                        0                    3.6                        2
## 71                        0                    0.8                        1
## 72                        0                    0.2                        2
## 73                        1                    1.8                        2
## 74                        0                    0.6                        1
## 75                        0                    0.0                        1
## 76                        0                    0.8                        1
## 77                        1                    2.8                        2
## 78                        0                    1.5                        1
## 79                        0                    0.2                        2
## 80                        1                    0.8                        1
## 81                        1                    3.0                        2
## 82                        0                    0.4                        2
## 83                        0                    0.0                        1
## 84                        1                    1.6                        2
## 85                        0                    0.2                        1
## 86                        0                    0.0                        1
## 87                        0                    0.0                        1
## 88                        0                    0.0                        1
## 89                        0                    0.0                        1
## 90                        0                    0.5                        1
## 91                        0                    0.4                        2
## 92                        0                    6.2                        3
## 93                        0                    1.8                        2
## 94                        0                    0.6                        2
## 95                        0                    0.0                        1
## 96                        1                    0.0                        1
## 97                        1                    1.2                        2
## 98                        0                    2.6                        2
## 99                        0                    0.8                        1
## 100                       0                    0.0                        1
## 101                       0                    0.0                        1
## 102                       0                    0.0                        1
## 103                       0                    0.0                        1
## 104                       0                    0.0                        1
## 105                       0                    2.0                        2
## 106                       0                    0.0                        1
## 107                       1                    0.0                        1
## 108                       0                    0.4                        2
## 109                       1                    3.6                        2
## 110                       0                    1.2                        2
## 111                       1                    1.0                        2
## 112                       1                    1.2                        2
## 113                       0                    0.0                        2
## 114                       1                    3.0                        2
## 115                       0                    1.2                        2
## 116                       0                    0.0                        2
## 117                       0                    0.0                        1
## 118                       0                    1.4                        1
## 119                       1                    1.8                        1
## 120                       0                    2.8                        2
## 121                       1                    0.0                        1
## 122                       0                    4.0                        2
## 123                       1                    1.2                        2
## 124                       1                    5.6                        3
## 125                       0                    1.4                        2
## 126                       0                    0.6                        2
## 127                       1                    4.0                        3
## 128                       1                    2.8                        2
## 129                       0                    0.0                        1
## 130                       0                    0.0                        1
## 131                       0                    0.4                        2
## 132                       1                    0.0                        1
## 133                       0                    0.0                        1
## 134                       1                    0.0                        1
## 135                       0                    0.2                        2
## 136                       0                    1.4                        2
## 137                       1                    2.6                        3
## 138                       0                    1.4                        2
## 139                       1                    1.6                        2
## 140                       0                    2.4                        2
## 141                       1                    0.0                        1
## 142                       0                    0.2                        2
## 143                       0                    0.0                        1
## 144                       1                    1.8                        2
## 145                       1                    0.6                        2
## 146                       0                    0.0                        1
## 147                       0                    1.0                        2
## 148                       0                    0.0                        1
## 149                       0                    0.0                        1
## 150                       0                    0.0                        1
## 151                       0                    1.2                        2
## 152                       0                    0.6                        2
## 153                       0                    1.6                        2
## 154                       1                    0.8                        2
## 155                       1                    2.2                        3
## 156                       0                    2.4                        2
## 157                       1                    1.6                        1
## 158                       0                    0.0                        1
## 159                       0                    1.2                        2
## 160                       0                    1.0                        1
## 161                       0                    0.0                        1
## 162                       1                    0.0                        1
## 163                       0                    1.6                        2
## 164                       0                    1.0                        2
## 165                       0                    0.0                        1
## 166                       1                    0.0                        1
## 167                       0                    0.0                        1
## 168                       1                    0.0                        1
## 169                       1                    0.0                        1
## 170                       0                    0.0                        2
## 171                       1                    2.9                        2
## 172                       1                    0.0                        1
## 173                       1                    0.0                        2
## 174                       0                    1.2                        2
## 175                       0                    2.0                        2
## 176                       1                    1.2                        2
## 177                       0                    0.1                        1
## 178                       1                    2.1                        2
## 179                       0                    1.9                        1
## 180                       0                    0.0                        1
## 181                       0                    0.5                        2
## 182                       1                    1.9                        2
## 183                       0                    0.8                        1
## 184                       0                    4.2                        3
## 185                       0                    0.0                        1
## 186                       0                    0.0                        1
## 187                       0                    0.8                        3
## 188                       1                    0.0                        2
## 189                       0                    0.0                        1
## 190                       0                    2.0                        2
## 191                       0                    0.0                        1
## 192                       1                    4.2                        2
## 193                       1                    0.1                        2
## 194                       0                    1.9                        2
## 195                       0                    1.5                        2
## 196                       1                    0.9                        2
## 197                       0                    0.1                        2
## 198                       1                    0.2                        2
## 199                       0                    1.1                        1
## 200                       0                    0.0                        1
## 201                       0                    0.0                        1
## 202                       1                    0.0                        1
## 203                       0                    0.2                        1
## 204                       0                    0.2                        1
## 205                       0                    0.0                        1
## 206                       1                    0.0                        2
## 207                       1                    3.0                        2
## 208                       1                    0.9                        2
## 209                       0                    0.0                        1
## 210                       1                    1.4                        2
## 211                       0                    0.0                        1
## 212                       1                    3.8                        2
## 213                       0                    2.0                        2
## 214                       1                    1.0                        2
## 215                       0                    0.0                        1
## 216                       0                    1.9                        2
## 217                       0                    0.0                        1
## 218                       1                    0.0                        2
## 219                       0                    2.0                        2
## 220                       0                    0.0                        1
## 221                       1                    0.0                        1
## 222                       0                    0.0                        1
## 223                       0                    0.0                        1
## 224                       1                    2.0                        2
## 225                       1                    1.8                        2
## 226                       0                    0.7                        1
## 227                       0                    0.1                        1
## 228                       0                    0.0                        1
## 229                       1                    0.0                        2
## 230                       1                    0.1                        1
## 231                       0                    0.1                        2
## 232                       1                    3.4                        2
## 233                       0                    0.8                        1
## 234                       1                    0.2                        1
## 235                       0                    0.0                        1
## 236                       1                    3.2                        2
## 237                       1                    1.6                        3
## 238                       0                    0.8                        1
## 239                       0                    0.0                        2
## 240                       0                    0.0                        1
## 241                       0                    0.0                        1
## 242                       0                    0.0                        1
## 243                       0                    0.0                        1
## 244                       0                    2.6                        2
## 245                       0                    0.0                        1
## 246                       0                    1.0                        2
## 247                       0                    0.1                        1
## 248                       1                    1.0                        2
## 249                       0                    1.0                        1
## 250                       0                    0.0                        1
## 251                       1                    1.5                        2
## 252                       0                    2.0                        2
## 253                       1                    0.2                        2
## 254                       0                    0.6                        1
## 255                       0                    1.2                        2
## 256                       0                    0.0                        2
## 257                       0                    0.3                        1
## 258                       0                    1.1                        2
## 259                       0                    0.0                        1
## 260                       0                    0.3                        1
## 261                       0                    0.3                        2
## 262                       0                    0.0                        1
## 263                       0                    0.9                        1
## 264                       0                    0.0                        1
## 265                       1                    3.6                        2
## 266                       1                    1.8                        2
## 267                       1                    1.0                        2
## 268                       0                    2.2                        2
## 269                       0                    0.0                        1
## 270                       0                    0.0                        1
## 271                       1                    1.9                        1
## 272                       0                    2.3                        1
## 273                       1                    1.8                        2
## 274                       0                    1.6                        2
## 275                       0                    0.8                        1
## 276                       0                    0.6                        2
## 277                       0                    0.0                        2
## 278                       0                    0.0                        2
## 279                       0                    0.0                        1
## 280                       0                    0.6                        2
## 281                       1                    3.0                        2
## 282                       0                    0.0                        1
## 283                       1                    2.0                        2
## 284                       0                    0.0                        1
## 285                       0                    0.0                        1
## 286                       0                    4.4                        3
## 287                       1                    2.8                        2
## 288                       0                    0.4                        2
## 289                       0                    0.0                        1
## 290                       0                    0.0                        3
## 291                       0                    0.8                        2
## 292                       0                    1.2                        1
## 293                       1                    2.8                        3
## 294                       1                    4.0                        1
## 295                       1                    0.0                        2
## 296                       0                    0.0                        1
## 297                       0                    1.0                        2
## 298                       1                    0.2                        2
## 299                       0                    1.2                        2
## 300                       0                    3.4                        2
## 301                       1                    1.2                        2
## 302                       0                    0.0                        2
## 303                       0                    0.0                        1
##     Num_Major_Vessels_Flouro Thalassemia target
## 1                          0           6      0
## 2                          3           3      1
## 3                          2           7      1
## 4                          0           3      0
## 5                          0           3      0
## 6                          0           3      0
## 7                          2           3      1
## 8                          0           3      0
## 9                          1           7      1
## 10                         0           7      1
## 11                         0           6      0
## 12                         0           3      0
## 13                         1           6      1
## 14                         0           7      0
## 15                         0           7      0
## 16                         0           3      0
## 17                         0           7      1
## 18                         0           3      0
## 19                         0           3      0
## 20                         0           3      0
## 21                         0           3      0
## 22                         0           3      0
## 23                         0           3      1
## 24                         2           7      1
## 25                         2           7      1
## 26                         0           3      0
## 27                         0           3      0
## 28                         0           3      0
## 29                         0           3      0
## 30                         0           7      1
## 31                         2           3      0
## 32                         2           7      1
## 33                         0           3      1
## 34                         0           7      0
## 35                         0           3      0
## 36                         0           3      0
## 37                         0           7      1
## 38                         1           6      1
## 39                         1           7      1
## 40                         0           3      0
## 41                         3           7      1
## 42                         0           7      0
## 43                         2           3      0
## 44                         0           3      0
## 45                         0           3      1
## 46                         1           7      1
## 47                         0           3      0
## 48                         0           7      1
## 49                         1           3      0
## 50                         0           3      0
## 51                         1           3      0
## 52                         0           7      0
## 53                         1           3      1
## 54                         0           3      0
## 55                         1           7      1
## 56                         1           7      1
## 57                         1           7      1
## 58                         0           7      1
## 59                         1           3      0
## 60                         1           3      0
## 61                         0           7      1
## 62                         0           3      0
## 63                         3           7      1
## 64                         0           3      0
## 65                         1           7      1
## 66                         2           7      1
## 67                         0           3      1
## 68                         0           7      0
## 69                         0           7      1
## 70                         0           3      1
## 71                         0           3      0
## 72                         2           7      1
## 73                         2           7      1
## 74                         2           6      1
## 75                         1           3      1
## 76                         0           3      0
## 77                         1           7      1
## 78                         1           3      0
## 79                         0           3      0
## 80                         0           7      1
## 81                         0           3      0
## 82                         0           3      0
## 83                         0           3      0
## 84                         0           7      1
## 85                         0           3      0
## 86                         0           3      0
## 87                         0           3      0
## 88                         0           3      0
## 89                         0           3      0
## 90                         0           3      0
## 91                         0           3      0
## 92                         3           7      1
## 93                         3           7      0
## 94                         0           3      0
## 95                         0           3      0
## 96                         1           7      1
## 97                         1           7      1
## 98                         2           7      1
## 99                         1           3      0
## 100                        0           3      0
## 101                        0           3      0
## 102                        0           3      0
## 103                        1           3      0
## 104                        1           3      0
## 105                        3           7      1
## 106                        0           7      0
## 107                        1           7      1
## 108                        1           7      1
## 109                        1           7      1
## 110                        0           7      1
## 111                        0           7      1
## 112                        1           3      1
## 113                        0           6      0
## 114                        0           7      1
## 115                        1           7      1
## 116                        0           6      0
## 117                        0           3      0
## 118                        0           3      0
## 119                        3           7      1
## 120                        1           7      1
## 121                        2           7      1
## 122                        3           7      1
## 123                        0           3      0
## 124                        0           7      1
## 125                        1           3      1
## 126                        0           3      0
## 127                        2           7      1
## 128                        1           7      1
## 129                        0           3      0
## 130                        0           3      0
## 131                        0           7      0
## 132                        1           7      0
## 133                        0           3      0
## 134                        0           3      0
## 135                        0           3      0
## 136                        0           3      0
## 137                        0           7      1
## 138                        1           7      1
## 139                        0           7      1
## 140                        0           3      0
## 141                        0           3      0
## 142                        0           7      1
## 143                        0           3      0
## 144                        0           7      1
## 145                        0           7      0
## 146                        0           3      1
## 147                        3           7      1
## 148                        0           3      0
## 149                        0           3      0
## 150                        1           3      0
## 151                        0           7      0
## 152                        0           3      0
## 153                        0           7      0
## 154                        1           7      1
## 155                        1           3      1
## 156                        3           3      1
## 157                        0           7      1
## 158                        2           7      1
## 159                        2           7      1
## 160                        1           7      0
## 161                        0           7      0
## 162                        3           3      1
## 163                        0           3      0
## 164                        0           3      0
## 165                        2           3      0
## 166                        0           7      0
## 167                        1           3      0
## 168                        1           3      0
## 169                        0           7      1
## 170                        0           3      0
## 171                        1           7      1
## 172                        0           7      0
## 173                        0           3      1
## 174                        0           3      0
## 175                        2           6      1
## 176                        1           7      1
## 177                        3           7      0
## 178                        1           6      1
## 179                        1           3      0
## 180                        3           3      0
## 181                        0           7      1
## 182                        2           7      1
## 183                        2           3      0
## 184                        0           7      0
## 185                        0           3      1
## 186                        2           3      0
## 187                        0           7      0
## 188                        3           6      1
## 189                        1           7      1
## 190                        3           7      1
## 191                        0           3      0
## 192                        3           7      1
## 193                        1           7      1
## 194                        3           3      1
## 195                        0           3      0
## 196                        2           3      1
## 197                        1           3      0
## 198                        0           3      0
## 199                        0           3      0
## 200                        0           3      1
## 201                        0           3      0
## 202                        0           3      0
## 203                        1           7      0
## 204                        0           7      0
## 205                        0           7      0
## 206                        3           7      1
## 207                        2           7      1
## 208                        0           7      1
## 209                        0           3      0
## 210                        0           3      1
## 211                        0           3      0
## 212                        0           7      1
## 213                        0           3      0
## 214                        2           7      1
## 215                        1           3      1
## 216                        0           7      0
## 217                        0           3      0
## 218                        0           3      0
## 219                        2           3      0
## 220                        0           3      0
## 221                        0           3      0
## 222                        0           3      0
## 223                        0           3      0
## 224                        2           7      1
## 225                        2           3      1
## 226                        0           3      0
## 227                        0           3      0
## 228                        1           3      0
## 229                        1           3      1
## 230                        1           3      1
## 231                        0           3      0
## 232                        0           3      1
## 233                        3           3      1
## 234                        1           3      0
## 235                        1           3      0
## 236                        2           3      1
## 237                        0           7      1
## 238                        0           7      1
## 239                        0           3      0
## 240                        0           3      0
## 241                        0           3      0
## 242                        0           3      0
## 243                        0           3      0
## 244                        2           3      1
## 245                        0           3      0
## 246                        0           3      1
## 247                        1           7      1
## 248                        1           3      1
## 249                        2           7      1
## 250                        0           3      0
## 251                        0           6      0
## 252                        1           7      1
## 253                        1           7      0
## 254                        0           3      0
## 255                        0           3      0
## 256                        0           3      0
## 257                        2           3      0
## 258                        0           3      0
## 259                        0           3      0
## 260                        0           7      1
## 261                        1           3      0
## 262                        2           3      1
## 263                        0           3      0
## 264                        0           3      0
## 265                        1           3      1
## 266                        0           6      1
## 267                        0           3      1
## 268                        1           6      1
## 269                        0           7      1
## 270                        0           3      0
## 271                        1           7      1
## 272                        0           6      0
## 273                        2           7      1
## 274                        0           3      0
## 275                        2           3      1
## 276                        0           7      0
## 277                        1           3      0
## 278                        0           3      0
## 279                        1           3      1
## 280                        0           3      0
## 281                        1           7      1
## 282                        0           3      0
## 283                        1           7      1
## 284                        0           3      0
## 285                        1           7      1
## 286                        3           6      1
## 287                        2           6      1
## 288                        1           7      0
## 289                        0           7      0
## 290                        0           3      0
## 291                        0           7      1
## 292                        0           3      0
## 293                        0           6      1
## 294                        2           7      1
## 295                        0           3      1
## 296                        0           3      0
## 297                        2           6      1
## 298                        0           7      1
## 299                        0           7      1
## 300                        2           7      1
## 301                        1           7      1
## 302                        1           3      1
## 303                        1           3      0
hd_long_fact_tbl <- heart  %>% 
  dplyr::select(Sex, Chest_Pain_Type, Fasting_Blood_Sugar, Resting_ECG, Exercise_Induced_Angina,Peak_Exercise_ST_Segment,Thalassemia,target) %>%
   mutate(Sex = recode_factor(Sex, `0` = "female", 
                                  `1` = "male" ),
         Chest_Pain_Type = recode_factor(Chest_Pain_Type, `1` = "typical",   
                                                          `2` = "atypical",
                                                          `3` = "non-angina", 
                                                          `4` = "asymptomatic"),
         Fasting_Blood_Sugar = recode_factor(Fasting_Blood_Sugar, `0` = "<= 120 mg/dl", 
                                                                  `1` = "> 120 mg/dl"),
         Resting_ECG = recode_factor(Resting_ECG, `0` = "normal",
                                                  `1` = "ST-T abnormality",
                                                  `2` = "LV hypertrophy"),
         Exercise_Induced_Angina = recode_factor(Exercise_Induced_Angina, `0` = "no",
                                                                          `1` = "yes"),
         Peak_Exercise_ST_Segment = recode_factor(Peak_Exercise_ST_Segment, `1` = "up-sloaping",
                                                                            `2` = "flat",
                                                                            `3` = "down-sloaping"),
         Thalassemia = recode_factor(Thalassemia, `3` = "normal",
                                                  `6` = "fixed defect",
                                                  `7` = "reversible defect")) %>%
  gather(key = "key", value = "value", -target)
## Warning: attributes are not identical across measure variables;
## they will be dropped
#Visualize with bar plot
hd_long_fact_tbl %>% 
  ggplot(aes(value)) +
    geom_bar(aes(x        = value, 
                 fill     = target), 
                 alpha    = .6, 
                 position = "dodge", 
                 color    = "black",
                 width    = .8
             ) +
    labs(x = "",
         y = "",
         title = "Scaled Effect of Categorical Variables") +
    theme(
         axis.text.y  = element_blank(),
         axis.ticks.y = element_blank()) +
    facet_wrap(~ key, scales = "free", nrow = 4) +
    scale_fill_manual(
         values = c("#fde725ff", "#20a486ff"),
         name   = "Heart\nDisease",
         labels = c("No HD", "Yes HD"))

#Must gather() data first in order to facet wrap by key 
#(default gather call puts all var names into new key col)
#hd_long_cont_tbl <- heart  %>%
 #dplyr::select(Age, Resting_Blood_Pressure, Serum_Cholesterol, Max_Heart_Rate_Achieved,
         #ST_Depression_Exercise, Num_Major_Vessels_Flouro, target) %>% 
  #gather(key   = "key", 
         #value = "value",
         #-target)

#Visualize numeric variables as boxplots
#hd_long_cont_tbl %>% 
  #ggplot(aes(y = value)) +
       #geom_boxplot(aes(fill =  target),
                      #alpha  = .6,
                      #fatten = .7) +
        #labs(x = "",
             #y = "",
             #title = "Boxplots for Numeric Variables") +
      #scale_fill_manual(
            #values = c("#fde725ff", "#20a486ff"),
            #name   = "Heart\nDisease",
            #labels = c("No HD", "Yes HD")) +
      #theme(
         #axis.text.x  = element_blank(),
         #axis.ticks.x = element_blank()) +
      #facet_wrap(~ key, 
                 #scales = "free", 
                 #ncol   = 2) 

Data Visualization

table <- table(as.numeric(heart$Chest_Pain_Type))
pie(table)

From the pie chart, we can observe that most individual experience asymptomatic angina and followed by non-angina pain.

The faceted plots for categorical and numeric variables suggest the following conditions are associated with increased prevalence of heart disease.

  1. Asymptomatic angina chest pain (relative to typical angina chest pain, atypical angina pain, or non-angina pain)
  2. Presence of exercise induced angina
  3. Lower fasting blood sugar
  4. Flat or down-sloaping peak exercise ST segment
  5. Presence of left ventricle hypertrophy
  6. Male
  7. Higher thelassemia score
  8. Higher age
  9. Lower max heart rate achieved
  10. Higher resting blood pressure
  11. Higher cholesterol
  12. Higher ST depression induced by exercise relative to rest

We can’t all be cardiologists but these do seem to pass the eye check. Particularly: age, blood pressure, cholesterol, and sex all point in the right direction based on what we generally know about the world around us. This provides a nice phase gate to let us proceed with the analysis.

Highly correlated variables can lead to overly complicated models or wonky predictions. The ggcorr() function from GGally package provides a nice, clean correlation matrix of the numeric variables. The default method is Pearson which I use here first. Pearson isn’t ideal if the data is skewed or has a lot of outliers so I’ll check using the rank-based Kendall method as well.

Correlation Analysis

Correlation analysis allows us to obtain an understanding on relationship and direction between features. Example +0.8 indicates very strong positive relationship while 0 indicates no relationship.

Two methods i.e. Pearson and Kendall are used for the correlation analysis in this study.

#Correlation matrix using Pearson method, default method is Pearson
heart %>% ggcorr(high       = "#20a486ff",
                                   low        = "#fde725ff",
                                   label      = TRUE, 
                                   hjust      = .75, 
                                   size       = 3, 
                                   label_size = 3,
                                   nbreaks    = 5
                                              ) +
  labs(title = "Correlation Matrix",
  subtitle = "Pearson Method Using Pairwise Obervations")
## Warning in ggcorr(., high = "#20a486ff", low = "#fde725ff", label = TRUE, : data
## in column(s) 'Age', 'Thalassemia' are not numeric and were ignored

#Correlation matrix using Kendall method
heart %>% ggcorr(method     = c("pairwise", "kendall"),
                                   high       = "#20a486ff",
                                   low        = "#fde725ff",
                                   label      = TRUE, 
                                   hjust      = .75, 
                                   size       = 3, 
                                   label_size = 3,
                                   nbreaks    = 5
                                   ) +
  labs(title = "Correlation Matrix",
  subtitle = "Kendall Method Using Pairwise Observations")
## Warning in ggcorr(., method = c("pairwise", "kendall"), high = "#20a486ff", :
## data in column(s) 'Age', 'Thalassemia' are not numeric and were ignored

Based on Pearson and Kendall’s correlation result, the factors that showed correlation >= 40% with the target feature are: Chest Pain type, Num of Major Vessels, Exercise Induced Angina, and ST Depression.

There are very minor differences between the Pearson and Kendall results. No variables appear to be highly correlated (i.e. > 50%) As such, it seems reasonable to keep all the original 14 variables as we proceed into the modeling section.Some additional steps can be used in the modelling stage to identify the statitically significant features

Modelling

Data Preparation For Modelling

The plan is to split up the original data set to form a training group (70%) and testing group (30%). The training group will be used to fit the model while the testing group will be used to evaluate predictions. The initial_split() function creates a split object which is just an efficient way to store both the training and testing sets. The training() and testing() functions are used to extract the appropriate dataframes out of the split object when needed.

Create Logistic Regression Model

Logistic regression is a form of regression analysis. A binary logistic regression is used when the target feature is a dichotomy, having only two values, for example, 0 or 1, or Yes or No.  In this project, it is used to compute the prediction of the presence of heart disease. It can be used to predict what conditions that could likely cause the presence of heart disease.

# Read the Caret Library
library(caret)
'%ni%' <- Negate('%in%')  # define 'not in' func
options(scipen=999)  # prevents printing scientific notations.

# Prep Training and Test dataset
# First, create a Train-test split with 70% data included in the training set
set.seed(100)
trainDataIndex <- createDataPartition(heart1$target, p=0.7, list = F)  # 70% training data
trainData <- heart1[trainDataIndex, ]
testData <- heart1[-trainDataIndex, ]

# Build Logistic Model (Using All Features)
logitmod1 <- glm(target ~ Age + Sex + Chest_Pain_Type + Resting_Blood_Pressure + Serum_Cholesterol + Fasting_Blood_Sugar + Resting_ECG+Max_Heart_Rate_Achieved+Exercise_Induced_Angina+ST_Depression_Exercise+Peak_Exercise_ST_Segment+Num_Major_Vessels_Flouro+Thalassemia, family = "binomial", data=trainData)

# Print the model summary
summary(logitmod1)
## 
## Call:
## glm(formula = target ~ Age + Sex + Chest_Pain_Type + Resting_Blood_Pressure + 
##     Serum_Cholesterol + Fasting_Blood_Sugar + Resting_ECG + Max_Heart_Rate_Achieved + 
##     Exercise_Induced_Angina + ST_Depression_Exercise + Peak_Exercise_ST_Segment + 
##     Num_Major_Vessels_Flouro + Thalassemia, family = "binomial", 
##     data = trainData)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.70796  -0.29831  -0.00013   0.20624   2.47762  
## 
## Coefficients:
##                             Estimate  Std. Error z value Pr(>|z|)    
## (Intercept)               -20.826675 6522.640154  -0.003 0.997452    
## Age34                       0.916971 7751.573625   0.000 0.999906    
## Age35                      15.292539 6522.639838   0.002 0.998129    
## Age37                      -1.361105 7719.002128   0.000 0.999859    
## Age38                      15.931586 6522.638764   0.002 0.998051    
## Age39                      16.802190 6522.641395   0.003 0.997945    
## Age40                      15.259690 6522.638789   0.002 0.998133    
## Age41                      13.912322 6522.638702   0.002 0.998298    
## Age42                      12.722683 6522.638870   0.002 0.998444    
## Age43                      12.592353 6522.638994   0.002 0.998460    
## Age44                      14.685295 6522.638658   0.002 0.998204    
## Age45                      13.138367 6522.638733   0.002 0.998393    
## Age46                      12.893359 6522.638869   0.002 0.998423    
## Age47                      15.692236 6522.638756   0.002 0.998080    
## Age48                      10.731397 6522.639944   0.002 0.998687    
## Age49                      13.698389 6522.639723   0.002 0.998324    
## Age50                      13.257435 6522.639576   0.002 0.998378    
## Age51                      11.605676 6522.638754   0.002 0.998580    
## Age52                      12.206813 6522.638855   0.002 0.998507    
## Age53                      -7.476687 6929.428706  -0.001 0.999139    
## Age54                      11.352843 6522.638751   0.002 0.998611    
## Age55                      14.177270 6522.639090   0.002 0.998266    
## Age56                      14.051898 6522.639028   0.002 0.998281    
## Age57                      13.649477 6522.638667   0.002 0.998330    
## Age58                      12.011296 6522.638729   0.002 0.998531    
## Age59                      14.265998 6522.638692   0.002 0.998255    
## Age60                      15.716961 6522.638784   0.002 0.998077    
## Age61                      15.444033 6522.638766   0.002 0.998111    
## Age62                      11.136944 6522.638944   0.002 0.998638    
## Age63                      15.611120 6522.639018   0.002 0.998090    
## Age64                       9.540608 6522.639142   0.001 0.998833    
## Age65                      13.005983 6522.638740   0.002 0.998409    
## Age66                      11.670180 6522.638779   0.002 0.998572    
## Age67                      13.227907 6522.638869   0.002 0.998382    
## Age68                      10.691786 6522.638924   0.002 0.998692    
## Age69                      11.043980 6522.646017   0.002 0.998649    
## Age70                      27.651256 7288.812690   0.004 0.996973    
## Age71                      -2.825241 7742.837763   0.000 0.999709    
## Age74                      -4.821633 9224.404119  -0.001 0.999583    
## Age76                      -2.549170 9224.404083   0.000 0.999780    
## Age77                      26.566605 9224.404210   0.003 0.997702    
## Age63                   -3.739744 9224.404076   0.000 0.999677    
## Sex                         1.974530    1.011735   1.952 0.050982 .  
## Chest_Pain_Type             0.745306    0.327136   2.278 0.022710 *  
## Resting_Blood_Pressure      0.017929    0.018716   0.958 0.338073    
## Serum_Cholesterol           0.014237    0.008057   1.767 0.077220 .  
## Fasting_Blood_Sugar        -0.158072    1.056833  -0.150 0.881102    
## Resting_ECG                 0.767514    0.367585   2.088 0.036799 *  
## Max_Heart_Rate_Achieved    -0.042724    0.019933  -2.143 0.032079 *  
## Exercise_Induced_Angina     0.830954    0.762069   1.090 0.275541    
## ST_Depression_Exercise      0.123910    0.370983   0.334 0.738377    
## Peak_Exercise_ST_Segment    0.655676    0.733402   0.894 0.371312    
## Num_Major_Vessels_Flouro    1.733654    0.486418   3.564 0.000365 ***
## Thalassemia6                0.143307    1.320759   0.109 0.913597    
## Thalassemia7                2.618716    0.861268   3.041 0.002362 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 293.584  on 212  degrees of freedom
## Residual deviance:  99.031  on 158  degrees of freedom
## AIC: 209.03
## 
## Number of Fisher Scoring iterations: 17
# Re-run Logistic with only significant variables

logitmod2 <- glm(target ~ Chest_Pain_Type+Resting_Blood_Pressure+Exercise_Induced_Angina+Num_Major_Vessels_Flouro+Thalassemia, family = "binomial", data=trainData)

summary(logitmod2)
## 
## Call:
## glm(formula = target ~ Chest_Pain_Type + Resting_Blood_Pressure + 
##     Exercise_Induced_Angina + Num_Major_Vessels_Flouro + Thalassemia, 
##     family = "binomial", data = trainData)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7817  -0.4983  -0.2581   0.5417   2.5019  
## 
## Coefficients:
##                          Estimate Std. Error z value      Pr(>|z|)    
## (Intercept)              -6.44547    1.79949  -3.582      0.000341 ***
## Chest_Pain_Type           0.59300    0.21625   2.742      0.006103 ** 
## Resting_Blood_Pressure    0.01464    0.01106   1.324      0.185665    
## Exercise_Induced_Angina   1.27346    0.43939   2.898      0.003753 ** 
## Num_Major_Vessels_Flouro  1.26868    0.25465   4.982 0.00000062885 ***
## Thalassemia6              1.44373    0.76835   1.879      0.060244 .  
## Thalassemia7              2.53388    0.43588   5.813 0.00000000613 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 293.58  on 212  degrees of freedom
## Residual deviance: 159.04  on 206  degrees of freedom
## AIC: 173.04
## 
## Number of Fisher Scoring iterations: 5
# Apply the model to predict the testdata
# Pred contains the probability that the observation with heart disease's presence for each observation.
# In logistic regression, need to set type='response' in order to compute the prediction probabilities.

pred <- predict(logitmod2, newdata = testData, type = "response")

# Measure the accuracy of prediction in the test data
# The common practice is to take the probability cutoff as 0.5.
# If the probability of Y is > 0.5, then it can be classified an event (presence of heart disease).
# So if pred is greater than 0.5, it is positive(heart disease =yes) else it is negative

y_pred_num <- ifelse(pred > 0.5, 1, 0)
y_pred <- factor(y_pred_num, levels=c(0, 1))
y_act <- testData$target

# Result : Prediction Accuracy (Proportion of predicted target that matches with actual target)
mean(y_pred == y_act)
## [1] 0.8222222
# Plot ROC Curve
# install.packages("InformationValue")
library(InformationValue)
## 
## Attaching package: 'InformationValue'
## The following objects are masked from 'package:yardstick':
## 
##     npv, precision, sensitivity, specificity
## The following objects are masked from 'package:caret':
## 
##     confusionMatrix, precision, sensitivity, specificity
InformationValue::plotROC(y_act, pred)

InformationValue::AUROC(y_act, pred)
## [1] 0.8603671
## Interpretation of ROC
# This is nicely captured by the ‘Receiver Operating Characteristics’ curve, 
# also called as the ROC curve. In fact, the area under the ROC curve can be used as an 
# evaluation metric to compare the efficacy of the models.

Intepretation of the Logistic Regression Model Result

The logistic regression model shows an accuracy of 82% on the test dataset. The ROC obtained is 86%.

The significant features are: Chest_Pain_Type, Resting_Blood_Pressure, Exercise_Induced_Angina, Num_Major_Vessels_Flouro and Thalassemia.

This finding is consistent with the pearson correlation result.

Understanding ROC

This is a way of analyzing how the sensitivity and specificity perform for the full range of probability cutoffs, that is from 0 to 1. Ideally, if you have a perfect model, all the events will have a probability score of 1 and all non-events will have a score of 0. For such a model, the area under the ROC will be a perfect 1. So, if we trace the curve from bottom left, the value of probability cutoff decreases from 1 towards 0. If you have a good model, more of the real events should be predicted as events, resulting in high sensitivity and low FPR. In that case, the curve will rise steeply covering a large area before reaching the top-right. Therefore, the larger the area under the ROC curve, the better is your model.

Create Naive Bayes (NB) Model

NB is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.
Naive Bayes model is easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.

indxTrain <- createDataPartition(y = heart$target,p = 0.70,list = FALSE)
training <- heart[indxTrain,]
testing <- heart[-indxTrain,]

#Check dimensions of the split
prop.table(table(heart$target)) * 100
## 
##        0        1 
## 54.12541 45.87459
prop.table(table(training$target)) * 100
## 
##        0        1 
## 54.46009 45.53991
#create objects x which holds the predictor variables and y which holds the response variables
x = training[,-14]
y = training$target

y <- as.factor(y)
defaultW <- getOption("warn") 
options(warn = -1) 
model = train(x,y,'nb',trControl=trainControl(method='cv',number=10))
model
## Naive Bayes 
## 
## 213 samples
##  13 predictor
##   2 classes: '0', '1' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 192, 192, 191, 191, 191, 191, ... 
## Resampling results across tuning parameters:
## 
##   usekernel  Accuracy   Kappa    
##   FALSE      0.8341775  0.6599184
##    TRUE      0.8198918  0.6278793
## 
## Tuning parameter 'fL' was held constant at a value of 0
## Tuning
##  parameter 'adjust' was held constant at a value of 1
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were fL = 0, usekernel = FALSE and adjust
##  = 1.
#Model Evaluation
#Predict testing set
Predict <- predict(model,newdata = testing ) 
#Get the confusion matrix to see accuracy value and other parameter values
#Confusion Matrix and Statistics
confusionMatrix(Predict, as.factor(testing$target))
## [1] 0 1
## <0 rows> (or 0-length row.names)
options(warn = defaultW)

Conclusion on the performance of Naive Bayes Model

Naive Bayes model result shows almost similar performance with Logistic Regression with Accuracy for ‘true’ prediction is about 83% and ‘false’ is 81%. Overall, the prediction accuracy is about 82%, which is similar to Logistic Regression.

Create Random Forest Model

Random forest is a Supervised Machine Learning Algorithm that is used widely in Classification and Regression problems. It builds decision trees on different samples and takes their majority vote for classification and average in case of regression.
One of the most important features of the Random Forest Algorithm is that it can handle the data set containing continuous variables as in the case of regression and categorical variables as in the case of classification. It performs better results for classification problems.

# Import library
library(randomForest)
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:ggplot2':
## 
##     margin
## The following object is masked from 'package:dplyr':
## 
##     combine
# To control the sampling permutation
set.seed(100)

# Change column 'target' to factor
heart$target <- as.factor(heart$target)

# Check the latest class of 'target'
str(heart)
## 'data.frame':    303 obs. of  14 variables:
##  $ Age                     : chr  "63" "67" "67" "37" ...
##  $ Sex                     : int  1 1 1 1 0 1 0 0 1 1 ...
##  $ Chest_Pain_Type         : int  1 4 4 3 2 2 4 4 4 4 ...
##  $ Resting_Blood_Pressure  : int  145 160 120 130 130 120 140 120 130 140 ...
##  $ Serum_Cholesterol       : int  233 286 229 250 204 236 268 354 254 203 ...
##  $ Fasting_Blood_Sugar     : int  1 0 0 0 0 0 0 0 0 1 ...
##  $ Resting_ECG             : int  2 2 2 0 2 0 2 0 2 2 ...
##  $ Max_Heart_Rate_Achieved : int  150 108 129 187 172 178 160 163 147 155 ...
##  $ Exercise_Induced_Angina : int  0 1 1 0 0 0 0 1 0 1 ...
##  $ ST_Depression_Exercise  : num  2.3 1.5 2.6 3.5 1.4 0.8 3.6 0.6 1.4 3.1 ...
##  $ Peak_Exercise_ST_Segment: int  3 2 2 3 1 1 3 1 2 3 ...
##  $ Num_Major_Vessels_Flouro: num  0 3 2 0 0 0 2 0 1 0 ...
##  $ Thalassemia             : chr  "6" "3" "7" "3" ...
##  $ target                  : Factor w/ 2 levels "0","1": 1 2 2 1 1 1 2 1 2 2 ...
# Split dataset into training and testing set with probability 75% & 25% 
rf_sample <- sample(2, nrow(heart), replace = TRUE, prob = c(0.75, 0.25))
rf_train <- heart[rf_sample==1,]
rf_test <- heart[rf_sample==2,]
str(rf_train)
## 'data.frame':    228 obs. of  14 variables:
##  $ Age                     : chr  "63" "67" "67" "37" ...
##  $ Sex                     : int  1 1 1 1 0 1 0 1 1 1 ...
##  $ Chest_Pain_Type         : int  1 4 4 3 2 2 4 4 4 4 ...
##  $ Resting_Blood_Pressure  : int  145 160 120 130 130 120 120 130 140 140 ...
##  $ Serum_Cholesterol       : int  233 286 229 250 204 236 354 254 203 192 ...
##  $ Fasting_Blood_Sugar     : int  1 0 0 0 0 0 0 0 1 0 ...
##  $ Resting_ECG             : int  2 2 2 0 2 0 0 2 2 0 ...
##  $ Max_Heart_Rate_Achieved : int  150 108 129 187 172 178 163 147 155 148 ...
##  $ Exercise_Induced_Angina : int  0 1 1 0 0 0 1 0 1 0 ...
##  $ ST_Depression_Exercise  : num  2.3 1.5 2.6 3.5 1.4 0.8 0.6 1.4 3.1 0.4 ...
##  $ Peak_Exercise_ST_Segment: int  3 2 2 3 1 1 1 2 3 2 ...
##  $ Num_Major_Vessels_Flouro: num  0 3 2 0 0 0 0 1 0 0 ...
##  $ Thalassemia             : chr  "6" "3" "7" "3" ...
##  $ target                  : Factor w/ 2 levels "0","1": 1 2 2 1 1 1 1 2 2 1 ...
# Running the Random Forest model
rf <- randomForest(target~., data=rf_train, proximity=TRUE) 
print(rf)
## 
## Call:
##  randomForest(formula = target ~ ., data = rf_train, proximity = TRUE) 
##                Type of random forest: classification
##                      Number of trees: 500
## No. of variables tried at each split: 3
## 
##         OOB estimate of  error rate: 17.98%
## Confusion matrix:
##     0  1 class.error
## 0 111 16   0.1259843
## 1  25 76   0.2475248
#Checking the accuracy of training and testing set
p1 <- predict(rf, rf_train)
confusionMatrix(p1, rf_train$target)
## Warning in Ops.factor(predictedScores, threshold): '<' not meaningful for
## factors
## [1] 0 1
## <0 rows> (or 0-length row.names)
p2 <- predict(rf, rf_test)
confusionMatrix(p2, as.factor(rf_test$target))
## Warning in Ops.factor(predictedScores, threshold): '<' not meaningful for
## factors
## [1] 0 1
## <0 rows> (or 0-length row.names)
plot(rf)

Intepretation of Random Forest (RF) Result

The result shows: OB estimate of error rate: 17.98% Therefore, the overall accuracy of RF is about 82%.

Model Comparison

Overall, all the three (3) Machine Learning Models showed similar accuracy of 82%. However, Logistic Regression is deemed as a better model here as it produced model result that shows the significant features in the model. This allows for better model interpretation and implementation.

Conclusion

As a conclusion, with the increasing number of deaths due to heart diseases, it has become mandatory to develop a system to predict heart diseases effectively and accurately. The motivation for this study was to find the most efficient ML algorithm for detection of heart diseases. This study compares the accuracy score of Logistic Regression, Naive Bayes and Random Forest algorithms for predicting heart disease using UCI machine learning repository dataset. The result of this study indicates that the Logistic Regression algorithm is the most suitable algorithm with accuracy score of 82% for prediction of heart disease. One of the advantage of Logistic Regression is, ease of interpretation, implementation and allows to know the significant features in the model. In contrast, RF provides similar accuracy but the interpretation and implementation is rather complex.

In future the work can be enhanced by developing a web application based on the Logistic Regression algorithm as well as using a larger dataset as compared to the one used in this analysis which will help to provide better results and help health professionals in predicting the heart disease effectively and efficiently.

Create a Data Profiling Report (in HTML)

heart3 <- heart
# Recode some categorical variables to numeric for report generation

heart3$Num_Major_Vessels_Flouro = factor(heart3$Num_Major_Vessels_Flouro, levels = c("0","1","2","3"), labels = c(0,1,2,3))

heart3$Thalassemia = factor(heart3$Thalassemia, levels = c("3","6","7"), labels = c(3,6,7))

# Reporting
# Create_report in DataExplorer
# Pull a full data profile of your data frame. 
# It will produce an html file with the basic statistics, structure, missing data, 
# distribution visualizations, correlation matrix and principal component analysis for your data frame

DataExplorer::create_report(heart1)
## 
## 
## processing file: report.rmd
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..                                                                    |   2%
##    inline R code fragments
## 
## 
  |                                                                            
  |...                                                                   |   5%
## label: global_options (with options) 
## List of 1
##  $ include: logi FALSE
## 
## 
  |                                                                            
  |.....                                                                 |   7%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.......                                                               |  10%
## label: introduce
## 
  |                                                                            
  |........                                                              |  12%
##   ordinary text without R code
## 
## 
  |                                                                            
  |..........                                                            |  14%
## label: plot_intro
## 
  |                                                                            
  |............                                                          |  17%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.............                                                         |  19%
## label: data_structure
## 
  |                                                                            
  |...............                                                       |  21%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.................                                                     |  24%
## label: missing_profile
## 
  |                                                                            
  |..................                                                    |  26%
##   ordinary text without R code
## 
## 
  |                                                                            
  |....................                                                  |  29%
## label: univariate_distribution_header
## 
  |                                                                            
  |......................                                                |  31%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.......................                                               |  33%
## label: plot_histogram
## 
  |                                                                            
  |.........................                                             |  36%
##   ordinary text without R code
## 
## 
  |                                                                            
  |...........................                                           |  38%
## label: plot_density
## 
  |                                                                            
  |............................                                          |  40%
##   ordinary text without R code
## 
## 
  |                                                                            
  |..............................                                        |  43%
## label: plot_frequency_bar
## 
  |                                                                            
  |................................                                      |  45%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.................................                                     |  48%
## label: plot_response_bar
## 
  |                                                                            
  |...................................                                   |  50%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.....................................                                 |  52%
## label: plot_with_bar
## 
  |                                                                            
  |......................................                                |  55%
##   ordinary text without R code
## 
## 
  |                                                                            
  |........................................                              |  57%
## label: plot_normal_qq
## 
  |                                                                            
  |..........................................                            |  60%
##   ordinary text without R code
## 
## 
  |                                                                            
  |...........................................                           |  62%
## label: plot_response_qq
## 
  |                                                                            
  |.............................................                         |  64%
##   ordinary text without R code
## 
## 
  |                                                                            
  |...............................................                       |  67%
## label: plot_by_qq
## 
  |                                                                            
  |................................................                      |  69%
##   ordinary text without R code
## 
## 
  |                                                                            
  |..................................................                    |  71%
## label: correlation_analysis
## 
  |                                                                            
  |....................................................                  |  74%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.....................................................                 |  76%
## label: principal_component_analysis
## 
  |                                                                            
  |.......................................................               |  79%
##   ordinary text without R code
## 
## 
  |                                                                            
  |.........................................................             |  81%
## label: bivariate_distribution_header
## 
  |                                                                            
  |..........................................................            |  83%
##   ordinary text without R code
## 
## 
  |                                                                            
  |............................................................          |  86%
## label: plot_response_boxplot
## 
  |                                                                            
  |..............................................................        |  88%
##   ordinary text without R code
## 
## 
  |                                                                            
  |...............................................................       |  90%
## label: plot_by_boxplot
## 
  |                                                                            
  |.................................................................     |  93%
##   ordinary text without R code
## 
## 
  |                                                                            
  |...................................................................   |  95%
## label: plot_response_scatterplot
## 
  |                                                                            
  |....................................................................  |  98%
##   ordinary text without R code
## 
## 
  |                                                                            
  |......................................................................| 100%
## label: plot_by_scatterplot
## output file: D:/01a Prog DS (Thursday)/01 Project/report.knit.md
## "C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS "D:/01a Prog DS (Thursday)/01 Project/report.knit.md" --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output pandoc53c493b7f80.html --lua-filter "C:\Users\Valli\Documents\R\win-library\4.1\rmarkdown\rmarkdown\lua\pagebreak.lua" --lua-filter "C:\Users\Valli\Documents\R\win-library\4.1\rmarkdown\rmarkdown\lua\latex-div.lua" --self-contained --variable bs3=TRUE --standalone --section-divs --table-of-contents --toc-depth 6 --template "C:\Users\Valli\Documents\R\win-library\4.1\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable theme=yeti --include-in-header "C:\Users\Valli\AppData\Local\Temp\RtmpYtASNq\rmarkdown-str53c435732f59.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
## 
## Output created: report.html

References