LOADING DATA INTO R ENVIRONMENT

TRAINING THE DECISION TREE MODEL

Running the Training Model

## CART 
## 
## 499478 samples
##      7 predictor
##      2 classes: 'N', 'Y' 
## 
## No pre-processing
## Resampling: Cross-Validated (2 fold) 
## Summary of sample sizes: 249740, 249738 
## Resampling results across tuning parameters:
## 
##   cp            Accuracy   Kappa     
##   0.0006257795  0.5631980  0.08617113
##   0.0124805811  0.5600907  0.06606325
##   0.0323348577  0.5498601  0.02283332
## 
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.0006257795.

Variable Importance in Decision Tree Model

TESTING THE DECISION TREE MODEL

Confusion Matrix at 50% Cut-Off Probability

## Confusion Matrix and Statistics
## 
##          Actual
## Predicted     Y     N
##         Y 53928 40264
##         N 13812 16864
##                                           
##                Accuracy : 0.5669          
##                  95% CI : (0.5642, 0.5697)
##     No Information Rate : 0.5425          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.0947          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.7961          
##             Specificity : 0.2952          
##          Pos Pred Value : 0.5725          
##          Neg Pred Value : 0.5497          
##              Prevalence : 0.5425          
##          Detection Rate : 0.4319          
##    Detection Prevalence : 0.7543          
##       Balanced Accuracy : 0.5456          
##                                           
##        'Positive' Class : Y               
## 

Performance Metrics at different Cut-Off Probabilities

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
##    cutoff  Accuracy Senstivity Specificity      kappa
## 1    0.00 0.5424929  1.0000000   0.0000000 0.00000000
## 2    0.05 0.5424929  1.0000000   0.0000000 0.00000000
## 3    0.10 0.5424929  1.0000000   0.0000000 0.00000000
## 4    0.15 0.5424929  1.0000000   0.0000000 0.00000000
## 5    0.20 0.5424929  1.0000000   0.0000000 0.00000000
## 6    0.25 0.5424929  1.0000000   0.0000000 0.00000000
## 7    0.30 0.5424929  1.0000000   0.0000000 0.00000000
## 8    0.35 0.5424929  1.0000000   0.0000000 0.00000000
## 9    0.40 0.5424929  1.0000000   0.0000000 0.00000000
## 10   0.45 0.5578771  0.9440213   0.1000035 0.04710024
## 11   0.50 0.5669347  0.7961027   0.2951968 0.09473542
## 12   0.55 0.5669347  0.7961027   0.2951968 0.09473542
## 13   0.60 0.4575071  0.0000000   1.0000000 0.00000000
## 14   0.65 0.4575071  0.0000000   1.0000000 0.00000000
## 15   0.70 0.4575071  0.0000000   1.0000000 0.00000000
## 16   0.75 0.4575071  0.0000000   1.0000000 0.00000000
## 17   0.80 0.4575071  0.0000000   1.0000000 0.00000000
## 18   0.85 0.4575071  0.0000000   1.0000000 0.00000000
## 19   0.90 0.4575071  0.0000000   1.0000000 0.00000000
## 20   0.95 0.4575071  0.0000000   1.0000000 0.00000000
## 21   1.00 0.4575071  0.0000000   1.0000000 0.00000000

AUC (Area Under the Curve)

## [1] 0.5475826