## Loading required package: ggplot2
## Loading required package: lattice
## [1] "state" "account_length"
## [3] "area_code" "international_plan"
## [5] "voice_mail_plan" "number_vmail_messages"
## [7] "total_day_minutes" "total_day_calls"
## [9] "total_day_charge" "total_eve_minutes"
## [11] "total_eve_calls" "total_eve_charge"
## [13] "total_night_minutes" "total_night_calls"
## [15] "total_night_charge" "total_intl_minutes"
## [17] "total_intl_calls" "total_intl_charge"
## [19] "number_customer_service_calls" "churn"
## state account_length area_code international_plan
## WV : 125 Min. : 1.0 area_code_408:1016 no :3625
## MN : 102 1st Qu.: 73.0 area_code_415:1997 yes: 376
## OH : 99 Median :100.0 area_code_510: 988
## VA : 98 Mean :100.4
## AL : 97 3rd Qu.:127.0
## NY : 97 Max. :243.0
## (Other):3383
## voice_mail_plan number_vmail_messages total_day_minutes total_day_calls
## no :2949 Min. : 0.000 Min. : 0.0 Min. : 0
## yes:1052 1st Qu.: 0.000 1st Qu.:144.1 1st Qu.: 87
## Median : 0.000 Median :180.0 Median :100
## Mean : 7.664 Mean :180.2 Mean :100
## 3rd Qu.:16.000 3rd Qu.:215.9 3rd Qu.:113
## Max. :52.000 Max. :351.5 Max. :163
##
## total_day_charge total_eve_minutes total_eve_calls total_eve_charge
## Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. : 0.00
## 1st Qu.:24.50 1st Qu.:166.7 1st Qu.: 87.0 1st Qu.:14.17
## Median :30.60 Median :201.3 Median :100.0 Median :17.11
## Mean :30.63 Mean :201.0 Mean :100.1 Mean :17.08
## 3rd Qu.:36.70 3rd Qu.:234.9 3rd Qu.:113.0 3rd Qu.:19.97
## Max. :59.76 Max. :363.7 Max. :170.0 Max. :30.91
##
## total_night_minutes total_night_calls total_night_charge total_intl_minutes
## Min. : 23.2 Min. : 33.0 Min. : 1.040 Min. : 0.00
## 1st Qu.:168.2 1st Qu.: 86.0 1st Qu.: 7.570 1st Qu.: 8.50
## Median :201.9 Median :100.0 Median : 9.090 Median :10.30
## Mean :201.3 Mean : 99.9 Mean : 9.057 Mean :10.26
## 3rd Qu.:235.1 3rd Qu.:113.0 3rd Qu.:10.580 3rd Qu.:12.00
## Max. :395.0 Max. :175.0 Max. :17.770 Max. :20.00
##
## total_intl_calls total_intl_charge number_customer_service_calls churn
## Min. : 0.00 Min. :0.000 Min. :0.000 yes: 566
## 1st Qu.: 3.00 1st Qu.:2.300 1st Qu.:1.000 no :3435
## Median : 4.00 Median :2.780 Median :1.000
## Mean : 4.42 Mean :2.769 Mean :1.561
## 3rd Qu.: 6.00 3rd Qu.:3.240 3rd Qu.:2.000
## Max. :19.00 Max. :5.400 Max. :9.000
##

##
## dt_pred yes no
## yes 34 16
## no 107 842
##
## 16 34 107 842
## 1 1 1 1
## Confusion Matrix and Statistics
##
## Reference
## Prediction yes no
## yes 34 16
## no 107 842
##
## Accuracy : 0.8769
## 95% CI : (0.8549, 0.8966)
## No Information Rate : 0.8589
## P-Value [Acc > NIR] : 0.0539
##
## Kappa : 0.3046
##
## Mcnemar's Test P-Value : 4.857e-16
##
## Sensitivity : 0.24113
## Specificity : 0.98135
## Pos Pred Value : 0.68000
## Neg Pred Value : 0.88725
## Prevalence : 0.14114
## Detection Rate : 0.03403
## Detection Prevalence : 0.05005
## Balanced Accuracy : 0.61124
##
## 'Positive' Class : yes
##
## [1] "Prediction is 0.68; recall is 0.24; F measure if 0.36 "
##
## 444 2756
## 1 1
## [1] 3201
## [1] 4001
##
## 453 2748
## 1 1
##
## 566 3435
## 1 1
## CART
##
## 4001 samples
## 19 predictor
## 2 classes: 'yes', 'no'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 3 times)
## Summary of sample sizes: 3601, 3602, 3601, 3601, 3600, 3601, ...
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa
## 0.07067138 0.8845272 0.36768761
## 0.07950530 0.8682849 0.20918381
## 0.10070671 0.8593670 0.07678755
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.07067138.
## CART
##
## 4001 samples
## 19 predictor
## 2 classes: 'yes', 'no'
##
## No pre-processing
## Resampling: Bootstrapped (10 reps)
## Summary of sample sizes: 4001, 4001, 4001, 4001, 4001, 4001, ...
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa
## 0.07067138 0.8875121 0.3967175
## 0.07950530 0.8741046 0.2801062
## 0.10070671 0.8622449 0.1201894
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.07067138.
## CART
##
## 4001 samples
## 19 predictor
## 2 classes: 'yes', 'no'
##
## No pre-processing
## Resampling: Leave-One-Out Cross-Validation
## Summary of sample sizes: 4000, 4000, 4000, 4000, 4000, 4000, ...
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa
## 0.07067138 0.8790302 0.31774690
## 0.07950530 0.8707823 0.29475395
## 0.10070671 0.8360410 -0.04038408
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.07067138.
## randomForest 4.7-1.2
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:ggplot2':
##
## margin
## Confusion Matrix and Statistics
##
## Reference
## Prediction yes no
## yes 67 2
## no 46 685
##
## Accuracy : 0.94
## 95% CI : (0.9212, 0.9554)
## No Information Rate : 0.8588
## P-Value [Acc > NIR] : 2.023e-13
##
## Kappa : 0.7046
##
## Mcnemar's Test P-Value : 5.417e-10
##
## Sensitivity : 0.59292
## Specificity : 0.99709
## Pos Pred Value : 0.97101
## Neg Pred Value : 0.93707
## Prevalence : 0.14125
## Detection Rate : 0.08375
## Detection Prevalence : 0.08625
## Balanced Accuracy : 0.79500
##
## 'Positive' Class : yes
##