Objective
Predict analysis creating customer attrituion using the
CreditCardData from the AER package, using the classification algorithm
from the rpart package.
Data Prep
- Created a new binary variable Attrituion_Flag.
- Split the data (10127 observations), into training and validation
datasets (70%/30% , 7204/2923)
The Tree

Two interpretations from the tree
1. Existing customer:
- Has Total_Trans_Amt >= 5265
- Has Total_Trans_Ct < 79
2. Attrited customer:
- Has Total_Ct_Chng_Q4_Q1 < 0.65
- Has Customer_Age >= 38
Variable Importance
Interesting how Total_Relationship_Count is perceived by many as a
predictor for attrition. However, the data shows that it is not a good
predictor.
## Total_Trans_Ct Total_Trans_Amt Total_Revolving_Bal
## 776.690803 592.394514 439.793691
## Avg_Utilization_Ratio Total_Ct_Chng_Q4_Q1 Total_Relationship_Count
## 394.522619 250.915330 222.382317
## Total_Amt_Chng_Q4_Q1 Credit_Limit Customer_Age
## 85.742262 82.814053 41.963318
## Avg_Open_To_Buy Months_on_book Contacts_Count_12_mon
## 29.870754 19.025366 5.307008
## Dependent_count Card_Category Marital_Status
## 4.712738 3.079317 2.465411
## CLIENTNUM
## 1.067827

Model Accuracy
Using the validation data, it is observed that the model is able
to:
- Predict correctly 81.3% of the attritted customers.
- Misclassify faithful people as cheaters 4.2% of the time
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 2990
##
##
## | validation_tree$Attrition_Flag_predicted
## validation_tree$Attrition_Flag | Attrited Customer | Existing Customer | Row Total |
## -------------------------------|-------------------|-------------------|-------------------|
## Attrited Customer | 392 | 90 | 482 |
## | 1210.390 | 241.884 | |
## | 0.813 | 0.187 | 0.161 |
## | 0.787 | 0.036 | |
## | 0.131 | 0.030 | |
## -------------------------------|-------------------|-------------------|-------------------|
## Existing Customer | 106 | 2402 | 2508 |
## | 232.619 | 46.486 | |
## | 0.042 | 0.958 | 0.839 |
## | 0.213 | 0.964 | |
## | 0.035 | 0.803 | |
## -------------------------------|-------------------|-------------------|-------------------|
## Column Total | 498 | 2492 | 2990 |
## | 0.167 | 0.833 | |
## -------------------------------|-------------------|-------------------|-------------------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 1731.379 d.f. = 1 p = 0
##
## Pearson's Chi-squared test with Yates' continuity correction
## ------------------------------------------------------------
## Chi^2 = 1725.829 d.f. = 1 p = 0
##
##