Predicting Customer Retention

Let’s load our required packages:

Our data structure we will be using for our models:

'data.frame':   1000 obs. of  10 variables:
 $ pick      : chr  "OCC" "ATT" "OCC" "OCC" ...
 $ income    : chr  "<7.5" "45-75" "" "" ...
 $ moves     : chr  "0" "2" "0" "2" ...
 $ age       : chr  "35-44" "25-34" "" "65+" ...
 $ education : chr  "HS" "HS" "" "<HS" ...
 $ employment: chr  "F" "F" "" "R" ...
 $ usage     : int  9 2 6 7 0 0 3 1 0 2 ...
 $ nonpub    : chr  "YES" "YES" "NO" "NO" ...
 $ reachout  : chr  "NO" "NO" "NO" "NO" ...
 $ card      : chr  "NO" "NO" "YES" "NO" ...
NULL

Converting blank character fields to missing data codes

Convert character fields to factor fields

Let’s check our revised structure of att data frame:

'data.frame':   1000 obs. of  10 variables:
 $ pick      : Factor w/ 2 levels "ATT","OCC": 2 1 2 2 2 2 2 2 2 2 ...
 $ income    : Factor w/ 7 levels "<7.5",">75","15-25",..: 1 6 NA NA NA NA 4 3 NA 4 ...
 $ moves     : Factor w/ 9 levels ">10","0","1",..: 2 4 2 4 2 2 2 2 2 2 ...
 $ age       : Factor w/ 6 levels "18-24","25-34",..: 3 2 NA 6 6 6 4 5 5 4 ...
 $ education : Factor w/ 6 levels "<HS",">BA","BA",..: 5 5 NA 1 5 NA 1 5 1 1 ...
 $ employment: Factor w/ 7 levels "D","F","H","P",..: 2 2 NA 5 3 NA 2 5 2 3 ...
 $ usage     : int  9 2 6 7 0 0 3 1 0 2 ...
 $ nonpub    : Factor w/ 2 levels "NO","YES": 2 2 1 1 1 1 1 1 1 1 ...
 $ reachout  : Factor w/ 2 levels "NO","YES": 1 1 1 1 1 1 1 1 1 1 ...
 $ card      : Factor w/ 2 levels "NO","YES": 1 1 2 1 1 1 1 2 1 1 ...
NULL

listwise case deletion for usage and marketing factors

  pick         usage        reachout   card    
 ATT:502   Min.   :  0.00   NO :919   NO :701  
 OCC:479   1st Qu.:  1.00   YES: 62   YES:280  
           Median :  6.00                      
           Mean   : 16.32                      
           3rd Qu.: 23.00                      
           Max.   :291.00                      

provide overview of data

  pick         income        moves        age      education    employment      usage       
 ATT:504   15-25  :185   0      :597   18-24: 61   <HS :153   F      :548   Min.   :  0.00  
 OCC:496   25-35  :171   1      :221   25-34:214   >BA : 60   R      :215   1st Qu.:  1.00  
           7.5-15 :114   2      : 88   35-44:203   BA  :150   H      : 93   Median :  6.00  
           35-45  :107   3      : 38   45-54:152   Coll:187   P      : 67   Mean   : 16.34  
           <7.5   : 96   4      : 16   55-64:153   HS  :361   U      : 26   3rd Qu.: 23.00  
           (Other):112   (Other): 23   65+  :184   Voc : 54   (Other): 25   Max.   :291.00  
           NA's   :215   NA's   : 17   NA's : 33   NA's: 35   NA's   : 26                   
  nonpub    reachout     card    
 NO  :808   NO  :919   NO  :702  
 YES :188   YES : 62   YES :281  
 NA's:  4   NA's: 19   NA's: 17  
                                 
                                 
                                 
                                 

Examine relationship between age and response to promotion. Switchers tend to have lower usage

Plotting the probability smooth for usage and switching:

Create a mosaic plot in using vcd package

Create a mosaic plot in using vcd package

Fittin our logistic regression model


Call:
glm(formula = att_spec, family = binomial, data = attwork)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.3240  -1.1878  -0.5689   1.0693   2.6588  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.338302   0.087095   3.884 0.000103 ***
usage       -0.013080   0.003362  -3.890 0.000100 ***
reachoutYES -0.869531   0.323727  -2.686 0.007231 ** 
cardYES     -0.475578   0.149350  -3.184 0.001451 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1359.4  on 980  degrees of freedom
Residual deviance: 1307.0  on 977  degrees of freedom
AIC: 1315

Number of Fisher Scoring iterations: 4

Printing ANOVA tests

Analysis of Deviance Table

Model: binomial, link: logit

Response: pick

Terms added sequentially (first to last)

         Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
NULL                       980     1359.4              
usage     1   33.303       979     1326.1 7.886e-09 ***
reachout  1    8.872       978     1317.2  0.002895 ** 
card      1   10.227       977     1307.0  0.001384 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Compute predicted probability of switching service providers

Plotting Predicted Probability of Switching


Confusion Matrix (rows=Predicted Service Provider, columns=Actual Service Provider
      
       ATT OCC
  AT&T 250 159
  OCC  252 320

Percent Accuracy:  58.1
            pick
Predict_Pick ATT OCC <NA>
        AT&T 250 159    0
        OCC  252 320    0
        <NA>   0   0    0

Plot predicted Service Provider (50 percent cut-off)

Ploting classification tree result from rpart

cex 1   xlim c(-0.2, 1.2)   ylim c(0, 1)

Visual of fit random forest model to the training data

Ensuring complete data in both partitions were created


ATT OCC 
502 479 

ATT OCC 
168 160 

ATT OCC 
334 319 

Plotting ROC for logistic regression

Plotting ROC for support vector machines

Plotting ROC for random forest

Plotting ROC for Naive Bayes

Plotting ROC for Neural Network

