Visualizations

Column

Exploratory Analysis

Column

Missing Values in Attributes

Churn Percentage

Logistic Regression

Column

Train Model Dataset Summary


Call:
glm(formula = Churn ~ ., family = "binomial", data = train)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.1180  -0.6679  -0.2652   0.6420   3.3752  

Coefficients: (6 not defined because of singularities)
                                        Estimate Std. Error z value Pr(>|z|)
(Intercept)                            -5.115592   1.564618  -3.270  0.00108
gender                                  0.011379   0.079005   0.144  0.88548
SeniorCitizen                           0.238729   0.102858   2.321  0.02029
Partner                                 0.069621   0.094717   0.735  0.46231
Dependents                             -0.192751   0.109240  -1.764  0.07765
PhoneService                            0.811077   0.777516   1.043  0.29687
MultipleLines                           0.672549   0.212669   3.162  0.00156
InternetService.xFiber.optic            2.300187   0.954902   2.409  0.01600
InternetService.xNo                    -2.356390   0.965796  -2.440  0.01469
OnlineSecurity.xNo.internet.service           NA         NA      NA       NA
OnlineSecurity.xYes                    -0.034687   0.216036  -0.161  0.87244
OnlineBackup.xNo.internet.service             NA         NA      NA       NA
OnlineBackup.xYes                       0.138864   0.211080   0.658  0.51062
DeviceProtection.xNo.internet.service         NA         NA      NA       NA
DeviceProtection.xYes                   0.243534   0.213608   1.140  0.25425
TechSupport.xNo.internet.service              NA         NA      NA       NA
TechSupport.xYes                       -0.082466   0.218595  -0.377  0.70599
StreamingTV.xNo.internet.service              NA         NA      NA       NA
StreamingTV.xYes                        0.885013   0.392658   2.254  0.02420
StreamingMovies.xNo.internet.service          NA         NA      NA       NA
StreamingMovies.xYes                    0.878475   0.392274   2.239  0.02513
Contract.xOne.year                     -0.767124   0.134757  -5.693 1.25e-08
Contract.xTwo.year                     -1.475999   0.219192  -6.734 1.65e-11
PaperlessBilling                        0.360428   0.090683   3.975 7.05e-05
PaymentMethod.xCredit.card..automatic. -0.009159   0.138118  -0.066  0.94713
PaymentMethod.xElectronic.check         0.346067   0.115756   2.990  0.00279
PaymentMethod.xMailed.check            -0.065202   0.141331  -0.461  0.64455
tenure_year.x1.2.years                  0.216034   0.189237   1.142  0.25362
tenure_year.x2.3.years                  0.865652   0.314879   2.749  0.00597
tenure_year.x3.4.years                  1.919536   0.451151   4.255 2.09e-05
tenure_year.x4.5.years                  2.483840   0.583321   4.258 2.06e-05
tenure_year.x5.6.years                  3.292625   0.726888   4.530 5.91e-06
tenure                                 -2.377305   0.320891  -7.408 1.28e-13
MonthlyCharges                         -1.827483   1.143597  -1.598  0.11004
TotalCharges                            0.317981   0.199783   1.592  0.11147
                                          
(Intercept)                            ** 
gender                                    
SeniorCitizen                          *  
Partner                                   
Dependents                             .  
PhoneService                              
MultipleLines                          ** 
InternetService.xFiber.optic           *  
InternetService.xNo                    *  
OnlineSecurity.xNo.internet.service       
OnlineSecurity.xYes                       
OnlineBackup.xNo.internet.service         
OnlineBackup.xYes                         
DeviceProtection.xNo.internet.service     
DeviceProtection.xYes                     
TechSupport.xNo.internet.service          
TechSupport.xYes                          
StreamingTV.xNo.internet.service          
StreamingTV.xYes                       *  
StreamingMovies.xNo.internet.service      
StreamingMovies.xYes                   *  
Contract.xOne.year                     ***
Contract.xTwo.year                     ***
PaperlessBilling                       ***
PaymentMethod.xCredit.card..automatic.    
PaymentMethod.xElectronic.check        ** 
PaymentMethod.xMailed.check               
tenure_year.x1.2.years                    
tenure_year.x2.3.years                 ** 
tenure_year.x3.4.years                 ***
tenure_year.x4.5.years                 ***
tenure_year.x5.6.years                 ***
tenure                                 ***
MonthlyCharges                            
TotalCharges                              
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 5704.4  on 4929  degrees of freedom
Residual deviance: 3965.6  on 4901  degrees of freedom
AIC: 4023.6

Number of Fisher Scoring iterations: 6

Observations Train Dataset

************************************************************  
                   Observations on the Train set    
 ***********************************************************  

                Churn Base Line Accuracy                    
                       0            1                       
                  0.7346856     0.2653144 

                    confusion matrix                    
                       1       0                       
                 1    720     588 
                 0    349     3273 

 Accuracy                                       :  0.8099391 
 Misclassification rate or Error Rate           :  0.1900609 
 Sensitivity (TPR)                              :  0.6735267 
 Specificity (TNR)                              :  0.8477078 
 False Negative Rate (FNR)                      :  0.3264733 
 False positive rate (FPR)                      :  0.1522922 
 Negative predictive value (NPV)                :  0.9036444 
 Positive predictive value / Precision (PPV)    :  0.5504587 
 Positive DLR(Diagnostic likelihood ratio)      :  4.422596 
 Negative DLR(Diagnostic likelihood ratio)      :  2.596561 
 F-Score                                        :  0.6058056 

Observations Test Dataset

************************************************************  
                   Observations on the Test set    
 ***********************************************************  

                    confusion matrix                    
                       1       0                       
                 1    294     267 
                 0    152     1400 

 Accuracy                                       :  0.8017037 
 Misclassification rate or Error Rate           :  0.1982963 
 Sensitivity (TPR)                              :  0.6591928 
 Specificity (TNR)                              :  0.839832 
 False Negative Rate (FNR)                      :  0.3408072 
 False positive rate (FPR)                      :  0.160168 
 Negative predictive value (NPV)                :  0.9020619 
 Positive predictive value / Precision (PPV)    :  0.5240642 
 Positive DLR(Diagnostic likelihood ratio)      :  4.115635 
 Negative DLR(Diagnostic likelihood ratio)      :  2.464244 
 F-Score                                        :  0.5839126 

Decision Tree

Conclusion

column

column

Summary

Accuracy score is the strongest metric and a very good sign,

     especially on the first try. Both our test and train curves 

     hug the upper left corner and have very strong values. With 

     such strong models 


------------------------------------------------------------------  
    Area Under the Receiver Operating Characteristic (ROC) Curve   
 ------------------------------------------------------------------  
 Area Under the Curve (AUC)                :  0.8572307 
 ------------------------------------------------------------------  




    Our model has an AUC of 0.85, which is pretty good. If we were

    to just make random guesses, our ROC would be a 45 degree line. 

    This would correspond to an AUC of 0.5.