1. Introduction

Churn with respect to a company refers to contractual customer base. It is an important factor for any business with a subscriber-based service model, including mobile telephone networks. In this paper, churn is the factor of how many people ended the=ir subscriptions with the company in the previous month.

Churn is an important factor for a company as it displays the image of the company’s service quality and customer satisfaction. It also displays the strength the company holds in the market.

This paper investigates the factors the govern the churn in a telecommunications company and analyses the their affects on churn.

2. Overview of the Study

Our field of study concerns the churn in a telecommunications company and how it is affected by various factors such as gender, senior citizenship, partnership, dependentship, various services subscribed for, contracts types, monthly and total charges.

Our analysis of churn in the company throws light oon how companies can hold their place in the market and hoow they can benefit from the factors affecting churn.

3. An empirical field study of Churn in telecommunications Company

3.1 Overview

Our specific objective is to determine what factors affect the churn in a company most and how they affect it.

It is important as no company wants their customer base to be in flux and dilemma. All the companis want market monopoly which is ossible only by reducing the churn of the company.

This paper provides reasonable insight of churn is affected by differnt variables.

Hypothesis H1: The churn in the company depends on the gender of its customers, their age range, their status of partnership, their status of dependentship, various services they have subscribed for, their contracts types, and their total charges.

3.2 Data

For this study, we collected data from the IBM training website. The data can be collected from (https://community.watsonanalytics.com/wp-content/uploads/2015/03/WA_Fn-UseC_-Telco-Customer-Churn.csv?cm_mc_uid=60666461180615162966768&cm_mc_sid_50200000=1517419332&cm_mc_sid_52640000=1517419332). This dataset contains of all the information about the customers including their customer ID, their gender, their age range, their status of partnership, their status of dependentship, various services they have subscribed for, their contracts types, their payment methods and their total charges.

3.3 Model

In order to test Hypothesis 1a, we proposed the following model:

Model: \[Churn = \beta_0 + \beta_1TotalCharges + \beta_2tenure + \beta_3gender + \beta_4PhoneService + \beta_5SeniorCitizen + \beta_6MultipleLines + \beta_7InternetService + \beta_8Contract + \epsilon\]

This analysis has to be done using the logistic regression analysis as the variable churn whose dependency is to analysed is a factor variable.

# reading data in the file
ibm.df<-read.csv(paste("IBM.csv"),)
#dividing the dataset for testing purposes 
train <- ibm.df[1:6950,]
test <- ibm.df[6951:7043,]
# logistic regression
model=glm(Churn~TotalCharges+tenure+gender+PhoneService+SeniorCitizen+MultipleLines+InternetService+Contract,data = train,family = binomial)
summary(model)
## 
## Call:
## glm(formula = Churn ~ TotalCharges + tenure + gender + PhoneService + 
##     SeniorCitizen + MultipleLines + InternetService + Contract, 
##     family = binomial, data = train)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7307  -0.7093  -0.3050   0.8173   3.5096  
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 3.483e-01  1.234e-01   2.821 0.004786 ** 
## TotalCharges                3.120e-04  6.316e-05   4.939 7.86e-07 ***
## tenure                     -6.203e-02  5.885e-03 -10.540  < 2e-16 ***
## genderMale                 -1.477e-02  6.417e-02  -0.230 0.817903    
## PhoneServiceYes            -7.805e-01  1.295e-01  -6.026 1.68e-09 ***
## SeniorCitizenYes            3.299e-01  8.173e-02   4.036 5.43e-05 ***
## MultipleLinesYes            3.037e-01  7.862e-02   3.863 0.000112 ***
## InternetServiceFiber optic  1.083e+00  9.291e-02  11.653  < 2e-16 ***
## InternetServiceNo          -7.178e-01  1.277e-01  -5.619 1.92e-08 ***
## ContractOne year           -8.096e-01  1.050e-01  -7.708 1.28e-14 ***
## ContractTwo year           -1.675e+00  1.727e-01  -9.700  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 8029.0  on 6938  degrees of freedom
## Residual deviance: 5888.3  on 6928  degrees of freedom
##   (11 observations deleted due to missingness)
## AIC: 5910.3
## 
## Number of Fisher Scoring iterations: 6

We established the effect of Total charges, phone Services, internet services atc. on churn with the simplest model. We estimated model, using logistic regression.

3.4 Results

We found empirical support for H1. The odds for churn decreases as tenure increases. Similarly the odds for churn decreases as contract increases. The subscription for phone service results in the decrease in the ods of churn. The results can be inferred similarly from the regression above.

4. Conclusion

This paper was motivated by the need for research that could improve our understanding of how churn of a company is influenced by various factors related to the customers. We observe that odds churn can be reduced by increasing contracts with the customes, by catching on the customers for phone service subscriptions. Its important for the company to reduce the charge of its service as more charge promotes churn.

Appendix 1

Descriptive statistics

# Summarize the Data
library(psych)
describe(ibm.df)
                  vars    n    mean      sd  median trimmed     mad   min
customerID*          1 7043 3522.00 2033.28 3522.00 3522.00 2610.86  1.00
gender*              2 7043    1.50    0.50    2.00    1.51    0.00  1.00
SeniorCitizen*       3 7043    1.16    0.37    1.00    1.08    0.00  1.00
Partner*             4 7043    1.48    0.50    1.00    1.48    0.00  1.00
Dependents*          5 7043    1.30    0.46    1.00    1.25    0.00  1.00
tenure               6 7043   32.37   24.56   29.00   31.43   32.62  0.00
PhoneService*        7 7043    1.90    0.30    2.00    2.00    0.00  1.00
MultipleLines*       8 7043    1.42    0.49    1.00    1.40    0.00  1.00
InternetService*     9 7043    1.87    0.74    2.00    1.84    1.48  1.00
OnlineSecurity*     10 7043    1.29    0.45    1.00    1.23    0.00  1.00
OnlineBackup*       11 7043    1.34    0.48    1.00    1.31    0.00  1.00
DeviceProtection*   12 7043    1.34    0.48    1.00    1.30    0.00  1.00
TechSupport*        13 7043    1.29    0.45    1.00    1.24    0.00  1.00
StreamingTV*        14 7043    1.38    0.49    1.00    1.36    0.00  1.00
StreamingMovies*    15 7043    1.39    0.49    1.00    1.36    0.00  1.00
Contract*           16 7043    1.69    0.83    1.00    1.61    0.00  1.00
PaperlessBilling*   17 7043    1.59    0.49    2.00    1.62    0.00  1.00
PaymentMethod*      18 7043    2.57    1.07    3.00    2.59    1.48  1.00
MonthlyCharges      19 7043   64.76   30.09   70.35   64.97   35.66 18.25
TotalCharges        20 7032 2283.30 2266.77 1397.47 1970.14 1812.92 18.80
Churn*              21 7043    1.27    0.44    1.00    1.21    0.00  1.00
                      max  range  skew kurtosis    se
customerID*       7043.00 7042.0  0.00    -1.20 24.23
gender*              2.00    1.0 -0.02    -2.00  0.01
SeniorCitizen*       2.00    1.0  1.83     1.36  0.00
Partner*             2.00    1.0  0.07    -2.00  0.01
Dependents*          2.00    1.0  0.87    -1.23  0.01
tenure              72.00   72.0  0.24    -1.39  0.29
PhoneService*        2.00    1.0 -2.73     5.43  0.00
MultipleLines*       2.00    1.0  0.32    -1.90  0.01
InternetService*     3.00    2.0  0.21    -1.15  0.01
OnlineSecurity*      2.00    1.0  0.94    -1.11  0.01
OnlineBackup*        2.00    1.0  0.65    -1.57  0.01
DeviceProtection*    2.00    1.0  0.66    -1.57  0.01
TechSupport*         2.00    1.0  0.92    -1.15  0.01
StreamingTV*         2.00    1.0  0.48    -1.77  0.01
StreamingMovies*     2.00    1.0  0.46    -1.79  0.01
Contract*            3.00    2.0  0.63    -1.27  0.01
PaperlessBilling*    2.00    1.0 -0.38    -1.86  0.01
PaymentMethod*       4.00    3.0 -0.17    -1.21  0.01
MonthlyCharges     118.75  100.5 -0.22    -1.26  0.36
TotalCharges      8684.80 8666.0  0.96    -0.23 27.03
Churn*               2.00    1.0  1.06    -0.87  0.01

Distribution of gender based on churn

gender2<-xtabs(~ibm.df$Churn+ibm.df$gender)
gender2
##             ibm.df$gender
## ibm.df$Churn Female Male
##          No    2549 2625
##          Yes    939  930

Distribution of Senior Citizenship based on churn

sc2<-xtabs(~ibm.df$Churn+ibm.df$SeniorCitizen)
sc2
##             ibm.df$SeniorCitizen
## ibm.df$Churn   No  Yes
##          No  4508  666
##          Yes 1393  476

Variation opf tenure

boxplot(ibm.df$tenure,horizontal = TRUE, main="Tenure of Subscribers",xlab="Months",col="grey")

hist(ibm.df$tenure,breaks = 30,main = "Frequency of Tenure Months",xlab = "Tenure",col="grey")

Average Tenure of all the Subscribers w.r.t churn

aggregate(tenure~Churn,data = ibm.df,mean)
##   Churn   tenure
## 1    No 37.56997
## 2   Yes 17.97913

Dependency of Churn on certain subscriptions

xtabs(~ibm.df$Churn+ibm.df$PhoneService)
##             ibm.df$PhoneService
## ibm.df$Churn   No  Yes
##          No   512 4662
##          Yes  170 1699
xtabs(~ibm.df$Churn+ibm.df$MultipleLines)
##             ibm.df$MultipleLines
## ibm.df$Churn   No  Yes
##          No  3053 2121
##          Yes 1019  850
xtabs(~ibm.df$Churn+ibm.df$InternetService)
##             ibm.df$InternetService
## ibm.df$Churn  DSL Fiber optic   No
##          No  1962        1799 1413
##          Yes  459        1297  113

Dependecy of Churn on Contracts

xtabs(~ibm.df$Churn+ibm.df$Contract)
##             ibm.df$Contract
## ibm.df$Churn Month-to-month One year Two year
##          No            2220     1307     1647
##          Yes           1655      166       48

Variation of Total Charges

boxplot(ibm.df$TotalCharges,horizontal = TRUE, main="Total Charges of Subscribers",xlab="Amount",col="grey")

hist(ibm.df$TotalCharges,breaks = 30,main = "Frequency of Total Charges ",xlab = "Amount",col="grey")

Average Total Charge of all the Subscribers w.r.t churn

aggregate(TotalCharges~Churn,data = ibm.df,mean)
##   Churn TotalCharges
## 1    No     2555.344
## 2   Yes     1531.796

Coefficient Plot

library(coefplot)
coefplot(model, intercept=FALSE)

Appendix 2

Variation of monthly charges with tenure and Churn

scatterplot(MonthlyCharges~tenure|Churn,data = ibm.df,cex=0.5)

Variation of Total charges with tenure and Churn

scatterplot(TotalCharges~tenure|Churn,data = ibm.df,cex=0.5)

How Tenure varies with Contracts.

aggregate(tenure~Contract,data = ibm.df,mean)
##         Contract   tenure
## 1 Month-to-month 18.03665
## 2       One year 42.04481
## 3       Two year 56.73510
boxplot(tenure~Contract,data = ibm.df,horizontal=TRUE,col="grey",main="Variation of Tenure with Contracts",xlab="Tenure")

                                                  THE END