Set working directory

Read my dataset

telco_cust_churn <- read.csv("WA_Fn-UseC_-Telco-Customer-Churn.csv")

Introduction

Harnessing the power of data analytics is critical for firms looking to understand and reduce customer turnover, which is a major challenge in today’s competitive environment. In this exploratory study, we go into a large dataset (7043 observations of 21 variables) encompassing customer attributes, service subscriptions, and demographic information to gain significant insights into churn behavior. I hope to provide meaningful recommendations for startups looking to improve client retention tactics and operational efficiencies by employing advanced visualization techniques.

  1. Analyse the distribution of churn within the dataset to determine the frequency and degree of customer attrition.

  2. Investigate the correlation between turnover and important demographic characteristics like gender, age, and household composition.

  3. Investigate how contract type, payment method, and service subscriptions affect churn rates.

  4. Visualize the monthly and total charges for churned and retained clients to find viable customer retention pricing options.

Provide enterprises with meaningful data and ideas for improving client retention and optimizing profitability.

Loading neccessary libraries

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Exploring the data

head(telco_cust_churn)
##   customerID gender SeniorCitizen Partner Dependents tenure PhoneService
## 1 7590-VHVEG Female             0     Yes         No      1           No
## 2 5575-GNVDE   Male             0      No         No     34          Yes
## 3 3668-QPYBK   Male             0      No         No      2          Yes
## 4 7795-CFOCW   Male             0      No         No     45           No
## 5 9237-HQITU Female             0      No         No      2          Yes
## 6 9305-CDSKC Female             0      No         No      8          Yes
##      MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection
## 1 No phone service             DSL             No          Yes               No
## 2               No             DSL            Yes           No              Yes
## 3               No             DSL            Yes          Yes               No
## 4 No phone service             DSL            Yes           No              Yes
## 5               No     Fiber optic             No           No               No
## 6              Yes     Fiber optic             No           No              Yes
##   TechSupport StreamingTV StreamingMovies       Contract PaperlessBilling
## 1          No          No              No Month-to-month              Yes
## 2          No          No              No       One year               No
## 3          No          No              No Month-to-month              Yes
## 4         Yes          No              No       One year               No
## 5          No          No              No Month-to-month              Yes
## 6          No         Yes             Yes Month-to-month              Yes
##               PaymentMethod MonthlyCharges TotalCharges Churn
## 1          Electronic check          29.85        29.85    No
## 2              Mailed check          56.95      1889.50    No
## 3              Mailed check          53.85       108.15   Yes
## 4 Bank transfer (automatic)          42.30      1840.75    No
## 5          Electronic check          70.70       151.65   Yes
## 6          Electronic check          99.65       820.50   Yes
summary(telco_cust_churn)
##   customerID           gender          SeniorCitizen      Partner         
##  Length:7043        Length:7043        Min.   :0.0000   Length:7043       
##  Class :character   Class :character   1st Qu.:0.0000   Class :character  
##  Mode  :character   Mode  :character   Median :0.0000   Mode  :character  
##                                        Mean   :0.1621                     
##                                        3rd Qu.:0.0000                     
##                                        Max.   :1.0000                     
##                                                                           
##   Dependents            tenure      PhoneService       MultipleLines     
##  Length:7043        Min.   : 0.00   Length:7043        Length:7043       
##  Class :character   1st Qu.: 9.00   Class :character   Class :character  
##  Mode  :character   Median :29.00   Mode  :character   Mode  :character  
##                     Mean   :32.37                                        
##                     3rd Qu.:55.00                                        
##                     Max.   :72.00                                        
##                                                                          
##  InternetService    OnlineSecurity     OnlineBackup       DeviceProtection  
##  Length:7043        Length:7043        Length:7043        Length:7043       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  TechSupport        StreamingTV        StreamingMovies      Contract        
##  Length:7043        Length:7043        Length:7043        Length:7043       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  PaperlessBilling   PaymentMethod      MonthlyCharges    TotalCharges   
##  Length:7043        Length:7043        Min.   : 18.25   Min.   :  18.8  
##  Class :character   Class :character   1st Qu.: 35.50   1st Qu.: 401.4  
##  Mode  :character   Mode  :character   Median : 70.35   Median :1397.5  
##                                        Mean   : 64.76   Mean   :2283.3  
##                                        3rd Qu.: 89.85   3rd Qu.:3794.7  
##                                        Max.   :118.75   Max.   :8684.8  
##                                                         NA's   :11      
##     Churn          
##  Length:7043       
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 
colSums(is.na(telco_cust_churn))
##       customerID           gender    SeniorCitizen          Partner 
##                0                0                0                0 
##       Dependents           tenure     PhoneService    MultipleLines 
##                0                0                0                0 
##  InternetService   OnlineSecurity     OnlineBackup DeviceProtection 
##                0                0                0                0 
##      TechSupport      StreamingTV  StreamingMovies         Contract 
##                0                0                0                0 
## PaperlessBilling    PaymentMethod   MonthlyCharges     TotalCharges 
##                0                0                0               11 
##            Churn 
##                0

Count and remove NA in the dataset

na_count <- sum(is.na(telco_cust_churn))
telco_cust_churn <- na.omit(telco_cust_churn)

Removed 11 ‘na’ left with 7032 observations of 21 variables

Visualize Churn distribution

The plot reveals that the majority of customers in the dataset did not churn (labelled as “No”), as evidenced by the greater count on the x-axis for “No” churn. Conversely, the number of churned customers (labelled as “Yes”) is substantially lower. This suggests an imbalance in the dataset, with a higher number of non-churned consumers than churned customers. Understanding this distribution is critical for effectively modelling and predicting churn, because imbalanced datasets can affect model performance. Further examination and potential rebalancing methods will be required to effectively solve this issue.

Visualizing Churn by Contract type

The data illustrates that customers with lengthier contracts are more likely to continue with the service, but those with month-to-month contracts are more likely to depart. This shows that committing to a lengthier contract may increase customer loyalty, allowing the company to keep them for a longer period of time. You can see why certain companies, such as EE, would want to commit you to a lengthier term.

Visualizing Churn by Gender

The data shows that gender has no significant effect on churn rates, as both males and females have similar numbers for churn and no churn. However, regardless of gender, lengthier contracts have a stronger influence on minimizing churn. This shows that contract type has a greater impact on client retention than gender.

Visualizing distribution of Monthly Charges by Churn

Customers with lower monthly rates are less likely to churn, as evidenced by the higher bars near the bottom end of the x-axis. Customers with higher monthly charges are more likely to churn, as evidenced by the drop in bars as monthly charges rise. This shows that pricing has the potential to influence client retention

When considering churn status, customers who do not churn tend to have lower monthly prices, whereas churned customers have higher monthly charges. This emphasizes the relevance of price tactics in retention of customers initiatives.

Visualizing distribution of Total Charges by Churn

Customers with lower total charges are more prone to churn, as indicated by the higher bars near the bottom of the the horizontal axis Customers with larger total charges are less likely to churn, as evidenced by a decrease in bars as total charges rise. This implies that higher-spending customers are more loyal and less likely to churn.

When evaluating churn status, churned customers typically have lower overall charges, whereas non-churned customers have higher total charges. This shows that customer loyalty may be influenced by overall expenditure, with higher-spending consumers demonstrating stronger loyalty.

Throughout the entire study, it is clear that some trends emerge regarding the customer attrition. Lets take a look:

By combining these insights, organisations can better comprehend the factors that influence customer churn and modify their strategy to increase retention rates. To promote the customer engagement and loyalty, businesses can concentrate on providing value-added services or offering incentives for extended term commitments.