regorK

Forecasting Customer Churn in Regork’s Telecom Expansion

regorK

Introduction

Introduction

  • At Regork, we are trying to be differentiators and test out our waters in the telecommunications sector. This means we need to understand how our customers are using and leaving our services. This is important for our company because it can help us understand the most influential reasons why people are no longer using our services. This analysis will allow you to help understand our current customers as well as know what resonates well with our loyal customers so we can try and replicate that moving forward when marketing to a new customer base.

    Methodology

  • For our methodology, we will be using machine learning to help identify what the most important features are that might cause someone to leave. The machine learning models that we will be evaluating are logistic regression, multivariate adaptive regression splines (MARS), and random forest. We will assess these models by looking at the area under curve (AUC) and evaluating the confusion matrix for each model.

Proposed Solution

  • Our solution introduces a discount program targeting key churn factors—tenure and total charges. Financial projections suggest the program’s costs are outweighed by its benefits, with a 10% discount for the most vulnerable customer segment (customers who have only stayed with us for less than 12 months) poised to significantly reduce churn. The expected net gain from this program will help grow customer loyalty and company profitability.

Packages Required

We show the packages used below:

  • tidyverse: Used to tidy up our data
  • tidymodels: Allows us to do our ML workflow
  • dplyr: Applied for data manipulation and transformation
  • ggplot2: Vital with creating visually appealing graphs
  • DT: Create visually appealing tabular data
  • gridExtra: Allows us to plot in a grid format
  • pdp: Allows us to plot partial dependencies
  • earth: Allows us to do multivariate adaptive regression splines
  • vip Used to plot feature importance
library(tidyverse)
library(tidymodels)
library(dplyr)
library(ggplot2)
library(DT)
library(gridExtra)
library(pdp)
library(earth)
library(vip)

Data Preparation

Overview of Our Data Set

Showcase a Sample of the Retention Data

retention <- read_csv("data/customer_retention.csv")

# Pull in our retention dataset
datatable(head(retention, 15, width = auto))

Transformation / Meta Data

Below is an overview of the metadata and transformations used in this dataset:

  • Transformations:
    • Changed the Status variable into a factor
    • Dropped any null values from our dataset

Retention

retention <- retention %>% 
  dplyr::mutate(Status = as.factor(Status))

retention <- drop_na(retention)

glimpse(retention) 
## Rows: 6,988
## Columns: 20
## $ Gender           <chr> "Female", "Male", "Male", "Male", "Female", "Female",…
## $ SeniorCitizen    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ Partner          <chr> "Yes", "No", "No", "No", "No", "No", "No", "No", "Yes…
## $ Dependents       <chr> "No", "No", "No", "No", "No", "No", "Yes", "No", "No"…
## $ Tenure           <dbl> 1, 34, 2, 45, 2, 8, 22, 10, 28, 62, 13, 16, 58, 49, 2…
## $ PhoneService     <chr> "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes", "No", …
## $ MultipleLines    <chr> "No phone service", "No", "No", "No phone service", "…
## $ InternetService  <chr> "DSL", "DSL", "DSL", "DSL", "Fiber optic", "Fiber opt…
## $ OnlineSecurity   <chr> "No", "Yes", "Yes", "Yes", "No", "No", "No", "Yes", "…
## $ OnlineBackup     <chr> "Yes", "No", "Yes", "No", "No", "No", "Yes", "No", "N…
## $ DeviceProtection <chr> "No", "Yes", "No", "Yes", "No", "Yes", "No", "No", "Y…
## $ TechSupport      <chr> "No", "No", "No", "Yes", "No", "No", "No", "No", "Yes…
## $ StreamingTV      <chr> "No", "No", "No", "No", "No", "Yes", "Yes", "No", "Ye…
## $ StreamingMovies  <chr> "No", "No", "No", "No", "No", "Yes", "No", "No", "Yes…
## $ Contract         <chr> "Month-to-month", "One year", "Month-to-month", "One …
## $ PaperlessBilling <chr> "Yes", "No", "Yes", "No", "Yes", "Yes", "Yes", "No", …
## $ PaymentMethod    <chr> "Electronic check", "Mailed check", "Mailed check", "…
## $ MonthlyCharges   <dbl> 29.85, 56.95, 53.85, 42.30, 70.70, 99.65, 89.10, 29.7…
## $ TotalCharges     <dbl> 29.85, 1889.50, 108.15, 1840.75, 151.65, 820.50, 1949…
## $ Status           <fct> Current, Current, Left, Current, Left, Left, Current,…
print(colSums(is.na(retention))) 
##           Gender    SeniorCitizen          Partner       Dependents 
##                0                0                0                0 
##           Tenure     PhoneService    MultipleLines  InternetService 
##                0                0                0                0 
##   OnlineSecurity     OnlineBackup DeviceProtection      TechSupport 
##                0                0                0                0 
##      StreamingTV  StreamingMovies         Contract PaperlessBilling 
##                0                0                0                0 
##    PaymentMethod   MonthlyCharges     TotalCharges           Status 
##                0                0                0                0

Exploratory Data Analysis

What Contract Types have the Highest Attrition Rates?

Regork_colors <- c("#0071ce", "#4b7f42") 

# Retention Rate By Contract Type
ggplot(retention, aes(x = Contract, fill = Status)) +
  geom_bar(position = "fill", color = "black") +
  scale_fill_manual(values = Regork_colors) +
  theme_minimal() +
  labs(y = "Retention Rate (%)", 
       title = "Retention Rate by Contract Type") +
  theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
        axis.text.y = element_text(size = 10),
        axis.title = element_text(size = 14, face = "bold"),
        plot.title = element_text(size = 18, face = "bold", hjust = 0.5))

Looking at the different types of contracts that we offer (month-to-month, one-year, and two-year), it looks like people who are on a month-to-month contract are more likely to leave as opposed to those who are on a one-year or two-year contract.

How Does Retention Rate Vary Among Customers Based on TV and Internet Service?

retention %>%
  group_by(StreamingTV, Status) %>%
  ggplot(aes(x = StreamingTV, fill = Status)) + 
  geom_bar(position = "fill", color = "black") +
  scale_fill_manual(values = Regork_colors) +  
  labs(y = "Retention Rate (%)", 
       title = "Retention Rate Based on TV Streaming Package")  +
  theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
        axis.text.y = element_text(size = 10),
        axis.title = element_text(size = 14, face = "bold"),
        plot.title = element_text(size = 18, face = "bold", hjust = 0.5))

This graph shows the retention rate of those with the streaming TV service, without the TV service, and those with the TV service but not the internet service. Those three categories are then broken into two proportions, current customers and those that have left.

One insight that can be pulled from the graph is that the proportion of current customers to leaving customers is nearly identical for those with internet and TV streaming compared to those with internet but no TV Streaming.

How is customer status distributed based on whether they’re senior citizens or not?

plot2data <- retention %>% 
  mutate( 
    Gender = as.factor(Gender), 
    SeniorCitizen = factor(SeniorCitizen, levels = c(0, 1), labels = c("No", "Yes")), 
    Partner = as.factor(Partner), 
    Dependents =     as.factor(Dependents), 
    PhoneService = as.factor(PhoneService), 
    MultipleLines = as.factor(MultipleLines), 
    Status = as.factor(Status) 
  ) 

ggplot(plot2data, aes(x = SeniorCitizen, fill = Status)) + 
  geom_bar(position = "dodge") + 
  labs(x = "Senior Citizen?", y = "Count", fill = "Status") + 
  scale_fill_manual(values = Regork_colors) + 
  ggtitle("Distribution of customer status\nbased on whether they were a senior citizen") + 
  theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
        axis.text.y = element_text(size = 10),
        axis.title = element_text(size = 14, face = "bold"),
        plot.title = element_text(size = 18, face = "bold", hjust = 0.5))

From the graph, we can see that there is a larger number of current customers who are not senior citizens compared to those who are. The number of customers who have churned is lower than the number of current customers for all categories. The difference between the number of current and churned customers is larger among non-senior citizens than among senior citizens, suggesting that the churn proportion is much larger for the number of senior citizens. However, the customer segment of non-senior citizens certainly outweighs the senior one.

Machine Learning

Machine Learning Data Preprocessing

Preprocessing Steps

  • Data Spliting: First, we are going to split our dataset with a 70/30 split. 70% is going to be allocated to our training data set, while 30% will be used for our testing data set, to assess how well our data responds to unseen data.
  • K-Fold Cross-Validation: Next, we set up our k-folds validation, which allows us to assess the model’s generalization ability and helps mitigate the risks of overfitting or underfitting to a particular train-test split.
  • Data Preparation with Recipe: Last, we created a recipe to normalize all of our numeric predictors. Then, we dummy encoded all of our categorical predictors to allow our machine learning models to incorporate them.
set.seed(123)
split <- initial_split(retention, prop = 0.7, strata = "Status")
retention_train <- training(split)
retention_test <- testing(split)


set.seed(123)
kfolds <- vfold_cv(retention_train, v = 5, strata = Status)


retention_recipe <- recipe(Status ~ ., data = retention_train) %>%
  step_normalize(all_numeric_predictors()) %>%
  step_dummy(all_nominal_predictors())

Logistic Regression Model

Setting up our Logstic Regression model

logistic_results <- logistic_reg() %>%
  fit_resamples(Status ~ ., kfolds)
collect_metrics(logistic_results)
## # A tibble: 2 × 6
##   .metric  .estimator  mean     n std_err .config             
##   <chr>    <chr>      <dbl> <int>   <dbl> <chr>               
## 1 accuracy binary     0.799     5 0.00401 Preprocessor1_Model1
## 2 roc_auc  binary     0.845     5 0.00521 Preprocessor1_Model1

Let’s define our Logistic regression model . Then, we will asses the model based on the ROC from our model, which is 0.845!

logistic_final_fit <- logistic_reg() %>%
  fit(Status ~ ., data = retention_train) 

Let go ahead and fit our logistic regression model on our training data.

logistic_final_fit %>%
  predict(retention_test) %>%
  bind_cols(retention_test %>% select(Status)) %>%
  conf_mat(truth = Status, estimate = .pred_class)
##           Truth
## Prediction Current Left
##    Current    1362  225
##    Left        178  332
vip(logistic_final_fit$fit, num_features = 20)

This code creates a confusion matrix, which allows us to see which predictions the model got correct and which predictions it got wrong on our testing set.

It also shows us what logistic regression thinks the most important features are for people churning.

Multivariate Adaptive Regression Splines (MARS) Model - Optimal Model

Setting up our MARS model

mars_mod <- mars(num_terms = tune(), prod_degree = tune()) %>%
  set_mode("classification")

mars_grid <- grid_regular(num_terms(range = c(1,100)), prod_degree(), levels = 25)

retention_wf <- workflow() %>%
  add_recipe(retention_recipe) %>%
  add_model(mars_mod)

tuning_results <- retention_wf %>%
  tune_grid(resamples = kfolds, grid = mars_grid)

tuning_results %>%
  collect_metrics() %>%
  filter(.metric == "roc_auc") %>%
  arrange(desc(mean))
## # A tibble: 50 × 8
##    num_terms prod_degree .metric .estimator  mean     n std_err .config         
##        <int>       <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>           
##  1        21           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  2        25           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  3        29           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  4        34           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  5        38           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  6        42           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  7        46           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  8        50           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
##  9        54           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
## 10        58           1 roc_auc binary     0.849     5 0.00486 Preprocessor1_M…
## # ℹ 40 more rows

Let’s define our MARS model for classification and use a grid to tune our hyperparameters. From there, we will collect the best ROC from our model, which is 0.849!

best_hyperparameters <- select_best(tuning_results, metric = "roc_auc")

mars_final_wf <- workflow() %>%
  add_recipe(retention_recipe) %>%
  add_model(mars_mod) %>%
  finalize_workflow(best_hyperparameters)


mars_final_fit <- mars_final_wf %>%
  fit(data = retention_train)

# Predicting on our test set
mars_final_fit %>%
  predict(retention_test) %>%
  bind_cols(retention_test %>% select(Status)) %>%
  conf_mat(truth = Status, estimate = .pred_class)
##           Truth
## Prediction Current Left
##    Current    1383  242
##    Left        157  315
mars_final_fit %>%
  predict(retention_test, type = "prob") %>%
  mutate(truth = retention_test$Status) %>%
  roc_auc(truth, .pred_Current)
## # A tibble: 1 × 3
##   .metric .estimator .estimate
##   <chr>   <chr>          <dbl>
## 1 roc_auc binary         0.847

Proceeding with fitting the final workflow to the training data, we then predict on the test set, evaluate model performance using a confusion matrix, and calculate the AUC score.

The results of the model prediction on the test set show that out of the instances labeled as “Current” the model correctly predicted 1383 people that are current and incorrectly classified 242 people as “Left”. Similarly, for instances labeled as “Left” the model correctly predicted 315 people that have left and misclassified 157 people as “Current”.

Additionally, the AUC score is calculated as 0.847 which is similar to our training AUC.

mars_final_fit %>%
  extract_fit_parsnip() %>%
  vip()

We can see here that our most important features on why someone is likely to leave are their total charges, the tenure of their account, and the monthly charges they acquire.

As a person responsible for making business decisions, what else are you learning from the observations in this section?

Our model did well on both the training and testing dataset so therefore we believe moving forward it will do well with unseen data in the future

Random Forest Model

Setting up our Random Forest model

First, we trained our model with default hyperparameter values and used the same 5-fold cross validation object and achieved the mean 5-fold cross-validated AUC of 0.836. After that, we arrived at the optimal random forest model with the highest AUC score of 0.847 after tuning the following values:

  • trees: tune of values ranging from 15 to 300 (From the rule of thumb of starting with 10 times the number of features)
  • mtry: tune for values ranging from 2 to 12 (Starting with five evenly spaced values of mtry across the range 2-p centered at the recommended default √p)
  • min_n: tune for values ranging from 1 to 20

Assess a total of 5 values from each parameter (levels = 5).

rf_mod <- rand_forest( 
  mode = "classification", 
  trees = tune(), 
  mtry = tune(), 
  min_n = tune() 
  ) %>% 
  set_engine("ranger", importance = "impurity") 
# create the hyperparameter grid 

rf_hyper_grid <- grid_regular( 
  trees(range = c(15, 300)), 
  mtry(range = c(2, 12)), 
  min_n(range = c(1, 20)), 
  levels = 5 
  ) 

# train our model across the hyper parameter grid 

set.seed(123) 
rf_results <- tune_grid(rf_mod, retention_recipe, resamples = kfolds, grid = rf_hyper_grid) 
# model results 

show_best(rf_results, metric = "roc_auc") 
## # A tibble: 5 × 9
##    mtry trees min_n .metric .estimator  mean     n std_err .config              
##   <int> <int> <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>                
## 1     4    86    20 roc_auc binary     0.847     5 0.00463 Preprocessor1_Model1…
## 2     4   228    20 roc_auc binary     0.846     5 0.00442 Preprocessor1_Model1…
## 3     4   300    20 roc_auc binary     0.845     5 0.00498 Preprocessor1_Model1…
## 4     4   300    15 roc_auc binary     0.844     5 0.00404 Preprocessor1_Model0…
## 5     4   157    10 roc_auc binary     0.844     5 0.00467 Preprocessor1_Model0…
rf_best_hyperparameters <- select_best(rf_results, metric = "roc_auc")

rf_final_wf <- workflow() %>%
  add_recipe(retention_recipe) %>%
  add_model(rf_mod) %>%
  finalize_workflow(rf_best_hyperparameters)


rf_final_fit <- rf_final_wf %>%
  fit(data = retention_train)

# Predicting on our test set
rf_final_fit %>%
  predict(retention_test) %>%
  bind_cols(retention_test %>% select(Status)) %>%
  conf_mat(truth = Status, estimate = .pred_class)
##           Truth
## Prediction Current Left
##    Current    1396  268
##    Left        144  289
rf_final_fit %>%
  predict(retention_test, type = "prob") %>%
  mutate(truth = retention_test$Status) %>%
  roc_auc(truth, .pred_Current)
## # A tibble: 1 × 3
##   .metric .estimator .estimate
##   <chr>   <chr>          <dbl>
## 1 roc_auc binary         0.845

This random forest model demonstrates a stronger ability to identify customers who are still with the service rather than those who have left. The higher number of false positives, where the model predicts customers as “Current” when they have actually “Left,” indicates a tendency of the model to be overly optimistic about customer retention. Conversely, the lower number of false negatives suggests that the model is more cautious about predicting that a customer has churned. This pattern reveals a bias in the model towards predicting customer continuation, which may lead to an underestimation of the churn rate.

rf_final_fit %>%  
   predict(retention_train, type = "prob") %>% 
   mutate(truth = retention_train$Status) %>% 
   roc_curve(truth, .pred_Current) %>% 
   autoplot() 

Summary

In terms of relative importance, how would you rate the predictors in your model? As a business manager, which factors would you focus on (for example, you could invest in offering some incentives or promotions) to decrease the chances of customers leaving?

mars_final_fit %>%
  extract_fit_parsnip() %>%
  vip()

In our optimal model, the most influential variables are ‘Total Charges’, ‘Tenure’, and ‘MonthlyCharges’. Following those variables are ‘PaymentMethod_Electronic.check’ and ‘OnlineSecurity_Yes’.

As a business manager, we would focus on understanding why these predictors are influential. Firstly, if high total charges and monthly charges are a primary reason for customers leaving, we’d look into restructuring pricing or providing incentives/programs that could potentially provide additional value to our customers. If customers with longer tenure are more loyal, we’d explore rewards for customer loyalty. Additionally, improving service quality in the area of tech support and online security could address specific concerns that lead to customer churn. Offering incentives or promotions tied to electronic payment methods, long-term contracts, and bundled services with phone and online backup could also be effective strategies to decrease churn.

Collect all the customers from the test dataset that you predict are going to leave

Customers that would leave:

customers_left <- mars_final_fit %>%
  predict(retention_test) %>%
  bind_cols(retention_test) %>% 
  filter(.pred_class == "Left")

customers_left
## # A tibble: 472 × 21
##    .pred_class Gender SeniorCitizen Partner Dependents Tenure PhoneService
##    <fct>       <chr>          <dbl> <chr>   <chr>       <dbl> <chr>       
##  1 Left        Female             0 Yes     No             28 Yes         
##  2 Left        Male               1 No      No              1 Yes         
##  3 Left        Male               0 No      No              5 Yes         
##  4 Left        Female             0 No      No              2 Yes         
##  5 Left        Female             0 Yes     No             47 Yes         
##  6 Left        Female             0 No      No             12 Yes         
##  7 Left        Female             1 Yes     No             25 Yes         
##  8 Left        Female             0 No      No             13 Yes         
##  9 Left        Female             0 No      No              4 Yes         
## 10 Left        Female             0 Yes     No              2 Yes         
## # ℹ 462 more rows
## # ℹ 14 more variables: MultipleLines <chr>, InternetService <chr>,
## #   OnlineSecurity <chr>, OnlineBackup <chr>, DeviceProtection <chr>,
## #   TechSupport <chr>, StreamingTV <chr>, StreamingMovies <chr>,
## #   Contract <chr>, PaperlessBilling <chr>, PaymentMethod <chr>,
## #   MonthlyCharges <dbl>, TotalCharges <dbl>, Status <fct>

Proportion of Predicted Churns

proportion_left <- mars_final_fit %>%
  predict(retention_test) %>%
  bind_cols(retention_test) %>% 
  count(.pred_class) %>%
  spread(key = .pred_class, value = n) %>%
  mutate(proportion_Left = Left / (Left + Current)*100) %>%
  pull(proportion_Left)

proportion_left
## [1] 22.50835

Exploratory Data Analysis of Predicted Churend Customers

ggplot(customers_left) +
 geom_histogram(aes(x = Tenure), position="identity") +
  ggtitle("Distribution of Tenure for Predicted Churned Customers") +
  labs(x = "Tenure (in Months)", y = "Count") + 
  scale_x_continuous(limits = c(0, 60), breaks = seq(0, 60, by = 6)) +
  scale_y_continuous(limits = c(0, 100)) +
  theme_minimal() +
  theme(text = element_text(size = 12), 
        axis.title = element_text(size = 14), 
        plot.title = element_text(size = 16, hjust = 0.5))

From the graph above, we can see that a significant number of predicted churns occur within the first few months of tenure, with the highest peak at the very beginning of 0-6 months range and 6-12 months range. The number of predicted churns decreases steadily as the tenure increases, indicating that customers are less likely to churn as they stay longer with the service.

Proportion of predicted churns that have only used the service for less than 12 months:

proportion_under_12_months <- customers_left %>%
  summarise(under_12_months = sum(Tenure < 12) / n() * 100) %>%
  pull(under_12_months)

proportion_under_12_months
## [1] 71.61017

Total monthly charges of predicted churns that only used the service for less than 12 months:

total_monthlycharges_under_12_months <- customers_left %>%
  filter(Tenure < 12) %>%
  summarise(total_monthlycharges = sum(MonthlyCharges)) %>%
  pull(total_monthlycharges)

total_monthlycharges_under_12_months
## [1] 24927.5
BAplot1 <- ggplot(customers_left) +
 geom_histogram(aes(x = MonthlyCharges), position = "identity") +
 scale_x_continuous(labels = scales::dollar_format()) +
  labs(x = "Monthly Charges", y = "Count") +
  ggtitle("Distribution of Monthly Charges for Predicted Churned Customers") +
  theme_minimal()

BAplot2 <- ggplot(customers_left) +
 geom_histogram(aes(x = TotalCharges), position = "identity") +
 scale_x_continuous(labels = scales::dollar_format()) +
  labs(x = "Total Charges", y = "Count") +
  ggtitle("Distribution of Total Charges for Predicted Churned Customers") +
  theme_minimal()

gridExtra::grid.arrange(BAplot1, BAplot2)

From the second graph above, we can interpret that the customers who are predicted to churn tend to be on the higher end of the monthly charge scale. For total charges, there is a heavy skew towards the lower end, with a significant number of customers having total charges of less than $1,000. This might suggest that customers who have not invested a lot into the service (have just started using the service - lower total charges) but generally have a higher monthly charges are more likely to churn.

Overall, the service might be losing customers early on, which could indicate issues with initial customer satisfaction or early value realization. Customers with higher monthly charges are more inclined to churn, suggesting that pricing is an important factor in the decision to leave. In addition, customers might not perceive enough value over time to have long-term commitment.

What is the predicted loss in revenue per month if no action is taken?

The predicted revenue loss if not action is taken:

customers_left %>%
  summarize(lost_revenue = sum(MonthlyCharges))
## # A tibble: 1 × 1
##   lost_revenue
##          <dbl>
## 1       37229.

If no proactive measures are implemented, Regork may experience a significant loss in revenue. Based on our model, we’ve estimated a potential revenue decline of approximately $37,229. This figure represents the aggregated monthly charges of customers identified by the model as high-risk for departure. While the actual number of customers who may leave—currently estimated at 472 (around 22.5% of customer base)—is not certain, these individuals represent the most vulnerable demographic according to our data. This underscores the importance of targeted retention strategies to mitigate the risk of this projected loss.

Propose an incentive scheme to your manager to retain these customers. Use your model to justify your proposal. You can do this by performing a cost benefit analysis (comparing the cost of the incentive plan to the benefit of retaining the customers)

Based on our MARS model, we’ve identified critical factors contributing to customer churn, such as Tenure and Total Charges. We propose a discount program that rewards newer customers with discounts, targeting customers who have only used our service less than 6 or 12 months.

The anticipated cost of this program, when offset against the predicted retention-led revenue, indicates a net gain for the company. For instance, offering a 10% discount for customers who have only stayed with us for less than 12 months prevents 71% of predicted churns, the savings in retained revenue would substantially exceed the expense of the discounts applied. The estimated cost of this discount program is $2,492.70, while the potential revenue retained is $24,927.50. This presents us with a net benefit of $22,435, making the incentive scheme a profitable investment for the company.

As a result, the investment in our loyalty program not only strengthens our relationship with current customers but also ensures sustained profitability.

Conclusion

Our solution introduces a discount program targeting key churn factors—tenure and total charges. Financial projections suggest the program’s costs are outweighed by its benefits, with a 10% discount for the most vulnerable customer segment (customers who have only stayed with us for less than 12 months) poised to significantly reduce churn. The expected net gain from this program affirms its potential to bolster both customer loyalty and company profitability.

Limitations

In order to enhance the strategy based on our analysis, we believe that we can focus on deeper customer segmentation to tailor retention efforts more effectively and introduce personalized incentives that resonate with high-risk churn segments. For instance, when we analyzed the test dataset, we noticed that all of the predicted churned customers were enrolled in the month-to-month contract type. Looking at different strategies to convert month-to-month customers to longer-duration contract customers can be significantly impactful. Our proposed incentives also focus mainly on more recent customers who have only used the service for less than 12 months. Additionally, looking deeper into devising specific loyalty programs that reward tenure can further cover other customer segments that are vulnerable to churning. This approach should be underpinned by a thorough cost-benefit analysis to ensure that our strategies are both effective, economically viable, and sustainable in terms of profit and inferred value to customers.