This analysis of the Telco Customer Churn dataset focuses specifically on customers who subscribe to internet services. The objective is to examine how tenure relates to contract type and total charges, while also exploring differences in payment methods across internet service categories. In addition, the analysis evaluates the demographic composition of the customer base (particularly senior citizen status ) and investigates the adoption of value-added services such as Online Security.
Together, these factors provide insight into customer behavior patterns and potential drivers of churn within the internet service segment.
Customers by Internet Service stacked by Senior Citizen Status
Fiber optic is the company’s largest internet service segment and contains the highest concentration of senior citizens. By comparison, DSL customers and those without internet service are predominantly non-senior, indicating that demographic differences exist but do not fundamentally alter the overall service distribution.
Although senior citizens represent a notable portion of the Fiber Optic segment, the overall customer distribution across internet services remains heavily driven by service type rather than demographic segmentation. Therefore, senior status will not be treated as a key differentiating factor in the subsequent churn-focused analysis. The analysis will instead prioritize tenure, contract type, total charges, and payment behavior as primary explanatory variables.
# visual - 2 bar graph
#data preparation
internet_senior_df <- df %>%
mutate(
SeniorCitizenLabel = ifelse(SeniorCitizen == 1,
"Senior Citizen",
"Not Senior Citizen")
) %>%
count(InternetService, SeniorCitizenLabel) %>%
data.frame()
internet_senior_df
## InternetService SeniorCitizenLabel n
## 1 DSL Not Senior Citizen 2162
## 2 DSL Senior Citizen 259
## 3 Fiber optic Not Senior Citizen 2265
## 4 Fiber optic Senior Citizen 831
## 5 No Not Senior Citizen 1474
## 6 No Senior Citizen 52
# plotting the bar chart
internet_senior_bar <- ggplot(
internet_senior_df,
aes(x = InternetService, y = n, fill = SeniorCitizenLabel)
) +
geom_bar(stat = "identity", colour = "black") +
geom_text(
aes(label = comma(n)), #add comma to show the numbers better
position = position_stack(vjust = 0.5),
size = 4
) +
scale_fill_manual(
values = c(
"Senior Citizen" = "coral",
"Not Senior Citizen" = "aquamarine"
)
) +
labs(
title = "Customers by Internet Service\n(Stacked by Senior Citizen Status)",
x = "Internet Service",
y = "Number of Customers",
fill = "Senior Status"
) +
scale_y_continuous(labels = comma) + #add comma to numbers
theme_light() +
theme(plot.title = element_text(hjust = 0.5))
internet_senior_bar
Customers with Internet Service with added Online Security
The majority of customers with internet service do not subscribe to Online Security as an add-on feature. However, without detailed pricing information or insight into how these service options are positioned and communicated to customers, it is difficult to determine the underlying drivers of this adoption gap. While Online Security adoption remains below 40%, the root cause is unclear. Is the additional cost perceived as too high relative to its value, or is customer awareness of the service limited? Further business investigation into pricing strategy and service presentation would provide greater clarity into whether adoption is constrained by cost, communication, or perceived value.
#prepping data for pie chart
pie_df <- df %>%
dplyr::filter(InternetService != "No") %>%
dplyr::mutate(
online_security = dplyr::case_when(
OnlineSecurity == "Yes" ~ "Has Online Security",
OnlineSecurity == "No" ~ "No Online Security",
TRUE ~ NA_character_
)
) %>%
dplyr::filter(!is.na(online_security)) %>%
dplyr::count(online_security) %>%
dplyr::mutate(
percent = n / sum(n),
label = scales::percent(percent, accuracy = 0.1)
)
pie_df
## online_security n percent label
## <char> <int> <num> <char>
## 1: Has Online Security 2019 0.3659598 36.6%
## 2: No Online Security 3498 0.6340402 63.4%
#plotting pie chart
ggplot(pie_df, aes(x = "", y = n, fill = online_security)) +
geom_col(width = 1, color = "white") +
coord_polar(theta = "y") +
geom_text(aes(label = label), position = position_stack(vjust = 0.5), size = 4) +
labs(
title = "Customers With Internet Service and Online Security",
fill = ""
) +
theme_void() +
theme(plot.title = element_text(hjust = 0.5))
Average Monthly Charges by Tenure and the Type of Payment Method
The purpose of this analysis is to determine whether payment behavior is correlated with internet service type. Specifically, I wanted to assess whether customers within different internet service categories (DSL, Fiber optic, or No Internet) demonstrate differing patterns in their use of electronic payment methods (electronic check, credit card, or automatic bank transfer) versus traditional mailed check payments.
From the distribution, there appears to be a measurable association between internet service tier and payment method. Fiber optic customers show a significantly higher concentration of electronic check usage compared to DSL and non-internet customers. In contrast, customers without internet service disproportionately rely on mailed checks. DSL customers fall somewhere in between, with a more balanced distribution across payment types. While this does not establish causation, it does indicate a clear correlation between service type and payment behavior.
In addition to payment behavior, I evaluated tenure and average monthly charges to understand how revenue evolves across the customer lifecycle. The data shows a positive correlation between tenure and average monthly charges across all internet service categories, though the magnitude of this relationship differs substantially by service tier.
For Fiber optic customers, average monthly charges increase from approximately $77–$80 during the first 10 months of tenure to roughly $105–$107 by 60–70 months. This represents an increase of nearly $30 per month over time. DSL customers follow a similar trajectory, increasing from approximately $44–$48 in early tenure to about $75–$77 at higher tenure levels — also an increase of roughly $30 per month. In contrast, customers without internet service show only a modest increase from about $20 to $23 as tenure increases.
These results suggest two important patterns:
There is a clear positive relationship between tenure and monthly revenue, particularly among internet-enabled customers.
Revenue growth is significantly more pronounced in Fiber optic and DSL segments compared to customers without internet service.
Together, the findings indicate that longer-tenured internet customers generate substantially higher recurring revenue, and payment behavior varies meaningfully by service tier. This reinforces the importance of understanding service-level dynamics when evaluating revenue performance and potential churn exposure.
line_df <- df %>%
filter(!is.na(tenure), !is.na(MonthlyCharges), !is.na(InternetService)) %>%
group_by(InternetService, tenure) %>%
dplyr::summarise(
avg_monthly = mean(MonthlyCharges, na.rm = TRUE),
.groups = "drop"
) %>%
data.frame()
names(line_df)
## [1] "InternetService" "tenure" "avg_monthly"
head(line_df)
## InternetService tenure avg_monthly
## 1 DSL 0 64.94000
## 2 DSL 1 44.21061
## 3 DSL 2 47.75061
## 4 DSL 3 47.18451
## 5 DSL 4 48.24590
## 6 DSL 5 50.53023
hi_lo <- line_df %>%
group_by(InternetService) %>%
filter(
avg_monthly == min(avg_monthly) |
avg_monthly == max(avg_monthly)
) %>%
ungroup() %>%
data.frame()
hi_lo
## InternetService tenure avg_monthly
## 1 DSL 1 44.21061
## 2 DSL 72 76.87594
## 3 Fiber optic 1 77.58766
## 4 Fiber optic 72 107.15552
## 5 No 16 19.80789
## 6 No 65 23.10417
inter_service_line <- ggplot(line_df, aes(x = tenure, y = avg_monthly, group = InternetService, color = InternetService)) +
geom_line(size = 1) +
geom_point(size = 2) +
geom_point(data = hi_lo, aes(x = tenure, y = avg_monthly), size = 3, shape = 21, fill = "white", color = "black") +
geom_label_repel(data = hi_lo, aes(label = paste0("$", round(avg_monthly, 2))),
box.padding = 1,
point.padding = 1,
size = 3,
color = "grey40",
segment.color = "darkblue") +
labs(
title = "Average Monthly Charges by Tenure\n(Min & Max per Internet Service)",
x = "Tenure (Months)",
y = "Average Monthly Charges",
color = "Internet Service") +
scale_y_continuous(
labels = dollar,
breaks = seq(
0,
ceiling(max(line_df$avg_monthly) / 20) * 20,by = 20)) + #do this to show the $ amounts in the y axis is a cleaner fashion
theme_light() +
theme(plot.title = element_text(hjust = 0.5))
inter_service_line
if ("package:plotly" %in% search()) detach("package:plotly", unload = TRUE)
#preparing data for donuts
donut_df <- df %>%
count(InternetService, PaymentMethod) %>%
group_by(InternetService) %>%
mutate(
pct = n / sum(n),
label = percent(pct, accuracy = 0.1)
) %>%
ungroup() %>%
data.frame()
donut_df
## InternetService PaymentMethod n pct label
## 1 DSL Bank transfer (automatic) 566 0.23378769 23.4%
## 2 DSL Credit card (automatic) 594 0.24535316 24.5%
## 3 DSL Electronic check 648 0.26765799 26.8%
## 4 DSL Mailed check 613 0.25320116 25.3%
## 5 Fiber optic Bank transfer (automatic) 646 0.20865633 20.9%
## 6 Fiber optic Credit card (automatic) 597 0.19282946 19.3%
## 7 Fiber optic Electronic check 1595 0.51518088 51.5%
## 8 Fiber optic Mailed check 258 0.08333333 8.3%
## 9 No Bank transfer (automatic) 332 0.21756225 21.8%
## 10 No Credit card (automatic) 331 0.21690695 21.7%
## 11 No Electronic check 122 0.07994758 8.0%
## 12 No Mailed check 741 0.48558322 48.6%
#plotting donut charts
#remove thick white borders so it does not look like there are pieces missing from the donut
payment_donut <- ggplot(donut_df, aes(x = 2, y = pct, fill = PaymentMethod)) +
geom_bar(stat = "identity", width = 1, color = NA) +
coord_polar(theta = "y") +
xlim(0.5, 2.5) +
geom_text(
aes(label = label),
position = position_stack(vjust = 0.5),
size = 4
) +
facet_wrap(~InternetService) +
labs(
title = "Payment Method Distribution by Internet Service",
fill = "Payment Method"
) +
theme_void() +
theme(
plot.title = element_text(hjust = 0.5)
)
payment_donut
Churn Risk Heat Maps for Internet Only Customers by Tenure compared by Contract Type and Lifetime Value
Churn Risk by Tenure and Customer Lifetime Value
The first heat map highlights a clear inverse relationship between tenure and churn risk, particularly as customer lifetime value increases.
Customers in the 0–12 month range with low lifetime value ($0–$1,500) exhibit the highest churn risk, reaching 51.5%. Even customers with slightly higher lifetime value in early tenure still show elevated churn rates (40–46%). This suggests that the first year of the customer lifecycle represents the most volatile period, regardless of spend level.
As tenure increases, churn risk declines materially across nearly all value tiers. For customers in the 49–72 month range, churn risk falls into the low single digits for most lifetime value segments. Notably, high-value customers ($3,000+) with tenure beyond 49 months demonstrate churn rates below 10%.
This pattern indicates:
A strong negative correlation between tenure and churn risk.
Higher lifetime value customers become significantly more stable as tenure increases.
Early-tenure customers represent the greatest retention vulnerability, especially within lower value tiers.
From a revenue perspective, the riskiest segment is low-tenure, moderate-value customers — they generate revenue but have a high probability of churn.
Churn Risk by Tenure and Contract Type (Internet Customers Only)
The second heat map isolates internet customers and introduces contract type as a structural driver of churn.
The most significant finding is the consistently elevated churn rate among month-to-month contracts, particularly in early tenure. Customers on month-to-month contracts in the 0–12 month range exhibit a churn rate of 58.3%, the highest observed in the analysis.
Even as tenure increases, month-to-month contracts maintain materially higher churn rates compared to fixed-term contracts:
At 61–72 months, month-to-month churn remains at 21.5%, still significantly higher than one-year or two-year contracts.
In contrast:
One-year contracts show moderate churn (10–18%), decreasing gradually over tenure.
Two-year contracts exhibit the lowest churn across all tenure ranges, remaining near 0–5%.
This suggests:
Contract structure is a primary churn driver.
Month-to-month agreements significantly increase churn risk independent of tenure.
Long-term contracts materially stabilize customer retention
#visual #5 - heatmap
#group tenure in monthly groups by Lifetime Value
heat_df_lifecycle <- df %>%
filter(!is.na(TotalCharges)) %>%
mutate(
ChurnFlag = ifelse(Churn == "Yes", 1, 0),
# Tenure grouping (same structure as before)
TenureGroup = cut(
tenure,
breaks = c(0, 12, 24, 36, 48, 60, 72), #group the tenure months
labels = c("0–12", "13–24", "25–36", "37–48", "49–60", "61–72"),
include.lowest = TRUE
),
# Business lifecycle value grouping
ChargeGroup = cut(
TotalCharges,
breaks = c(0, 500, 1500, 3000, 5000, Inf), #group the $ amounts to avoid scientific notation
labels = c("$0–$500",
"$500–$1,500",
"$1,500–$3,000",
"$3,000–$5,000",
"$5,000+"),
include.lowest = TRUE
)
) %>%
group_by(TenureGroup, ChargeGroup) %>%
summarise(
churn_rate = mean(ChurnFlag, na.rm = TRUE),
n = n(),
.groups = "drop"
) %>%
data.frame()
heat_df_lifecycle
## TenureGroup ChargeGroup churn_rate n
## 1 0–12 $0–$500 0.467316514 1744
## 2 0–12 $500–$1,500 0.515081206 431
## 3 13–24 $0–$500 0.058823529 238
## 4 13–24 $500–$1,500 0.327380952 504
## 5 13–24 $1,500–$3,000 0.407801418 282
## 6 25–36 $0–$500 0.000000000 18
## 7 25–36 $500–$1,500 0.049586777 242
## 8 25–36 $1,500–$3,000 0.274944568 451
## 9 25–36 $3,000–$5,000 0.363636364 121
## 10 37–48 $500–$1,500 0.036842105 190
## 11 37–48 $1,500–$3,000 0.164835165 182
## 12 37–48 $3,000–$5,000 0.268817204 372
## 13 37–48 $5,000+ 0.444444444 18
## 14 49–60 $500–$1,500 0.032051282 156
## 15 49–60 $1,500–$3,000 0.103448276 87
## 16 49–60 $3,000–$5,000 0.123867069 331
## 17 49–60 $5,000+ 0.251937984 258
## 18 61–72 $500–$1,500 0.007246377 138
## 19 61–72 $1,500–$3,000 0.000000000 165
## 20 61–72 $3,000–$5,000 0.032653061 245
## 21 61–72 $5,000+ 0.097788126 859
# Plot
#show the higher churn rate in red with the lower churn rate in green
ggplot(
heat_df_lifecycle,
aes(x = TenureGroup,
y = ChargeGroup,
fill = churn_rate)
) +
geom_tile(color = "white") +
geom_text(
aes(label = percent(churn_rate, accuracy = 0.1)),
size = 3
) +
labs(
title = "Churn Risk Heat Map\nby Tenure and Customer Lifetime Value",
x = "Tenure (Months)",
y = "Total Charges (Customer Value Stage)"
) +
scale_fill_gradient2(
low = "darkgreen", # Low churn = strong retention
mid = "lightyellow",
high = "red", # High churn = risk
midpoint = mean(heat_df_lifecycle$churn_rate, na.rm = TRUE),
labels = percent,
name = "Churn Rate"
) +
theme_light() +
theme(plot.title = element_text(hjust = 0.5))
#visual - 6 Heat Map Tenure, charges and churn
#only focus on customers with internet service by Contract type
#add in the same tenure grouping
heat_contract_internet_df <- df %>%
filter(InternetService != "No") %>%
mutate(
ChurnFlag = ifelse(Churn == "Yes", 1, 0),
TenureGroup = cut(
tenure,
breaks = c(0, 12, 24, 36, 48, 60, 72), #group tenure months
labels = c("0–12", "13–24", "25–36", "37–48", "49–60", "61–72"),
include.lowest = TRUE
)
) %>%
group_by(TenureGroup, Contract) %>%
summarise(
churn_rate = mean(ChurnFlag, na.rm = TRUE),
n = n(),
.groups = "drop"
) %>%
data.frame()
heat_contract_internet_df
## TenureGroup Contract churn_rate n
## 1 0–12 Month-to-month 0.58281445 1606
## 2 0–12 One year 0.18367347 49
## 3 0–12 Two year 0.00000000 18
## 4 13–24 Month-to-month 0.41095890 657
## 5 13–24 One year 0.14705882 102
## 6 13–24 Two year 0.00000000 26
## 7 25–36 Month-to-month 0.34977578 446
## 8 25–36 One year 0.10497238 181
## 9 25–36 Two year 0.00000000 35
## 10 37–48 Month-to-month 0.35099338 302
## 11 37–48 One year 0.15887850 214
## 12 37–48 Two year 0.05063291 79
## 13 49–60 Month-to-month 0.27896996 233
## 14 49–60 One year 0.15162455 277
## 15 49–60 Two year 0.04968944 161
## 16 61–72 Month-to-month 0.21495327 107
## 17 61–72 One year 0.13286713 286
## 18 61–72 Two year 0.04200542 738
# Plot heat map
#use the same heatmap colors to indicate churn rate
#only focus on customers with internet
ggplot(
heat_contract_internet_df,
aes(x = TenureGroup,
y = Contract,
fill = churn_rate)
) +
geom_tile(color = "white") +
geom_text(
aes(label = percent(churn_rate, accuracy = 0.1)),
size = 4
) +
labs(
title = "Churn Risk Heat Map (Internet Customers Only)\nby Tenure and Contract Type",
x = "Tenure (Months)",
y = "Contract Type"
) +
scale_fill_gradient2(
low = "darkgreen", # Low churn
mid = "lightyellow",
high = "red", # High churn
midpoint = mean(heat_contract_internet_df$churn_rate, na.rm = TRUE),
labels = percent,
name = "Churn Rate"
) +
theme_light() +
theme(plot.title = element_text(hjust = 0.5))
The analysis reveals several consistent patterns across internet service type, payment behavior, tenure, revenue, and churn risk.
Tenure demonstrates a strong negative correlation with churn. Customers within their first 12 months exhibit the highest churn rates, particularly those on month-to-month contracts. As tenure increases, churn risk declines materially across all segments, with long-tenured customers under two-year contracts showing the greatest stability.
Contract structure is a significant driver of churn. Month-to-month customers consistently exhibit elevated churn rates compared to one-year and two-year contracts, indicating that commitment length materially impacts retention outcomes.
Revenue trends show a positive correlation between tenure and average monthly charges, especially among Fiber optic and DSL customers. Longer-tenured internet customers generate substantially higher recurring revenue compared to early-tenure customers and those without internet service.
Payment behavior varies meaningfully by internet service type, with Fiber optic customers more heavily utilizing electronic payment methods, while non-internet customers rely more on mailed checks. Although causality cannot be inferred, the association suggests behavioral differences across service tiers.
Collectively, the findings indicate that early-tenure, month-to-month internet customers represent the highest churn exposure, while long-tenured customers under fixed-term contracts drive the most stable and valuable revenue. Retention strategies should therefore prioritize stabilizing customers within their first year and incentivizing longer-term contract adoption.