For subscription-based organizations, customer churn poses a significant problem because keeping current clients is more economical than finding new ones. For music streaming services, understanding the factors that influence whether a customer renews their subscription is critical for improving customer retention strategies and long-term profitability.The analysis focuses on the training dataset and visualisation techniques to customers who renewed their subscription and those who churned.
The analysis uses two datasets: a training dataset containing 850 customers and a testing dataset containing 150 customers.Using the sub_training.csv dataset, carry out a visual exploration of the data to understand the relationship between whether a customer renews their subscription (variable called “renewed”) and each of the other potential predictor variables like age, spend, gender, lor, contact recency,num_complaints.
library(tidyverse)
sub_training <- read_csv(“sub_training.csv”)
sub_testing <- read_csv(“sub_testing.csv”)
The datasets used in this analysis consist of a training dataset and
a testing dataset provided by the music subscription company. The
training dataset contains 850 customer observations and the testing
dataset contains 150 customer observations. Both datasets are imported
into R using the read_csv() function from the tidyverse
package.
This section explore the relationship between subscription renewal status and potential predictor variables. it is mainly used to identify customers who renewed their subscription and those who churned.
ggplot(data = sub_training) + geom_boxplot(mapping = aes(x = renewed, y = spend)) + labs(title = “Customer Spend by Renewal Status”)
ggplot(data = sub_training) + geom_boxplot(mapping = aes(x = renewed, y = lor)) + labs(title = “Length of Relationship by Renewal Status”)
ggplot(data = sub_training) + geom_boxplot(mapping = aes(x = renewed, y = age)) + labs(title = “Customer Age by Renewal Status”)
ggplot(data = sub_training) + geom_boxplot(mapping = aes(x = renewed, y = num_complaints)) + labs(title = “Number of complaints by Renewal Status”)
ggplot(data = sub_training) + geom_boxplot(mapping = aes(x = renewed, y = num_contacts)) + labs(title = “Number of Customer Contacts by Renewal Status”)
ggplot (data = sub_training) + geom_bar(mapping = aes(x = gender, fill = renewed), position = “dodge”) + labs(title = “Renewalstatus by Gender”, x = “Gender”, y = “Customers”, fill = “Renewed”)