Dataset

The dataset used in this report is about customer churn from a telecommunication company. Customer churn relates to whether a customer leaves or remains with a company. It contains one month of customer data from the teleco company. The raw data contains 7043 rows (customers) and 21 columns (features). It contains one character type variable, 7 binary variables, 3 integer variables and 10 categorical variables.

  1. customer_id - Character
  2. gender - Binary
  3. SeniorCitizen - Binary
  4. Partner - Binary
  5. Dependents - Binary
  6. Tenure - Integer
  7. PhoneService - Binary
  8. MultipleLines - Categorical
  9. InternetService - Categorical
  10. OnlineSecurity - Categorical
  11. OnlineBackup - Categorical
  12. DeviceProtection - Categorical
  13. TechSupport - Categorical
  14. StreamingTV - Categorical
  15. StreamingMovies - Categorical
  16. Contract - Categorical
  17. PaperlessBilling - Binary
  18. PaymentMethod - Categorical
  19. MonthlyCharges - Integer
  20. TotalCharges - Integer
  21. Churn - Binary

Content Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

Customers who left or stayed with the company within the timeframe of one month – the column is called Churn

The timeline for the data is from one month within the company.

It contains information about services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies

It contains information about customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges

It contains information about customer demographics – gender, age range, and if they have partners and dependents

Link: https://www.kaggle.com/blastchar/telco-customer-churn

#Code chunk for setup- Version R Studio:  "1.2.5001"

#install.packages ("sqldf")
#install.packages ("ggplot2")
#install.packages ("dplyr")
#install.packages('gridExtra')
#install.packages('plotly')
#install.packages('ggpubr')







library('reshape2') 
library('sqldf')
library('ggplot2')
library('gridExtra')
library('dplyr')
library('plotly')
library('ggpubr')


#Read in File
teleco<-read.csv("C://Users//james//Documents//Teleco.csv", header = TRUE)
View(teleco)

glimpse(teleco)
Rows: 7,043
Columns: 21
$ customerID       <chr> "7590-VHVEG", "5575-GNVDE", "3668-QPYBK", "77...
$ gender           <chr> "Female", "Male", "Male", "Male", "Female", "...
$ SeniorCitizen    <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ Partner          <chr> "Yes", "No", "No", "No", "No", "No", "No", "N...
$ Dependents       <chr> "No", "No", "No", "No", "No", "No", "Yes", "N...
$ tenure           <int> 1, 34, 2, 45, 2, 8, 22, 10, 28, 62, 13, 16, 5...
$ PhoneService     <chr> "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes"...
$ MultipleLines    <chr> "No phone service", "No", "No", "No phone ser...
$ InternetService  <chr> "DSL", "DSL", "DSL", "DSL", "Fiber optic", "F...
$ OnlineSecurity   <chr> "No", "Yes", "Yes", "Yes", "No", "No", "No", ...
$ OnlineBackup     <chr> "Yes", "No", "Yes", "No", "No", "No", "Yes", ...
$ DeviceProtection <chr> "No", "Yes", "No", "Yes", "No", "Yes", "No", ...
$ TechSupport      <chr> "No", "No", "No", "Yes", "No", "No", "No", "N...
$ StreamingTV      <chr> "No", "No", "No", "No", "No", "Yes", "Yes", "...
$ StreamingMovies  <chr> "No", "No", "No", "No", "No", "Yes", "No", "N...
$ Contract         <chr> "Month-to-month", "One year", "Month-to-month...
$ PaperlessBilling <chr> "Yes", "No", "Yes", "No", "Yes", "Yes", "Yes"...
$ PaymentMethod    <chr> "Electronic check", "Mailed check", "Mailed c...
$ MonthlyCharges   <dbl> 29.85, 56.95, 53.85, 42.30, 70.70, 99.65, 89....
$ TotalCharges     <dbl> 29.85, 1889.50, 108.15, 1840.75, 151.65, 820....
$ Churn            <chr> "No", "No", "Yes", "No", "Yes", "Yes", "No", ...
#Structure of DF#
str(teleco)
'data.frame':   7043 obs. of  21 variables:
 $ customerID      : chr  "7590-VHVEG" "5575-GNVDE" "3668-QPYBK" "7795-CFOCW" ...
 $ gender          : chr  "Female" "Male" "Male" "Male" ...
 $ SeniorCitizen   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Partner         : chr  "Yes" "No" "No" "No" ...
 $ Dependents      : chr  "No" "No" "No" "No" ...
 $ tenure          : int  1 34 2 45 2 8 22 10 28 62 ...
 $ PhoneService    : chr  "No" "Yes" "Yes" "No" ...
 $ MultipleLines   : chr  "No phone service" "No" "No" "No phone service" ...
 $ InternetService : chr  "DSL" "DSL" "DSL" "DSL" ...
 $ OnlineSecurity  : chr  "No" "Yes" "Yes" "Yes" ...
 $ OnlineBackup    : chr  "Yes" "No" "Yes" "No" ...
 $ DeviceProtection: chr  "No" "Yes" "No" "Yes" ...
 $ TechSupport     : chr  "No" "No" "No" "Yes" ...
 $ StreamingTV     : chr  "No" "No" "No" "No" ...
 $ StreamingMovies : chr  "No" "No" "No" "No" ...
 $ Contract        : chr  "Month-to-month" "One year" "Month-to-month" "One year" ...
 $ PaperlessBilling: chr  "Yes" "No" "Yes" "No" ...
 $ PaymentMethod   : chr  "Electronic check" "Mailed check" "Mailed check" "Bank transfer (automatic)" ...
 $ MonthlyCharges  : num  29.9 57 53.9 42.3 70.7 ...
 $ TotalCharges    : num  29.9 1889.5 108.2 1840.8 151.7 ...
 $ Churn           : chr  "No" "No" "Yes" "No" ...
ncol(teleco)
[1] 21
nrow(teleco)
[1] 7043
length(studentresult)
[1] 8
dim(studentresult)
[1] 45  8
#Summary of data#
summary(teleco)
  customerID           gender          SeniorCitizen      Partner         
 Length:7043        Length:7043        Min.   :0.0000   Length:7043       
 Class :character   Class :character   1st Qu.:0.0000   Class :character  
 Mode  :character   Mode  :character   Median :0.0000   Mode  :character  
                                       Mean   :0.1621                     
                                       3rd Qu.:0.0000                     
                                       Max.   :1.0000                     
                                                                          
  Dependents            tenure      PhoneService       MultipleLines     
 Length:7043        Min.   : 0.00   Length:7043        Length:7043       
 Class :character   1st Qu.: 9.00   Class :character   Class :character  
 Mode  :character   Median :29.00   Mode  :character   Mode  :character  
                    Mean   :32.37                                        
                    3rd Qu.:55.00                                        
                    Max.   :72.00                                        
                                                                         
 InternetService    OnlineSecurity     OnlineBackup      
 Length:7043        Length:7043        Length:7043       
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
                                                         
                                                         
                                                         
                                                         
 DeviceProtection   TechSupport        StreamingTV       
 Length:7043        Length:7043        Length:7043       
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
                                                         
                                                         
                                                         
                                                         
 StreamingMovies      Contract         PaperlessBilling  
 Length:7043        Length:7043        Length:7043       
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
                                                         
                                                         
                                                         
                                                         
 PaymentMethod      MonthlyCharges    TotalCharges       Churn          
 Length:7043        Min.   : 18.25   Min.   :  18.8   Length:7043       
 Class :character   1st Qu.: 35.50   1st Qu.: 401.4   Class :character  
 Mode  :character   Median : 70.35   Median :1397.5   Mode  :character  
                    Mean   : 64.76   Mean   :2283.3                     
                    3rd Qu.: 89.85   3rd Qu.:3794.7                     
                    Max.   :118.75   Max.   :8684.8                     
                                     NA's   :11                         
#Look at missing data for all columns in df
teleco %>%
  select(everything()) %>%  
  summarise_all(funs(sum(is.na(.))))

#Remove Missing Values
teleco <-na.omit(teleco)

1. Introduction

As technology is advancing telecommunication services are in competition with eachother to retain customers (Amin et al., 2017). It is estimated that annual customer churn within telecommunication companies ranges from 20% to 40% (Ahn et al., 2006). It is important for telecommunication companies to retain a customer base in order to keep their profit level. It has also been stated many times how it is more expensive to acquire a new customer than it is to retain an existing customer (Ahn et al., 2006). Therefore it is clear that telecommunication companies should be focused on retaining customers rather than acquiring new customers. They should be focused on reducing brand switching within their existing customer base. As technology advances and new services become available to customers it is important for telecommunication companies to monitor the ttrends in the industry. They must be able to maintain industry standard of services expected by customers ar a fair price point. This report aims to investigate some of the causes of customer churn within the telecommunication industry.

2. User Story

Primary Groups or Individuals: The primary individuals I will be communicating to will be senior management in a Teleco company.

What does audience care about?: The audience will want to understand what is causing customers to leave their company. Teleco companys are quite competitive today with eachother and will want to retain as many customers as possible. They will want to know if there is anything they can improve with their business model to retain more customers. Is there something that customers are not satisfied with? In order to better understand customer churn they need to understand their customer behaviour and the triggers that are causing customers to leave.

What action does audience need to take?: Once the insights have been provided to senior management they will need to assess whether they need to improve or refine their business model for customers to make it more appealing and to avoid customers leaving. They will need to conudct an assessment to ensure the services they offer are up to date with industry standards and current trends in the market. They may need to improve their streaming service for example as it may not be up to industry standard or perhaps their pricing model is being undermined by a competitor.

Benefits: The benefits of providing the insights to senior management is that they will be able to identify key reasons why customers are leaving and not being retained. This will allow senior management to act and make changes necessary in order to avoid losing customers and retain as much as possible. Having a stable and large customer base gives a company a good image among the public. This in turn may also increase referrals from existing customers and will also ensure that the companys profit level remains stable.

Risks: The risks of not acting on the insights provided would be that they may lose a large amount of customers. Their services may also fall far behind current industry standards and they will not have any insights into what is required to retain their customer base and keep their customer base satisfied with their service. If the company continues to lose customers they will in turn lose profit. They also risk getting a bad reputation among the public due to word of mouth from customers that are leaving.

Overall, the aim of this report is to provide key insights into what is causing a lack of customer retention in a teleco company.

3. Data Exploration and wrangling

In the below section I have conducted some data exploration and created some simple plots of variables of interest. I have created bar graphs looking at churn, gender, contract type, phone service, internet service, movie streaming and tv streaming.I have also wrangled the data and created some data frames which I used to look at certain possible trends and some that I will use later in my visualisations. For example I have looked at the count of how many customers did or did not churn grouped by the contract type which will be used in visualisation 1.

#Data Exploration and wrangling 

#Look at Churn Count
Churn_count <- sqldf('Select count(churn) as count, churn from teleco group by churn')
View(Churn_count)

ggplot(data=Churn_count, aes(x=Churn, y=count, fill = Churn)) + geom_bar(stat="identity")



#Look at gender distribution
Gender <- sqldf('Select count(gender) as count, gender from teleco group by gender')
View(Gender)

ggplot(data=Gender, aes(x=gender, y=count, fill = gender)) + geom_bar(stat="identity")



#Look at Contract Type
Contract_count <- sqldf('Select count(Contract) as count, Contract from teleco group by Contract')
View(Contract_count)

ggplot(data=Contract_count, aes(x=Contract, y=count, fill = Contract)) + geom_bar(stat="identity")



#Look at Phone Service
Phone_service <- sqldf('Select count(PhoneService) as count, PhoneService from teleco group by PhoneService')
View(Phone_service)

ggplot(data=Phone_service, aes(x=PhoneService, y=count, fill = PhoneService)) + geom_bar(stat="identity")


#Look at Internet Service
Internet_service <- sqldf('Select count(InternetService) as count, InternetService from teleco group by InternetService')
View(Internet_service)

ggplot(data=Internet_service, aes(x=InternetService, y=count, fill = InternetService)) + geom_bar(stat="identity")



#Look at Movie Service
Movie_service <- sqldf('Select count(StreamingMovies) as count, StreamingMovies from teleco group by StreamingMovies')
View(Movie_service)

ggplot(data=Movie_service, aes(x=StreamingMovies, y=count, fill = StreamingMovies)) + geom_bar(stat="identity")



#Look at TV Service
TV_service <- sqldf('Select count(StreamingTV) as count, StreamingTV from teleco group by StreamingTV')
View(TV_service)

ggplot(data=TV_service, aes(x=StreamingTV, y=count, fill = StreamingTV)) + geom_bar(stat="identity")




#Look at Churn for Contract
Churn_contract <- sqldf('Select count(churn) as count, churn, contract from teleco  group by contract, churn order by churn, count desc')
View(Churn_contract)

df <- subset(Churn_contract, Churn_contract$Churn == 'Yes')
View(df)

#Look at Monthly Charges
Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn from teleco group by MonthlyCharges,churn order by churn, count desc')
View(Charges)


#Look at Payment Method
Payment <- sqldf('Select count(churn) as count, churn, PaymentMethod from teleco  group by PaymentMethod, churn order by churn, count desc')
View(Payment)

#Look at Internet Service
Internet <- sqldf('Select count(churn) as count, churn, InternetService from teleco group by InternetService, churn order by churn, count desc')
View(Internet)

#Look at Tenure
Tenure <- sqldf('Select count(churn) as count, Tenure, churn from teleco group by Tenure,churn order by churn, count desc')
View(Charges)

#Look at Streaming TV
Tv <- sqldf('Select count(churn) as count, StreamingTv, churn from teleco group by StreamingTv,churn order by churn, count desc')
View(Tv)

#Look at Streaming Movies
Movies <- sqldf('Select count(churn) as count, StreamingMovies, churn from teleco group by StreamingMovies,churn order by churn,count desc')
View(Movies)

#Look at gender distribution
Gender <- sqldf('Select count(gender) as count, gender from teleco group by gender')
View(Gender)

Gender_churn <- sqldf('Select count(churn) as count, gender, churn from teleco group by gender, churn')
View(Gender_churn)

3. Visualisations

3.1 Visualisation 1: Relationship between contract type and customer churn

From the data exploration their appears to be a relationship between contract type and customer retention. Customers who are on a monthly contract are leaving more than customers on a one year or two year contract. Therefore this is an indicator to me that there is perhaps a relationship between the cost of monthly fees and customer retention. This is something that will be investigated with a visualization. The first visualisation that will be presented to senior management will be a histogram plot showing that most customers who have left have been on a monthly contract rather than a one year or two year contract. In my previous iterations which can be viewed further on in this report I experimented with the position of the bar graphs. I decided to use a bar graph as I found that it was the best method to portray this data. The bar graph effectively shows the large difference between the contract types. I had originally created a static plot but my final iteration I decided to make the plot interactive. This will allow senior management to interact with this visualisation and filter the data however they wish to view it.

#Contract Plot
View(Churn_contract)

Churn_contract2 <- melt(Churn_contract)
Using Churn, Contract as id variables
View (Churn_contract2)


#Contract Plot Final Iteration (Static)
Contract_Plot <- ggplot(data=Churn_contract, aes(x=Contract, y=count, fill = Churn)) + geom_bar(stat="identity", position= 'dodge') + ylab('Churn Count') + ggtitle('Relationship Between Customer Churn and Contract Type')


#Contract Plot Final Iteration (Interactive)
ggplotly(Contract_Plot) %>%
  layout(
    title = 'Relationship Between Churn and Contract Type'
  )

NA
NA

3.2 Visualisation 2: Relationship between Monthly Charges and Customer Churn

Senior management can see from the visualisation 1 that most customers leaving are on a monthly contract. The next visualisation that will be presented to senior management will be a boxplot showing that customers who are leaving are also paying higher monthly fees. This could be an indication to senior management that they need to assess their pricing plan.I had experimented with different plot types such as line graph or scatter plot which can be viewed in my previous iterations further on in the report. I used a boxplot as I found that it was the best plot to convey the message and effectively shows the difference in monthly fees between customers who are leaving and customers who are staying with the company. The box plot is easy to read and understand. However, senior management will now want to understand why these customers are paying higher monthly charges. They will question why customers are paying such higher monthly charges compared to other customers. This will be explored in the next visualisation.



#Monthly Charges Final Iteration
ggplot(teleco, aes(y= MonthlyCharges, x = "Churn", fill = Churn)) + geom_boxplot()+  xlab(" ") +ggtitle('Relationship Between Monthly Charges and Customer Churn ')

NA
NA

3.3 Visualisation 3: Comparison of Monthly charges for Services Offered

From the previous visualisation presented to senior management we can identify that customers who are leaving are paying high monthly fees. Senior management will wasn’t to question why these customers are paying high monthly fees so the aim of the next visualisation will be to look at how much difference there is in the cost of services they provide. It looks to investigate whether there is a big difference in the monthly fees for the services offered. The services that are looked at an compared in this visualisation are TV streaming, Movie streaming an internet service. I decided to continue to use box plot diagrams as it is effective in communicating whether there is a high difference between monthly fees of certain services. I also decided to arrange the plots created together in order to compare them. From the visualisation senior management can see that there is a wide gap in monthly fees for the high end services they provide. For example

#compare Internet Type and monthly charges

Internet_Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn, InternetService from teleco group by MonthlyCharges,churn order by churn, count desc')
View(Internet_Charges)


Internet_plot <-ggplot(Internet_Charges, aes(y= MonthlyCharges, x= " ", fill = InternetService)) + geom_boxplot()+  xlab(" Churn") + scale_fill_ordinal() + theme_minimal()

#compare Streaming TV and monthly charges

TV_Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn, StreamingTV from teleco group by MonthlyCharges,churn order by churn, count desc')
View(TV_Charges)



Tv_plot <- ggplot(TV_Charges, aes(y= MonthlyCharges, x= " ", fill = StreamingTV)) + geom_boxplot()+  xlab(" Churn") + scale_fill_ordinal() + theme_minimal()


#compare Streaming Movies and monthly charges

Movie_Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn, StreamingMovies from teleco group by MonthlyCharges,churn order by churn, count desc')
View(Movie_Charges)



Movie_plot <- ggplot(Movie_Charges, aes(y= MonthlyCharges, x= " ", fill = StreamingMovies)) + geom_boxplot()+  xlab(" Churn") + scale_fill_ordinal() + theme_minimal()


#Final Iteration

charges_comparison <- ggarrange( Tv_plot, Movie_plot,Internet_plot, ncol = 2, nrow = 2)

annotate_figure(charges_comparison,top = text_grob("Comparing Service Monthly Charges", color = "black", face = "bold", size = 14))

NA
NA
NA
NA

3.4 Visualisation 4: Relationship Between Churn and Tenure

The final visualisation presented to senior management will be to look at how tenure is related to customer churn. There tends to be a trend that the longer customers are retained the less likely they are to stop using a companies service. This can be seen in the line graph below. I have chosen an animated line graph to represent the data. Each line represents the churn type (yes and no). The line graph was chosen as there are two numeric variables being represented ( tenure and count of customers). From my iterations which can be seen below I found the line graph the best method to represent this data. From the graph it can be identified that the highest number of customers that churned were only with the company for 10 months. As customers tenure increased with the company their likelihood of churning decreased. This should prompt senior management to put more focus on the customers who have most recently joined within the past few months and to focus on retaining these customers.



p <- teleco %>%
  group_by(tenure, Churn) %>%
  summarise(count = n(), .groups = 'drop') %>%
  ggplot(aes(tenure, count, color = Churn)) +
  geom_line(size = 2) + ggtitle('Relationship Between Tenure And Customer Churn') + theme(plot.title = element_text(hjust = 0.5))

p + geom_point() + transition_reveal(tenure)

Frame 1 (1%)
Frame 2 (2%)
Frame 3 (3%)
Frame 4 (4%)
Frame 5 (5%)
Frame 6 (6%)
Frame 7 (7%)
Frame 8 (8%)
Frame 9 (9%)
Frame 10 (10%)
Frame 11 (11%)
Frame 12 (12%)
Frame 13 (13%)
Frame 14 (14%)
Frame 15 (15%)
Frame 16 (16%)
Frame 17 (17%)
Frame 18 (18%)
Frame 19 (19%)
Frame 20 (20%)
Frame 21 (21%)
Frame 22 (22%)
Frame 23 (23%)
Frame 24 (24%)
Frame 25 (25%)
Frame 26 (26%)
Frame 27 (27%)
Frame 28 (28%)
Frame 29 (29%)
Frame 30 (30%)
Frame 31 (31%)
Frame 32 (32%)
Frame 33 (33%)
Frame 34 (34%)
Frame 35 (35%)
Frame 36 (36%)
Frame 37 (37%)
Frame 38 (38%)
Frame 39 (39%)
Frame 40 (40%)
Frame 41 (41%)
Frame 42 (42%)
Frame 43 (43%)
Frame 44 (44%)
Frame 45 (45%)
Frame 46 (46%)
Frame 47 (47%)
Frame 48 (48%)
Frame 49 (49%)
Frame 50 (50%)
Frame 51 (51%)
Frame 52 (52%)
Frame 53 (53%)
Frame 54 (54%)
Frame 55 (55%)
Frame 56 (56%)
Frame 57 (57%)
Frame 58 (58%)
Frame 59 (59%)
Frame 60 (60%)
Frame 61 (61%)
Frame 62 (62%)
Frame 63 (63%)
Frame 64 (64%)
Frame 65 (65%)
Frame 66 (66%)
Frame 67 (67%)
Frame 68 (68%)
Frame 69 (69%)
Frame 70 (70%)
Frame 71 (71%)
Frame 72 (72%)
Frame 73 (73%)
Frame 74 (74%)
Frame 75 (75%)
Frame 76 (76%)
Frame 77 (77%)
Frame 78 (78%)
Frame 79 (79%)
Frame 80 (80%)
Frame 81 (81%)
Frame 82 (82%)
Frame 83 (83%)
Frame 84 (84%)
Frame 85 (85%)
Frame 86 (86%)
Frame 87 (87%)
Frame 88 (88%)
Frame 89 (89%)
Frame 90 (90%)
Frame 91 (91%)
Frame 92 (92%)
Frame 93 (93%)
Frame 94 (94%)
Frame 95 (95%)
Frame 96 (96%)
Frame 97 (97%)
Frame 98 (98%)
Frame 99 (99%)
Frame 100 (100%)
Finalizing encoding... done!

4.Previous Iterations

4.1 Visualisation 1 Iterations: Relationship between contract type and customer churn

In this section I will show previous iterations of the plots created for this report. Firstly below shows the previous iterations for visualisation 1. I experimented with flipping the position of the bar graphs and also stacking the bar graphs. However I decided to group the bar graphs in the end as I found that when they were stacked it made it more difficult to compare the sizes for each churn type. When I grouoed the bar plots it remedied this. I also had created a static plot that I felt looked well but in order to add a bit more interactivity I decided to make the final plot an interactive plot.


#Contract Plot Iteration 1
ggplot(data=Churn_contract, aes(x=count, y=Contract, fill = Churn)) + geom_histogram(stat="identity") 
Ignoring unknown parameters: binwidth, bins, pad

#Contract Plot Iteration 2
ggplot(data=Churn_contract, aes(x=count, y=Contract, fill = Churn)) + geom_histogram(stat="identity", position = 'dodge') 
Ignoring unknown parameters: binwidth, bins, pad

#Contract Plot Iteration 3 (Static)
ggplot(data=Churn_contract, aes(x=Contract, y=count, fill = Churn)) + geom_bar(stat="identity", position= 'dodge') + ylab('Churn Count') + ggtitle('Relationship Between Customer Churn and Contract Type')

NA
NA

4.2 Visualisation 2 Iterations: Relationship between Monthly Charges and Customer Churn

Previous Iterations for Visualisation 2 can be seen below. I experimented with different plot types such as line graph or a scatter plot and frequency polygon graph. However, I felt these plots did not represent the data in a clean manner and the box plot was much better at conveying the message to stakeholders. I also chose the box plot as I wanted to have a diverse range of plots in the report. The box plot allows senior management to quickly identify mean values and the difference in monthly charges being paid by customers who are leaving the company.

#Monthly Charges Plot Iteration 1
View(Charges) 

ggplot(Charges,aes(x = MonthlyCharges, y=count, color = Churn))+ geom_line(size=2) + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')


#Monthly Charges Iteration 2
ggplot(Charges,aes(x = MonthlyCharges, y=count, color = Churn))+ geom_point() + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')


#Monthly Charges Iteration 3
ggplot(Charges,aes(x = MonthlyCharges, color = Churn))+ geom_freqpoly(size=2) + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')

NA
NA

4.3 Visualisation 3 Iterations: Comparison of Monthly charges for Services Offered

Previous Iterations for Visualisation 3 can be seen below. I had to experiment with the number of row and columns. As can be seen below in my previous iterations the plots were overlapping eachother and did not present in a tidy manner. I also experimented with including a title for each of the plots but these titles also overlapped eachother. I decided to remove the titles and include just one title for the overall plot which made it look neater and I felt that the legend for each plot conveyed what the monthly charges were for in each plot.


#Iteration 1
grid.arrange(Internet_plot, Tv_plot, Movie_plot, nrow=2, widths = c(2, 1, 1),layout_matrix = rbind(c(1, 2, NA),c(3, 3, 4)))


#Iteration 2
grid.arrange(Internet_plot, Tv_plot, Movie_plot, nrow=2, widths = c(2, 1, 1))

NA
NA

4.4 Visualisation 4 Iterations: Relationship between Tenure and Customer Churn

Previous Iterations for Visualisation 4 can be seen below. I first experimented with different pot types such as histogram and scatterplots. I found that the frequency polygon plot seemed to look the most clean and convey the message most effectively. However, when I attempted to animate this plot it did not work very effectively. I experimented with the transition functions such as transition_state etc but it did not animate correctly. Therefore instead of using a frequency polygon plot I switched to a line graph which was used in my final visualisation and could be animated correctly.


#Tenure Plot Iteration 1
ggplot(data=Tenure, aes(x=tenure, y=count, fill = Churn)) + geom_histogram(stat="identity") 
Ignoring unknown parameters: binwidth, bins, pad

#Tenure Plot Iteration 2
ggplot(Tenure,aes(x = tenure, y=count, color = Churn))+ geom_point() + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')


#Tenure Plot Iteration 3
Tenure_Iteration4 <- ggplot(teleco,aes(x = tenure, color = Churn))+ geom_freqpoly(size=2)+theme_minimal()

Tenure_Iteration4

Tenure_Iteration4 + transition_reveal(tenure)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Frame 1 (1%)
Frame 2 (2%)
Frame 3 (3%)
Frame 4 (4%)
Frame 5 (5%)
Frame 6 (6%)
Frame 7 (7%)
Frame 8 (8%)
Frame 9 (9%)
Frame 10 (10%)
Frame 11 (11%)
Frame 12 (12%)
Frame 13 (13%)
Frame 14 (14%)
Frame 15 (15%)
Frame 16 (16%)
Frame 17 (17%)
Frame 18 (18%)
Frame 19 (19%)
Frame 20 (20%)
Frame 21 (21%)
Frame 22 (22%)
Frame 23 (23%)
Frame 24 (24%)
Frame 25 (25%)
Frame 26 (26%)
Frame 27 (27%)
Frame 28 (28%)
Frame 29 (29%)
Frame 30 (30%)
Frame 31 (31%)
Frame 32 (32%)
Frame 33 (33%)
Frame 34 (34%)
Frame 35 (35%)
Frame 36 (36%)
Frame 37 (37%)
Frame 38 (38%)
Frame 39 (39%)
Frame 40 (40%)
Frame 41 (41%)
Frame 42 (42%)
Frame 43 (43%)
Frame 44 (44%)
Frame 45 (45%)
Frame 46 (46%)
Frame 47 (47%)
Frame 48 (48%)
Frame 49 (49%)
Frame 50 (50%)
Frame 51 (51%)
Frame 52 (52%)
Frame 53 (53%)
Frame 54 (54%)
Frame 55 (55%)
Frame 56 (56%)
Frame 57 (57%)
Frame 58 (58%)
Frame 59 (59%)
Frame 60 (60%)
Frame 61 (61%)
Frame 62 (62%)
Frame 63 (63%)
Frame 64 (64%)
Frame 65 (65%)
Frame 66 (66%)
Frame 67 (67%)
Frame 68 (68%)
Frame 69 (69%)
Frame 70 (70%)
Frame 71 (71%)
Frame 72 (72%)
Frame 73 (73%)
Frame 74 (74%)
Frame 75 (75%)
Frame 76 (76%)
Frame 77 (77%)
Frame 78 (78%)
Frame 79 (79%)
Frame 80 (80%)
Frame 81 (81%)
Frame 82 (82%)
Frame 83 (83%)
Frame 84 (84%)
Frame 85 (85%)
Frame 86 (86%)
Frame 87 (87%)
Frame 88 (88%)
Frame 89 (89%)
Frame 90 (90%)
Frame 91 (91%)
Frame 92 (92%)
Frame 93 (93%)
Frame 94 (94%)
Frame 95 (95%)
Frame 96 (96%)
Frame 97 (97%)
Frame 98 (98%)
Frame 99 (99%)
Frame 100 (100%)
Finalizing encoding... done!

References

  1. Ahn, J. H., Han, S. P., & Lee, Y. S. (2006). Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry. Telecommunications policy, 30(10-11), 552-568.

  2. Amin, A., Anwar, S., Adnan, A., Nawaz, M., Alawfi, K., Hussain, A., & Huang, K. (2017). Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing, 237, 242-254.

---
title: "Data Visualisation CA2"
author: 'D19125650: Jamie Baxter TU060 - Part Time'
output: html_notebook
---
# Dataset
The dataset used in this report is about customer churn from a telecommunication company. Customer churn relates to whether a customer leaves or remains with a company. It contains one month of customer data from the teleco company. The raw data contains 7043 rows (customers) and 21 columns (features). It contains one character type variable, 7 binary variables, 3 integer variables and 10 categorical variables.


1. customer_id - Character
2. gender - Binary
3. SeniorCitizen  - Binary
4. Partner - Binary
5. Dependents - Binary
6. Tenure - Integer
7. PhoneService - Binary
8. MultipleLines - Categorical
9. InternetService - Categorical
10. OnlineSecurity - Categorical
11. OnlineBackup - Categorical
12. DeviceProtection - Categorical
13. TechSupport - Categorical
14. StreamingTV - Categorical
15. StreamingMovies - Categorical
16. Contract - Categorical
17. PaperlessBilling - Binary
18. PaymentMethod - Categorical
19. MonthlyCharges - Integer
20. TotalCharges - Integer
21. Churn - Binary

Content
Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

Customers who left or stayed with the company within the timeframe of one month – the column is called Churn

The timeline for the data is from one month within the company.

It contains information about services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies

It contains information about customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges

It contains information about customer demographics – gender, age range, and if they have partners and dependents

Link: https://www.kaggle.com/blastchar/telco-customer-churn



```{r}
#Code chunk for setup- Version R Studio:  "1.2.5001"

#install.packages ("sqldf")
#install.packages ("ggplot2")
#install.packages ("dplyr")
#install.packages('gridExtra')
#install.packages('plotly')
#install.packages('ggpubr')


library('reshape2') 
library('sqldf')
library('ggplot2')
library('gridExtra')
library('dplyr')
library('plotly')
library('ggpubr')
library('gganimate')


#Read in File
teleco<-read.csv("C://Users//james//Documents//Teleco.csv", header = TRUE)
View(teleco)

glimpse(teleco)

#Structure of DF#
str(teleco)
ncol(teleco)
nrow(teleco)
length(teleco)

dim(teleco)

#Summary of data#
summary(teleco)

#Look at missing data for all columns in df
teleco %>%
  select(everything()) %>%  
  summarise_all(funs(sum(is.na(.))))

#Remove Missing Values
teleco <-na.omit(teleco)

```
# 1. Introduction
As technology is advancing telecommunication services are in competition with eachother to retain customers (Amin et al., 2017). It is estimated that annual customer churn within telecommunication companies ranges from 20% to 40% (Ahn et al., 2006). It is important for telecommunication companies to retain a customer base in order to keep their profit level. It has also been stated many times how it is more expensive to acquire a new customer than it is to retain an existing customer (Ahn et al., 2006). Therefore it is clear that telecommunication companies should be focused on retaining customers rather than acquiring new customers. They should be focused on reducing brand switching within their existing customer base. As technology advances and new services become available to customers it is important for telecommunication companies to monitor the ttrends in the industry. They must be able to maintain industry standard of services expected by customers ar a fair price point. This report aims to investigate some of the causes of customer churn within the telecommunication industry.

# 2. User Story

**Primary Groups or Individuals:** The primary individuals I will be communicating to will be senior management in a Teleco company. 

**What does audience care about?:** The audience will want to understand what is causing customers to leave their company. Teleco companys are quite competitive today with eachother and will want to retain as many customers as possible. They will want to know if there is anything they can improve with their business model to retain more customers. Is there something that customers are not satisfied with? In order to better understand customer churn they need to understand their customer behaviour and the triggers that are causing customers to leave.

**What action does audience need to take?:** Once the insights have been provided to senior management they will need to assess whether they need to improve or refine their business model for customers to make it more appealing and to avoid customers leaving. They will need to conudct an assessment to ensure the services they offer are up to date with industry standards and current trends in the market. They may need to improve their streaming service for example as it may not be up to industry standard or perhaps their pricing model is being undermined by a competitor. 

**Benefits:** The benefits of providing the insights to senior management is that they will be able to identify key reasons why customers are leaving and not being retained. This will allow senior management to act and make changes necessary in order to avoid losing customers and retain as much as possible. Having a stable and large customer base gives a company a good image among the public. This in turn may also increase referrals from existing customers and will also ensure that the companys profit level remains stable.


**Risks:** The risks of not acting on the insights provided would be that they may lose a large amount of customers. Their services may also fall far behind current industry standards and they will not have any insights into what is required to retain their customer base and keep their customer base satisfied with their service. If the company continues to lose customers they will in turn lose profit. They also risk getting a bad reputation among the public due to word of mouth from customers that are leaving.

Overall, the aim of this report is to provide key insights into what is causing a lack of customer retention in a teleco company.

# 3. Data Exploration and wrangling
In the below section I have conducted some data exploration and created some simple plots of variables of interest. I have created bar graphs looking at churn, gender, contract type, phone service, internet service, movie streaming and tv streaming.I  have also wrangled the data and created some data frames which I used to look at certain possible trends and some that I will use later in my visualisations. For example I have looked at the count of how many customers did or did not churn grouped by the contract type which will be used in visualisation 1. 

```{r datadesc1}
#Data Exploration and wrangling 

#Look at Churn Count
Churn_count <- sqldf('Select count(churn) as count, churn from teleco group by churn')
View(Churn_count)

ggplot(data=Churn_count, aes(x=Churn, y=count, fill = Churn)) + geom_bar(stat="identity")


#Look at gender distribution
Gender <- sqldf('Select count(gender) as count, gender from teleco group by gender')
View(Gender)

ggplot(data=Gender, aes(x=gender, y=count, fill = gender)) + geom_bar(stat="identity")


#Look at Contract Type
Contract_count <- sqldf('Select count(Contract) as count, Contract from teleco group by Contract')
View(Contract_count)

ggplot(data=Contract_count, aes(x=Contract, y=count, fill = Contract)) + geom_bar(stat="identity")


#Look at Phone Service
Phone_service <- sqldf('Select count(PhoneService) as count, PhoneService from teleco group by PhoneService')
View(Phone_service)

ggplot(data=Phone_service, aes(x=PhoneService, y=count, fill = PhoneService)) + geom_bar(stat="identity")

#Look at Internet Service
Internet_service <- sqldf('Select count(InternetService) as count, InternetService from teleco group by InternetService')
View(Internet_service)

ggplot(data=Internet_service, aes(x=InternetService, y=count, fill = InternetService)) + geom_bar(stat="identity")


#Look at Movie Service
Movie_service <- sqldf('Select count(StreamingMovies) as count, StreamingMovies from teleco group by StreamingMovies')
View(Movie_service)

ggplot(data=Movie_service, aes(x=StreamingMovies, y=count, fill = StreamingMovies)) + geom_bar(stat="identity")


#Look at TV Service
TV_service <- sqldf('Select count(StreamingTV) as count, StreamingTV from teleco group by StreamingTV')
View(TV_service)

ggplot(data=TV_service, aes(x=StreamingTV, y=count, fill = StreamingTV)) + geom_bar(stat="identity")



#Look at Churn for Contract
Churn_contract <- sqldf('Select count(churn) as count, churn, contract from teleco  group by contract, churn order by churn, count desc')
View(Churn_contract)

df <- subset(Churn_contract, Churn_contract$Churn == 'Yes')
View(df)

#Look at Monthly Charges
Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn from teleco group by MonthlyCharges,churn order by churn, count desc')
View(Charges)


#Look at Payment Method
Payment <- sqldf('Select count(churn) as count, churn, PaymentMethod from teleco  group by PaymentMethod, churn order by churn, count desc')
View(Payment)

#Look at Internet Service
Internet <- sqldf('Select count(churn) as count, churn, InternetService from teleco group by InternetService, churn order by churn, count desc')
View(Internet)

#Look at Tenure
Tenure <- sqldf('Select count(churn) as count, Tenure, churn from teleco group by Tenure,churn order by churn, count desc')
View(Charges)

#Look at Streaming TV
Tv <- sqldf('Select count(churn) as count, StreamingTv, churn from teleco group by StreamingTv,churn order by churn, count desc')
View(Tv)

#Look at Streaming Movies
Movies <- sqldf('Select count(churn) as count, StreamingMovies, churn from teleco group by StreamingMovies,churn order by churn,count desc')
View(Movies)

#Look at gender distribution
Gender <- sqldf('Select count(gender) as count, gender from teleco group by gender')
View(Gender)

Gender_churn <- sqldf('Select count(churn) as count, gender, churn from teleco group by gender, churn')
View(Gender_churn)

```

# 3. Visualisations

## 3.1 Visualisation 1: Relationship between contract type and customer churn
From the data exploration their appears to be a relationship between contract type and customer retention. Customers who are on a monthly contract are leaving more than customers on a one year or two year contract.  Therefore this is an indicator to me that there is perhaps a relationship between the cost of monthly fees and customer retention. This is something that will be investigated with a visualization. The first visualisation that will be presented to senior management will be a histogram plot showing that most customers who have left have been on a monthly contract rather than a one year or two year contract. In my previous iterations which can be viewed further on in this report I experimented with the position of the bar graphs. I decided to use a bar graph as I found that it was the best method to portray this data. The bar graph effectively shows the large difference between the contract types. I had originally created a static plot but my final iteration I decided to make the plot interactive. This will allow senior management to interact with this visualisation and filter the data however they wish to view it.
```{r datadesc2}
#Contract Plot
View(Churn_contract)

Churn_contract2 <- melt(Churn_contract)

View (Churn_contract2)


#Contract Plot Final Iteration (Static)
Contract_Plot <- ggplot(data=Churn_contract, aes(x=Contract, y=count, fill = Churn)) + geom_bar(stat="identity", position= 'dodge') + ylab('Churn Count') + ggtitle('Relationship Between Customer Churn and Contract Type')


#Contract Plot Final Iteration (Interactive)
ggplotly(Contract_Plot) %>%
  layout(
    title = 'Relationship Between Churn and Contract Type'
  )


```


## 3.2 Visualisation 2: Relationship between Monthly Charges and Customer Churn
Senior management can see from the visualisation 1 that most customers leaving are on a monthly contract. The next visualisation that will be presented to senior management will be a boxplot showing that customers who are leaving are also paying higher monthly fees. This could be an indication to senior management that they need to assess their pricing plan.I had experimented with different plot types such as line graph or scatter plot which can be viewed in my previous iterations further on in the report. I used a boxplot as I found that it was the best plot to convey the message and effectively shows the difference in monthly fees between customers who are leaving and customers who are staying with the company. The box plot is easy to read and understand. However, senior management will now want to understand why these customers are paying higher monthly charges. They will question why customers are paying such higher monthly charges compared to other customers. This will be explored in the next visualisation.
```{r datadesc3}


#Monthly Charges Final Iteration
ggplot(teleco, aes(y= MonthlyCharges, x = "Churn", fill = Churn)) + geom_boxplot()+  xlab(" ") +ggtitle('Relationship Between Monthly Charges and Customer Churn ')


```

## 3.3 Visualisation 3: Comparison of Monthly charges for Services Offered
From the previous visualisation presented to senior management we can identify that customers who are leaving are paying high monthly fees. Senior management will wasn't to question why these customers are paying high monthly fees so the aim of the next visualisation will be to look at how much  difference there is in the cost of services they provide. It looks to investigate whether there is a big difference in the monthly fees for the services offered. The services that are looked at an compared in this visualisation are TV streaming, Movie streaming an internet service. I decided to continue to use box plot diagrams as it is effective in communicating whether there is a high difference between monthly fees of certain services. I also decided to arrange the plots created together in order to compare them. From the visualisation senior management can see that there is a wide gap in monthly fees for the high end services they provide. For example 

```{r}
#compare Internet Type and monthly charges

Internet_Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn, InternetService from teleco group by MonthlyCharges,churn order by churn, count desc')
View(Internet_Charges)


Internet_plot <-ggplot(Internet_Charges, aes(y= MonthlyCharges, x= " ", fill = InternetService)) + geom_boxplot()+  xlab(" Churn") + scale_fill_ordinal() + theme_minimal()

#compare Streaming TV and monthly charges

TV_Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn, StreamingTV from teleco group by MonthlyCharges,churn order by churn, count desc')
View(TV_Charges)



Tv_plot <- ggplot(TV_Charges, aes(y= MonthlyCharges, x= " ", fill = StreamingTV)) + geom_boxplot()+  xlab(" Churn") + scale_fill_ordinal() + theme_minimal()


#compare Streaming Movies and monthly charges

Movie_Charges <- sqldf('Select count(churn) as count, MonthlyCharges, churn, StreamingMovies from teleco group by MonthlyCharges,churn order by churn, count desc')
View(Movie_Charges)



Movie_plot <- ggplot(Movie_Charges, aes(y= MonthlyCharges, x= " ", fill = StreamingMovies)) + geom_boxplot()+  xlab(" Churn") + scale_fill_ordinal() + theme_minimal()


#Final Iteration

charges_comparison <- ggarrange( Tv_plot, Movie_plot,Internet_plot, ncol = 2, nrow = 2)

annotate_figure(charges_comparison,top = text_grob("Comparing Service Monthly Charges", color = "black", face = "bold", size = 14))




```


## 3.4 Visualisation 4: Relationship Between Churn and Tenure
The final visualisation presented to senior management will be to look at how tenure is related to customer churn. There tends to be a trend that the longer customers are retained the less likely they are to stop using a companies service. This can be seen in the line graph below. I have chosen an animated line graph to represent the data. Each line represents the churn type (yes and no). The line graph was chosen as there are two numeric variables being represented ( tenure and count of customers). From my iterations which can be seen below I found the line graph the best method to represent this data. From the graph it can be identified that the highest number of customers that churned were only with the company for 10 months. As customers tenure increased with the company their likelihood of churning decreased. This should prompt senior management to put more focus on the customers who have most recently joined within the past few months and to focus on retaining these customers.
```{r}


p <- teleco %>%
  group_by(tenure, Churn) %>%
  summarise(count = n(), .groups = 'drop') %>%
  ggplot(aes(tenure, count, color = Churn)) +
  geom_line(size = 2) + ggtitle('Relationship Between Tenure And Customer Churn') + theme(plot.title = element_text(hjust = 0.5))

p + geom_point() + transition_reveal(tenure)

```

# 4.Previous Iterations
## 4.1 Visualisation 1 Iterations: Relationship between contract type and customer churn
In this section I will show previous iterations of the plots created for this report. Firstly below shows the previous iterations for visualisation 1. I experimented with flipping the position of the bar graphs and also stacking the bar graphs. However I decided to group the bar graphs in the end as I found that when they were stacked it made it more difficult to compare the sizes for each churn type. When I grouoed the bar plots it remedied this. I also had created a static plot that I felt looked well but in order to add a bit more interactivity I decided to make the final plot an interactive plot.
```{r}

#Contract Plot Iteration 1
ggplot(data=Churn_contract, aes(x=count, y=Contract, fill = Churn)) + geom_histogram(stat="identity") 

#Contract Plot Iteration 2
ggplot(data=Churn_contract, aes(x=count, y=Contract, fill = Churn)) + geom_histogram(stat="identity", position = 'dodge') 

#Contract Plot Iteration 3 (Static)
ggplot(data=Churn_contract, aes(x=Contract, y=count, fill = Churn)) + geom_bar(stat="identity", position= 'dodge') + ylab('Churn Count') + ggtitle('Relationship Between Customer Churn and Contract Type')


```

## 4.2 Visualisation 2 Iterations: Relationship between Monthly Charges and Customer Churn
Previous Iterations for Visualisation 2 can be seen below. I experimented with different plot types such as line graph or a scatter plot and frequency polygon graph. However, I felt these plots did not represent the data in a clean manner and the box plot was much better at conveying the message to stakeholders. I also chose the box plot as I wanted to have a diverse range of plots in the report. The box plot allows senior management to quickly identify mean values and the difference in monthly charges being paid by customers who are leaving the company.



```{r}
#Monthly Charges Plot Iteration 1
View(Charges) 

ggplot(Charges,aes(x = MonthlyCharges, y=count, color = Churn))+ geom_line(size=2) + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')

#Monthly Charges Iteration 2
ggplot(Charges,aes(x = MonthlyCharges, y=count, color = Churn))+ geom_point() + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')

#Monthly Charges Iteration 3
ggplot(Charges,aes(x = MonthlyCharges, color = Churn))+ geom_freqpoly(size=2) + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')


```

## 4.3 Visualisation 3 Iterations: Comparison of Monthly charges for Services Offered
Previous Iterations for Visualisation 3 can be seen below. I had to experiment with the number of row and columns. As can be seen below in my previous iterations the plots were overlapping eachother and did not present in a tidy manner. I also experimented with including a title for each of the plots but these titles also overlapped eachother. I decided to remove the titles and include just one title for the overall plot which made it look neater and I felt that the legend for each plot conveyed what the monthly charges were for in each plot. 


```{r}

#Iteration 1
grid.arrange(Internet_plot, Tv_plot, Movie_plot, nrow=2, widths = c(2, 1, 1),layout_matrix = rbind(c(1, 2, NA),c(3, 3, 4)))

#Iteration 2
grid.arrange(Internet_plot, Tv_plot, Movie_plot, nrow=2, widths = c(2, 1, 1))


```

## 4.4 Visualisation 4 Iterations: Relationship between Tenure and Customer Churn
Previous Iterations for Visualisation 4 can be seen below. I first experimented with different pot types such as histogram and scatterplots. I found that the frequency polygon plot seemed to look the most clean and convey the message most effectively. However, when I attempted to animate this plot it did not work very effectively. I experimented with the transition functions such as transition_state etc but it did not animate correctly. Therefore instead of using a frequency polygon plot I switched to a line graph which was used in my final visualisation and could be animated correctly.

```{r}

#Tenure Plot Iteration 1
ggplot(data=Tenure, aes(x=tenure, y=count, fill = Churn)) + geom_histogram(stat="identity") 


#Tenure Plot Iteration 2
ggplot(Tenure,aes(x = tenure, y=count, color = Churn))+ geom_point() + ylab('Churn Count') +ggtitle('Relationship Between Monthly Charges and Customer Churn')

#Tenure Plot Iteration 3
Tenure_Iteration4 <- ggplot(teleco,aes(x = tenure, color = Churn))+ geom_freqpoly(size=2)+theme_minimal()

Tenure_Iteration4

Tenure_Iteration4 + transition_reveal(tenure)

```


# References

1) Ahn, J. H., Han, S. P., & Lee, Y. S. (2006). Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry. Telecommunications policy, 30(10-11), 552-568.

2) Amin, A., Anwar, S., Adnan, A., Nawaz, M., Alawfi, K., Hussain, A., & Huang, K. (2017). Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing, 237, 242-254.