library(tidyverse)
library(kableExtra)
game_churn <- read_csv("game_churn.csv")
game_churn <- game_churn %>%
rename(churn = Churn,
level = Level,
start_moves = `Start Moves`,
extra_moves = `Extra Moves`,
used_moves = `Used Moves`,
buy_more_moves = `Buy More Moves`,
used_coins = `Used Coins`,
end_type = `End Type`,
play_time_sec = `Play Time Sec`,
rolling_losses = `Rolling Losses`,
scores = Scores,
datetime = Datetime,
hour = Hour,
day = Day)Game Churn
Introduction
Churn is the process where customers stop using a company’s services or products, our goal is to find out how many customers churn and find the correlation between people leaving our game and different parts of our game to see what is causing the churn. The company’s goal is then change the game to keep players playing.
Setup
First we need to set-up our script. We load up the libraries create our object from our data set. and then just for simple ease of use we rename our variables to fit the snake case writing style.
Analysis
Within our analysis we are going to aim to use correlation to find our the causation of our churn, we are going to do by making this using the statistics provided to us in the data set.
Churn Rate Table
Our first step is to find out the rate at which people stop playing our game. The way we decided to tackle this is with a simple table showing the percentage of people that stay in our game and the people that don’t. In this case yes meaning the player has churned and no meaning the player has not churned.
churn_rate_table <- game_churn %>%
group_by(churn) %>%
summarise(count = n()) %>%
mutate(churn_percentage = round(count/sum(count)*100, 1)) %>%
arrange(desc(count))
knitr::kable(churn_rate_table,
align = "lrr",
col.names = c("Churn", "Number of Players", "Percentage of Churn"),
caption = "Churn Rate",
table.attr = 'data-quarto-disable-processing = "true"') %>%
kable_styling(full_width = F) %>%
row_spec(2, color = "white", background = "green") %>%
row_spec(1, color = "white", background = "red")| Churn | Number of Players | Percentage of Churn |
|---|---|---|
| No | 8896 | 65.7 |
| Yes | 4652 | 34.3 |
Findings
We can see that the churn percentage is at 34.3 %. This means that 34.3 % of players stop playing our game. That is more than one third of our player base, which is a rather large chunk, so we need to find the cause and eliminate it.
Correlation Between Churn and Other Variables Heatmap
Instead of blindly going and guessing what variables may have something to do with our high churn rate, we are going to create a correlation heatmap to find out which variables have the strongest connection with churn. The way our correlation heatmap works is:
- Closer the number is to 1 the stronger positive the correlation is
- Closer the number is to -1 the stronger negative the correlation is
- Closer the number is to 0 the weaker the correlation is
A positive correlation means that as our variable goes up so does our churn, while a negative correlation means that when a variable goes down our churn goes up.
game_churn %>%
mutate(Churn_Num = ifelse(churn == "Yes", 1, 0)) %>%
mutate(end_win = ifelse(end_type == "Win", 1, 0),
end_lose = ifelse(end_type == "Lose", 1, 0),
end_quit = ifelse(end_type == "Quit", 1, 0),
end_restart = ifelse(end_type == "Restart", 1, 0)) %>%
select(Churn_Num, where(is.numeric)) %>%
cor(use = "complete.obs") %>%
as.data.frame() %>%
rownames_to_column("Var1") %>%
pivot_longer(-Var1, names_to = "Var2", values_to = "Correlation") %>%
filter(Var1 == "Churn_Num", Var2 != "Churn_Num") %>%
mutate(Var2 = reorder(Var2, Correlation)) %>%
ggplot(aes(x = Var2, y = Var1, fill = Correlation)) +
geom_tile() +
geom_text(aes(label = round(Correlation, 2)), size = 3) +
scale_fill_gradient2(low = "red", mid = "yellow", high = "green", midpoint = 0, limits = c(-1, 1)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1),
axis.title = element_blank(),
axis.text.y = element_blank()) +
scale_x_discrete(labels = c(level = "Level",
start_moves = "Start Moves",
extra_moves = "Extra Moves",
used_moves = "Moves Used",
buy_more_moves = "Moves Bought",
play_time_sec = "Play Time",
rolling_losses = "Rolling Losses",
used_coins = "Coins Spent",
scores = "Score",
hour = "Hour of Day",
end_win = "Win Level",
end_lose = "Lose Level",
end_quit = "Quit Level",
end_restart = "Restart Level")) +
labs(title = "Correlation of Variables with Churn")Findings
We found that the strongest correlation is a negative correlation between level and churn, this is to be expected since players are less likely to churn the more time they have invested in the game and newer player are much more likely to churn, same goes for score. But what is interesting to see is that start moves and used moves have the second strongest negative correlation, we can safely assume that these variables determine the skill level of our players (more moves started with and used means more knowledge about the game), so with skill going up the churn rate likely lowers, but to see if that is true we will need to make a separate graphic and see. But most intriguing of all is the rolling losses variable, from the correlation it seems that with growing loses the churn drops, anybody would expect that with frustration from paying the churn would be much higher, but it seems as though numerous losses bring motivation rather than frustration to the players. Lastly we will talk about the end type. It seems that winning a level lowers churn and losing, quitting or restarting a level grows churn to be expected, this may seem to go against our findings from rolling loses, but it is important to remember that one loss and many losses are an entirely different experience.
Churn Based on Playtime and Level
In this graphic we delve deeper into the connection between experienced players and churning. We are using a bubbleplot as there is far too much data for a scatterplot to be useful, this way our players get combined into a bubble sized based on their amounts. We are looking to confirm our suspicion that with growing level and time spent churn goes down.
bubble_data <- game_churn %>%
group_by(level, churn) %>%
summarise(avg_play_time = mean(play_time_sec), count = n())
ggplot(bubble_data, aes(x = level, y = avg_play_time, size = count, color = churn)) +
geom_point(alpha = 0.4) +
scale_size(range = c(2, 12)) +
scale_y_continuous(limits = c(0, 2000)) +
labs(title = "Play Time vs Level by Churn",
x = "Level",
y = "Average Play Time (sec)",
size = "Number of Players",
color = "Churn") +
theme_minimal()Findings
From the chart we can see that most of our churning customers are among the lower levels with lower average play time. This is confirms that players that spend more time in the game and reach higher level are less likely churn. At the same time it shows us that most players are at lower levels, this may speak to the difficulty of the game.
Churn Based on Rolling Loses
Earlier we found that the correlation between rolling loses and churn is negative which was quite and interesting outcome. In this histogram we will find out a bit more about it. The bars are overlapping, meaning the more times the green bar fits in the red bar the mower the percentage of churn.
ggplot(data = game_churn) +
geom_histogram(mapping = aes(x = rolling_losses, fill = churn, ), position = "identity", binwidth = 3, alpha = 0.5) +
scale_x_continuous(limits = c(0, 50)) +
scale_y_continuous(limits = c(0, 2200)) +
labs(title = "Overlapping Histogram of Rolling Losses by Churn",
x = "Rolling Losses",
y = element_blank())Findings
We can clearly see that the churn bar is going down quicker than the churn bar as rolling losses grow higher. This confirms our earlier hypothesis that players are less likely to stop playing our game the more losses in a row they have, pointing to the possibility that players get motivated by losses and rather than quit they get motivated to beat the difficult levels. Another avenue of possibility is that the more difficult level that make our players lose more are usually far along into the game (higher levels), and from previous research we know that players at higher levels are less likely to quit our game for good.
Churn Based on End Type
This table will show us what percentage of churn our game experiences at each end type, the goal is to find out what end type makes our players want to leave the most and ideally lower the amount of times our players end the game that way.
endtype_churn_table <- game_churn %>%
count(churn, end_type) %>%
group_by(end_type) %>%
mutate(percentage = round(n / sum(n) * 100, 1)) %>%
ungroup() %>%
select(-n) %>%
pivot_wider(
names_from = end_type,
values_from = percentage
) %>%
arrange(churn)
knitr::kable(
endtype_churn_table,
align = "lrrrr",
caption = "Churn % by End Type",
col.names = c("Churn", "Lose %", "Quit %", "Restart %", "Win %"),
table.attr = 'data-quarto-disable-processing="true"') %>%
kable_styling(full_width = FALSE) %>%
row_spec(2, color = "white", background = "green") %>%
row_spec(1, color = "white", background = "red")| Churn | Lose % | Quit % | Restart % | Win % |
|---|---|---|---|---|
| No | 63.8 | 47.1 | 68.3 | 70.4 |
| Yes | 36.2 | 52.9 | 31.7 | 29.6 |
Findings
The Highest churn rate is when players quit the game at more than 50%, quitting is one of the steps to churning so that is to be expected, but the percentage is still very high and we need to make players more likely to stay when struggling with the game. The other end types are around the average, with loss being slightly above average churn rate, but that is to be expected,
Churn based on Score
A boxplot showing churn based on score will show us whether players with a higher score are more likely to stay and players with a lower score are more likely to churn.
ggplot(data = game_churn) +
geom_boxplot(mapping = aes(x = churn, y = scores), outlier.colour = "red") +
labs(title = "Churn Based on Score",
x = "Churn",
y = "Scores")Findings
We can see exactly what we expected players with a higher score are more likely to keep playing and players with a lower score are more likely to churn although the difference in the median is not very significant, this point to the conclusion that while higher score player are less likely to churn it is not a key indicator of churn and score is not the main problem making players churn.
game skill by churn
Earlier we talked about moves used and starting moves being an indicator of the level of skill and experience of a player. This means that good players will positioned along a linear line going at a 45 degree angle showing near perfect utilization of moves (no moves wasted, no moves needing to be purchased). In this scatterplot we aim to find out if this skill is connected to lower churn levels.
ggplot(data = game_churn) +
geom_point(mapping = aes(x = start_moves, y = used_moves, colour = churn)) +
labs(title = "Churn Based on Level of Skill",
x = "Starting Moves",
y = "Moves Used")Findings
Here we can clearly see that perfect utilization of available moves is directly connected to lower churn rates, the trendline along the 45 degree angle is showing minimal churn rates. This leads us to believe player that feel like they are good at the game are less likely to give it up.
Churn Based on Spending
In this barchart we are aiming to find out whether spending in game has any effect on our churn rate. This is a very important metric as spending directly effects our revenue via micro transactions. Our goal is to maximise profits and minimize churn rate.
buy_moves_churn <- game_churn %>%
count(buy_more_moves, churn) %>%
group_by(buy_more_moves) %>%
mutate(percentage = n / sum(n) * 100) %>%
ungroup()
ggplot(buy_moves_churn, aes(x = buy_more_moves, y = percentage, fill = churn)) +
geom_bar(stat = "identity") +
geom_text(aes(label = paste0(round(percentage, 1), "%")), position = position_stack(vjust = 0.5), size = 3, color = "white") +
ylab("Churn Percentage (%)") +
xlab("Moves Bought") +
ggtitle("Percentage of Churn with Bought Moves")Findings
We can see that with no extra moves bought the churn rate is nearly identical to our overall churn rate meaning buying no moves likely has no effect on churn rate, interestingly enough, when buying one or two moves the churn rate goes way down, this is likely due to the players not wanting to leave after they just spent money as that would be a waste of time and money, but the churn rate goes up significantly when buying three or four moves, this signifies the frustration of having to buy more and more moves to finish a level. Players don’t want to be required to spend too much or too often just to finish a level leading to a rise in churn.
Conclusion
We found a lot of interesting information related to churn, we have an idea of what causes it and how to fix it.
Findings:
- Players at lower levels are more likely to churn
- Rolling losses lower churn rate
- Players are most likely to churn after ending a level with quitting
- Players with higher scores churn less often
- More skilled players are less likely to churn
- Buying too many moves raises churn level
Recommendations:
- Quicker progression at lower levels
- Make game challenging for experienced players (more fun)
- Make finishing a level rewarding (less likely to quit)
- Add a tutorial to teach new players strategy to become skilled
- Make levels easier to finish when one or two moves are purchased