Project Description

Candy Crush Saga is a hit mobile game developed by King (part of Activision|Blizzard) that is played by millions of people all around the world. In this project we will work with a real Candy Crush data set and use this data to estimate level difficulty.

1 Candy Crush Saga

Candy Crush Saga is a hit mobile game developed by King (part of Activision|Blizzard) that is played by millions of people all around the world. The game is structured as a series of levels where players need to match similar candy together to (hopefully) clear the level and keep progressing on the level map.

Candy Crush has more than 3000 levels, and new ones are added every week. That is a lot of levels! And with that many levels, it’s important to get level difficulty just right. Too easy and the game gets boring, too hard and players become frustrated and quit playing.

In this project, we will see how we can use data collected from players to estimate level difficulty.

2 The data set

The dataset we will use contains one week of data from a sample of players who played Candy Crush back in 2014. The data is also from a single episode, that is, a set of 15 levels. It has the following columns:

The granularity of the dataset is player, date, and level. That is, there is a row for every player, day, and level recording the total number of attempts and how many of those resulted in a win.

We define the granularity of a dataset as the lowest level of detail of the observations. Here that means the combination of level, player_id, and dt. The rest of the columns are the facts that happened at that level of detail. That is, what happened for a given player, at a given day, at a given level. Sometimes we refer to the two types of columns as id columns (level, player_id, dt) and variable columns (num_attempts, num_success).

## # A tibble: 6 x 5
##   player_id                      dt         level num_attempts num_success
##   <chr>                          <date>     <int>        <int>       <int>
## 1 6dd5af4c7228fa353d505767143f5… 2014-01-04     4            3           1
## 2 c7ec97c39349ab7e4d39b4f74062e… 2014-01-01     8            4           1
## 3 c7ec97c39349ab7e4d39b4f74062e… 2014-01-05    12            6           0
## 4 a32c5e9700ed356dc8dd5bb3230c5… 2014-01-03    11            1           1
## 5 a32c5e9700ed356dc8dd5bb3230c5… 2014-01-07    15            6           0
## 6 b94d403ac4edf639442f93eeffdc7… 2014-01-01     8            8           1

3 Checking the data set

Now that we have loaded the dataset let’s count how many players we have in the sample and how many days worth of data we have.

## [1] "Number of players:"
## [1] 6814
## [1] "Period for which we have data:"
## [1] "2014-01-01"
## [1] "2014-01-07"

Within each Candy Crush episode, there is a mix of easier and tougher levels. Luck and individual skill make the number of attempts required to pass a level different from player to player. The assumption is that difficult levels require more attempts on average than easier ones. That is, the harder a level is, the lower the probability to pass that level in a single attempt is.

A simple approach to model this probability is as a Bernoulli process; as a binary outcome (you either win or lose) characterized by a single parameter \(p_{win}\): the probability of winning the level in a single attempt. This probability can be estimated for each level as:

\[p_{win} = \frac{\sum wins}{\sum attempts}\] For example, let’s say a level has been played 10 times and 2 of those attempts ended up in a victory. Then the probability of winning in a single attempt would be \(p_{win} = 2 / 10 = 20%\).

Now, let’s compute the difficulty \(p_{win}\) separately for each of the 15 levels.

## # A tibble: 15 x 4
##    level attempts  wins  p_win
##    <int>    <int> <int>  <dbl>
##  1     1     1322   818 0.619 
##  2     2     1285   666 0.518 
##  3     3     1546   662 0.428 
##  4     4     1893   705 0.372 
##  5     5     6937   634 0.0914
##  6     6     1591   668 0.420 
##  7     7     4526   614 0.136 
##  8     8    15816   641 0.0405
##  9     9     8241   670 0.0813
## 10    10     3282   617 0.188 
## 11    11     5575   603 0.108 
## 12    12     6868   659 0.0960
## 13    13     1327   686 0.517 
## 14    14     2772   777 0.280 
## 15    15    30374  1157 0.0381

Modeling the probability of winning a level (\(p_{win}\)) as a Bernoulli process is, of course, a simplification. In reality, this probability will also depend on the skill of each player and the player could learn from past attempts and play better every time. But to include those assumptions is a refinement that we will leave for another occasion.

4 Plotting difficulty profile

Great! We now have the difficulty for all the 15 levels in the episode. Keep in mind that, as we measure difficulty as the probability to pass a level in a single attempt, a lower value (a smaller probability of winning the level) implies a higher level difficulty.

Now that we have the difficulty of the episode we should plot it. Let’s plot a line graph with the levels on the X-axis and the difficulty (\(p_{win}\)) on the Y-axis. We call this plot the difficulty profile of the episode.

5 Spotting hard levels

What constitutes a hard level is subjective. However, to keep things simple, we could define a threshold of difficulty, say 10%, and label levels with \(p_{win} < 10%\) as hard. It’s relatively easy to spot these hard levels on the plot, but we can make the plot more friendly by explicitly highlighting the hard levels.

6 Computing uncertainty

As Data Scientists we should always report some measure of the uncertainty of any provided numbers. Maybe tomorrow, another sample will give us slightly different values for the difficulties? Here we will simply use the Standard error as a measure of uncertainty:

\[\sigma_{error} \approx \frac{\sigma_{sample}}{\sqrt n}\]

Here \(n\) is the number of datapoints and \(\sigma_{sample}\) is the sample standard deviation. For a Bernoulli process, the sample standard deviation is:

\[\sigma_{sample} = \sqrt{p_{win}(1-p_{win})}\]

Therefore, we can calculate the standard error like this:

\[\sigma_{error} \approx \sqrt{\frac{p_{win}(1-p_{win})}{n}}\]

We already have all we need in the difficulty data frame! Every level has been played n number of times and we have their difficulty \(p_{win}\). Now, let’s calculate the standard error for each level.

There are many ways we could calculate the uncertainty around the difficulty estimates. We could, for example, have used bootstrap estimation or Bayesian modeling. However, calculating standard errors is a very quick way of getting uncertainty estimates that in many cases are good enough.

7 Showing uncertainty

Now that we have a measure of uncertainty for each levels’ difficulty estimate let’s use error bars to show this uncertainty in the plot. We will set the length of the error bars to one standard error. The upper limit and the lower limit of each error bar should then be \(p_{win} + \sigma_{error}\) and \(p_{win} - \sigma_{error}\), respectively.

8 A final metric

It looks like our difficulty estimates are pretty precise! Using this plot, a level designer can quickly spot where the hard levels are and also see if there seems to be too many hard levels in the episode.

One question a level designer might ask is: “How likely is it that a player will complete the episode without losing a single time?” Let’s calculate this using the estimated level difficulties!


The probability of two independent events happening is simply the product of the individual probabilities. So the probability of winning both level 1 and level 2 on the first attempt would be

p_win[1] * p_win[2]

To extend this to all the 15 levels in the episode you can use the prod function which multiplies all the numbers in a vector together (that is, takes the product of all the vector elements).

## [1] 9.447141e-12

The probability p gets printed out using scientific notation. So a probability of 9.447-12 is the same as 0.000000000009447. That is, a really small probability.

9 Should our level designer worry?

Given the probability we just calculated, should our level designer should not worry about that a lot of players might complete the episode in one attempt.