Tombola

TODO write intro about tombola

Mathematical analysis

The objective is to understand at each point of the game what is the probability that a card makes tombola, and also for the minor prizes (ambo, terna, etc.)

The probability that a card has a drawn number depends on two factors: the number of free numbers on the card and the number of numbers in the drum.

The probability that a card has an “hit” (the number drawn is on the card is):

\[P(H_{t, k}) = \frac{c-k}{d - t}\] where:

From this formula is possible to calculate the fraction of cards in a class \(k\) at time \(t\).

This is the sum of two components:

or in math:

\[N_{t,k} = N_{t-1, k-1}P(H_{t, k-1}) + N_{t-1, k}(1 - P(H_{t, k}))\]

This equations are all what is needed to model the tombola game, so now it can be implemented in code

Implement in code

Step 1

Naive implementation of the functions in code

p_H <- function(t,k){
  (n_c - k) / (n_d - t)
}
N <- function(t, k){
  N(t-1, k-1) * p_H(t, k-1) + N(t-1, k) * (1 - p_H(t,k) )
}

Step 2

The function N calls recursively itself, which works but it will never stop, so the solution is to have an initial state hardcoded into the function for t=0

N <- function(t, k){
  if(t == 0){
    ifelse(k == 0, 1, 0) # 1 means all the card are in class 0
  }
  else{
    N(t-1, k-1) * p_H(t-1, k-1) + N(t-1, k) * (1 - p_H(t-1,k)) 
  }
}

Step 3

The function N now it works but is very slow, because it needs to call itself a number of times that increase exponentially with t, so we use memoization to cache the results.

In R it can be simply done as

N <- memoise(N)

Step 4

Finally we want the system to be flexible and allow for different values of n_c and n_d

# this functions returns the N function for a particular game
get_game_N <- function(n_c, n_d){
  
  p_H <- function(t,k){
        (n_c - k) / (n_d - t)
  }
  
  N <- function(t, k){
      if(t == 0){
        ifelse(k == 0, 1, 0) # 1 means all the card are in class 0
      }
      else{
        N(t-1, k-1) * p_H(t-1, k-1) + N(t-1, k) * (1 - p_H(t-1,k)) 
      }
    }
  N <- memoise(N)
  return(N)
}

Result

Tombola

with all the code is possible to calculate the probabilities for each class at all time steps

all_game_status <- function(n_c, n_d){
  N <- get_game_N(n_c, n_d)
  game_status <- matrix(nrow = n_d, ncol=n_c+1)

  for (t in 1:n_d){
    for (k in 0:n_c){
      game_status[t, k+1] <- N(t, k) # The k+1 is needed because R indexes starts from 1 and not 0 like the classes
    }
  }
  
  game_status <- game_status %>% 
    as.data.frame() %>% 
    as_tibble() %>% 
    mutate(time = 1:n_d) %>% 
    relocate(time)
  
  names(game_status) <- c("time", 0:n_c)
  
  game_status
  
}
tombola <- all_game_status(15, 90) # number fo tombola
tombola
## # A tibble: 90 × 17
##     time   `0`   `1`    `2`     `3`      `4`       `5`      `6`      `7`     `8`
##    <int> <dbl> <dbl>  <dbl>   <dbl>    <dbl>     <dbl>    <dbl>    <dbl>   <dbl>
##  1     1 0.833 0.167 0      0       0        0          0        0       0      
##  2     2 0.693 0.281 0.0262 0       0        0          0        0       0      
##  3     3 0.575 0.354 0.0670 0.00387 0        0          0        0       0      
##  4     4 0.476 0.396 0.114  0.0134  0.000534 0          0        0       0      
##  5     5 0.393 0.415 0.161  0.0287  0.00233  0.0000683  0        0       0      
##  6     6 0.323 0.416 0.205  0.0493  0.00608  0.000362   8.04e-6  0       0      
##  7     7 0.266 0.404 0.243  0.0740  0.0123   0.00112    5.02e-5  8.61e-7 0      
##  8     8 0.218 0.384 0.273  0.101   0.0214   0.00262    1.79e-4  6.23e-6 8.30e-8
##  9     9 0.178 0.358 0.295  0.130   0.0334   0.00517    4.79e-4  2.53e-5 6.83e-7
## 10    10 0.145 0.329 0.310  0.158   0.0480   0.00906    1.06e-3  7.60e-5 3.12e-6
## # … with 80 more rows, and 7 more variables: 9 <dbl>, 10 <dbl>, 11 <dbl>,
## #   12 <dbl>, 13 <dbl>, 14 <dbl>, 15 <dbl>
# tidy dataframe for easier plotting
tombola_td <- tombola %>% 
  pivot_longer(-time, names_to = "numbers", values_to = "fraction") %>% 
  mutate(numbers = factor(numbers, levels = numbers %>%  as.integer() %>% unique())) 

here there are the results for tombola

tombola %>% 
  ggplot() +
  geom_line(aes(time, `15`)) +
  labs(y="probability of tombola")

for all classes

tombola_td %>% 
  ggplot() +
  geom_line(aes(time, fraction)) +
  facet_wrap(~numbers)

Probability of having done at least k hits at each time

totalprob <- function(x){
  end <- length(x)
  map_dbl(seq_along(x),~sum(x[.x:end]))
}
tombola_td %>%
  group_by(time) %>% 
  summarize(cum_fraction = totalprob(fraction), numbers, .groups="keep") %>% 
  ggplot() +
  geom_line(aes(time, cum_fraction)) +
  facet_wrap(~numbers)

Minor prizes

the approach is the same for tombola, but instead of having have a card with 15 numbers we use consider the length of the row as a card, in this case 5.

tombola_minor <- all_game_status(5, 90)
tombola_minor_td <- tombola_minor %>% 
  pivot_longer(-time, names_to = "numbers", values_to = "fraction") %>% 
  mutate(numbers = factor(numbers, levels = numbers %>%  as.integer() %>% unique()))
tombola_minor_td %>% 
  ggplot() +
  geom_line(aes(time, fraction)) +
  facet_wrap(~numbers)

tombola_minor_td %>%
  group_by(time) %>% 
  summarize(cum_fraction = totalprob(fraction), numbers, .groups="keep") %>% 
  ggplot() +
  geom_line(aes(time, cum_fraction)) +
  facet_wrap(~numbers)

Extra

Calculating the total number of possible games and possible card in tombola.

combination <- function(n,k){
  factorial(n) / (factorial(k) * factorial(n - k))
}
permutation <- function(n,k){
  factorial(n) / factorial(n-k)
}

total number of possible “games”

permutation(90, 90)
## [1] 1.485716e+138

number of possible cards

combination(90, 15)
## [1] 4.579567e+16