TODO write intro about tombola
The objective is to understand at each point of the game what is the probability that a card makes tombola, and also for the minor prizes (ambo, terna, etc.)
The probability that a card has a drawn number depends on two factors: the number of free numbers on the card and the number of numbers in the drum.
The probability that a card has an “hit” (the number drawn is on the card is):
\[P(H_{t, k}) = \frac{c-k}{d - t}\] where:
From this formula is possible to calculate the fraction of cards in a class \(k\) at time \(t\).
This is the sum of two components:
or in math:
\[N_{t,k} = N_{t-1, k-1}P(H_{t, k-1}) + N_{t-1, k}(1 - P(H_{t, k}))\]
This equations are all what is needed to model the tombola game, so now it can be implemented in code
Naive implementation of the functions in code
p_H <- function(t,k){
(n_c - k) / (n_d - t)
}
N <- function(t, k){
N(t-1, k-1) * p_H(t, k-1) + N(t-1, k) * (1 - p_H(t,k) )
}
The function N calls recursively itself, which works but it will never stop, so the solution is to have an initial state hardcoded into the function for t=0
N <- function(t, k){
if(t == 0){
ifelse(k == 0, 1, 0) # 1 means all the card are in class 0
}
else{
N(t-1, k-1) * p_H(t-1, k-1) + N(t-1, k) * (1 - p_H(t-1,k))
}
}
The function N now it works but is very slow, because it needs to call itself a number of times that increase exponentially with t, so we use memoization to cache the results.
In R it can be simply done as
N <- memoise(N)
Finally we want the system to be flexible and allow for different values of n_c and n_d
# this functions returns the N function for a particular game
get_game_N <- function(n_c, n_d){
p_H <- function(t,k){
(n_c - k) / (n_d - t)
}
N <- function(t, k){
if(t == 0){
ifelse(k == 0, 1, 0) # 1 means all the card are in class 0
}
else{
N(t-1, k-1) * p_H(t-1, k-1) + N(t-1, k) * (1 - p_H(t-1,k))
}
}
N <- memoise(N)
return(N)
}
with all the code is possible to calculate the probabilities for each class at all time steps
all_game_status <- function(n_c, n_d){
N <- get_game_N(n_c, n_d)
game_status <- matrix(nrow = n_d, ncol=n_c+1)
for (t in 1:n_d){
for (k in 0:n_c){
game_status[t, k+1] <- N(t, k) # The k+1 is needed because R indexes starts from 1 and not 0 like the classes
}
}
game_status <- game_status %>%
as.data.frame() %>%
as_tibble() %>%
mutate(time = 1:n_d) %>%
relocate(time)
names(game_status) <- c("time", 0:n_c)
game_status
}
tombola <- all_game_status(15, 90) # number fo tombola
tombola
## # A tibble: 90 × 17
## time `0` `1` `2` `3` `4` `5` `6` `7` `8`
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.833 0.167 0 0 0 0 0 0 0
## 2 2 0.693 0.281 0.0262 0 0 0 0 0 0
## 3 3 0.575 0.354 0.0670 0.00387 0 0 0 0 0
## 4 4 0.476 0.396 0.114 0.0134 0.000534 0 0 0 0
## 5 5 0.393 0.415 0.161 0.0287 0.00233 0.0000683 0 0 0
## 6 6 0.323 0.416 0.205 0.0493 0.00608 0.000362 8.04e-6 0 0
## 7 7 0.266 0.404 0.243 0.0740 0.0123 0.00112 5.02e-5 8.61e-7 0
## 8 8 0.218 0.384 0.273 0.101 0.0214 0.00262 1.79e-4 6.23e-6 8.30e-8
## 9 9 0.178 0.358 0.295 0.130 0.0334 0.00517 4.79e-4 2.53e-5 6.83e-7
## 10 10 0.145 0.329 0.310 0.158 0.0480 0.00906 1.06e-3 7.60e-5 3.12e-6
## # … with 80 more rows, and 7 more variables: 9 <dbl>, 10 <dbl>, 11 <dbl>,
## # 12 <dbl>, 13 <dbl>, 14 <dbl>, 15 <dbl>
# tidy dataframe for easier plotting
tombola_td <- tombola %>%
pivot_longer(-time, names_to = "numbers", values_to = "fraction") %>%
mutate(numbers = factor(numbers, levels = numbers %>% as.integer() %>% unique()))
here there are the results for tombola
tombola %>%
ggplot() +
geom_line(aes(time, `15`)) +
labs(y="probability of tombola")
for all classes
tombola_td %>%
ggplot() +
geom_line(aes(time, fraction)) +
facet_wrap(~numbers)
Probability of having done at least k hits at each time
totalprob <- function(x){
end <- length(x)
map_dbl(seq_along(x),~sum(x[.x:end]))
}
tombola_td %>%
group_by(time) %>%
summarize(cum_fraction = totalprob(fraction), numbers, .groups="keep") %>%
ggplot() +
geom_line(aes(time, cum_fraction)) +
facet_wrap(~numbers)
the approach is the same for tombola, but instead of having have a card with 15 numbers we use consider the length of the row as a card, in this case 5.
tombola_minor <- all_game_status(5, 90)
tombola_minor_td <- tombola_minor %>%
pivot_longer(-time, names_to = "numbers", values_to = "fraction") %>%
mutate(numbers = factor(numbers, levels = numbers %>% as.integer() %>% unique()))
tombola_minor_td %>%
ggplot() +
geom_line(aes(time, fraction)) +
facet_wrap(~numbers)
tombola_minor_td %>%
group_by(time) %>%
summarize(cum_fraction = totalprob(fraction), numbers, .groups="keep") %>%
ggplot() +
geom_line(aes(time, cum_fraction)) +
facet_wrap(~numbers)
Calculating the total number of possible games and possible card in tombola.
combination <- function(n,k){
factorial(n) / (factorial(k) * factorial(n - k))
}
permutation <- function(n,k){
factorial(n) / factorial(n-k)
}
total number of possible “games”
permutation(90, 90)
## [1] 1.485716e+138
number of possible cards
combination(90, 15)
## [1] 4.579567e+16