The project calculates the probabilities of outcomes when n number of fair 6-sided dice are rolled. It then attempts to estimate the probabilities using Normal Distribution and demonstrates graphically the exact and estimated values.
library(polynom)
library(ggplot2)
library(gridExtra)
We consider sum of outcomes of n number of a 6-sided fair dice rolls. The probability of obtaing p can be expressed as the coeffecient of xp in the expression (1/6n)(1 + x + x2 + x3 + x4 + x5 + x6)n. This is the mathematical aspect of the project. Check more details here
A function is created that takes number of dice, n as the input and returns histogram of probability of outcomes. It calculates probabilities by the above mentioned method.
pd <- function(n){
s = 6
p<-polynomial(c(0,rep(1/s,s)))
k <- data.frame(as.vector(p^n)[-(1:n)])
k$no <- n:(s*n)
names(k)[1] <- "p"
ggplot(data.frame(k)) +
geom_bar(stat = "identity",aes(no, p), size = 1, fill = "orange", col = "black") +
labs(title = paste(n, ifelse(n == 1,"die roll", "dice rolls"), sep = " "),
x = "Outcome", y = "Probability")
}
Probability Distribution of a single die roll, 6 dice rolls, 9 dice rolls and 25 dice rolls are shown.
grid.arrange(pd(1), pd(6),pd(9),pd(25), ncol = 2)
These probabilities are estimated via normal distribution with the same mean and variance as that of the data.
For a single 6-sided fair dice roll, mean and variance can be calculated as follows:
E(X) = (1+2+3+4+5+6)/6 = 3.5
Var(X) = E(X2)-(E(X))2 = (12 + 22 + 32 + 42 + 52 + 62)/6 - 3.52 = 35/12
In case of n number of 6-sided fair dice rolls, since we assume these dice rolls to be independent. So, we can calculate mean and variance of sum by simply adding n times the mean and variance of single die roll case. So we get:
E(S) = 3.5n
Var(S) = (35/12)n
The previous function is modified to overlap the output histogram with normal distribution with parameters calculated above
pd <- function(n){
s = 6
p<-polynomial(c(0,rep(1/s,s)))
k <- data.frame(as.vector(p^n)[-(1:n)])
k$no <- n:(s*n)
names(k)[1] <- "p"
ggplot(data.frame(k)) +
geom_bar(stat = "identity",aes(no, p), size = 1, fill = "orange", col = "black") +
stat_function(fun = dnorm, size = 1.3, col = "blue",
args = list(mean = (s+1)/2*n, sd = sqrt(n*(s^2-1)/12))) +
labs(title = paste(n, ifelse(n == 1,"die roll", "dice rolls"), sep = " "),
x = "Outcome", y = "Probability")
}
Normal Estimations of the above probability distributions are shown
grid.arrange(pd(1), pd(6),pd(9),pd(25), ncol = 2)
Estimation is observed to get better as n increases.