Simulating the Monty Hall Problem

Simulate the Problem Using R

library(ggplot2)

# number of times to repeat the experiment
iter<-  10000 

# defining the doors
doors <- c("goat","goat","car")

# initialize dataframe to store the result per iteration


monte_hall <- function(iteration){
        
        contestant_door <-  sample(doors, size = iteration, replace = TRUE)
        i=1:iteration
        
        #stick_win which is equal to 1 if the contestant_door in current i is car, 0 otherwise.
        #switch_win which is equal to 0 if the contestant_door is equal to car, 1 otherwise.
        stick_win <- ifelse(contestant_door == 'car',1,0)
        switch_win <- ifelse(contestant_door == 'car',0,1)
        
        stick_prob <- cumsum(stick_win)/i
        switch_prob <- cumsum(switch_win)/i
        
        #store result in a dataframe
        results <- data.frame(i=i, contestant_door=contestant_door, 
                              stick_win=stick_win,  switch_win=switch_win,
                              stick_prob=stick_prob, switch_prob=switch_prob)
        
        return(results)
}

monte_hall_results <- monte_hall(iter)

Let’s check our simulated data. First let’s check the contestant_door column or the result of the door that the contestant pick, of 10k repetitions, 1/3 should be results to car, and 2/3 to goat. Then let’s print some of the data.

summary <- table(monte_hall_results$contestant_door)

df_summary <- data.frame(label = names(summary), count = matrix(summary))

print(df_summary)

##   label count
## 1   car  3333
## 2  goat  6667

head(monte_hall_results)

##   i contestant_door stick_win switch_win stick_prob switch_prob
## 1 1            goat         0          1       0.00        1.00
## 2 2            goat         0          1       0.00        1.00
## 3 3            goat         0          1       0.00        1.00
## 4 4             car         1          0       0.25        0.75
## 5 5             car         1          0       0.40        0.60
## 6 6             car         1          0       0.50        0.50

You can see that if the contestant play the show 10k times, 0.3333, 0.6667 of the time we got the car which is very close to the theoritical probability of winning (\(1/3\) or 33.33%).

Let’s summarize the chance of winning if we stick vs we switch.

print(paste("Chance of winning if we always stick from our first choice: ", sum(monte_hall_results$stick_win)/iter))

## [1] "Chance of winning if we always stick from our first choice:  0.3333"

print(paste("Chance of winning if we switch from our first choice: ", sum(monte_hall_results$switch_win)/iter))

## [1] "Chance of winning if we switch from our first choice:  0.6667"

Let’s plot the estimated probability of winning by sticking to the first choise vs switching to the other door by the number of iterations:

g <- ggplot(monte_hall_results, mapping = aes(x=i, y=stick_prob))
g + geom_line(color="#3333ff") +
        geom_hline(yintercept = 1/3, color = '#8080ff') +
        geom_line(aes(y=switch_prob), color= "#ff751a") +
        geom_hline(yintercept = 2/3, color = '#ff944d') +
        ylab('Est.Probability')+xlab('Iteration') +
        geom_label(data = data.frame(label = c('switch', 'stick'), i = c(10000,10000), stick_prob = c(0.75,0.25)), 
                   aes(label = label, color=label),
                   show.legend = FALSE
                   )+
        scale_color_manual(values = c("#3333ff", "#ff751a"))+
        ggtitle("Estimated Probability of Winning")

Simulating the Monty Hall Problem

Nilrey Jim D. Cornites

8/16/2020

The Monty Hall Problem

Simulate the Problem Using R