The Monty Hall Problem: Part II

This Lab is an extension of the Monty Hall Problem that was implemented in Lab 01.

In this lab we will be practicing loops constructors, and collectors.

Last week we worked to build five distinct functions, one for each step of the game:

  1. Game set-up
  2. Contestant selects first door
  3. Host opens a goat door
  4. Contestant decides to stay or switch doors
  5. Host reveals prize to determine if they won

We now have all of the pieces that we need to start analyzing our problem. Which strategy is best in this scenario? Should we stay with our first selection? Or should we switch doors when given the option?

We can answer some of these hard analytical questions either (1) with an explicit closed-form mathematical solution (i.e. a proof), or (2) using simulation to examine the effectiveness of both strategies by playing the game over and over and looking at which strategy yields higher returns.

Functions From Lab 01

Using the functions you created last week or those provided in the solutions, package them all together into a single play_game() function which executes each step of a single game in order.

create_game <- function()
{
  a.game <- sample(x=c("car", "goat", "goat"), size=3, replace=F)
  return(a.game)
}

select_door <- function()
{
  doors <- c(1,2,3)
  a.pick <- sample(doors, 1)
  return(a.pick)
}

open_goat_door <- function(a.game, a.pick)
{
  
  opened.door <- (which(a.game == "goat")[1])
  if(opened.door == a.pick){
    opened.door <- (which(a.game =="goat")[2])
  } else {
    opened.door <- opened.door
  }
  return(opened.door)
}

change_door <- function(stay=T, opened.door, a.pick)
{
  doors <- c(1,2,3)
  if (stay==TRUE){
    final.pick <- a.pick
  } 
  if (stay==FALSE){ 
    final.pick <- doors[!doors %in% c(opened.door, a.pick)] #not opened door and not a.pick
  }

  return(final.pick) 
}

determine_winner <- function(final.pick, game) 
{
  if(game[final.pick] == "car"){
    return ("WIN")
  } else if (game[final.pick] == "goat"){
    return ("LOSE")
  }
}


The play_game function

play_game <- function( )
{
  new.game <- create_game()
  first.pick <- select_door()
  opened.door <- open_goat_door( new.game, first.pick )
  final.pick.stay <- change_door( stay=T, opened.door, first.pick )
  final.pick.switch <- change_door( stay=F, opened.door, first.pick )
  outcome.stay <- determine_winner( final.pick.stay, new.game  )
  outcome.switch <- determine_winner( final.pick.switch, new.game )
  
  # game.results <- bundle the results
  # return( <<< game.results >>> )
  
  strategy <- c("stay","switch")
  outcome <- c(outcome.stay,outcome.switch)
  game.results <- data.frame( strategy, outcome,
                              stringsAsFactors=F )
  return( game.results )
}

play_game()


Adding the Game to a Loop

When running simulations for inferential purposes the general rule of thumb is they should run at least 10,000 times in order for the simulated statistics to converge close to the actual theoretical value (average proportion of wins achieved by each strategy in this case).

Building the table of results with three doors, one car:

results.df <- NULL   # collector

for( i in 1:10000 )  # iterator
{
  game.outcome <- play_game()
  # binding step
  results.df <- rbind( results.df, game.outcome )
}

table( results.df ) 
##         outcome
## strategy LOSE  WIN
##   stay   6677 3323
##   switch 3323 6677
#table as proportions 
table( results.df ) %>% 
  prop.table( margin=1 ) %>%  # row proportions
  round( 2 )
##         outcome
## strategy LOSE  WIN
##   stay   0.67 0.33
##   switch 0.33 0.67


Part One: With Five Doors, Is SWITCH still the Dominant Strategy?

The first challenge question required you build a game with 5 doors: 3 goats and 2 cars. The rest of the game is the same except instead of opening a single goat door after the contestant makes their initial selection, the host now opens a goat door AND a car door.

In this new game set-up, is SWITCH still the dominant strategy for contestants?

Using the solutions provided for Lab 01 (or your own functions if you were able to successfully implement it) wrap the game steps into a single play_game() function. Similar to the example above, return a data frame that contains results from one game.

Use a loop to build a simulation that plays the game 10,000 times. Create a table to report the results.

Building the functions:

build_doors <- function( n=5 ){ return( 1:n ) }

create_game <- function( )
{
    a.game <- sample( x=rep( c("goat","car"), c(3,2) ), 
                      size=5, replace=F )
    return( a.game )
}

select_door <- function( )
{
  doors <- build_doors() 
  a.pick <- sample( doors, size=1 )
  return( a.pick )  # number between 1 and 5
}

open_doors <- function( game, a.pick )
{
   # reveal one car and one goat
   
   doors <- build_doors()
   
   if( game[ a.pick ] == "car" )
   { 
     opened.car.door <- doors[ game == "car" & doors != a.pick ]
     goat.doors <- doors[ game != "car" ] 
     opened.goat.door <- sample( goat.doors, size=1 )
     opened.doors <- c( opened.car.door, opened.goat.door )
   }
   
   if( game[ a.pick ] == "goat" )
   { 
     opened.car.door <- sample( doors[game=="car"], size=1 )
     available.goat.doors <- doors[ game != "car" & doors != a.pick ] 
     opened.goat.door <- sample( available.goat.doors, size=1 )
     opened.doors <- c( opened.car.door, opened.goat.door )
   }
   return( opened.doors ) # two numbers
}

change_door <- function( stay=T, opened.doors, a.pick )
{
   doors <- build_doors()
   
   if( stay )
   {
     final.pick <- a.pick
   }
   if( ! stay )
   {
     available.doors <- 
        doors[  ! ( doors %in% opened.doors | doors == a.pick )  ]
     final.pick  <- sample( available.doors, size=1 ) 
   }
  
   return( final.pick )  # number between 1 and 5
}

determine_winner <- function( final.pick, game )
{
   if( game[ final.pick ] == "car" )
   {
      return( "WIN" )
   }
   if( game[ final.pick ] == "goat" )
   {
      return( "LOSE" )
   }
}


The play_game function with five doors:

play_game <- function( )
{
  new.game <- create_game()
  first.pick <- select_door()
  opened.door <- open_doors( new.game, first.pick )

  final.pick.stay <- change_door( stay=T, opened.door, first.pick )
  final.pick.switch <- change_door( stay=F, opened.door, first.pick )

  outcome.stay <- determine_winner( final.pick.stay, new.game  )
  outcome.switch <- determine_winner( final.pick.switch, new.game )
  
  strategy <- c("stay","switch")
  outcome <- c(outcome.stay,outcome.switch)
  game.results <- data.frame( strategy, outcome,
                              stringsAsFactors=F )
  return( game.results )
}

play_game()


Building the table of results with five doors, two cars:

results.df <- NULL   # collector

for( i in 1:10000 )  # iterator
{
  game.outcome <- play_game()
  # binding step
  results.df <- rbind( results.df, game.outcome )
}

table( results.df ) 
##         outcome
## strategy LOSE  WIN
##   stay   5980 4020
##   switch 6991 3009
table( results.df ) %>% 
  prop.table( margin=1 ) %>%  # row proportions
  round( 2 )
##         outcome
## strategy LOSE WIN
##   stay    0.6 0.4
##   switch  0.7 0.3


From this table, it appears that staying with your initial pick of doors has a slightly higher chance of winning (increase chance by 10%).


Part Two: Ten Doors, How many Cars and Goats?

The second challenge question asks that you create a game that allows the user to specify (1) how many goats and (2) how many cars are included in the game, also determining the number of doors the contestants can choose from. The host only opens one goat door, similar to the original game.

Using solutions for the second challenge problem on Lab-01, create a wrapper play_game() function and build a simulation loop the same as the last problem.

Create three separate simulations, each with 10 doors total but analyzing three different scenarios.

  • 1 car, 9 goats
  • 2 cars, 8 goats
  • 3 cars, 7 goats

Use your results to answer the following questions:

Q1: Is SWITCH still the dominant strategy? Report your chances of winning for each strategy for each scenario.

Q2: How much better off are you switching in the original game? How much did it improve your chances of winning?

  • Pr(win|switch)−Pr(win|stay)

Q3: How much better off are you switching in each of the three scenario with 10 doors and 1 to 3 cars?

num.goats <- 7
num.cars <- 3

create_game <- function( num.goats, num.cars )
{
    a.game <- sample( x=rep( c("goat","car"), c(num.goats,num.cars) ), 
                      size=(num.goats+num.cars), replace=F )
    return( a.game )
}

build_doors <- function( n ){ return( 1:n ) }

# we had to add a 'game' argument so we can get length(game)

select_door <- function( game )
{
  doors <- build_doors( n=length(game) ) 
  a.pick <- sample( doors, size=1 )
  return( a.pick )  # number between 1 and N
}

build_doors <- function( n ){ return( 1:n ) }

create_game <- function( num.goats, num.cars )
{
    a.game <- sample( x=rep( c("goat","car"), c(num.goats,num.cars) ), 
                      size=(num.goats+num.cars), replace=F )
    return( a.game )
}

select_door <- function( game )
{
  doors <- build_doors( n=length(game) ) 
  a.pick <- sample( doors, size=1 )
  return( a.pick )  # number between 1 and N
}

open_goat_door <- function( game, a.pick )
{
   doors <- build_doors( n=length(game) )
   doors.that.can.be.opened <- 
      doors[ ! ( game == "car" | doors == a.pick ) ]
   opened.door <- sample( doors.that.can.be.opened, size=1 )
   return( opened.door ) # number between 1 and N
}

change_door <- function( stay=T, game, opened.door, a.pick )
{
   doors <- build_doors( length(game) )
   
   if( stay )
   {
     final.pick <- a.pick
   }
   if( ! stay )
   {
     available.doors <- 
        doors[ doors != opened.door & doors != a.pick ]
     final.pick  <- sample( available.doors, size=1 ) 
   }
  
   return( final.pick )  # number between 1 and N
}

determine_winner <- function( final.pick, game )
{
   if( game[ final.pick ] == "car" )
   {
      return( "WIN" )
   }
   if( game[ final.pick ] == "goat" )
   {
      return( "LOSE" )
   }
}


The play_game function with ten doors:

play_game <- function( num.goats=7, num.cars=3 )
{

  new.game <- create_game( num.goats, num.cars )
  first.pick <- select_door( game=new.game )
  opened.door <- open_goat_door( new.game, first.pick )

  final.pick.stay <- change_door( stay=T, new.game, opened.door, first.pick )
  final.pick.switch <- change_door( stay=F, new.game, opened.door, first.pick )

  outcome.stay <- determine_winner( final.pick.stay, new.game  )
  outcome.switch <- determine_winner( final.pick.switch, new.game )
  
  strategy <- c("stay","switch")
  outcome <- c(outcome.stay,outcome.switch)
  game.results <- data.frame( strategy, outcome,
                              stringsAsFactors=F )
  return( game.results )
}

play_game()


Building the table of results with ten doors, three cars:

results <- NULL
goats <- 7
cars <- 3

for( i in 1:100000 )  
{
  game.outcome <- play_game( goats, cars )
  results[[i]] <- game.outcome 
}

results.df <- dplyr::bind_rows( results )

table( results.df ) %>% 
  prop.table( margin=1 ) %>% 
  round( 2 )
##         outcome
## strategy LOSE  WIN
##   stay   0.70 0.30
##   switch 0.66 0.34


Building the table of results with ten doors, two cars:

results <- NULL
goats <- 8
cars <- 2

for( i in 1:100000 )  
{
  game.outcome <- play_game( goats, cars )
  results[[i]] <- game.outcome 
}

results.df <- dplyr::bind_rows( results )

table( results.df ) %>% 
  prop.table( margin=1 ) %>% 
  round( 2 )
##         outcome
## strategy LOSE  WIN
##   stay   0.80 0.20
##   switch 0.78 0.22


Building the table of results with ten doors, one cars:

results <- NULL
goats <- 9
cars <- 1

for( i in 1:100000 )  
{
  game.outcome <- play_game( goats, cars )
  results[[i]] <- game.outcome 
}

results.df <- dplyr::bind_rows( results )

table( results.df ) %>% 
  prop.table( margin=1 ) %>% 
  round( 2 )
##         outcome
## strategy LOSE  WIN
##   stay   0.90 0.10
##   switch 0.89 0.11


Q1: Is SWITCH still the dominant strategy? Report your chances of winning for each strategy for each scenario.

Of all of the scenarios above, the only time it is in your favor to keep the initial door pick is with five doors and two cars. All others give you a better chance of winning if you switch.

Q2: How much better off are you switching in the original game? How much did it improve your chances of winning? - Pr(win|switch)−Pr(win|stay)

In the original game, your chances of winning increase by (0.67 - 0.33 = 0.34) 34% if you switch doors.

Q3: How much better off are you switching in each of the three scenario with 10 doors and 1 to 3 cars?

10 doors, 1 car: your chance of winning increases by 0.01 or 1%

10 doors, 2 cars: your chance of winning increases by 0.03 or 3%

10 doors, 3 cars: your chance of winning increases by 0.04 or 4%