The Monty Hall Problem

Probability with R

0.1 Monty Hall Problem

Suppose that there are three closed doors on the set of the , a TV game show presented by Monty Hall. Behind one of these doors is a car; behind the other two are goats. The contestant does not know where the car is, but Monty Hall does.

The contestant selects a door, but not the outcome is not immediately evident. Monty opens one of the remaining ``wrong” doors. If the contestant has already chosen the correct door, Monty is equally likely to open either of the two remaining doors.

After Monty has shown a goat behind the door that he opens, the contestant is always given the option to switch doors. What is the probability of winning the car if she stays with her first choice? What if she decides to switch?

0.1.1 Implementation (part 1)}

We have 3 doors to choose from, so we will define a numeric sequence between 1 and 3. The command takes a sample of size n from a specified set of values. Here we just want to select one door to be our “correct door” and another to be ``selected” door (i.e. the contestant selects)

These events are independent. We will perform the selection for both doors separately, but this can be implemented in one command.

Doors = 1:3 
Correct = sample(Doors,1)
Choice = sample(Doors,1)

A wrong door must be selected to be opened. The door selected by the contestant can’t be chosen. First let us select the doors that must stay closed, then find the ones we can choose from to open

StayClosed = union(Correct, Choice)
CanOpen = setdiff(Doors, StayClosed)

The command poses an interesting problem. Lets look at the help file to see what it is.

Create a single element vector (let’s call it , with that single element being 4. Now sample a single value from that data set. You may expect to get 4 each time, but doesn’t work that way in this instance.

Vec=c(4)
sample(Vec,1)

## [1] 1

sample(Vec,1)

## [1] 4

sample(Vec,1)

## [1] 3

sample(Vec,1)

## [1] 1

To overcome this we need some control statements. Now we have the doors we are able to open.

if(length(CanOpen)>1)
{
Open = sample(CanOpen,1) 
}else {Open=CanOpen}

NotOpen = setdiff(Doors,Open)

Stick = Choice        
    #The previous statement is to aid the narrative. 

Switch = setdiff(NotOpen,Choice)

# Was sticking the right decision? Or was switching?
# The following logicial statements  will tell us.

Stick==Choice

## [1] TRUE

Switch==Choice

## [1] FALSE

0.1.2 Writing a Function}

We are going to create a function to help us simulate a solution for the Monty Hall Problem. The function is constructed using code we have seen already. The output of the function is returned as a data frame, with three columns:

[Correct] : The number of the correct door.
[Choice] : The door that the contestant chose originally, and the door selected if the contestant decided to ``stick”.
[Switch] : The door selected if the contestant chose to ``switch”.

Necessarily the Correct option must correspond with a value in one of the other two columns.

MHfunc <- function(numdoors=3){
 Doors = 1:numdoors 
 Correct = sample(Doors,1)
 Choice = sample(Doors,1)
 
 StayClosed = union(Correct, Choice)
 CanOpen = setdiff(Doors, StayClosed)
 
 if(length(CanOpen)>1)
    {
    Open = sample(CanOpen,1) #Problem here
    }else {Open=CanOpen}

 NotOpen = setdiff(Doors,Open)
 Switch = setdiff(NotOpen,Choice)

 
 MHoutput=c(Correct,Choice,Switch)
 names(MHoutput)= c("Correct","Choice","Switch")
 return(MHoutput)
}
MHfunc()

## Correct  Choice  Switch 
##       2       2       3

No arguments are needed to be passed to the function. The output should appear something like this:

MHfunc()

## Correct  Choice  Switch 
##       2       2       1

0.1.3 Transforming logical values}

The command can be use to convert logical values, such as TRUE or FALSE into binary form (i.e. 1 and 0).

LogVec=c(TRUE,FALSE,TRUE)
as.numeric(LogVec)

## [1] 1 0 1

# > output = MHfunc()
# > output
# Correct  Choice  Switch 
#       1       2       1 
# > output[1]
# Correct 
#       1 
# > output[2]
# Choice 
#      2 
# > output[1]==output[2]
# Correct 
#   FALSE 
# > output[1]==output[3]
# Correct 
#    TRUE
# > as.numeric(output[1]==output[2])
# [1] 0
# > as.numeric(output[1]==output[3])
# [1] 1
#    >

0.1.4 A Simulation Study of the Monty Hall Problem}

Let us perform a simulation study of the Monty Hall problem. We are putting the code we have written already, placed in a loop, although we alter the ending to make for more readable output.

StickCount = 0
SwitchCount = 0
M=1000

for(i in 1:M)
 {
 output=MHfunc()
 Correct=output[1]
 Choice=output[2]
 Switch=output[3]
 # Install counters for both “count variables”

 StickCount = StickCount + as.numeric(Choice  ==  Correct)
 SwitchCount = SwitchCount + as.numeric(Switch  ==  Correct)
}
StickCount

## [1] 342

SwitchCount

## [1] 658

The output of this program would look like this. The proportion is approximately 2 to 1 in favour of the switching option.

# > StickCount
# [1] 323
# > SwitchCount
# [1] 677