Mr. and Mrs. Smith have two children, one of whom is a daughter. What is the probability that the other one is a boy?
We can model the set of outcomes by \(\Omega=\{FF,FM,MF,MM\}\), which we endow with the uniform probability measure. The interpretation of, e.g., \(FM\) is that the first (second) observed child is a female (male). Then, we want to calculate \(P(\{FM,MF,MM\}|\{FF,FM,MF\})\). Using the definition of conditional probability, we get \[(\{FM,MF,MM\}|\{FF,FM,MF\})=\frac{P(\{FM,MF,MM\}\cap\{FF,FM,MF\})}{P(\{FF,FM,MF\})} = \frac{P(\{FM,MF\})}{P(\{FF,FM,MF\})}=\frac{2}{3}.\]
We want to use R to simulate the experiment several times and infer the desired probability.
Solution 1. For the set of outcomes, we assume the following correspondence: \(FF\leftrightarrow 1\), \(FM\leftrightarrow 2\), \(MF\leftrightarrow 3\) and \(MM\leftrightarrow 4\).
N=1000000
S=sample(x=1:4,size=N,replace=T)
prob=mean(S[S<4]!=1)
prob
## [1] 0.6668444
Solution 2. The goal is to use data frames.
N=1000000
df=data.frame(C1=sample(x=c("F","G"),size=N,replace=T),
C2=sample(x=c("F","G"),size=N,replace=T));
df=df[df$C1=="F" | df$C2=="F",] # only families with at least a female
prob=nrow(df[df$C1=="G" | df$C2=="G",])/nrow(df) # how many families with a boy (among the ones with at least a female)
prob
## [1] 0.6666653
With this game, each of \(K>3\) players tosses a fair coin. A player wins a game if his/her outcome is different from the ones of all the other players.
Identify a sample space \(\Omega\) and endow it with a suitable probability measure.
Compute the probability of having a winner.
Write a code in R that simulate the game.
Let \(T_K\) be the random variable denoting the number of games to play for observing a winner. Find the law and the expected value of \(T_k\).
A natural choice is \(\Omega=\{0,1\}^K\) where \(0\) and \(1\) represent the outcomes “Head” and “Tail”, respectively. The set \(\Omega\) is endowed with the uniform probability.
Using \(\sigma\)-additivity, the desired probability is \(p_K=\sum_{i=1}^K P(\{a_i\})+P(\{b_i\})\) where \(a_i\) (respectively, \(b_i\)) is a vector of size \(K\) containing all zeros (ones) except on coordinate \(i\), which contains a one (zero). Since \(p(a_i)=p(b_i)=\frac{1}{2^K}\), we obtain \(p_K = \frac{K}{2^{K-1}}\)
Solution 1
N=1000000
K=5
S=replicate(K, sample(x=0:1,N,replace=T))
p_K=mean(rowSums(S)==1 | rowSums(S)==K-1)
p_K
## [1] 0.312763
Solution 2 (with data frames)
N=1000000
K=5
df_parties = data.frame(x1 = sample(x=c(0,1),size=N,replace=T));
for(i in 2:K) {
df_parties = cbind(df_parties, data.frame(x = sample(x=c(0,1),size=N,replace=T)));
names(df_parties)[i]=paste0("x",i);
}
df_parties$sum = 0
for(i in 1:K) {
df_parties$sum = df_parties$sum + df_parties[,paste0("x",i)];
}
p_K=mean(df_parties$sum==1 | df_parties$sum==K-1);
p_K
## [1] 0.313461