Investigation 3A:
Two contests

Author

Javi Zialcita, Chris Opyrchal

Published

December 5, 2024

You may work with a group of as many as three students on this investigation, submitting one report with all names, provided that you all contribute to the work.

Target contest

Suppose that Mabel and Nora play a game in which they each throw a ball at a target, repeatedly and independently. Each has a certain probability of successfully hitting the target with a throw. The game ends after each has made a pre-specified number of throws. The winner of the game is the person who hits the target with more throws. If they hit the target with the same number of throws, then the game results in a tie.

Write R code for simulating this random process. Inputs should be N (number of games to be simulated), n (number of throws for each person per game), p_m and p_n (success probabilities for Mabel and Nora). Outputs should be the approximate probability that Mabel wins, the approximate probability that Nora wins, and the approximate probability that the game results in a tie.

Some advice: Do not use any loops in your code. Make use of the rbinom command. Start with a small-ish number of repetitions until you are confident that your code runs correctly. Examine graphs and summary statistics to make sure that your results look reasonable.

a) Run this R code to simulate 1,000,000 games with 10 throws per game, where Mabel has an 80% chance and Nora has a 70% chance of hitting the target.

  N <- 1000000  # number of games
  n <- 10       # number of throws per games
  p_m <- 0.80   # Mabel's success probability
  p_n <- 0.70   # Nora's success probability
  
  # Simulate the number of successful hits for each person
  mabel_hits <- rbinom(N, n, p_m)
  nora_hits <- rbinom(N, n, p_n)
  
  # Determine the outcomes of the game (win, loss, or tie)
  mabel_wins <- sum(mabel_hits > nora_hits)
  nora_wins <- sum(mabel_hits < nora_hits)
  ties <- sum(mabel_hits == nora_hits)

  # Calculate approximate probabilities
  prob_mabel_win <- mabel_wins / N
  prob_nora_win <- nora_wins / N
  prob_tie <- ties / N

  # Print the results
  cat("Approximate probability that Mabel wins:", prob_mabel_win, "\n")
Approximate probability that Mabel wins: 0.604855 
  cat("Approximate probability that Nora wins:", prob_nora_win, "\n")
Approximate probability that Nora wins: 0.215819 
  cat("Approximate probability of a tie:", prob_tie, "\n")
Approximate probability of a tie: 0.179326 

b) Write more code to produce the difference [(number of successes for Mabel) minus (number of successes for Nora)] for each game. Examine a histogram of these differences. Also report the mean and standard deviation of these differences.

  # Calculate the difference (number of successes for Mabel - number of successes for Nora)
  differences <- mabel_hits- nora_hits
  
  # Plot histogram of differences
  hist(differences, main="Histogram of Differences (Mabel - Nora)", xlab="Difference (Mabel - Nora)", breaks=30)

  # Calculate and report the mean and standard deviation of the differences
  mean_diff <- mean(differences)
  std_dev_diff <- sd(differences)
  
  cat("Mean of differences:", mean_diff, "\n")
Mean of differences: 1.000986 
  cat("Standard deviation of differences:", std_dev_diff, "\n")
Standard deviation of differences: 1.924506 

c) Use properties of binomial distributions, and properties of expected value and variance, to calculate the (theoretical) expected value and standard deviation for this difference. Verify that the simulation results come close to the theoretical values.

  expected_diff <- n * p_m - n * p_n
  variance_diff <- n * p_m * (1 - p_m) + n * p_n * (1 - p_n)
  std_dev_theory <- sqrt(variance_diff)
  
  cat("Theoretical expected difference:", expected_diff, "\n")
Theoretical expected difference: 1 
  cat("Theoretical standard deviation:", std_dev_theory, "\n")
Theoretical standard deviation: 1.923538 

d) Re-run the code with 25 throws per person per game, then 100 throws per person per game, then 500 throws per person per game. Report the three approximate probabilities (Mabel wins, Nora wins, tie) for each of these sample sizes,

# Different sample sizes
throws_list <- c(25, 100, 500)

for (n in throws_list) {
  # Simulate the number of successful hits for each person
  mabel_hits <- rbinom(N, n, p_m)
  nora_hits <- rbinom(N, n, p_n)
  
  # Determine the outcomes of the game (win, loss, or tie)
  mabel_wins <- sum(mabel_hits > nora_hits)
  nora_wins <- sum(mabel_hits < nora_hits)
  ties <- sum(mabel_hits == nora_hits)
  
  # Calculate approximate probabilities
  prob_mabel_win <- mabel_wins / N
  prob_nora_win <- nora_wins / N
  prob_tie <- ties / N
  
  # Print the results
  cat("\nResults for", n, "throws per person per game:\n")
  cat("Approximate probability that Mabel wins:", prob_mabel_win, "\n")
  cat("Approximate probability that Nora wins:", prob_nora_win, "\n")
  cat("Approximate probability of a tie:", prob_tie, "\n")
}

Results for 25 throws per person per game:
Approximate probability that Mabel wins: 0.74555 
Approximate probability that Nora wins: 0.161077 
Approximate probability of a tie: 0.093373 

Results for 100 throws per person per game:
Approximate probability that Mabel wins: 0.940595 
Approximate probability that Nora wins: 0.042284 
Approximate probability of a tie: 0.017121 

Results for 500 throws per person per game:
Approximate probability that Mabel wins: 0.999872 
Approximate probability that Nora wins: 9.8e-05 
Approximate probability of a tie: 3e-05 

e) Write a few sentences to describe how the three probabilities change as the number of throws increases.

The probabilities of Mabel winning start to become really high, while the probability of Nora or there being a tie decreases.

Puzzle contest

Suppose that Omar and Pablo play a game in which each of them solves a puzzle as quickly as possible. The time needed to solve the puzzle follows an exponential distribution for each person, independently from person to person. The winner of the game is the person who takes less time to solve the puzzle.

Write R code for simulating this random process. Inputs should be N (number of games to be simulated), lambda_o and lambda_p (lambda parameter values for times to solve the puzzle for Omar and Pablo). Outputs should be the approximate probability that Omar wins, the approximate probability that Pablo wins, and the approximate probability that the game results in a tie.

Some advice: Do not use any loops in your code. Make use of the rexp command. Start with a small-ish number of repetitions until you are confident that your code runs correctly. Examine graphs and summary statistics to make sure that your results look reasonable.

f) Run this R code to simulate 1,000,000 games, where Omar’s parameter value is 1 and Pablo’s is 2.

  N <- 1000000  # number of games
  lambda_o <- 1  # Omar's rate parameter
  lambda_p <- 2  # Pablo's rate parameter
  
  # Simulate the times to solve the puzzle for Omar and Pablo
  omar_times <- rexp(N, lambda_o)
  pablo_times <- rexp(N, lambda_p)
  
  # Determine the outcomes of the game (win, loss, or tie)
  omar_wins <- sum(omar_times < pablo_times)
  pablo_wins <- sum(omar_times > pablo_times)
  ties <- sum(omar_times == pablo_times)
  
  # Calculate approximate probabilities
  prob_omar_win <- omar_wins / N
  prob_pablo_win <- pablo_wins / N
  prob_tie <- ties / N
  
  # Print the results
  cat("Approximate probability that Omar wins:", prob_omar_win, "\n")
Approximate probability that Omar wins: 0.333432 
  cat("Approximate probability that Pablo wins:", prob_pablo_win, "\n")
Approximate probability that Pablo wins: 0.666568 
  cat("Approximate probability of a tie:", prob_tie, "\n")
Approximate probability of a tie: 0 

g) Write more code to produce the difference [(Omar’s time) minus (Pablo’s time)] for each game. Examine a histogram of these differences. Also report the mean and standard deviation of these differences.

  differences <- omar_times - pablo_times
  
  # Plot histogram of differences
  hist(differences, main="Histogram of Differences (Omar - Pablo)", xlab="Difference (Omar - Pablo)", breaks=50)

  # Calculate and report the mean and standard deviation of the differences
  mean_diff <- mean(differences)
  std_dev_diff <- sd(differences)
  
  cat("Mean of differences:", mean_diff, "\n")
Mean of differences: 0.50051 
  cat("Standard deviation of differences:", std_dev_diff, "\n")
Standard deviation of differences: 1.119909 

h) Use properties of exponential distributions, and properties of expected value and variance, to calculate the (theoretical) expected value and standard deviation for this difference. Verify that the simulation results come close to the theoretical values.

  # Theoretical expected value and standard deviation for the difference
  expected_diff <- (1 / lambda_o) - (1 / lambda_p)
  variance_diff <- (1 / lambda_o^2) + (1 / lambda_p^2)
  std_dev_theory <- sqrt(variance_diff)
  
  cat("Theoretical expected difference:", expected_diff, "\n")
Theoretical expected difference: 0.5 
  cat("Theoretical standard deviation:", std_dev_theory, "\n")
Theoretical standard deviation: 1.118034 

i) Re-run the code with parameter values of 2 and 3, and then with parameter values of 3 and 4, and then with parameter values of 4 and 5, respectively for Omar and Pablo.

# Define the different parameter sets
  param_sets <- list(c(2, 3), c(3, 4), c(4, 5))
  
  for (params in param_sets) {
    lambda_o <- params[1]
    lambda_p <- params[2]
    
    # Simulate the times to solve the puzzle for Omar and Pablo
    omar_times <- rexp(N, lambda_o)
    pablo_times <- rexp(N, lambda_p)
    
    # Determine the outcomes of the game (win, loss, or tie)
    omar_wins <- sum(omar_times < pablo_times)
    pablo_wins <- sum(omar_times > pablo_times)
    ties <- sum(omar_times == pablo_times)
    
    # Calculate approximate probabilities
    prob_omar_win <- omar_wins / N
    prob_pablo_win <- pablo_wins / N
    prob_tie <- ties / N
    
    # Print the results for each parameter set
    cat("\nResults for lambda_o =", lambda_o, "and lambda_p =", lambda_p, ":\n")
    cat("Approximate probability that Omar wins:", prob_omar_win, "\n")
    cat("Approximate probability that Pablo wins:", prob_pablo_win, "\n")
    cat("Approximate probability of a tie:", prob_tie, "\n")
  }

Results for lambda_o = 2 and lambda_p = 3 :
Approximate probability that Omar wins: 0.399848 
Approximate probability that Pablo wins: 0.600152 
Approximate probability of a tie: 0 

Results for lambda_o = 3 and lambda_p = 4 :
Approximate probability that Omar wins: 0.429011 
Approximate probability that Pablo wins: 0.570989 
Approximate probability of a tie: 0 

Results for lambda_o = 4 and lambda_p = 5 :
Approximate probability that Omar wins: 0.444296 
Approximate probability that Pablo wins: 0.555704 
Approximate probability of a tie: 0 

j) Write a few sentences to describe how the three probabilities change as the parameter values increase.

As lamda increases, Omar’s probability of winning increases, as Pablo’s decreases.

k) Make a conjecture for the theoretical probability that Omar wins the game, in terms of the two parameter values lambda_o and lambda_p.

REMEMBER TO:

  1. Click the “Render” button and save to an html file.
  2. Look over the html file to make sure that it looks fine. If it does not look fine, edit the file and then click on “Render” again.
  3. Submit the html file as your report for Investigation 3B via Canvas.
  4. If you are working with a group, only one member of the group should submit the file, but make sure that all group member names appear at the top of the file.