Introduction

Monte Carlo simulations are a statistical technique used to model probabilistic (or “stochastic”) processes and establish the odds for a variety of outcomes. The concept was first popularized right after World War II. To study nuclear fission, mathematician Stanislaw Ulam coined the term in reference to an uncle who loved playing the odds at the Monte Carlo casino (then a world symbol of gambling, like Las Vegas today).

To get started, load package tidyverse, and set the random number generator in your setup R chunk. In the following problems you will make heavy use of functions

  • sample(x, size, replace = FALSE, prob = NULL),

  • replicate(n, expr, simplify = "array").

Antoine Gombaud’s question

Conduct a simulation to answer the question below that was initially posed by Antoine Gombaud (a famous gambler in the 17th century).

Which is more likely:

  1. getting at least one 6 when rolling a single fair six-sided die 4 times,
  2. getting at least one pair of sixes when two fair six-sided dice are thrown 24 times?

The number of simulation replications you choose is at your discretion, but if you choose a number too small the results will not be accurate.

Enough?

How many Monte Carlo experiments are enough? Perform a Monte Carlo simulation to evaluate the probability of getting “Heads” in a fair coin toss. Use ggplot() to plot the probability estimate on the y-axis and the iteration (number of Monte Carlo) experiments on the x-axis. As the number of iterations gets large, you should see this value stabilize at around 0.50.

Birthday problem

Conduct a Monte Carlo simulation to answer the following questions related to the birthday problem or birthday paradox.

  1. What is the probability of at least two people sharing the same birthday (month and day) from a random sample of 23 individuals?
  2. What is the probability of at least two people sharing the same birthday (month and day) from a random sample of 70 individuals?
  3. Create a plot using ggplot with the number of individuals on the x-axis and the probability of at least one pair on the y-axis. You should simulate the probability of at least two people sharing the same birthday for each number of individuals from 2 to 100.

You may ignore leap year and assume 365 days per year. The number of simulation replications you choose is at your discretion, but if you choose a number too small the results will not be accurate.