Extra Credit

  1. Simulate choosing a card from a shuffled deck 10 times with replacement to count the occurrences where you get a diamond.

    # General set-up for Simulation
    set.seed(123)
    x <- 1 # of realizations. I will put 1 for now lol.
    realization.sim <- rbinom(n = x, size = 10, prob = 13/52) # Prob. of getting a diamond in a standard deck of cards is P(Diamond) = 13/52 beacuse there area total of 13 diamond cards in a standard deck of 52 cards.

As we increase the realizations, we see the mean decrease closer and closer to the theoretical value starting at 3.4 with 5 realizations to 2.535 with 200 realizations (the theoretical mean is 2.5). However when observe the dynamics of the variance, we see that with the increasing of realizations, variance does not necessarily decrease closer to the theoretical variance. At 5 realizations, the variance is 1.8. Then at 20 realizations the variance jumps to 2.273684. At 100 realizations, the variance jumps back down to 1.807677. Lastly at 200 realizations, the variance jumps down to 1.64701. The theoretical variance is 1.875 and the closest to this variance was actually at 100 realizations. But it is not surprising since we are simulating values, which leads to random spread in data and therefore variance is random as well especially for smaller samples. The growing realizations also approach the PDF. Comparing the realizations, we can see that the 200 realizations simulation is the closest distribution to PDF.

  1. Let’s say for a new driver, the average time it takes to drive from UOG to GPO is 20 minutes with a standard deviation of 3 minutes and X, the time to drive between UOG and GPO, is normally distributed.

The growing observations approach the theoretical value, beginning with 5 observationsand an expected value of 18.94294 and ending at an expected value of 19.82672 at 200 observations (the theoretical expected value is 20). Variance varies as the observations grow due to simulation of observations. The furthest away from the theoretical variance is the simulation with only 5 observations at 17.49525. If we compare the distributions of histograms, we can see that the as observations gets larger, it starts to shape into a normal distribution like the PDF. Both at 100 and 200 observations are the histograms that look, for the most, part normal.

  1. The Poisson distribution is a probability distribution that models the number of events within a fixed time interval. The values taken by a Poisson random variable are discrete and mainly counts. The probability mass function is given as:

    \[ f(x) = \frac{\lambda^{x}_{p} e^{-\lambda_{p}}}{x!} \]

    where x ∈ 0, 1, … and λp represents the mean number of occurrences. The notation X ∼ POIS(λp) means that X follows a Poisson distribution with parameter λp. (dpois, ppois, qpois, and rpois are the R functions for this distribution).

As the hours grow, we see the distribution approaches the PMF distribution with 100 hours being the closest depiction of the PMF distribution. However for the expected value and variance it actually strays away from the theoretical mean and variance. I think this is because of outlying data as lambda increases.