Distribution

For the first week, we have a simple warm-up exercise for the discussion. Using R, generate 100 simulations of 30 samples each from a distribution (other than normal) of your choice. Graph the sampling distribution of means. Graph the sampling distribution of the minimum.

library(tidyverse)
library(rbokeh)

list_x <- NULL
for(i in 1:1000){
  x <- rgamma(30, shape = 1.25, scale = .5)
  list_x[[i]] <- list(values = x, mean = mean(x), min = min(x))
}

figure(width = 750, height = 450) %>%
  ly_hist(list_x %>% map_dbl("mean"), breaks = 40, freq = FALSE, color = "blue") %>%
  ly_hist(list_x %>% map_dbl("min") , breaks = 40, freq = FALSE, color = "red") %>%
  x_axis(label = "Value") 

Red = Minimum, Blue = Mean

What did the simulation of the means demonstrate?

The simulation of the means demonstrate the central limit theorem such that even if samples are taken from a non normal distribution, at a sufficiently large size, the means will approximate a normal distribution.

What about the distribution of the minimum demonstrate?

We see that the theorem does not hold for the minimum values, the distribution of the minimum does not approximate a normal distribution.