Distribution
For the first week, we have a simple warm-up exercise for the discussion. Using R, generate 100 simulations of 30 samples each from a distribution (other than normal) of your choice. Graph the sampling distribution of means. Graph the sampling distribution of the minimum.
library(tidyverse)
library(rbokeh)
list_x <- NULL
for(i in 1:1000){
x <- rgamma(30, shape = 1.25, scale = .5)
list_x[[i]] <- list(values = x, mean = mean(x), min = min(x))
}
figure(width = 750, height = 450) %>%
ly_hist(list_x %>% map_dbl("mean"), breaks = 40, freq = FALSE, color = "blue") %>%
ly_hist(list_x %>% map_dbl("min") , breaks = 40, freq = FALSE, color = "red") %>%
x_axis(label = "Value") Red = Minimum, Blue = Mean
What did the simulation of the means demonstrate?
The simulation of the means demonstrate the central limit theorem such that even if samples are taken from a non normal distribution, at a sufficiently large size, the means will approximate a normal distribution.
What about the distribution of the minimum demonstrate?
We see that the theorem does not hold for the minimum values, the distribution of the minimum does not approximate a normal distribution.