We will get random samples of continuous distributions for different sample sizes.
Consider the Normal distribution \(N(\mu = 5, \sigma = 1.2)\). We will get random samples from this distribution and check out the histograms for each of these samples. We expect that as the sample size increases, the sample distribution will provide an increasingly improved approximation of the the true distribution.
Plot the density curve of \(N(\mu, \sigma)\),
mean = 5
sd = 1.5
x <- seq(-4, 4, length=100) *sd + mean
hx <- dnorm(x, mean, sd)
plot(x, hx, type= "n", xlab = "x", ylab= "",
main= "normal distribution", axes= TRUE)
x <- seq(-4, 4, length=100) *sd + mean
lines(x, hx)
For the sample sizes \(4, 7, 10, 15, 20, 30, 40, 80, 1000\) get random samples of respective size, and plot the histograms for each of these samples.
size <- c(4, 7, 10, 15, 20, 30, 40, 80, 1000)
sim_num = 1000
mean = 5
sd = 1.2
sampleM <- function(n, mean, sd){
sample <- rnorm(n, mean, sd)
meanS <- mean(sample)
return(meanS)
}
par(mfrow = c(3,3))
for(i in 1:length(size)){
data <- replicate(sim_num, sampleM(size[i], mean, sd), simplify = "array")
hist(data, xlab = "sample mean", main = paste0("Sample Size: ", size[i]))
}
What do you notice? I noticed that the sample size increase the shape of the histogram and that then becomes a normal distribution.
Consider the Gamma distribution \(\text{Gamma}(\alpha = 2, \beta = 1.5)\). We will get random samples from this distribution and check out the histograms for each of these samples. We expect that as the sample size increases, the sample distribution will provide an increasingly improved approximation of the the true distribution.
Plot the density curve of \(\text{Gamma}(\alpha = 2, \beta = 1.5)\),
alpha = 2
beta = 1.5
x <- seq(0,10,by=0.01)
hx <- dgamma(x, shape=alpha, scale = beta)
plot(x, hx, type="n", xlab="x", ylab="",
main = "gamma distribution", axes = TRUE)
lines(x, hx)
For the sample sizes \(4, 7, 10, 15, 20, 30, 40, 80, 1000\) get random samples of respective size, and plot the histograms for each of these samples.
size <- c(4, 7, 10, 15, 20, 30, 40, 80, 1000)
sim_num = 1000
mean = 5
sd = 1.2
sampleM <- function(n, alpha, beta){
sample <- rgamma(n, shape=alpha, scale=beta)
meanS <- mean(sample)
return(meanS)
}
par(mfrow = c(3, 3))
for(i in 1:length(size)){
data <- replicate(sim_num, sampleM(size[i], mean, sd), simplify = "array")
hist(data, xlab="sample mean", main = paste0("Sample Size: ", size[i]))
}
What do you notice?
I noticed that the sample size increase the shape of the histogram and that then becomes more of normal distribution.