Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponential(0.2)s.By reading, we know that sample size is 40, random uniforms is 1000. By google, we know that theoretical exponential distribution has a mean of 1/lambda, standard deviation equals to mean, and variance is 1/lambda^2.
Package we use:
echo = TRUE
library(ggplot2)
library(knitr)
Calculate actual, theoretical (mean, sd, var), and form a data frame:
n<-40
nosim<-1000
lambda<-0.2
Matrix<-matrix(rexp(nosim*n, lambda),nosim)
data<-apply(Matrix, 1, mean)
actual_mean<-mean(data)
theo_mean<-1/lambda
actual_sd<-sd(data)
theo_sd<-1/lambda*(1/sqrt(n))
actual_mean
## [1] 4.995171
theo_mean
## [1] 5
Now plot actual means with actual density.
ggplot(data = data.frame(data), aes(x = data)) +
geom_histogram(binwidth = 0.1, aes(y= ..density..), fill = NA,
color = "purple") +
geom_vline(xintercept = actual_mean, size = 1, color = "blue",linetype="dashed") +
stat_function(fun = dnorm, color = "blue", size = 1,
arg = list(mean = actual_mean, sd = actual_sd)) +
ylab("density") + xlab("sample mean") +
ggtitle("simulation distribution with 40 exponentials and lambda 0.2") +
scale_x_continuous(breaks = seq(0, 8, 1))
We see the actual mean is very close to theoretical mean( which value is 5).
actual_var<-var(data)
theo_var<-1/(lambda^2*n)
form a data frame show mean, sd, and var:
df<-data.frame(name=c("actual", "theoretical"),
mean=c(actual_mean, theo_mean),
sd=c(actual_sd, theo_sd),
var=c(actual_var, theo_var))
df
## name mean sd var
## 1 actual 4.995171 0.7975797 0.6361334
## 2 theoretical 5.000000 0.7905694 0.6250000
we see that variance of actual is pretty close to theoretical variance.
Now we plot the exponential distribution with sample size 40, and theoretical exponential distribution and look their differece of mean.
ggplot(data = data.frame(data), aes(x = data)) +
geom_histogram(binwidth = 0.1, aes(y= ..density..), fill = NA,
color = "purple") +
geom_vline(xintercept = actual_mean, size = 1, color = "blue",linetype="dashed") +
geom_vline(xintercept = theo_mean, size = 1, color = "red",linetype="dashed") +
stat_function(fun = dnorm, color = "blue", size = 1,
arg = list(mean = actual_mean, sd = actual_sd)) +
stat_function(fun = dnorm, color = "red", size = 1,
arg = list(mean = theo_mean, sd = theo_sd)) +
ylab("density") + xlab("mean") +
ggtitle("Actual distribution(blue) and theoretical distribution(red)") +
scale_x_continuous(breaks = seq(0, 8, 1))
legend: red line: theoretical density line and theoretical mean. blue line: actual density line and actual mean.
We can see in the plot, actual distribution is very close to the theoretical distribution because red line is very close to blue line. Therefore, the distribution is approximately normal.