Part One of this report will focus on investigating the exponential distribution in R and comparing it with the Central Limit Theorem. Part Two will be analyzing the ToothGrowth data, comparing tooth growth by supp and dose.
The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. We set lambda = 0.2 for all simulations. The code below runs the simulation of 40 exponentials 1000 times.
set.seed(1)
lambda <- 0.2
n <- 40
numsim <- 1000
data <- matrix(rexp(n*numsim, lambda), numsim)
Next, we calculate the theoretical and actual mean and variance.
theoMean <- 1/lambda
rowMean <- apply(data, 1, mean)
actlMean <- mean(rowMean)
theoSTDEV <- ((1/lambda) * (1/sqrt(n)))
actSTDEV <- sd(rowMean)
theoVariance <- theoSTDEV^2
actVariance <- var(rowMean)
The graph displays both our theoretical and actual means: The yellow dashed line being the theoretical mean, the black line being the actual mean, the red curve being our theoretical variance and our blue line being the actual variance. The table compares the values of the means, standard deviations and variance.
dfRowMeans <- data.frame(rowMean) # convert to data.frame for ggplot
mp <- ggplot(dfRowMeans, aes(x=rowMean))
mp <- mp + geom_histogram(binwidth = lambda, fill="orange", color="black", aes(y = ..density..))
mp <- mp + labs(title="Density of 40 Numbers from Exponential Distribution", x="Mean of 40 Selections", y="Density")
mp <- mp + geom_vline(xintercept=actlMean,size=1.0, color="black")
mp <- mp + stat_function(fun=dnorm,args=list(mean=actlMean, sd=actSTDEV),color = "blue", size = 1.0)
mp <- mp + geom_vline(xintercept = theoMean, size = 1.0, color = "yellow", linetype = "longdash")
mp <- mp + stat_function(fun = dnorm, args = list(mean = theoMean, sd = theoSTDEV),color = "red", size = 1.0)
mp <- mp + theme_bw() + theme(plot.title = element_text(hjust = 0.5))
mp
| Variable | Theoretical Value | Actual Value |
|---|---|---|
| Mean | 5 | 4.9900252 |
| Standard Deviation | 0.7905694 | 0.7859435 |
| Variance | 0.625 | 0.6177072 |
As the graph shows, our distribution is approximately normal by following the Central Limit Theorem.