In this project we will investigate the exponential distribution in R and compare it with the Central Limit Theorem.
Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials. Hereunder we are going to: * Show the sample mean and compare it to the theoretical mean of the distribution. * Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution. * Show that the distribution is approximately normal.
The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. Let set lambda = 0.2 for all of the simulations. Here we create the distribution of 1000 averages of 40 exponentials. Which means we do a thousand simulations. To provide reproducability for exponentials generation the seed function establishes start position so every time generation gives the same values.
set.seed(123456)
n <- 40
Simulations <- 1000
Lambda <- 0.2
SampleMean <- NULL
for(i in 1:Simulations) {
SampleMean <- c(SampleMean, mean(rexp(n, Lambda)))}
mean(SampleMean)
## [1] 5.022915
The output show that, the theoretical mean distribution of 5, relative to our mean 5, approximates, pretty good.
Show that the distribution is appoximately normal:
hist(SampleMean, breaks = n, prob = T, col = "light blue", xlab = "Sample Means")
x <- seq(min(SampleMean), max(SampleMean), length = 100)
lines(x, dnorm(x, mean = 1/Lambda, sd = (1/Lambda/sqrt(n))), pch = 25, col = "red")
qqnorm(SampleMean, col = "light blue")
qqline(SampleMean, col = "Red")
library(datasets)
data(ToothGrowth)
library(ggplot2)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
dim(ToothGrowth)
## [1] 60 3
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
ggplot(data=ToothGrowth, aes(x=as.factor(dose), y=len, fill=supp)) +
geom_bar(stat="identity") +
facet_grid(. ~ supp) +
xlab("Dose(mg)") +
ylab("Tooth length")
hypoth1 <- t.test(len ~ supp, data = ToothGrowth)
hypoth1$conf.int
## [1] -0.1710156 7.5710156
## attr(,"conf.level")
## [1] 0.95
hypoth1$p.value
## [1] 0.06063451
hypoth2<-t.test(len ~ supp, data = subset(ToothGrowth, dose == 0.5))
hypoth2$conf.int
## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95
hypoth2$p.value
## [1] 0.006358607
hypoth3<-t.test(len ~ supp, data = subset(ToothGrowth, dose == 1))
hypoth3$conf.int
## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95
hypoth3$p.value
## [1] 0.001038376
hypoth4<-t.test(len ~ supp, data = subset(ToothGrowth, dose == 2))
hypoth4$conf.int
## [1] -3.79807 3.63807
## attr(,"conf.level")
## [1] 0.95
hypoth4$p.value
## [1] 0.9638516
hypoth4$p.value
## [1] 0.9638516
The OJ variable provides marginally better tooth growth relative to VC at dosages of 0.5 & 1.0. On the other hand, variables, OJ and VC, realized similar amount of tooth growth at elevate dosages, fo example 2.0 mg/day. Consequetly, final determiation appears inconclusive or deterministic on which dt has the higher efficacy of impact for all scenarios.
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18362)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_3.2.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.2 knitr_1.24 magrittr_1.5 tidyselect_0.2.5
## [5] munsell_0.5.0 colorspace_1.4-1 R6_2.4.0 rlang_0.4.1
## [9] plyr_1.8.4 stringr_1.4.0 dplyr_0.8.3 tools_3.6.1
## [13] grid_3.6.1 gtable_0.3.0 xfun_0.8 withr_2.1.2
## [17] htmltools_0.3.6 yaml_2.2.0 lazyeval_0.2.2 digest_0.6.20
## [21] assertthat_0.2.1 tibble_2.1.3 crayon_1.3.4 reshape2_1.4.3
## [25] purrr_0.3.2 glue_1.3.1 evaluate_0.14 rmarkdown_1.16
## [29] labeling_0.3 stringi_1.4.3 compiler_3.6.1 pillar_1.4.2
## [33] scales_1.0.0 pkgconfig_2.0.2