First a simulation excercise and second a inferential analysis excercise.
1.Simulation Excercises:
1.1 Investigate the exponential distribution in R
lambda <- 0.2 # sets the value for lambda in the instructions
n<- 1000 # number of simulatins
hist(rexp(1:n, rate = 1/lambda))# histogram of 1000 simulations

mean(rexp(1:n, rate = 1/lambda)) # mean of 1000 of abve.
## [1] 0.1957426
(sd(rexp(1:n, rate = 1/lambda)))^2 # variance of above
## [1] 0.0430937
1.2 Investigate the distribution of averages of 40 exponentials and compare with 1.2
mns = NULL
for (i in 1 : 1000) mns = c(mns, mean(rexp(40, rate = 1/lambda))) # 1000 sims of 40 means
hist(mns) # histogram of above

mean(mns) #mean of above
## [1] 0.1995004
(sd(mns))^2 #variance of above
## [1] 0.001064762
Explain understanding of the differences of the variances: Variance of 1.2 is narrower than 1.1 because of the CLT.
2. Inferential Analysis To analyze the ToothGrowth data in the R datasets.
i. Load the ToothGrowth data
ii. Provide a basic summary of the data.
data("ToothGrowth")
x <- ToothGrowth$len #vector of len
mean(x) #mean len
## [1] 18.81333
y<-mean(x) #load it to y for further analysis
sd(x)*sd(x) # variance of len
## [1] 58.51202
y + c(-1, 1) * qnorm(0.975) * sd(x)/sqrt(length(x))# confidence interval of len
## [1] 16.87783 20.74884
iii. Hypothesis tests to compare tooth growth by supp and dose
iv. Conclusions and the assumptions
#Test Ho: Supps have no effect on len
t.test(len ~ supp, paired = FALSE, var.equal = TRUE, data = ToothGrowth)
##
## Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 58, p-value = 0.06039
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1670064 7.5670064
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
#Test Ho: Doses 1&2 have same effect on len
ToothGrowt12 <- subset(ToothGrowth, dose %in% c(1, 2))# new subset with doses 1 & 2
t.test(len ~ dose, paired = FALSE, var.equal = TRUE, data = ToothGrowt12)
##
## Two Sample t-test
##
## data: len by dose
## t = -4.9005, df = 38, p-value = 1.811e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.994387 -3.735613
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
Assumptions: paired = FALSE, var.equal = TRUE
Conclusions: Because it’s not significant at .05, we can’t reject Ho: Supps have no effect on len. Because it’s significant at .05 we reject Ho: Doses 1&2 have same effect on len