Question1

Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials. You should

Show the sample mean and compare it to the theoretical mean of the distribution.

Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution. Show that the distribution is approximately normal.

lambda<-0.2
print(paste("theortical mean is",as.character( 1/lambda)))
## [1] "theortical mean is 5"
mns = NULL
for (i in 1 : 1000) mns = c(mns, mean(rexp(40,lambda)))
print(paste("sample mean is",as.character(mean(mns))))  
## [1] "sample mean is 4.9860331933177"

variance

variance<-var(mns)
true_sd<-sqrt(variance)*sqrt(40)
print(paste("theortical variance is", as.character(1/lambda)))  
## [1] "theortical variance is 5"
print(paste("sample variance is", as.character(true_sd)))
## [1] "sample variance is 5.1326598639625"

test if it follows normal distribution

qqnorm(mns);qqline(mns,col=2)

##Question2
###Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package.
(a) Load the ToothGrowth data and perform some basic exploratory data analyses (b) Provide a basic summary of the data. (c) Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering) (d) State your conclusions and the assumptions needed for your conclusions.
###load data

data("ToothGrowth")  

roughly study data

dim(ToothGrowth) 
## [1] 60  3
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
head(ToothGrowth,10)
##     len supp dose
## 1   4.2   VC  0.5
## 2  11.5   VC  0.5
## 3   7.3   VC  0.5
## 4   5.8   VC  0.5
## 5   6.4   VC  0.5
## 6  10.0   VC  0.5
## 7  11.2   VC  0.5
## 8  11.2   VC  0.5
## 9   5.2   VC  0.5
## 10  7.0   VC  0.5
dose_group<-levels(factor(ToothGrowth$dose))

for (level in dose_group){
  result<-t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == level, ])
  print(paste("For does",as.character(level),"p-value is",as.character(result$p.value)))
} 
## [1] "For does 0.5 p-value is 0.0063586067640968"
## [1] "For does 1 p-value is 0.00103837587229988"
## [1] "For does 2 p-value is 0.963851588723373"

Conclusion:

considering only dose=2 has p-value >5%, the supply doesn’t really impact tooth growth