by Mughundhan Chandrasekar - May/14/2016

1. Load the ToothGrowth data and perform some basic exploratory data analyses

library(datasets)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
library(graphics)
library(lattice)
Exploring the contents of the dataset
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
ToothGrowth[1:10,]
##     len supp dose
## 1   4.2   VC  0.5
## 2  11.5   VC  0.5
## 3   7.3   VC  0.5
## 4   5.8   VC  0.5
## 5   6.4   VC  0.5
## 6  10.0   VC  0.5
## 7  11.2   VC  0.5
## 8  11.2   VC  0.5
## 9   5.2   VC  0.5
## 10  7.0   VC  0.5
table(ToothGrowth$dose, ToothGrowth$supp)
##      
##       OJ VC
##   0.5 10 10
##   1   10 10
##   2   10 10
Exploring dataset by plotting A
ggplot(data=ToothGrowth, aes(x=as.factor(dose), y=len, fill=supp)) +
    geom_bar(stat="identity",) +
    facet_grid(. ~ supp) +
    xlab("Dosage in miligrams") +
    ylab("Tooth length") +
    guides(fill=guide_legend(title="Supplement Type"))

##### Exploring dataset by plotting B

xyplot(len~dose|supp, ToothGrowth,
       main="Scatterplots by Supplement Type and Dosage",
       ylab="Length", xlab="Dose")

2. Provide a basic summary of the data

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
Basic Summary
1.60 observations
2.len: length of odontoblasts (teeth) in each of 10 guinea pigs
3.OJ: (orange juice) as delivery method
4.VC: (ascorbic acid) as deliverym method
5.dose: three dose levels of Vitamin C (0.5, 1 and 2 mg)

3. Using confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose

t.test(len ~ supp, data = ToothGrowth)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

4. Doing the same test, but increasing the amount of dosages

t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == .5, ])
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98
t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 1, ])
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77
t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 2, ])
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

Conclusion

Confidence testing while varying dosage results that an increase in dosage from .5, 1, to 2 is proportianal to longer tooth. However, with a p-value of 0.06 and having zero in the confidence interval means we can not reject the null hypothesis that different supplement types have no effect on tooth length.