Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package.

1. Loading the ToothGrowth data and basic exploratory data analyses

library(datasets)
data(ToothGrowth)
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
tail(ToothGrowth)
##     len supp dose
## 55 24.8   OJ    2
## 56 30.9   OJ    2
## 57 26.4   OJ    2
## 58 27.3   OJ    2
## 59 29.4   OJ    2
## 60 23.0   OJ    2
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

2. Basic summary of the data

mean(ToothGrowth$len)
## [1] 18.81333
sd(ToothGrowth$len)
## [1] 7.649315
library(ggplot2)
bp <- ggplot(ToothGrowth,  aes(x=factor(dose),y=len, fill=factor(dose)))
bp <- bp + geom_boxplot() + facet_grid(.~supp)
bp <- bp + labs(x= "Dosage (Milligram)", y = " Length of teeth"  )
bp

3 Confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

Tooth growth by dose:

dostest<- subset(ToothGrowth, dose %in% c(0.5, 2.0))
dt1<- t.test(len ~ dose, data = dostest)
dt1
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100

Tooth growth by supp:

supptest1 <- subset(ToothGrowth, dose %in% c(0.5))
supptest2 <- subset(ToothGrowth, dose %in% c(1))
supptest3 <- subset(ToothGrowth, dose %in% c(2))
st1<- t.test(len ~ supp, data = supptest1)
st1
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98
st2<- t.test(len ~ supp, data = supptest2)
st2
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77
st3<- t.test(len ~ supp, data = supptest3)
st3
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

4 The conclusions

The p-value of 4.398e-14 indicates that the means of the two groups (Oosage 0.5 and 2) are not equal. When using the lower dosage (0.5 and 1 milligram) the OJ supplement is more effective: - When comparing the means between the supplements using the dosage 0.5 the p-value = 0.006359- there is a difference between the means in two groups - When comparing the means between the supplements using the dosage 1 the p-value = 0.001038- there is a difference between the means in two groups The higher level of dosage (2 milligrams) eliminates the difference between the supplements, there is no statistical difference between the means in two groups. (p-value = 0.9639).