Synopsis

This is an analysis of the ToothGrowth data in the R datasets package. The analysis will provide summary of the data and use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.The analysis will show the effect of Vitamin C on tooth growth in guinea pigs. The difference in means of different dosages and supplements given.

  data(ToothGrowth)
  str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
glimpse of the data
head(ToothGrowth, tail(10))
##     len supp dose
## 1   4.2   VC  0.5
## 2  11.5   VC  0.5
## 3   7.3   VC  0.5
## 4   5.8   VC  0.5
## 5   6.4   VC  0.5
## 6  10.0   VC  0.5
## 7  11.2   VC  0.5
## 8  11.2   VC  0.5
## 9   5.2   VC  0.5
## 10  7.0   VC  0.5

Summary

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
length(ToothGrowth$len)
## [1] 60
plot( head(ToothGrowth, 20))

Plot shows 3 levels of Vitamin C dose (.5, 1 , 2 ) mg

Dose Variable

 ToothGrowth$dose <- as.factor(ToothGrowth$dose)

Analysis

sm <- split(ToothGrowth$len, ToothGrowth$supp)
sapply(sm,mean)
##       OJ       VC 
## 20.66333 16.96333
ggplot(aes(x=supp, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=supp))+ 
        xlab("Supplement") +ylab("Length")  + 
  guides(fill=guide_legend(title="Supplement type"))

Compute Vitamin C dose effect on a tooth length

dm <- split(ToothGrowth$len, ToothGrowth$dose)
sapply(dm, mean)
##    0.5      1      2 
## 10.605 19.735 26.100
ggplot(aes(x=dose, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=dose)) + 
        xlab("Dose [mg]") +ylab("Length") 

CI and Hypothesis test

t.test(len ~ supp, data = ToothGrowth, var.equal = FALSE, paired = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

With CI of -0.1710156 7.5710156 No correlation of supply and lenth so we can’t reject the null hypothesis

Dose level correlation with growth of tooth

d1 <- subset(ToothGrowth, dose %in% c(.5, 1.0))
d2 <- subset(ToothGrowth, dose %in% c(.5, 2.0))
d3 <- subset(ToothGrowth, dose %in% c(1.0, 2.0))

t.test(len ~ dose, data = d1, paired = FALSE, var.equal = FALSE )
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean in group 0.5   mean in group 1 
##            10.605            19.735
t.test(len ~ dose, data = d2, paired = FALSE, var.equal = FALSE )
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100
t.test(len ~ dose, data = d3, paired = FALSE, var.equal = FALSE )
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2 
##          19.735          26.100

with the following CI -11.983781 -6.276219 in c(.5, 1.0) ; -18.15617 -12.83383 in c(.5, 2.0) and -8.996481 -3.733519 in c(1.0, 2.0) we can now reject the null hypothesis which acknowledge correlation of dose level and tooth length.

Dose level correlation with Supplement

ds1 <- subset(ToothGrowth, dose  == .5)
ds2 <- subset(ToothGrowth, dose == 1.0)
ds3 <- subset(ToothGrowth, dose ==  2.0)


t.test(len ~ supp, data = ds1, paired = FALSE, var.equal = FALSE )
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98
t.test(len ~ supp, data = ds2, paired = FALSE, var.equal = FALSE )
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77
t.test(len ~ supp, data = ds3, paired = FALSE, var.equal = FALSE )
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

Again with the following CI 1.719057 8.780943 for .5 ; 2.802148 9.057852 for 1.0 we can now reject the null hypothesis which acknowledge correlation of dose level and tooth length, but ds3 CI -3.79807 3.63807 for 2.0 is not sufficient to reject null hypothesis.

In conclussion, the analysis shows that supplement has no effect on tooth growth, while increasing the dose level leads to increased tooth growth.