Overview

This is a statistical analysis of the ToothGrowth dataset. The goal of the analysis is to compare the tooth growth under two types of supplements, and three types of doses.

The ToothGrowth dataset is described by the R documentation as follows:

“The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).”

1. Load the ToothGrowth data and perform some basic exploratory data analyses

The database contains 60 observations and 3 variables.

  1. len numeric Tooth length
  2. supp factor Supplement type (VC or OJ).
  3. dose numeric Dose in milligrams/day
library(datasets)
library(dplyr)
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

2. Provide a basic summary of the data.

This plot gives a summary of tooth length by supplement and by dose.

require(graphics)
coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
       xlab = "ToothGrowth data: length vs dose, given type of supplement")

3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

Observing the plot in the previous section, it seems that for dose .5 and 1.0 the supplement OJ provides greater tooth growth compared to VC. However, for the dose 2.0 it is hard to conclude anything visually. Therefore, a t-test is used to confirm if the differences in the means are significant or not. The null hypothesis is that the means are equal.

First, lets analyse both supplements on a dose of 0.5 milligrams/day using a t-test. Note that the 95 percent confidence interval does not contain zero, which suggests that it is very possible that the two population means are not equal.

OJ_05 <- filter(ToothGrowth, supp == "OJ", dose == "0.5")
VC_05 <- filter(ToothGrowth, supp == "VC", dose == "0.5")
t.test(OJ_05$len,VC_05$len)
## 
##  Welch Two Sample t-test
## 
## data:  OJ_05$len and VC_05$len
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean of x mean of y 
##     13.23      7.98

Now, lets look at both supplements on a dose of 1.0 milligrams/day. Note that once again the 95 percent confidence interval does not contain zero, which suggests that it is very possible that the two population means are not equal.

OJ_10 <- filter(ToothGrowth, supp == "OJ", dose == 1)
VC_10 <- filter(ToothGrowth, supp == "VC", dose == 1)
t.test(OJ_10$len,VC_10$len)
## 
##  Welch Two Sample t-test
## 
## data:  OJ_10$len and VC_10$len
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean of x mean of y 
##     22.70     16.77

Finally, lets look at both supplements on a dose of 2.0 milligrams/day. Note that this time the 95 percent confidence interval does contain zero, which suggests that the two population means are not statistically different.

OJ_20 <- filter(ToothGrowth, supp == "OJ", dose == 2)
VC_20 <- filter(ToothGrowth, supp == "VC", dose == 2)
t.test(OJ_20$len,VC_20$len)
## 
##  Welch Two Sample t-test
## 
## data:  OJ_20$len and VC_20$len
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean of x mean of y 
##     26.06     26.14

4. State your conclusions and the assumptions needed for your conclusions.

It can be concluded that with 95% confidence that orange juice produce longer teeth at a dose of 0.5 milligrams/day, and at a dose of 1.0 milligrams/day. However, at 2.0 milligrams/day, both ascorbic acid (a form of vitamin C and coded as VC) and orange juice are statistically similar.