This is a statistical analysis of the ToothGrowth dataset. The goal of the analysis is to compare the tooth growth under two types of supplements, and three types of doses.
The ToothGrowth dataset is described by the R documentation as follows:
“The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).”
The database contains 60 observations and 3 variables.
library(datasets)
library(dplyr)
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
This plot gives a summary of tooth length by supplement and by dose.
require(graphics)
coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
xlab = "ToothGrowth data: length vs dose, given type of supplement")
Observing the plot in the previous section, it seems that for dose .5 and 1.0 the supplement OJ provides greater tooth growth compared to VC. However, for the dose 2.0 it is hard to conclude anything visually. Therefore, a t-test is used to confirm if the differences in the means are significant or not. The null hypothesis is that the means are equal.
First, lets analyse both supplements on a dose of 0.5 milligrams/day using a t-test. Note that the 95 percent confidence interval does not contain zero, which suggests that it is very possible that the two population means are not equal.
OJ_05 <- filter(ToothGrowth, supp == "OJ", dose == "0.5")
VC_05 <- filter(ToothGrowth, supp == "VC", dose == "0.5")
t.test(OJ_05$len,VC_05$len)
##
## Welch Two Sample t-test
##
## data: OJ_05$len and VC_05$len
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean of x mean of y
## 13.23 7.98
Now, lets look at both supplements on a dose of 1.0 milligrams/day. Note that once again the 95 percent confidence interval does not contain zero, which suggests that it is very possible that the two population means are not equal.
OJ_10 <- filter(ToothGrowth, supp == "OJ", dose == 1)
VC_10 <- filter(ToothGrowth, supp == "VC", dose == 1)
t.test(OJ_10$len,VC_10$len)
##
## Welch Two Sample t-test
##
## data: OJ_10$len and VC_10$len
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean of x mean of y
## 22.70 16.77
Finally, lets look at both supplements on a dose of 2.0 milligrams/day. Note that this time the 95 percent confidence interval does contain zero, which suggests that the two population means are not statistically different.
OJ_20 <- filter(ToothGrowth, supp == "OJ", dose == 2)
VC_20 <- filter(ToothGrowth, supp == "VC", dose == 2)
t.test(OJ_20$len,VC_20$len)
##
## Welch Two Sample t-test
##
## data: OJ_20$len and VC_20$len
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean of x mean of y
## 26.06 26.14
It can be concluded that with 95% confidence that orange juice produce longer teeth at a dose of 0.5 milligrams/day, and at a dose of 1.0 milligrams/day. However, at 2.0 milligrams/day, both ascorbic acid (a form of vitamin C and coded as VC) and orange juice are statistically similar.