This is an R Markdown document that provides a basic inferential data analysis on the R ToothGrowth data. The ToothGrowth data shows the effect of vitamic C on tooth growth in guinea pigs. It looks as follows: -
data("ToothGrowth")
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
The response is the length of teeth (len) in each of 10 guinea pigs at each of three dose levels (dose) of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (supp), orange juice (OJ) or ascorbic acid (VC).
Plotting the two supplements, for each of the guinea pigs, shows that tooth length tends to increase, as supplement dosage increases, for both supplements.
if (!require(ggplot2)) stop("This script requires ggplot to be installed")
if (!require(data.table)) stop("This script requires data.table to be installed")
tg <- data.table(ToothGrowth)
tg <- cbind(tg, rep(1:10, 6))
g <- ggplot(tg, aes(x = dose, y = len, group = factor(V2)))
g <- g + geom_line(size = 1, aes(colour = factor(V2)))
g <- g + geom_point(size = 10, pch = 21, fill = "salmon", alpha = 0.5)
g <- g + facet_grid(. ~ supp)
g
Lets test the null hypothesis, that increase in dosage of a particular supplement type (say, ascorbic acid, VC) does not increase tooth length.
g1 <- tg[supp == "VC" & dose == 0.5, len]
g2 <- tg[supp == "VC" & dose == 2.0, len]
t.test(g2, g1, paired = TRUE)
##
## Paired t-test
##
## data: g2 and g1
## t = 9.7912, df = 9, p-value = 4.264e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 13.9643 22.3557
## sample estimates:
## mean of the differences
## 18.16
The t-statistic is a positive number in this case, with a significant result that shows that the null hypothesis can be rejected in the case of ascorbic acid (VC). Increasing the dosage of ascorbic acid shows an increase in tooth length, with a confidence interval that does not include 0.
Doing the same for orange juice shows similar results: -
g3 <- tg[supp == "OJ" & dose == 0.5, len]
g4 <- tg[supp == "OJ" & dose == 2.0, len]
t.test(g4, g3, paired = TRUE)
##
## Paired t-test
##
## data: g4 and g3
## t = 7.4919, df = 9, p-value = 3.724e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 8.956042 16.703958
## sample estimates:
## mean of the differences
## 12.83
Now lets test the null hypothesis that each of the supplements in similar doses do not have any difference in terms of tooth length.
t.test(g3, g1, paired = TRUE)
##
## Paired t-test
##
## data: g3 and g1
## t = 2.9791, df = 9, p-value = 0.01547
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.263458 9.236542
## sample estimates:
## mean of the differences
## 5.25
This test shows a significant result, with a positive t-statistic and confidence interval that does not include 0, so the null hypothesis can be rejected. Orange juice causes tooth to grow longer than ascorbic acid, at a dosage of 0.5mg.
However, at higher doses, example, at 2.0mg, the null hypothesis cannot be rejcted.
t.test(g4, g2, paired = TRUE)
##
## Paired t-test
##
## data: g4 and g2
## t = -0.042592, df = 9, p-value = 0.967
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.328976 4.168976
## sample estimates:
## mean of the differences
## -0.08
It shows the confidence interval crossing zero and a minor negative t-statistic, with a non-significant result, so the null hypothesis that orange juice or ascorbic acid, at a dosage of 2.0mg, does not cause a difference in tooth length cannot be rejected. This is also apparent in the figure below: -
coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
xlab = "ToothGrowth data: length vs dose, given type of supplement")
The figure shows that for each guinea pig, within a supplement group, there is a trend upwards of tooth lengh as dosage increase. However, across the two groups, mean length reaches the same value of ~25, as dosage is increased from 0.5mg to 2.0mg.
Note that for each of the above hypothesis, we are using the paired = TRUE parameter for the t-tests. This assumes that the two supplements do not contaminate each other, and there is an appropriate washout period between the delivery of these supplements on the guinea pigs.