Overview
- In this document, I begin by providing a basic overview of the ToothGrowth dataset in R, which contains data from an experiment measuring tooth growth of guinea pigs when given different doses of Vitamin C through two different delivery methods. I then seek to use confidence intervals and hypothesis testing to draw conclusions from the data.
Exploratory Analysis
- Let’s first look at a basic summary of the data:
str(data)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
- As we can see, there are 60 observations of 3 variables: Tooth Length, Supplement Type, and Dosage.
- It is not clear from the documentation provided within R whether or not this is a paired dataset.
- An examination of the documentation from the original study shows that it is not actually a paired dataset, but rather, 60 guinea pigs were used.
g <- ggplot(data, aes(dose, len))
g + geom_point() + facet_wrap(~ supp) +
geom_smooth(method = "lm", se = F) +
ggtitle("Tooth Growth by Dosage and Method")

- It appears that there is an increase in tooth growth as the dosage increases for both supplement types.
g <- ggplot(data, aes(supp, len))
g + geom_boxplot(aes(fill = supp)) + facet_wrap(~ dose) +
ggtitle("Tooth Growth by Method and Dosage")

- Here, we can see that Orange Juice tends to have a greater effect on tooth growth than the aqueous Vitamin C solution.
Hypothesis Testing
- Now we will use formal statistical methods to compare tooth growth by supplement method and dosage.
Comparing by Dosage
- Our initial exploratory analysis showed that tooth length increased when dosage increased.
- To test this formally, let’s break this down into three different sets of comparisons:
- Comparing the 0.5 mg dose with the 1.0 mg dose
- Comparing the 0.5 mg does with the 2.0 mg dose
- Comparing the 1.0 mg dose with the 2.0 mg dose
- Let us assume this distribution of sample means follows a T distribution.
- Our hypotheses are as follows:
- \(H_0: \mu_{1.0} - \mu_{0.5} = 0\); \(H_A: \mu_{1.0} - \mu_{0.5} \not= 0\)
- \(H_0: \mu_{2.0} - \mu_{0.5} = 0\); \(H_A: \mu_{2.0} - \mu_{0.5} \not= 0\)
- \(H_0: \mu_{2.0} - \mu_{1.0} = 0\); \(H_A: \mu_{2.0} - \mu_{1.0} \not= 0\)
- We will be using a two-sample t test with a 95% confidence interval.
sub1 <- filter(data, dose == 0.5 | dose == 1.0)
sub2 <- filter(data, dose == 0.5 | dose == 2.0)
sub3 <- filter(data, dose == 1.0 | dose == 2.0)
t1 <- t.test(len ~ dose, data = sub1)
t2 <- t.test(len ~ dose, data = sub2)
t3 <- t.test(len ~ dose, data = sub3)
- Here’s what we can conclude:
- Null Hypothesis rejected with a p-value of approximately 0. Confidence Interval at 95% is approximately -11.98, -6.28.
- Null Hypothesis rejected with a p-value of approximately 0. Confidence Interval at 95% is approximately -18.16, -12.83.
- Null Hypothesis rejected with a p-value of approximately 0. Confidence Interval at 95% is approximately -9, -3.73.
Comparing by Supplement Method
- Our inital exploratory analysis seemed to show that in the same doses, Orange Juice tended to have a greater affect on tooth growth than the Vitamin C solution.
- Let’s assume the distribution of sample means follows a T distribution.
- Our hypotheses are as follows:
- \(H_0: \mu_{VC} - \mu_{OJ} = 0\); \(H_A: \mu_{VC} - \mu_{OJ} \not= 0\)
- We will be using a two-sample t test with a 95% confidence interval.
t <- t.test(len ~ supp, data = data)
- Here’s what we can conclude:
- This test gives us a p-value of 0.06 and a confidence interval of -0.17, 7.57.
- We cannot reject the null hypothesis. There is not a significant statistical difference on tooth growth between supplement methods.
Summary
- This document used hypothesis testing to generate conclusions about C. I. Bliss’s scientific study regarding the effect of vitamin c on tooth growth in guinea pigs.
- After initial analysis, we examined the effect of dosage amounts on tooth growth and also the effect of supplement type on tooth growth.
- To do this, we assumed:
- The data were IID
- The distribution of means followed a T distribution
- A 95% confidence interval
- We concluded with great certainty that varying dosage amounts had an effect on tooth growth.
- We also concluded that there was not a significan statistical difference in tooth growth between delivery methods.