Overview

Exploratory Analysis

str(data)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
g <- ggplot(data, aes(dose, len))
g + geom_point() + facet_wrap(~ supp) +
  geom_smooth(method = "lm", se = F) +
  ggtitle("Tooth Growth by Dosage and Method")

g <- ggplot(data, aes(supp, len))
g + geom_boxplot(aes(fill = supp)) + facet_wrap(~ dose) +
  ggtitle("Tooth Growth by Method and Dosage")

Hypothesis Testing

Comparing by Dosage

  • Our initial exploratory analysis showed that tooth length increased when dosage increased.
  • To test this formally, let’s break this down into three different sets of comparisons:
    1. Comparing the 0.5 mg dose with the 1.0 mg dose
    2. Comparing the 0.5 mg does with the 2.0 mg dose
    3. Comparing the 1.0 mg dose with the 2.0 mg dose
  • Let us assume this distribution of sample means follows a T distribution.
  • Our hypotheses are as follows:
    1. \(H_0: \mu_{1.0} - \mu_{0.5} = 0\); \(H_A: \mu_{1.0} - \mu_{0.5} \not= 0\)
    2. \(H_0: \mu_{2.0} - \mu_{0.5} = 0\); \(H_A: \mu_{2.0} - \mu_{0.5} \not= 0\)
    3. \(H_0: \mu_{2.0} - \mu_{1.0} = 0\); \(H_A: \mu_{2.0} - \mu_{1.0} \not= 0\)
  • We will be using a two-sample t test with a 95% confidence interval.
sub1 <- filter(data, dose == 0.5 | dose == 1.0)
sub2 <- filter(data, dose == 0.5 | dose == 2.0)
sub3 <- filter(data, dose == 1.0 | dose == 2.0)
t1 <- t.test(len ~ dose, data = sub1)
t2 <- t.test(len ~ dose, data = sub2)
t3 <- t.test(len ~ dose, data = sub3)
  • Here’s what we can conclude:
    1. Null Hypothesis rejected with a p-value of approximately 0. Confidence Interval at 95% is approximately -11.98, -6.28.
    2. Null Hypothesis rejected with a p-value of approximately 0. Confidence Interval at 95% is approximately -18.16, -12.83.
    3. Null Hypothesis rejected with a p-value of approximately 0. Confidence Interval at 95% is approximately -9, -3.73.

Comparing by Supplement Method

  • Our inital exploratory analysis seemed to show that in the same doses, Orange Juice tended to have a greater affect on tooth growth than the Vitamin C solution.
  • Let’s assume the distribution of sample means follows a T distribution.
  • Our hypotheses are as follows:
    • \(H_0: \mu_{VC} - \mu_{OJ} = 0\); \(H_A: \mu_{VC} - \mu_{OJ} \not= 0\)
  • We will be using a two-sample t test with a 95% confidence interval.
t <- t.test(len ~ supp, data = data)
  • Here’s what we can conclude:
    • This test gives us a p-value of 0.06 and a confidence interval of -0.17, 7.57.
    • We cannot reject the null hypothesis. There is not a significant statistical difference on tooth growth between supplement methods.

Summary