Overview

This is the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package.

The datasets has the response which is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).

In this project, we are tasked these parts:

  1. Load the ToothGrowth data and perform some basic exploratory data analyses
  2. Provide a basic summary of the data.
  3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering)
  4. State your conclusions and the assumptions needed for your conclusions.

Exercise

Loading add-on package

library(ggplot2)

1. Load the ToothGrowth data and perform some basic exploratory data analyses

data(ToothGrowth)  # ToothGrowth dataset
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

2. Provide a basic summary of the data.

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
ggplot(data = ToothGrowth, aes(x = supp, y = len)) +
         geom_boxplot(aes(fill = supp))

ggplot(data = ToothGrowth, aes(x = supp, y = len)) +
         geom_boxplot(aes(fill = supp)) + facet_wrap(~ dose)

3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

The T-test by dose:

t.test(ToothGrowth$len[ToothGrowth$dose == 1], ToothGrowth$len[ToothGrowth$dose == 0.5],
       paired = FALSE, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 0.5]
## t = 6.4766, df = 37.986, p-value = 6.342e-08
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  6.753323      Inf
## sample estimates:
## mean of x mean of y 
##    19.735    10.605
t.test(ToothGrowth$len[ToothGrowth$dose == 2], ToothGrowth$len[ToothGrowth$dose == 1],
       paired = FALSE, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 2] and ToothGrowth$len[ToothGrowth$dose == 1]
## t = 4.9005, df = 37.101, p-value = 9.532e-06
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  4.17387     Inf
## sample estimates:
## mean of x mean of y 
##    26.100    19.735

Note: Increaseng doseages increase mean tooth length.

The T-test by supp:

t.test(len ~ supp, data = ToothGrowth, paired = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

Note: The p-value of OJ vs. VC is 0.06, which is greater than 0.05, so I accept the null hypothesis. but…

The T-test by supp for each dose:

t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 0.5, ], paired = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98
t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 1, ], paired = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77
t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 2, ], paired = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

Note:

In case of doseages is 0.5, the p-value of OJ vs. VC is 0.006, which is less than 0.05, so I DO NOT accept the null hypothesis.

In case of doseages is 1.0, the p-value of OJ vs. VC is 0.001, which is less than 0.05, so I DO NOT accept the null hypothesis.

In case of doseages is 2.0, the p-value of OJ vs. VC is 0.964, which is greater than 0.05, so I accept the null hypothesis.

4. State your conclusions and the assumptions needed for your conclusions.

  • Increaseng doseages increase mean tooth length.

  • Orange juice (OJ) increase tooth length unless doseage is 2.0.