Overview

In this study, we will perform some inferential data analysis on the ToothGrowth data in the R datasets package.

Load ToothGrowth Data

library(datasets)
data("ToothGrowth")
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Basic Summary of Data

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

The ToothGrowth data frame has 60 observations on 3 variables.
- len: Toothlength
- supp: Supplement type (VC or OJ)
- dose: Dose in milligrams/day

Basic Exploratory Analysis

We first look at how len is impacted by each of the supp and dose variables seperately.

Then we look at how len is impacted by both of the supp and dose variables together.

Confidence Intervals and Hypothesis Tests

T-test for mean difference by supplement type

\(H_o:\) There is no difference between OJ and VC’s effect on tooth length

## [1] -0.1710156  7.5710156
## attr(,"conf.level")
## [1] 0.95

T-tests for mean difference by dosage level

\(H_o:\) There is no difference between dosage levels’ (0.5 vs. 1) effects on tooth length

## [1] -11.983781  -6.276219
## attr(,"conf.level")
## [1] 0.95

\(H_o:\) There is no difference between dosage levels’ (0.5 vs. 2) effects on tooth length

## [1] -18.15617 -12.83383
## attr(,"conf.level")
## [1] 0.95

\(H_o:\) There is no difference between dosage levels’ (1 vs. 2) effects on tooth length

## [1] -8.996481 -3.733519
## attr(,"conf.level")
## [1] 0.95

95% CIs for above 3 t-tests do not contain the value 0. So we reject the null hypothesis.

T-tests for mean difference by supplement type at same dosage levels

\(H_o:\) There is no difference between OJ and VC’s effects on tooth length when doesage is 0.5 mg

## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95

\(H_o:\) There is no difference between OJ and VC’s effects on tooth length when doesage is 1 mg

## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95

\(H_o:\) There is no difference between OJ and VC’s effects on tooth length when doesage is 2 mg

## [1] -3.79807  3.63807
## attr(,"conf.level")
## [1] 0.95

Conclusions and Assumptions

For the t-tests we performed above, we look at the 95% confidence intervals. If the CI contains the value 0, we fail to reject the null hypothesis. If the CI does not contain value 0, we reject the null hypothesis. Below is a list of summary points we concluded from performing the hypothesis test:

  1. When looking at supplement type alone, there is no difference in OJ and VC’s effects on tooth length
  2. When looking at doesage levels alone, higher dosage tends have higher positive effects on tooth length
  3. When dosage levels are controlled, OJ has greater effects on tooth length than VC does at dosage levels 0.5 and 1.0. We can not state the same when dosage level is 2.0

The above conclusions are based on the following assumptions:
1. Sample of 60 guinea pigs were randomly drawn from the population 2. The dosage levels and delivery methods were assigned to each guinea pig in the sample randomly

Appendix A - R code for plots

library(ggplot2)
library(gridExtra)
## boxplot of Toothlength by Supplement Type
g <- ggplot(ToothGrowth)
g1 <- g + geom_boxplot(aes(x = supp, y = len, fill = supp)) 
g1 <- g1 + labs(title = "Tooth Length by Supplement Type", 
                x = "Supplement Type", y = "Tooth Length")+
        theme(legend.position = "none")
## boxplot of Toothlength by Dosage
g2 <- g + geom_boxplot(aes(x = as.factor(dose), y = len, fill = as.factor(dose)))
g2 <- g2+ labs(title = "Tooth Length by Dosage", x = "Dosage", y = "Tooth Length")+
        theme(legend.position = "none")
grid.arrange(g1,g2, nrow = 1)
g <- ggplot(ToothGrowth, aes(x = supp, y = len, color = supp))
g <- g + geom_boxplot() + 
        facet_grid(.~dose, labeller = label_both) + 
        labs (title = "Tooth Length vs  Dosage Level by Supplement Type",
              x = "Supplement Type", y = "Tooth Length")+
        theme(legend.position = "none")
g 

Appendix B - R code for t-tests

t.test(len ~ supp, data = ToothGrowth)$conf.int
t.test(len ~ dose, data = subset(ToothGrowth, dose %in% c(0.5, 1.0)))$conf.int
t.test(len ~ dose, data = subset(ToothGrowth, dose %in% c(0.5, 2.0)))$conf.int
t.test(len ~ dose, data = subset(ToothGrowth, dose %in% c(1.0, 2.0)))$conf.int
t.test(len ~ supp, data = subset(ToothGrowth,  dose == 0.5))$conf.int
t.test(len ~ supp, data = subset(ToothGrowth,  dose == 1))$conf.int
t.test(len ~ supp, data = subset(ToothGrowth,  dose == 2))$conf.int