Statistical Inference Project Part 2

Overview

We’re going to analyze the ToothGrowth data in the R datasets package using confidence intervals and hypothesis tests to compare tooth growth by supp and dose.

1. Load the data

library(datasets)
tg <- ToothGrowth

2. Provide a basic summary of the data.

The ?ToothGrowth command gives us the details on this dataset. The purpose of the experiment was to test the effect of vitamin C on tooth growth in guinea pigs.

“The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).”

The format is a data frame with 3 variables:

[,1] len numeric Tooth length [,2] supp factor Supplement type (VC or OJ). [,3] dose numeric Dose in milligrams/day

We can also look at some summary tables on the dataframe to find out a bit more:

str(tg)

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

summary(tg)

##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

The structure (str) summary told us that dose is stored in the data as a numeric. We want it as a factor.

tg$dose <- as.factor(tg$dose)

3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering)

We’ll look at the data a bit to make some hypotheses.

First we’ll ask does dose matter?

plot(len ~ dose, data = tg)

It looks like it could.

Second we’ll ask does the type of supplement matter?

plot(len ~ supp, data = tg)

It also looks like it could.

So, what are our hypotheses that need testing?

Orange juice (OJ) has a larger impact on odontoblast growth than ascorbic acid (VC).

So we’ll do the t-test:

t.test(len~supp, paired = FALSE, var.equal = FALSE, data = tg)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

It’s really close but the p-value of 0.06 is higher than 0.05 so we can’t say with 95% confidence that OJ results in more growth than VC.

Larger doses of OJ result in more odontoblast growth.

tg_oj <- subset(tg, supp == "OJ" & dose != 1)
t.test(len~dose, paired = FALSE, var.equal = FALSE, data = tg_oj)

## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -7.817, df = 14.668, p-value = 1.324e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -16.335241  -9.324759
## sample estimates:
## mean in group 0.5   mean in group 2 
##             13.23             26.06

The p-value is way less than .05 so we can say this with near certainty.

Larger doses of VC result in more odontoblast growth.

tg_vc <- subset(tg, supp == "VC" & dose != 1)
t.test(len~dose, paired = FALSE, var.equal = FALSE, data = tg_vc)

## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -10.388, df = 14.327, p-value = 4.682e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -21.90151 -14.41849
## sample estimates:
## mean in group 0.5   mean in group 2 
##              7.98             26.14

This also has a really small p value and looks conclusive.

4. State your conclusions and the assumptions needed for your conclusions.

I’ve concluded:

We can’t say with certainty that OJ promotes more growth than VC.
We can say with certainty that higher doses of either VC or OJ promote more growth.