1.Overview

This study analyzes the effect of Vitamin C on Tooth Growth in Guinea Pigs by analyzing the ToothGrowth data in the R datasets package. It uses confidence intervals and/or hypothesis tests to compare tooth growth by supplement and dose. Only techniques seen in class are used, even if there are other approaches worth considering.

2.Summary of the data

The data shows the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).

The ToothGrowth data set is a data frame with 60 observations on 3 variables.

##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Since there are only three dose levels of Vitamin C (0.5, 1 and 2 mg) we will transform the variable dose to a factor variable.

3.Basic exploratory analysis

First we investigate the effect of supplement delivery method on tooth length and compare the average tooth lengths of guinea pigs that were served Vitamin C via orange juice with those that were served via ascorbic acid.

3.1.Effect of Vitamin C supplement delivery method on toothlength

Mean toothlength by delivery method:

##       OJ       VC 
## 20.66333 16.96333

A basic exploratory analysis indicates that tooth length of the guinea pigs is longer when Vitamin C is delivered through OJ. A statistical t test for the difference in the means of two independent groups will be used to reject or accept that hypothesis. Plot 1 in the appendix illustrates this comparison.

3.2.Effect of Vitamin supplement dose on toothlength

Mean toothlength by supplement dose:

##    0.5      1      2 
## 10.605 19.735 26.100

A basic exploratory analysis indicates that tooth length of the guinea pigs is longer with increasing doses of Vitamin C. A statistical t test for the difference in the means of two independent groups will be used to reject or accept that hypothesis. Plot 2 in the appendix illustrates this comparison.

4. Statistical Inference

4.1. Assumptions

  1. The experiment is done with random assignment of guinea pigs to different dose level categories and supplement type to control for confounders that might affect the outcome.
  2. Members of the sample population, i.e. the 60 guinea pigs, are representative of the entire population of guinea pigs. This assumption allows us to generalize the results.
  3. For the t-tests, the variances are assumed to be different for the two groups being compared. This assumption is less stronger than the case in which the variances are assumed to be equal.

4.2.Effect of supplement delivery method on toothlength

Our null hypothese, H0, states that the means of the two groups are equal. The alternative hypothesis states that the two means are different. We will perform a two sided test.

## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$supp == "OJ"] and ToothGrowth$len[ToothGrowth$supp == "VC"]
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

Since the p-value of this test is 0.06 and the confidence interval of the test contains zero, we have not enough proof to reject the null hypothesis. We can conclude that vitamin C delivery method has NO effect on tooth length.

4.3. Effect of supplement dose on toothlength

We will test the effect of supplement dose on toothlength. Given that our basic exploratory analysis indicated that there was a strong correlation between increasing vitamin C dose and toothlength we will only do this test for two combinations. The first combination of dose 0.5mg and 1.0mg and a second combination of dose 1.0mg and 2.0mg. Our null hypothesis, H0, will be that the two groups have eaual means. The alternative hypothesis, Ha, will be that the two groups have different means. We will perform a two sided test.

Analyzing the data for correlation between the dose level (0.5mg and 1mg) and change in tooth growth

## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 0.5] and ToothGrowth$len[ToothGrowth$dose == 1]
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean of x mean of y 
##    10.605    19.735

Analyzing the data for correlation between the dose level (1mg and 2mg) and change in tooth growth

## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 2]
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean of x mean of y 
##    19.735    26.100

For both dose level pairs, the p-value is less than 0.05. Therefore we can reject the null hypothesis, H0, and we can conclude that increasing the dose level leads to an increase in tooth length.

5. Conclusion

Following results can be concluded from the above hypothesis testing

  1. Supplement type has no effect on tooth growth.

  2. Inreasing the dose level leads to increased tooth growth

APPENDIX- R CODE and plots not shown in body of report:

Summarizing the data

library(datasets)
library(ggplot2)
data(ToothGrowth)
set.seed(25)
summary(ToothGrowth)
str(ToothGrowth)
# Convert dose to a factor
ToothGrowth$dose<-as.factor(ToothGrowth$dose)

str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...

Basic exploratory analysis

Effect of Vitamin C supplement delivery method on toothlength

# calculate the mean toothlength of guinea pigs served by the two supplement delivery methods
meansupp = split(ToothGrowth$len, ToothGrowth$supp)
sapply(meansupp, mean)
##       OJ       VC 
## 20.66333 16.96333

Plot 1 - Effect of Vitamin C supplement delivery method on toothlength

# Plot tooth length ('len') vs. the supplement delivery method ('supp')
ggplot(aes(x=supp, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=supp))+ 
        xlab("Supplement type") +ylab("Tooth length") + ggtitle(" Plot 1:Tooth Length vs. Supplement Delivery Method")

meandose = split(ToothGrowth$len, ToothGrowth$dose)
sapply(meandose, mean)
##    0.5      1      2 
## 10.605 19.735 26.100

Plot2 - Effect of Vitamin C supplement dose on toothlength

# Plot tooth length ('len') vs. the vitamin C dose ('dose')
ggplot(aes(x=dose, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=dose)) + 
        xlab("Dose in miligrams") +ylab("Tooth length") + ggtitle("Plot 2:Tooth Length vs. Dose Amount")

##Statistical inference

Two sided t.test - Effect of Vitamin C supplement delivery method on toothlength

t.test(ToothGrowth$len[ToothGrowth$supp=="OJ"], ToothGrowth$len[ToothGrowth$supp=="VC"], paired = FALSE, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$supp == "OJ"] and ToothGrowth$len[ToothGrowth$supp == "VC"]
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

Two sided t.test - Effect of Vitamin C supplement dose on toothlength

# assuming unequal variances between the two groups
t.test(ToothGrowth$len[ToothGrowth$dose==0.5], ToothGrowth$len[ToothGrowth$dose==1], paired = FALSE, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 0.5] and ToothGrowth$len[ToothGrowth$dose == 1]
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean of x mean of y 
##    10.605    19.735
t.test(ToothGrowth$len[ToothGrowth$dose==1], ToothGrowth$len[ToothGrowth$dose==2], paired = FALSE, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 2]
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean of x mean of y 
##    19.735    26.100