Overview

In the second portion of the project, we’re going to analyze the ToothGrowth data in the R datasets package. The ToothGrowth data set consist of the length of odontoblasts (teeth) in each of 60 guinea pigs, 10 for each combination, at three Vitamin C dosage levels: 0.5, 1, and 2 mg and two delivery methods: orange juice (OJ) or ascorbic acid (VC)

The dataset contains 60 observations of 3 variables:

Load the ToothGrowth data and perform some basic exploratory data analyses

#load libraries
library(ggplot2)
#Load the data
data(ToothGrowth)
# Convert dose to a factor
ToothGrowth$dose<-as.factor(ToothGrowth$dose)
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

Provide a basic summary of the data.

summary(ToothGrowth)
##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90

Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

Boxplots

# Plot tooth length ('len') vs. supplement delivery method ('supp') broken out by the dose amount ('dose')
ggplot(aes(x=supp, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=supp)) + xlab("Supplement Delivery") + ylab("Tooth Length") + facet_grid(~ dose) + ggtitle("Tooth Length vs. Delivery Method \nby Dose Amount") + 
     theme(plot.title = element_text(lineheight=.8, face="bold"))

Confidence Intervals and Hypothesis Testing

Do the tooth length of the guinea pigs depend on delivery methods?

len<-ToothGrowth$len
supp<-ToothGrowth$supp
dose<-ToothGrowth$dose
t.test(len[supp=="OJ"], len[supp=="VC"], paired = FALSE, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len[supp == "OJ"] and len[supp == "VC"]
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

The p-value of this test interprets lack of enough evidence to reject the null hypothesis.

Do the tooth length of the group depends on vitamin C dosage?

t.test(len[dose==2], len[dose==1], paired = FALSE, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  len[dose == 2] and len[dose == 1]
## t = 4.9005, df = 38, p-value = 1.811e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  3.735613 8.994387
## sample estimates:
## mean of x mean of y 
##    26.100    19.735

The p-value of this test is 0, a evidence that we can reject the null hypothesis. Therefore we can assume that the means of dosage change from 1mg to 2mg creates an positive effect on tooth length. Furthermore, the confidence interval does not contain zero (0).

Conclusion

After above analysis we can conclude that supplement type has no effect on tooth growth, and increasing the dose level leads to increased tooth growth.

Assumptions: