data("ToothGrowth")
library("knitr")
library("ggplot2")
library("fitdistrplus")
## Loading required package: MASS

Overview

In this project we will perform a basic analysis to determine if supplements in the form of orange juice or ascorbic acid and dose quantity influence tooth growth. In this context, the null hypothesis, \(H0\), states that there is no significant relationship between orange juice or ascorbic acid or dosage as it relates to tooth growth.
summary(ToothGrowth); str(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
#hist(ToothGrowth$len)

Analysis

Length vs Supplement

We evaluated whether or not there appeared to be an interaction between the dosage level and the length of tooth growth. Given a p-value (p >= 0.05) we would accept the null hypothesis that there is no relationship between length of tooth growth and the supplement taken. However, when comparins length of tooth growth to the dosage of supplement the findings are dissimilar.
#test by 'supp'
oj <- ToothGrowth[ToothGrowth$supp %in% 'OJ', ]
vc <- ToothGrowth[ToothGrowth$supp %in% 'VC', ]
t.test(len ~ supp, paired = FALSE, data = subset(ToothGrowth, ToothGrowth$supp!="Shoe"))
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

Length vs Dosage - 0.5 vs 1.0

As you can see from the very small p-vaue (p <= 0.5) we would reject the null hypothesis. There does appear to be an interaction with tooth growth length and this particular dosage amount of supplement.
#test by 'dose = 0.5' and 1.0
t.test(len ~ dose, paired = FALSE, data = subset(ToothGrowth, ToothGrowth$dose %in% c(.5, 1)))
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean in group 0.5   mean in group 1 
##            10.605            19.735

Length vs Dosage - 1.0 vs 2.0

Once again, in this test we observe another small p-value and reject the null hypothesis. There does appear to be an interaction between dosage amount and length.
t.test(len ~ dose, paired = FALSE, data = subset(ToothGrowth, ToothGrowth$dose %in% c(1, 2)))
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2 
##          19.735          26.100

Length vs Dosage - 2.0 vs 0.5

Finally, we observe another smal p-value (p <= 0.05) and reject the null hypothesis. There does appear to be an even stronger interaction between the smallest dosage and the largest dosage (roughly 4x).
t.test(len ~ dose, paired = FALSE, data = subset(ToothGrowth, ToothGrowth$dose %in% c(.5, 2)))
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100
Lastly, we present a boxplot of the tooth growth by the delivery method (supplement by dosage), which shows that as the supplement amount doubles so too does the tooth growth amount, up to a certain point. This further supports out statistical analysis of p-values from the t-test.
boxplot(len~supp*dose, data = ToothGrowth, col=c("blue", "goldenrod"), xlab = "Delivery Method by Dose", ylab = "Tooth Growth (mm)")

Conclusion

The goal of the exercise was to explore a data and test for relationships between variables. We determined that there was no interaction between tooth growth and supplement but that indeed there was an interaction between tooth growth and the dosage of either supplement.

Appendix

ggplot(data = ToothGrowth, aes(x = len)) + geom_density() + xlim(-5,45) + geom_vline(xintercept = median(ToothGrowth$len), size = .75, color = "blue")

descdist(ToothGrowth$len)

## summary statistics
## ------
## min:  4.2   max:  33.9 
## median:  19.25 
## mean:  18.81333 
## estimated sd:  7.649315 
## estimated skewness:  -0.1499519 
## estimated kurtosis:  2.045018
#fit.uniform <- fitdist(ToothGrowth$len, distr = "unif")
#plot(fit.uniform)