Overview

The purpose of this data analysis is to analyze the ToothGrowth data set by comparing the guinea tooth growth by supplement and dose. To achieve this purpose, I will follow the following steps: 1) I will load the ToothGrowth data and perform some basic exploratory data analyses 2) I will provide a basic summary of the data. 3) I will use confidence intervals to compare tooth growth by supplement and dose 4) Finally, I will state my conclusions about this data analysis and the and the assumptions necessary for those conclusions.

Load the ToothGrowth data and perform basic exploratory data analyses

library(lattice)
## Warning: package 'lattice' was built under R version 3.5.3
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
table(ToothGrowth$supp,ToothGrowth$dose)
##     
##      0.5  1  2
##   OJ  10 10 10
##   VC  10 10 10
bwplot(ToothGrowth$len ~ToothGrowth$supp | ToothGrowth$dose)

Basic summary of the data

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
aggregate(ToothGrowth$len,list(ToothGrowth$dose,ToothGrowth$supp)
          ,FUN=function(x) c(x_mean = mean(x), x_sd = sd(x)))
##   Group.1 Group.2  x.x_mean    x.x_sd
## 1     0.5      OJ 13.230000  4.459709
## 2     1.0      OJ 22.700000  3.910953
## 3     2.0      OJ 26.060000  2.655058
## 4     0.5      VC  7.980000  2.746634
## 5     1.0      VC 16.770000  2.515309
## 6     2.0      VC 26.140000  4.797731

As you can see, increasing the dose increases tooth growth. It looks like orange The juice (OJ) is more effective than ascorbic acid (VC) for tooth growth when the dose is .5 at 1.0 milligrams per day. When the dose is 2.0 milligrams per day, both supplements are equally effective.

Use confidence intervals & hypothesis tests to compare tooth growth by supplement and dose

I test the hypothesis that the two different supplements, orange juice and vitamin C, have no affect on tooth length.

t.test(len ~ supp, data = ToothGrowth)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

The confidence intervals include 0 and the p-value is greater than the 0.05 threshold, so the null hypothesis cannot be rejected.

Now I will compare the doses to each other to determine if there are doses that are correlated with the length of the tooth.

d_5 <- ToothGrowth[which(ToothGrowth$dose==.5),1]
d_10 <- ToothGrowth[which(ToothGrowth$dose==1),1]
d_20 <- ToothGrowth[which(ToothGrowth$dose==2),1]
d_510_t1 <- t.test(d_5, d_10, paired=FALSE, var.equal=TRUE)
d_510_t2 <- t.test(d_5, d_10, paired=FALSE, var.equal=FALSE)
d_510 <- data.frame("p-value"=c(d_510_t1$p.value, d_510_t2$p.value),
                          "Conf-Low"=c(d_510_t1$conf[1],d_510_t2$conf[1]),
                          "Conf-High"=c(d_510_t1$conf[2],d_510_t2$conf[2]),
                           row.names=c("t1","t2"), "Dose"="[0.5..1]")
d_520_t1 <- t.test(d_5, d_20, paired=FALSE, var.equal=TRUE)
d_520_t2 <- t.test(d_5, d_20, paired=FALSE, var.equal=FALSE)
d_520 <- data.frame("p-value"=c(d_520_t1$p.value, d_520_t2$p.value),
                            "Conf-Low"=c(d_520_t1$conf[1],d_520_t2$conf[1]),
                            "Conf-High"=c(d_520_t1$conf[2],d_520_t2$conf[2]), 
                            row.names=c("t1","t2"), "Dose"="[0.5..2]")
d_1020_t1 <- t.test(d_10, d_20, paired=FALSE, var.equal=TRUE)
d_1020_t2 <- t.test(d_10, d_20, paired=FALSE, var.equal=FALSE)
d_1020 <- data.frame("p-value"=c(d_1020_t1$p.value, d_1020_t2$p.value),
                           "Conf-Low"=c(d_1020_t1$conf[1],d_1020_t2$conf[1]),
                           "Conf-High"=c(d_1020_t1$conf[2],d_1020_t2$conf[2]), 
                           row.names=c("t1","t2"), "Dose"="[1..2]")
doseTot <- rbind(d_510,d_520,d_1020)
doseTot
##          p.value   Conf.Low  Conf.High     Dose
## t1  1.266297e-07 -11.983748  -6.276252 [0.5..1]
## t2  1.268301e-07 -11.983781  -6.276219 [0.5..1]
## t11 2.837553e-14 -18.153519 -12.836481 [0.5..2]
## t21 4.397525e-14 -18.156167 -12.833833 [0.5..2]
## t12 1.810829e-05  -8.994387  -3.735613   [1..2]
## t22 1.906430e-05  -8.996481  -3.733519   [1..2]

Conclusions & assumptions

The result of comparing all the doses is that their p-values are below 0.05 and the confidence intervals do not contain zero, so we can reject the null hypothesis and conclude that the dose does affect the tooth length. Orange juice delivers more tooth growth than ascorbic acid for dosages 0.5 & 1.0. Orange juice and ascorbic acid deliver the same amount of tooth growth for dose amount 2.0 mg/day.

Assumptions