ToothGrowth Exploratory Data Analysis

Author: Wesley

1. Load ToothGrowth data.frame

library(datasets)
data(ToothGrowth)
str(ToothGrowth) #Summary of ToothGrowth data.frame
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

2. Provide a basic summary of the data

ToothGrowth                                                                         %>%
group_by(supp)                                                                      %>%
summarize(mean_len  = mean(len)  ,
          mean_dose = mean(dose) ,
          sd_len    = sd(len)    ,
          sd_dose   = sd(dose))
## Source: local data frame [2 x 5]
## 
##   supp mean_len mean_dose   sd_len   sd_dose
## 1   OJ 20.66333  1.166667 6.605561 0.6342703
## 2   VC 16.96333  1.166667 8.266029 0.6342703
ggplot(data=ToothGrowth, aes(as.factor(dose), len, fill=supp)) +
    geom_bar(stat="identity",)                                     +
    facet_grid(. ~ supp)                                           +
    xlab("Dose (mg)")                                              +
    ylab("Tooth length")                                           +
    scale_fill_brewer(name="Supplement", type="qual")

Fig: We compare the differences in tooth length given the difference supplements

3. Using confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

# split the data up by dosages
d0.5 <- subset(ToothGrowth, dose == 0.5)
d1.0 <- subset(ToothGrowth, dose == 1.0)
d2.0 <- subset(ToothGrowth, dose == 2.0)

# conduct a t test between supplements
test0.5 <- t.test(len ~ supp, paired = FALSE, var.equal = FALSE, data = d0.5)
test1.0 <- t.test(len ~ supp, paired = FALSE, var.equal = FALSE, data = d1.0)
test2.0 <- t.test(len ~ supp, paired = FALSE, var.equal = FALSE, data = d2.0)
Dosage P-value 95% Confidence Interval Significant (P-value < 0.05)
0.5 0.0063586 1.7190573 - 8.7809427 Yes
1.0 0.0010384 2.8021482 - 9.0578518 Yes
2.0 0.9638516 -3.7980705 - 3.6380705 No

Assumptions

  • Absence of confounding factors
  • Samples are unpaired, with unequal variances.

4. Conclusion

Based on the t test statistics, at dosages 0.5, 1.0 there the two supplements VC and OJ led to significant difference in growth. However at dosage 2.0, there is not significant different