Overview

Investigate the effects of vitamin C on the tooth growth of guinea pigs. Data is obtained via the ToothGrowth dataset, included in the datasets package in R.

Load required libraries and the ToothGrowth data set

library(ggplot2)
library(datasets)
library(dplyr)
data(ToothGrowth)

Examine the the data

str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Dose should be a factor, not a numeric variable.

ToothGrowth$dose <- factor(ToothGrowth$dose)

View a basic summary of the data.

summary(ToothGrowth)
##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90

Create boxplots that show tooth length by dosage and supplement method.

ggplot(ToothGrowth, aes(x = dose, y = len)) + geom_boxplot() +
  xlab("dose") +
  ylab("tooth length")

ggplot(ToothGrowth, aes(x = supp, y = len)) + geom_boxplot() +
  xlab("supplement type") +
  ylab("tooth length")

The first boxplot (with dose along the x-axis) suggests that as dosage increases, so does tooth length. The second boxplot (with supplement type along the x-axis) is much more ambiguous. Additional statistical analysis is required to determine the effect of the supplement type on tooth length.

Statistical analysis of supplement type on tooth length

To determine if supplement type has a statistically significant effect on tooth growth, a t-test must be performed. First, a vector of tooth lengths is created for each supplement type.

oj_len <- filter(ToothGrowth, supp == "OJ")$len
vc_len <- filter(ToothGrowth, supp == "VC")$len

A t-test is performed with the new vectors.

t.test(oj_len, vc_len)
## 
##  Welch Two Sample t-test
## 
## data:  oj_len and vc_len
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

The resulting p-value and the fact that zero lies in the confidence interval both suggest that the null hypothesis cannot be rejected (the means of the two populations may be the same). Based on the data, it is not clear that supplement type has a significant effect on tooth growth.