Tooth Growth Statistical Tests

Overview

The purpose of this analysis is to analyse tooth growth by supp and dose. First we will take a brief look at the data, perform a couple tests and finally offer conclusions.

Section 1 - Load and Explore the data

#install.packages("gmodels")
library(ggplot2)
library(knitr)
library(gmodels)
library(dplyr)

data("ToothGrowth")
data <- as.data.frame(ToothGrowth)

str(data)

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

table (data$dose)

## 
## 0.5   1   2 
##  20  20  20

data$dose <- factor(data$dose)

So far we have discovered what the data looks like. It seem to be a small dataset. There are only 60 observations and 3 variables. The supp and dose variables seem to be evenly distributed factors in the dataset and the len variable is a measurement that corresponds to those factors. After reviewing the dose column, I changed it to a factor with 3 levels.

Section 2 - Summary of Data

summary(data)

##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90

The summary confirms that the supp and dose can be treated as factors. We can see that len ranges from 4.20 to 33.90. Below a cross tabulation is done which shows the distribution of dose and supp.

CrossTable(data$supp, data$dose)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  60 
## 
##  
##              | data$dose 
##    data$supp |       0.5 |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|-----------|
##           OJ |        10 |        10 |        10 |        30 | 
##              |     0.000 |     0.000 |     0.000 |           | 
##              |     0.333 |     0.333 |     0.333 |     0.500 | 
##              |     0.500 |     0.500 |     0.500 |           | 
##              |     0.167 |     0.167 |     0.167 |           | 
## -------------|-----------|-----------|-----------|-----------|
##           VC |        10 |        10 |        10 |        30 | 
##              |     0.000 |     0.000 |     0.000 |           | 
##              |     0.333 |     0.333 |     0.333 |     0.500 | 
##              |     0.500 |     0.500 |     0.500 |           | 
##              |     0.167 |     0.167 |     0.167 |           | 
## -------------|-----------|-----------|-----------|-----------|
## Column Total |        20 |        20 |        20 |        60 | 
##              |     0.333 |     0.333 |     0.333 |           | 
## -------------|-----------|-----------|-----------|-----------|
## 
##

Section 3 - Tests and Confidence Intervals

H0 - There is no difference

doseA <- subset(data, dose %in% c(0.5, 1.0))
doseB <- subset(data, dose %in% c(0.5, 2.0))
doseC <- subset(data, dose %in% c(1.0, 2.0))

test_supp <- t.test(len ~ supp, paired = F, var.equal=F, data = data)
test_doseA <- t.test(len ~ dose, paired = F, var.equal=F, data = doseA)
test_doseB <- t.test(len ~ dose, paired = F, var.equal=F, data = doseB)
test_doseC <- t.test(len ~ dose, paired = F, var.equal=F, data = doseC)

kable(data.frame("p-value" = c(test_supp$p.value, test_doseA$p.value, test_doseB$p.value, test_supp$p.value),
  "Lower Limit" = c(test_supp$conf[1], test_doseA$conf[1], test_doseB$conf[1], test_doseC$conf[1]),
  "Upper Limit" = c(test_supp$conf[2], test_doseA$conf[2], test_doseB$conf[2], test_doseC$conf[2]),
  row.names = c("supp", "dose 0.5 & 1.0", "dose 0.5 & 2.0", "dose 1.0 & 2.0")
))

	p.value	Lower.Limit	Upper.Limit
supp	0.0606345	-0.1710156	7.571016
dose 0.5 & 1.0	0.0000001	-11.9837813	-6.276219
dose 0.5 & 2.0	0.0000000	-18.1561665	-12.833834
dose 1.0 & 2.0	0.0606345	-8.9964805	-3.733519

Section 4 - Conclusion