This is a basic analysis of the ToothGrowth data in base R. The original description of the data is reproduced below:
The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).
This was originally sourced from C. I. Bliss (1952) The Statistics of Bioassay. Academic Press.
Each figure and output will have a number next to it that references its code chunk in the appendix.
First, let’s see what the data generally look like. Then let’s create a violin plot so we can see how the tooth growth is distributed among different factor levels.
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
The “len” variable has noticable variability thoughout the entire dataset. The violin plot aggregates this variability by dosage. It shows that under both VC and OJ the lengths of the odontoblasts increased with increased dosages. The increase under VC was less variable overall and more linear.
Let’s statistically test if the increases in odontoblast lengths are significant for both Ascorbic Acid and Orange Juice. Since the sample sizes are quite small, the results of these tests should be interpreted with caution.
| Parameters | T.Statistic | P.Value | Lower 95% CI | Upper 95% CI |
|---|---|---|---|---|
| VC 1mg - 0.5mg | 7.46 | 0.0000 | 6.31 | 11.27 |
| VC 2mg - 1mg | 5.47 | 0.0001 | 5.69 | 13.05 |
| VC 2mg - 0.5mg | 10.39 | 0.0000 | 14.42 | 21.90 |
| OJ 1mg - 0.5mg | 5.05 | 0.0001 | 5.52 | 13.42 |
| OJ 2mg - 1mg | 2.25 | 0.0392 | 0.19 | 6.53 |
| OJ 2mg - 0.5mg | 7.82 | 0.0000 | 9.32 | 16.34 |
Above is a table of six two sample t-tests with unequal variance. The significance of the difference of odontoblast average length was tested by dosage differences and for each supplement. A 95% confidence interval was computed for the difference between the two groups’ lengths. To compensate for the small sample sizes, the tests were done for unequal variance.
All of the p-values were significant which means that there is strong evidence that the increases in dosages of VC and OJ created statistically significant increases in odontoblast length on the average. The mean difference value of “0” also falls squarely outside of every confidence interval. Some of the main assumptions are that these small samples were randomized well and that there are no confounding variables that may have caused the increases in length. The observations in each of the two samples for each test are assumed to be identically and independently distributed.
library(datasets)
data(ToothGrowth)
ToothGrowth<-ToothGrowth
summary(ToothGrowth)
if("vioplot" %in% rownames(installed.packages()) == FALSE) {install.packages("vioplot")}
suppressWarnings(suppressMessages(library(vioplot)))
par(mfrow=c(1,2))
vioplot(ToothGrowth$len[ToothGrowth$dose==0.5&ToothGrowth$supp=="VC"], ToothGrowth$len[ToothGrowth$dose==1.0&ToothGrowth$supp=="VC"], ToothGrowth$len[ToothGrowth$dose==2.0&ToothGrowth$supp=="VC"],names=c("0.5mg", "1.0mg", "2.0mg"), col="blue")
title("Ascorbic Acid (VC)",ylab="Odontoblast Length", xlab= "Dosage")
vioplot(ToothGrowth$len[ToothGrowth$dose==0.5&ToothGrowth$supp=="OJ"], ToothGrowth$len[ToothGrowth$dose==1.0&ToothGrowth$supp=="OJ"], ToothGrowth$len[ToothGrowth$dose==2.0&ToothGrowth$supp=="OJ"],names=c("0.5mg", "1.0mg", "2.0mg"), col="red")
title("Orange Juice (OJ)",ylab="Odontoblast Length", xlab= "Dosage")
group1<-ToothGrowth[ToothGrowth$supp=="VC" & ToothGrowth$dose<=1,]
group2<-ToothGrowth[ToothGrowth$supp=="VC" & ToothGrowth$dose>=1,]
group3<-ToothGrowth[ToothGrowth$supp=="VC" & ToothGrowth$dose!=1,]
group4<-ToothGrowth[ToothGrowth$supp=="OJ" & ToothGrowth$dose<=1,]
group5<-ToothGrowth[ToothGrowth$supp=="OJ" & ToothGrowth$dose>=1,]
group6<-ToothGrowth[ToothGrowth$supp=="OJ" & ToothGrowth$dose!=1,]
test1<-t.test(len ~ dose, paired = FALSE, var.equal = FALSE, data = group1)
test2<-t.test(len ~ dose, paired = FALSE, var.equal = FALSE, data = group2)
test3<-t.test(len ~ dose, paired = FALSE, var.equal = FALSE, data = group3)
test4<-t.test(len ~ dose, paired = FALSE, var.equal = FALSE, data = group4)
test5<-t.test(len ~ dose, paired = FALSE, var.equal = FALSE, data = group5)
test6<-t.test(len ~ dose, paired = FALSE, var.equal = FALSE, data = group6)
Parameters<-c("VC 1mg - 0.5mg", "VC 2mg - 1mg", "VC 2mg - 0.5mg","OJ 1mg - 0.5mg", "OJ 2mg - 1mg", "OJ 2mg - 0.5mg")
T.Statistic<-round(c(test1[1]$statistic*-1,test2[1]$statistic*-1,test3[1]$statistic*-1,test4[1]$statistic*-1,test5[1]$statistic*-1,test6[1]$statistic*-1),digits=2)
P.Value<-round(c(test1[3]$p.value,test2[3]$p.value,test3[3]$p.value,test4[3]$p.value,test5[3]$p.value,test6[3]$p.value),digits=4)
Confidence.Interval<-round(rbind(sort(test1[4]$conf.int[1:2]*-1),sort(test2[4]$conf.int[1:2]*-1),sort(test3[4]$conf.int[1:2]*-1),sort(test4[4]$conf.int[1:2]*-1),sort(test5[4]$conf.int[1:2]*-1),sort(test6[4]$conf.int[1:2]*-1)),digits=2)
Test.Results<-data.frame(Parameters,T.Statistic,P.Value,Confidence.Interval)
colnames(Test.Results)[4:5]<-c("Lower 95% CI", "Upper 95% CI")
if("knitr" %in% rownames(installed.packages()) == FALSE) {install.packages("knitr")}
library(knitr)
kable(Test.Results)