In this report we explore and analyze the ToothGrowth data in the R datasets package and infer statistically by performing hypothesis tests to compare tooth growth by supp and dose.
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
unique(ToothGrowth$dose)
## [1] 0.5 1.0 2.0
From the basic exploratory data analyses above and references provided in the appendix section below, we see that there are:
* 60 observations with each observation collected from one of the 60 guinea pigs.
* 2 types of supplements: OJ (orange juice) and VC (ascorbic acid) with 30 observations each.
* 3 dose levels: 0.5, 1.0, and 2.0, unit in milligrams per day.
For more basic exploratory data analyses with plots, please refer to the appendix section below.
# Test: OJ vs. VC
t <- t.test(len ~ supp, data = ToothGrowth, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 0.06063451
t$conf.int[1:2]
## [1] -0.1710156 7.5710156
Result: With P-value of 6.063% and 95% confidence interval of (-0.171, 7.571) for mean(OJ)-mean(VC), we can’t reject the null hypothesis that there is no significant difference in tooth length between the two supplement types.
# Test: Dose level comparison - 0.5 vs. 2.0
doseL1L3 <- subset(ToothGrowth, dose %in% c(0.5, 2.0))
t <- t.test(len ~ dose, data = doseL1L3, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 4.397525e-14
t$conf.int[1:2]
## [1] -18.15617 -12.83383
Result: With P-value of < 5% and 95% confidence interval of (-18.156 -12.834) for mean(0.5)-mean(2.0), we reject the null hypothesis that there is no significant difference in tooth length between the two dose levels. Dose level at 2.0 mg results in significantly longer tooth length compared to that at 0.5 mg.
# Test: Dose level comparison - 1.0 vs. 2.0
doseL2L3 <- subset(ToothGrowth, dose %in% c(1.0, 2.0))
t <- t.test(len ~ dose, data = doseL2L3, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 1.90643e-05
t$conf.int[1:2]
## [1] -8.996481 -3.733519
Result: With P-value of < 5% and 95% confidence interval of (-8.996 -3.734) for mean(1.0)-mean(2.0), we reject the null hypothesis that there is no significant difference in tooth length between the two dose levels. Dose level at 2.0 mg results in significantly longer tooth length compared to that at 1.0 mg.
# Test: Dose level comparison - 0.5 vs. 1.0
doseL1L2 <- subset(ToothGrowth, dose %in% c(0.5, 1.0))
t <- t.test(len ~ dose, data = doseL1L2, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 1.268301e-07
t$conf.int[1:2]
## [1] -11.983781 -6.276219
Result: With P-value of < 5% and 95% confidence interval of (-11.984 -6.276) for mean(0.5)-mean(1.0), we reject the null hypothesis that there is no significant difference in tooth length between the two dose levels. Dose level at 1.0 mg results in significantly longer tooth length compared to that at 0.5 mg.
# Test: OJ vs. VC on dose at 0.5 mg
doseL1 <- subset(ToothGrowth, dose == 0.5)
t <- t.test(len ~ supp, data = doseL1, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 0.006358607
t$conf.int[1:2]
## [1] 1.719057 8.780943
Result: With P-value of < 5% and 95% confidence interval of (1.719 8.781) for mean(OJ)-mean(VC), we reject the null hypothesis that there is no significant difference in tooth length between the two supplement types. Using OJ with dose level at 0.5 mg results in significantly longer tooth length compared to using VC with the same dose level.
# Test: OJ vs. VC on dose at 1.0 mg
doseL2 <- subset(ToothGrowth, dose == 1.0)
t <- t.test(len ~ supp, data = doseL2, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 0.001038376
t$conf.int[1:2]
## [1] 2.802148 9.057852
Result: With P-value of < 5% and 95% confidence interval of (2.802 9.058) for mean(OJ)-mean(VC), we reject the null hypothesis that there is no significant difference in tooth length between the two supplement types. Using OJ with dose level at 1.0 mg results in significantly longer tooth length compared to using VC with the same dose level.
# Test: OJ vs. VC on dose at 2.0 mg
doseL3 <- subset(ToothGrowth, dose == 2.0)
t <- t.test(len ~ supp, data = doseL3, paired = FALSE, var.equal = FALSE)
t$p.value
## [1] 0.9638516
t$conf.int[1:2]
## [1] -3.79807 3.63807
Result: With P-value of 96.385% and 95% confidence interval of (-3.798 3.638) for mean(OJ)-mean(VC), we can’t reject the null hypothesis that there is no significant difference in tooth length between the two supplement types. With dose level at 2.0 mg, there is in no significant difference in tooth length between OJ and VC.
R documentation of ToothGrowth data.
Description improvement of ToothGrowth data.
# Tooth length by supp type
ggplot(ToothGrowth, aes(x = supp, y = len, fill = supp)) +
geom_boxplot() +
labs(title = "Guinea Pigs Tooth Growth Analysis",
x = "Supp Type - Orange Juice (OJ) vs. Ascorbic Acid (VC)",
y = "Length of Tooth") +
scale_fill_discrete(name = "Supp Type")
# Tooth length by dose
ggplot(ToothGrowth, aes(x = factor(dose), y = len, fill = factor(dose))) +
geom_boxplot()+
labs(title = "Guinea Pigs Tooth Growth Analysis",
x = "Dose (mgs/day)", y = "Length of Tooth") +
scale_fill_discrete(name = "Dose")
# Tooth length by supp type and dose
ggplot(data = ToothGrowth, aes(x = supp, y = len, fill = supp)) +
geom_boxplot() + facet_grid(. ~ dose) +
labs(title = "Guinea Pigs Tooth Growth Analysis\nGrouped by Dose\n(mgs/day)",
x = "Supp Type - Orange Juice (OJ) vs. Ascorbic Acid (VC)",
y = "Length of Tooth") +
scale_fill_discrete(name = "Supp Type")