We’re going to analyze the ToothGrowth data in the R
datasets package. The response is the length of odontoblasts (cells
responsible for tooth growth) in 60 guinea pigs. Each animal received
one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of
two delivery methods, orange juice or ascorbic acid (a form of vitamin C
and coded as VC).
A data frame with 60 observations on 3 variables.
Source: C. I. Bliss (1952). The Statistics of Bioassay Academic Press.
library("ggplot2")
data("ToothGrowth")
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
#Convert dose column from numeric to factor
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
g <- ggplot(ToothGrowth, aes(x = dose, y = len, fill = supp))
g <- g + geom_col()
g <- g + labs(title = "Length of Tooth by Doses for Both Supplements", x = "Doses", y = "Tooth Length", fill="Supplements" )
g <- g + facet_grid(~supp, scales = "free")
g
We can see that the doses given through orange juice are more effective. This can be noticed since greater growth through the teeth’ length is observed when the dose is administered via orange juice and less when it is administered via ascorbic acid.
We could also notice that when 2mg doses are administered per day, the teeth length is the same regardless of the delivery methods.
All these are only initial guesses. We need to verify if all these guesses are true by carrying out hypothesis tests.
Our null hypothesis is that the length of the tooth does not vary when we use either of the two delivery methods (VC or OJ) while our alternative hypothesis would be that tooth length varies depending on the delivery methods.
t.test(len ~ supp, data = ToothGrowth)$conf.int
## [1] -0.1710156 7.5710156
## attr(,"conf.level")
## [1] 0.95
t.test(len ~ supp, data = ToothGrowth)$p.value
## [1] 0.06063451
Since the p-value is 0.06063 just a little bit greater
than the significant level of 0.05, and the confidence
interval contains zero there is not enough evidence to reject
the null hypothesis.
We cannot assume the delivery type has a significant effect on tooth growth. So, we will fail to reject the null hypothesis which claimed that, supplements have no effect on tooth length.
Our null hypothesis here is that tooth length does not vary between methods when we use different doses while our alternative hypothesis would be that the length of the teeth varies according to the method and dose delivered.
dose0.5_1.0 <- subset(ToothGrowth, dose %in% c(0.5, 1.0))
t.test(len ~ dose, data = dose0.5_1.0)$conf.int
## [1] -11.983781 -6.276219
## attr(,"conf.level")
## [1] 0.95
t.test(len ~ dose, data = dose0.5_1.0)$p.value
## [1] 1.268301e-07
dose0.5_2.0 <- subset(ToothGrowth, dose %in% c(0.5, 2.0))
t.test(len ~ dose, data = dose0.5_2.0)$conf.int
## [1] -18.15617 -12.83383
## attr(,"conf.level")
## [1] 0.95
t.test(len ~ dose, data = dose0.5_2.0)$p.value
## [1] 4.397525e-14
dose1.0_2.0 <- subset(ToothGrowth, dose %in% c(1.0, 2.0))
t.test(len ~ dose, data = dose1.0_2.0)$conf.int
## [1] -8.996481 -3.733519
## attr(,"conf.level")
## [1] 0.95
t.test(len ~ dose, data = dose1.0_2.0)$p.value
## [1] 1.90643e-05
From all the t.tests above, we can see that the p-values are very small and therefore significant.
We will reject the null hypothesis and establish that increasing the dose level leads to an increase in tooth length since the p-values are far less than the significant level and the confidence intervals don’t contain zero.