# preload library
library(ggplot2)
library(gridExtra) # to make multiplots
## Warning: package 'gridExtra' was built under R version 3.4.3
data(ToothGrowth)
Let’s look at the data and its characteristics
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
Let’s look at the data with only 1 category: dose and supp:
g1 <- qplot(dose, len, data = ToothGrowth) + geom_boxplot(aes(fill = factor(dose)))
g2 <- qplot(supp, len, data = ToothGrowth) + geom_boxplot(aes(fill = supp))
grid.arrange(arrangeGrob(g1, g2, ncol = 2, nrow = 1), bottom = "Figure 1: len variation with respect to dose (left) and supp (right)")
Now let’s look closer at each domain, ie. supp and dose, and their corresponding subdomains, ie. dose, supp.
g3 <- qplot(dose, len, data = ToothGrowth, facets = ~supp, ylab = "Tooth length") +
geom_boxplot(aes(fill = factor(dose))) + labs(fill = "Dose")
g4 <- qplot(supp, len, data = ToothGrowth, facets = ~dose, ylab = "Tooth length") +
geom_boxplot(aes(fill = supp)) + labs(fill = "Supp")
grid.arrange(arrangeGrob(g3, g4, ncol = 2, nrow = 1), bottom = "Figure 2: len variation by subdivision with respect to supp (left) and dose (right)")
len
, supp
and dose
. There are 2 types of supp
: OJ and VC; and 3 types of dose
: 0.5, 1.0 and 2.0.supp
and dose
(Figure 1) suggests potential differences in dose(s) but not so clear differences in supp(s).dose
and supp
(Figure 2) shows better separation between the subdivisions of each category. Particularly, between supp(s), len
seems to increase with higher dose
(Figure 2 left); between dose(s), len
seems to be higher with supp
OJ than VC, except at dose
= 2.0. This is an noticeable and interesting observation that we will later investigate in detail.I will compare tooth growth len with respect to 2 categories: supp and dose. Because the generality of the comparison is similar between the cases, I will use two different methods:
* confidence interval for comparing tooth growth len vs. supp (a)
* hypothesis test for comparing tooth growth len vs. dose (b)
Please note that during exploratory analysis above, we have an impression that supp OJ is more effective than VC, and respectively similar with dose 0.5, 1.0 and 2.0. However, for generality, I assume we’re looking if they’re different. Therefore, in the following comparisons, I use 2-tail test.
Especially for this data, since we observe similar effects at dose = 2.0, it is worthy to do an additional specific test between 2 supp(s) at this dose. For this test, I use 2 sided test because we don’t know if any of them is larger/smaller than the other. (c)
# extract data (only first column)
lenOJ <- subset(ToothGrowth, supp == "OJ")[, 1]
lenVC <- subset(ToothGrowth, supp == "VC")[, 1]
From figure 1 (fight), we can assume and test if OJ performs better than VC. In this case, the test is equivilent to a one sided test. The confidence interval of that test is:
(mean(lenOJ) - mean(lenVC)) + c(-1, 1) * qt(0.95, length(lenVC) +
length(lenOJ) - 2) * sqrt((var(lenVC) + var(lenOJ))/2) *
sqrt(1/length(lenVC) + 1/length(lenOJ))
## [1] 0.4708204 6.9291796
We see that zero doesn’t fall in the confidence interval. This suggests our hypothesis is not true. Therefore OJ actually does perform better than VC.
On the other hand, if we just want to check if the two supp(s) produce different tooth growth, we need to use a two sided test, the confidence interval for that hypothesis is:
(mean(lenOJ) - mean(lenVC)) + c(-1, 1) * qt(0.975, length(lenVC) +
length(lenOJ) - 2) * sqrt((var(lenVC) + var(lenOJ))/2) *
sqrt(1/length(lenVC) + 1/length(lenOJ))
## [1] -0.1670064 7.5670064
Now zero lies in the confidence interval. It means that there is no significant difference between these supp(s).
Ideally, we need to make 3 tests, comparing:
* dose 0.5 vs dose 1.0
* dose 1.0 vs dose 2.0
* dose 0.5 vs dose 2.0
In all of them, the null hypothesis is that there is no difference between the features we’re looking at. For example, in the first test, \(H_0=\) “there is no difference between dose 0.5 and dose 1.0”. I use \(\alpha=0.05\).
# extract data
len05 <- subset(ToothGrowth, dose == 0.5)[, 1]
len10 <- subset(ToothGrowth, dose == 1)[, 1]
len20 <- subset(ToothGrowth, dose == 2)[, 1]
t.test(len05, len10, alternative = "two.sided", paired = FALSE,
var.equal = FALSE, conf.level = 0.95)
##
## Welch Two Sample t-test
##
## data: len05 and len10
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean of x mean of y
## 10.605 19.735
The test returns an extremely low probability (1.27E-7) for the hypothesis of no difference between dose 0.5 and dose 1.0. With this result, we reject the null. Given what we observed in figure 1, we can conclude that dose 1.0 has significant higher effect on tooth growth compared to dose 0.5.
t.test(len10, len20, alternative = "two.sided", paired = FALSE,
var.equal = FALSE, conf.level = 0.95)
##
## Welch Two Sample t-test
##
## data: len10 and len20
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean of x mean of y
## 19.735 26.100
Again, the test returns an extremely low probability (1.91E-5) for the hypothesis of no difference between dose 1.0 and dose 2.0. With this result, we reject the null. Given what we observed in figure 1, we can conclude that dose 2.0 has significant higher effect on tooth growth compared to dose 1.0.
From these 2 tests, we can infer that dose 2.0 has significant higher effect on tooth growth compare to dose 0.5. The last test with these 2 dose(s) is not necessary.
# extract data
lenOJ20 <- subset(ToothGrowth, supp == "OJ" & dose == 2)[, 1]
lenVC20 <- subset(ToothGrowth, supp == "VC" & dose == 2)[, 1]
# run test
t.test(lenVC20, lenOJ20, alternative = "two.sided", paired = FALSE,
var.equal = FALSE, conf.level = 0.95)
##
## Welch Two Sample t-test
##
## data: lenVC20 and lenOJ20
## t = 0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.63807 3.79807
## sample estimates:
## mean of x mean of y
## 26.14 26.06
This test returns a very high probability (96%). It is also supported by the fact that zero stays within the 95% confidence interval. This result suggsets we fail to reject to null. In other words, there is no significant difference of tooth growth between supp OJ and VC at dose 2.0 mg.
A quick exploratory analysis shows intersting insights into the data. A general look at only 1 category (Figure 1) doesn’t provide as detail as that using 2 or more categories (Figure 2).
Generally speaking, supp
OJ has more effect on len
than VC. However this is confirmed only under the assumption of a one-sided test. Under a two-sided test, normally happens if we don’t perform exploratory analysis, they appear to be similar.
Generally speaking, higher dose
increases len
observation. This is true with the assumption of considering 1 category analysis (Figure 1). If we look at the more detail separation (Figure 2), one of my tests has shown that the statement is not true at dose
2.0 where there is no significant difference between supp
OJ and VC.