We’re going to perform statistical analysis to the ToothGrowth data in the R datasets package. This data is about length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).
First, we will split the ToothGrowth data and group it by the supp (delivery method) & dose (Dose in milligrams) order, then we will calculate the mean from each group.
avg_growth = ToothGrowth %>%
group_by(supp, dose) %>%
summarise(len = mean(len))
Now, we will do a basic exploratory data analysis to see if we can get useful information from this data.
g = ggplot(avg_growth, aes(dose, len, colour=supp)) +
geom_point(size = 5) +
geom_line(size = 1)
g + xlab("Dose(mg)") +
ylab("Avg. Tooth Length") +
scale_colour_discrete(name ="Delivery Method",
breaks=c("OJ", "VC"),
labels=c("Orange Juice", "Ascorbic Acid"))
From the plot above we could see that Orange Juice looks more effective in 0.5 mg and 1.0 mg dose for teeth growth than Ascorbic Acid.
To get statistical evidence from our hypothesis above we could do a T Test to each delivery method & dose group with an assumption that there is no relationship between the subjects in each sample (independent samples).
Here is the T Test on 0.5 mg dose group.
oj0.5 = ToothGrowth %>% filter(supp == "OJ" & dose == 0.5)
vc0.5 = ToothGrowth %>% filter(supp == "VC" & dose == 0.5)
t.test(oj0.5$len, vc0.5$len)
##
## Welch Two Sample t-test
##
## data: oj0.5$len and vc0.5$len
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean of x mean of y
## 13.23 7.98
Here is the T Test on the 1.0 mg dose group.
oj1.0 = ToothGrowth %>% filter(supp == "OJ" & dose == 1.0)
vc1.0 = ToothGrowth %>% filter(supp == "VC" & dose == 1.0)
t.test(oj1.0$len, vc1.0$len)
##
## Welch Two Sample t-test
##
## data: oj1.0$len and vc1.0$len
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean of x mean of y
## 22.70 16.77
And, here is the T Test on the 2.0 mg dose group.
oj2.0 = ToothGrowth %>% filter(supp == "OJ" & dose == 2.0)
vc2.0 = ToothGrowth %>% filter(supp == "VC" & dose == 2.0)
t.test(oj2.0$len, vc2.0$len)
##
## Welch Two Sample t-test
##
## data: oj2.0$len and vc2.0$len
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean of x mean of y
## 26.06 26.14
From our test and plot above we can say that we are 95% confident that 0.5 mg and 1.0 mg dose of Orange Juice is more effective than 0.5 mg and 1.0 mg dose of Ascorbic Acid. However, at 2.0 mg dose, there is no significant effect between Orange Juice and Ascorbic Acid.