Weβre going to analyze the ToothGrowth data for guinea pigs in the R datasets package. We will focus on answering the following research questions: H0: Vitamin C has no effect on tooth length. H1: The Dosage of Vitamin C has a positive effect on thooth length. H2: Orange juice (natural vitamin C) has a bigger effect on tooth length than Ascorbic Acid (synthetic vitamin C).
The random variable is the tooth length measured among 10 guinea pigs at three different dosage levels of Vitamin C (0.5, 1, and 2 mg) and ingested with two different delivery methods (orange juice or ascorbic acid).The data format is a data frame with 60 observations on 3 variables (len renamed Length, supp renamed SupplementType and dose renamed Dose).
# Loading the dataset and looking at the data frame content.
library(datasets); data(ToothGrowth);str(ToothGrowth);
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
# Renaming variables
names(ToothGrowth) <- c("Length", "SupplementType", "Dose")
levels(ToothGrowth$SupplementType) <- c("OrangeJuice", "AscorbicAcid")
## Investigating Hypothesis 1
# Dose 0.5 mg versus 1 mg: Performing a t-test
Length1 <- subset(ToothGrowth, Dose %in% c(1,0.5))
t.test(Length ~ Dose, paired = FALSE, var.equal = FALSE, data = Length1)
##
## Welch Two Sample t-test
##
## data: Length by Dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
# Dose 1 mg versus 2 mg: Performing a t-test
Length2 <- subset(ToothGrowth, Dose %in% c(1, 2))
t.test(Length ~ Dose, paired = FALSE, var.equal = FALSE, data = Length2)
##
## Welch Two Sample t-test
##
## data: Length by Dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
Conclusion: For all the above tests, the confidence interval do not include 0 and the p-value is below 0.05. We can therefore reject the null hypothesis and conclude that the level of intake of Vitamin C has an effect on tooth length. The t-tests also show that the samples with different dosages are significantly different from each other. Looking also at their means and standard deviation (see appendix), we can then conclude that the higher the Vitamin C dosage, the longer the thooth are.
## Investigating Hypothesis 2
# Ascorbic acid versus orange Juice at 0.5 mg: Performing a t-test
Length3 <- subset(ToothGrowth, Dose %in% c(0.5))
t.test(Length ~ SupplementType, paired = FALSE, var.equal = FALSE, data = Length3)
##
## Welch Two Sample t-test
##
## data: Length by SupplementType
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OrangeJuice mean in group AscorbicAcid
## 13.23 7.98
# Ascorbic acid versus orange Juice at 1 mg: Performing a t-test
Length4 <- subset(ToothGrowth, Dose %in% c(1))
t.test(Length ~ SupplementType, paired = FALSE, var.equal = FALSE, data = Length4)
##
## Welch Two Sample t-test
##
## data: Length by SupplementType
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OrangeJuice mean in group AscorbicAcid
## 22.70 16.77
# Ascorbic acid versus orange Juice at 2 mg: Performing a t-test
Length5 <- subset(ToothGrowth, Dose %in% c(2))
t.test(Length ~ SupplementType, paired = FALSE, var.equal = FALSE, data = Length5)
##
## Welch Two Sample t-test
##
## data: Length by SupplementType
## t = -0.0461, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OrangeJuice mean in group AscorbicAcid
## 26.06 26.14
Conclusion: For 0.5 and 1 mg, the confidence interval do not include 0, the p-value is below 0.05. We can therefore reject the null hypothesis and conclude that orange juice has a bigger effect on tooth length than Ascorbic Acid for Vitamin C intake of 0.5 mg and 1 mg. For 2 mg, we can not reject the null hypothesis and can not conclude that Orange juice has a bigger effect on tooth length than Ascorbic Acid. However, we previously deduce that 2 mg has the biggest effect on tooth length, but their delivery method appear to be insignificant at that dosage.
# Plotting the dataset with a boxplot and a coplot (idea for codes taken from R tutorials)
library(graphics)
# Graf 1
boxplot(Length ~ Dose, data = ToothGrowth,boxwex = 0.25, at = 1:3 - 0.2,
subset = SupplementType == "AscorbicAcid", col = "yellow",main = "ToothGrowth data: length vs dose",
xlab = "Vitamin C dosage (mg)", ylab = "Tooth Length", xlim = c(0.5, 3.5), ylim = c(0, 40), yaxs = "i")
boxplot(Length ~ Dose, data = ToothGrowth, add = TRUE, boxwex = 0.25, at = 1:3 + 0.2,
subset = SupplementType == "OrangeJuice", col = "orange")
legend(2.8, 11, c("Ascorbic Acid", "Orange Juice"), fill = c("yellow", "orange"))
# Graf 2
coplot(Length ~ Dose | SupplementType, data = ToothGrowth, panel = panel.smooth, col="red", bg = "orange", pch = 21, bar.bg = c(fac="orange"), xlab = "ToothGrowth data: length vs dose", ylab="Tooth Length")
Our analysis assumes the following: 1 The datasets are unpaired - Even thus, we lack information regarding the modalities of the study design, it is fair to assume that toothgrowth is studied among independent guinea pigs due to nature and time constraints involved in growing teeth. 2 We assume the variance of the population of guinea pigs to be unequals due to the inequalities in asbsording the ascorbic acid from natural and synthetic sources in the population.
# Dose 0.5 versus 2 (not really needed)
t.test(Length ~ SupplementType, paired = FALSE, var.equal = FALSE, data = ToothGrowth)
##
## Welch Two Sample t-test
##
## data: Length by SupplementType
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OrangeJuice mean in group AscorbicAcid
## 20.66333 16.96333
Conclusion: When all dosages are combined in the same samples, the t-test shows that their are no significant difference between their delivery methods. The research question needs to be investigated in a more detailed fashion.
d1 <- ToothGrowth$Length[ToothGrowth$SupplementType=="AscorbicAcid" & ToothGrowth$Dose==0.5 ]; mean(d1); sd(d1);
## [1] 7.98
## [1] 2.746634
d2 <- ToothGrowth$Length[ToothGrowth$SupplementType=="OrangeJuice" & ToothGrowth$Dose==0.5 ]; mean(d2); sd(d2);
## [1] 13.23
## [1] 4.459709
d3 <- ToothGrowth$Length[ToothGrowth$SupplementType=="AscorbicAcid" & ToothGrowth$Dose==1 ]; mean(d3); sd(d3);
## [1] 16.77
## [1] 2.515309
d4 <- ToothGrowth$Length[ToothGrowth$SupplementType=="OrangeJuice" & ToothGrowth$Dose==1 ]; mean(d4); sd(d4);
## [1] 22.7
## [1] 3.910953
d5 <- ToothGrowth$Length[ToothGrowth$SupplementType=="AscorbicAcid" & ToothGrowth$Dose==2 ]; mean(d5); sd(d5);
## [1] 26.14
## [1] 4.797731
d6 <- ToothGrowth$Length[ToothGrowth$SupplementType=="OrangeJuice" & ToothGrowth$Dose==2 ]; mean(d6); sd(d6);
## [1] 26.06
## [1] 2.655058
C. I. Bliss (1952) The Statistics of Bioassay. Academic Press.
McNeil, D. R. (1977) Interactive Data Analysis. New York: Wiley.
Boxplots: Using βat =β and adding boxplots β example idea by Roger Bivand.