This study analyzes the effect of Vitamin C on Tooth Growth in Guinea Pigs by analyzing the ToothGrowth data in the R datasets package. It uses confidence intervals and/or hypothesis tests to compare tooth growth by supplement and dose. Only techniques seen in class are used, even if there are other approaches worth considering.
The data shows the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).
The ToothGrowth data set is a data frame with 60 observations on 3 variables.
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
Since there are only three dose levels of Vitamin C (0.5, 1 and 2 mg) we will transform the variable dose to a factor variable.
First we investigate the effect of supplement delivery method on tooth length and compare the average tooth lengths of guinea pigs that were served Vitamin C via orange juice with those that were served via ascorbic acid.
Mean toothlength by delivery method:
## OJ VC
## 20.66333 16.96333
A basic exploratory analysis indicates that tooth length of the guinea pigs is longer when Vitamin C is delivered through OJ. A statistical t test for the difference in the means of two independent groups will be used to reject or accept that hypothesis. Plot 1 in the appendix illustrates this comparison.
Mean toothlength by supplement dose:
## 0.5 1 2
## 10.605 19.735 26.100
A basic exploratory analysis indicates that tooth length of the guinea pigs is longer with increasing doses of Vitamin C. A statistical t test for the difference in the means of two independent groups will be used to reject or accept that hypothesis. Plot 2 in the appendix illustrates this comparison.
Our null hypothese, H0, states that the means of the two groups are equal. The alternative hypothesis states that the two means are different. We will perform a two sided test.
##
## Welch Two Sample t-test
##
## data: ToothGrowth$len[ToothGrowth$supp == "OJ"] and ToothGrowth$len[ToothGrowth$supp == "VC"]
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean of x mean of y
## 20.66333 16.96333
Since the p-value of this test is 0.06 and the confidence interval of the test contains zero, we have not enough proof to reject the null hypothesis. We can conclude that vitamin C delivery method has NO effect on tooth length.
We will test the effect of supplement dose on toothlength. Given that our basic exploratory analysis indicated that there was a strong correlation between increasing vitamin C dose and toothlength we will only do this test for two combinations. The first combination of dose 0.5mg and 1.0mg and a second combination of dose 1.0mg and 2.0mg. Our null hypothesis, H0, will be that the two groups have eaual means. The alternative hypothesis, Ha, will be that the two groups have different means. We will perform a two sided test.
##
## Welch Two Sample t-test
##
## data: ToothGrowth$len[ToothGrowth$dose == 0.5] and ToothGrowth$len[ToothGrowth$dose == 1]
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean of x mean of y
## 10.605 19.735
##
## Welch Two Sample t-test
##
## data: ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 2]
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean of x mean of y
## 19.735 26.100
For both dose level pairs, the p-value is less than 0.05. Therefore we can reject the null hypothesis, H0, and we can conclude that increasing the dose level leads to an increase in tooth length.
Following results can be concluded from the above hypothesis testing
Supplement type has no effect on tooth growth.
Inreasing the dose level leads to increased tooth growth
library(datasets)
library(ggplot2)
data(ToothGrowth)
set.seed(25)
summary(ToothGrowth)
str(ToothGrowth)
# Convert dose to a factor
ToothGrowth$dose<-as.factor(ToothGrowth$dose)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
# calculate the mean toothlength of guinea pigs served by the two supplement delivery methods
meansupp = split(ToothGrowth$len, ToothGrowth$supp)
sapply(meansupp, mean)
## OJ VC
## 20.66333 16.96333
# Plot tooth length ('len') vs. the supplement delivery method ('supp')
ggplot(aes(x=supp, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=supp))+
xlab("Supplement type") +ylab("Tooth length") + ggtitle(" Plot 1:Tooth Length vs. Supplement Delivery Method")
meandose = split(ToothGrowth$len, ToothGrowth$dose)
sapply(meandose, mean)
## 0.5 1 2
## 10.605 19.735 26.100
# Plot tooth length ('len') vs. the vitamin C dose ('dose')
ggplot(aes(x=dose, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=dose)) +
xlab("Dose in miligrams") +ylab("Tooth length") + ggtitle("Plot 2:Tooth Length vs. Dose Amount")
##Statistical inference
t.test(ToothGrowth$len[ToothGrowth$supp=="OJ"], ToothGrowth$len[ToothGrowth$supp=="VC"], paired = FALSE, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: ToothGrowth$len[ToothGrowth$supp == "OJ"] and ToothGrowth$len[ToothGrowth$supp == "VC"]
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean of x mean of y
## 20.66333 16.96333
# assuming unequal variances between the two groups
t.test(ToothGrowth$len[ToothGrowth$dose==0.5], ToothGrowth$len[ToothGrowth$dose==1], paired = FALSE, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: ToothGrowth$len[ToothGrowth$dose == 0.5] and ToothGrowth$len[ToothGrowth$dose == 1]
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean of x mean of y
## 10.605 19.735
t.test(ToothGrowth$len[ToothGrowth$dose==1], ToothGrowth$len[ToothGrowth$dose==2], paired = FALSE, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 2]
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean of x mean of y
## 19.735 26.100