This is an analysis of the ToothGrowth data in the R datasets package. The analysis will provide summary of the data and use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.The analysis will show the effect of Vitamin C on tooth growth in guinea pigs. The difference in means of different dosages and supplements given.
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
head(ToothGrowth, tail(10))
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
## 7 11.2 VC 0.5
## 8 11.2 VC 0.5
## 9 5.2 VC 0.5
## 10 7.0 VC 0.5
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
length(ToothGrowth$len)
## [1] 60
plot( head(ToothGrowth, 20))
Plot shows 3 levels of Vitamin C dose (.5, 1 , 2 ) mg
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
sm <- split(ToothGrowth$len, ToothGrowth$supp)
sapply(sm,mean)
## OJ VC
## 20.66333 16.96333
ggplot(aes(x=supp, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=supp))+
xlab("Supplement") +ylab("Length") +
guides(fill=guide_legend(title="Supplement type"))
dm <- split(ToothGrowth$len, ToothGrowth$dose)
sapply(dm, mean)
## 0.5 1 2
## 10.605 19.735 26.100
ggplot(aes(x=dose, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=dose)) +
xlab("Dose [mg]") +ylab("Length")
t.test(len ~ supp, data = ToothGrowth, var.equal = FALSE, paired = FALSE)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
With CI of -0.1710156 7.5710156 No correlation of supply and lenth so we can’t reject the null hypothesis
d1 <- subset(ToothGrowth, dose %in% c(.5, 1.0))
d2 <- subset(ToothGrowth, dose %in% c(.5, 2.0))
d3 <- subset(ToothGrowth, dose %in% c(1.0, 2.0))
t.test(len ~ dose, data = d1, paired = FALSE, var.equal = FALSE )
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
t.test(len ~ dose, data = d2, paired = FALSE, var.equal = FALSE )
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.15617 -12.83383
## sample estimates:
## mean in group 0.5 mean in group 2
## 10.605 26.100
t.test(len ~ dose, data = d3, paired = FALSE, var.equal = FALSE )
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
with the following CI -11.983781 -6.276219 in c(.5, 1.0) ; -18.15617 -12.83383 in c(.5, 2.0) and -8.996481 -3.733519 in c(1.0, 2.0) we can now reject the null hypothesis which acknowledge correlation of dose level and tooth length.
ds1 <- subset(ToothGrowth, dose == .5)
ds2 <- subset(ToothGrowth, dose == 1.0)
ds3 <- subset(ToothGrowth, dose == 2.0)
t.test(len ~ supp, data = ds1, paired = FALSE, var.equal = FALSE )
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC
## 13.23 7.98
t.test(len ~ supp, data = ds2, paired = FALSE, var.equal = FALSE )
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC
## 22.70 16.77
t.test(len ~ supp, data = ds3, paired = FALSE, var.equal = FALSE )
##
## Welch Two Sample t-test
##
## data: len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OJ mean in group VC
## 26.06 26.14
Again with the following CI 1.719057 8.780943 for .5 ; 2.802148 9.057852 for 1.0 we can now reject the null hypothesis which acknowledge correlation of dose level and tooth length, but ds3 CI -3.79807 3.63807 for 2.0 is not sufficient to reject null hypothesis.
In conclussion, the analysis shows that supplement has no effect on tooth growth, while increasing the dose level leads to increased tooth growth.