This project analyses the ToothGrowth data in the R datasets package. The structure of this analysis is as follows:
The data set studies The Effect of Vitamin C on Tooth Growth in Guinea Pigs.The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid ( a form of vitamin C and coded as VC).
pairs(ToothGrowth, main = "ToothGrowth data", pch = 21, bg = c("red", "green"))
coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth, pch = 21, bg = c("red", "green"),
xlab = "ToothGrowth data: length vs dose, given type of supplement")
ToothGrowth$dose<-as.factor(ToothGrowth$dose)
ggplot(aes(x=dose, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=dose)) + xlab("Dose Amount") + ylab("Tooth Length") + facet_grid(~ supp) + ggtitle("Length vs Dose by type of supplement")
knitr::kable(summary(ToothGrowth))
| len | supp | dose | |
|---|---|---|---|
| Min. : 4.20 | OJ:30 | 0.5:20 | |
| 1st Qu.:13.07 | VC:30 | 1 :20 | |
| Median :19.25 | NA | 2 :20 | |
| Mean :18.81 | NA | NA | |
| 3rd Qu.:25.27 | NA | NA | |
| Max. :33.90 | NA | NA |
knitr::kable(ToothGrowth %>% group_by(dose, supp) %>% summarize(mean = mean(len), sd = sd(len)))
| dose | supp | mean | sd |
|---|---|---|---|
| 0.5 | OJ | 13.23 | 4.459708 |
| 0.5 | VC | 7.98 | 2.746634 |
| 1 | OJ | 22.70 | 3.910953 |
| 1 | VC | 16.77 | 2.515309 |
| 2 | OJ | 26.06 | 2.655058 |
| 2 | VC | 26.14 | 4.797731 |
Independent and identically distributed (i.i.d.)
Variances of growth are different for different supplement and dossage
Tooth growth follows a normal distribution
Null Hypothesis: Not growth difference for either supplemt
Alternative Hypotehis: Higher growth with orange jiuce
t.test(len~supp,data=ToothGrowth, alternative="greater", paired=FALSE, var.equal=FALSE, conf.level=0.95)
Welch Two Sample t-test
data: len by supp
t = 1.9153, df = 55.309, p-value = 0.03032
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
0.4682687 Inf
sample estimates:
mean in group OJ mean in group VC
20.66333 16.96333
With the above p-value the null hypothesis can be rejected. That could be stated as more or less 3% chance to obtain an extreme value difference on mean growth.
At the same time the p-value validates orange juice as better growth promoter.
Null Hypothesis: Not growth difference for multiple dossages
Alternative Hypotehis: Higher growth with higher dossage
data(ToothGrowth)
data <- ToothGrowth %>% filter(dose <2)
t.test(len~dose,data=data, alternative="less", paired=FALSE, var.equal=FALSE, conf.level=0.95)
Welch Two Sample t-test
data: len by dose
t = -6.4766, df = 37.986, p-value = 6.342e-08
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -6.753323
sample estimates:
mean in group 0.5 mean in group 1
10.605 19.735
Null hypothesis rejected with the above p-value for 0.5 and 1 mg/day.
data <- ToothGrowth %>% filter(dose >0.5)
t.test(len~dose,data=data, alternative="less", paired=FALSE, var.equal=FALSE, conf.level=0.95)
Welch Two Sample t-test
data: len by dose
t = -4.9005, df = 37.101, p-value = 9.532e-06
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -4.17387
sample estimates:
mean in group 1 mean in group 2
19.735 26.100
The null hypothesis is also rejected with the above p-value for 1 and 2 mg/day.
Based on these lower p-values the alternative hypothesis becomes validated.
Since similar results seem to be obtained at this supplement concentration. A need to verify the fact is requred.
Null Hypothesis: Not growth difference for either supplement at 2 mg/day dossage
Alternative Hypotehis: Growth difference with different supplement at 2 mg/day dossage
data <- ToothGrowth %>% filter(dose ==2)
t.test(len~supp,data=data, alternative="two.sided", paired=FALSE, var.equal=FALSE, conf.level=0.95)
Welch Two Sample t-test
data: len by supp
t = -0.046136, df = 14.04, p-value = 0.9639
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.79807 3.63807
sample estimates:
mean in group OJ mean in group VC
26.06 26.14
She null hypothesis cannot be rejected with the above p-value. Not enough information to validate the alternative hypothesis is available.
With the present anaylisis we can infer that:
.