=========================================================================================================================
Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package. 1. Load the ToothGrowth data and perform some basic exploratory data analyses 2. Provide a basic summary of the data. 3. Use confidence intervals and hypothesis tests to compare tooth growth by supp and dose. (Use the techniques from class even if there’s other approaches worth considering) 4. State your conclusions and the assumptions needed for your conclusions.
library(knitr)
opts_knit$set(progress=FALSE, verbose = TRUE)
opts_chunk$set(echo=TRUE, message=FALSE, tidy=TRUE, comment=NA,
fig.path="figure/", fig.keep="high", fig.width=10, fig.height=6,
fig.align="center")
Load needed libraries.
require(plyr)
require(ggplot2)
=============================================================================================================================
library(datasets)
boxplot(len ~ supp * dose, data = ToothGrowth, xlab = "Supp Dose", ylab = "Tooth Length",
main = "Boxplot of Tooth Growth Data")
** The boxplot graph shows on average that as the length of the tooth increases the dose is also increased.**
=============================================================================================================================
library(datasets)
x <- ToothGrowth
summary(x)
len supp dose
Min. : 4.2 OJ:30 Min. :0.50
1st Qu.:13.1 VC:30 1st Qu.:0.50
Median :19.2 Median :1.00
Mean :18.8 Mean :1.17
3rd Qu.:25.3 3rd Qu.:2.00
Max. :33.9 Max. :2.00
The ToothGrowth dataset explains the relation between the growth of teeth at each of three dose levels of Vitamin C (0.5, 1 and 2 mg) with each of two delivery methods(orange juice and ascorbic acid).
=============================================================================================================================
alpha <- 0.05 # 95% confidence interval for 2 tail z values
z.half.alpha <- qnorm(1 - alpha/2)
c(-z.half.alpha, z.half.alpha)
[1] -1.96 1.96
t.test(x$len[x$supp == "OJ"], x$len[x$supp == "VC"], paired = TRUE) # Hypothesis 1 that OJ does not improve growth more than VC
Paired t-test
data: x$len[x$supp == "OJ"] and x$len[x$supp == "VC"]
t = 3.303, df = 29, p-value = 0.00255
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
1.409 5.991
sample estimates:
mean of the differences
3.7
t.test(x$len, x$dose) # Hypothesis 2 that dosage improves growth
Welch Two Sample t-test
data: x$len and x$dose
t = 17.81, df = 59.8, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
15.66 19.63
sample estimates:
mean of x mean of y
18.813 1.167
=============================================================================================================================
With Hypothesis 1 - The Paired Test: the 95% confidence interval contains the sample mean of the differences between -1.96 to 1.96, the hypothesis cannot be rejected. Therefore, there is insufficient evidence to conclude that supplement OJ will works any better than supplement VC.**
With Hypothesis 2 - The Welch 2 Sample T-Test: the 95% confidence interval contains the sample mean of the differences between -1.96 to 1.96, the hypothesis cannot be rejected. Therefore, there is sufficient evidence to conclude that increased dosages will effect tooth Growth. **
Due to the values obtained it can be assumed that there is a difference in the growth of the tooth while the doses are larger. By looking at the boxplot and the assumptions from the hypothesis, the delivery methods are independent of the dose size.