The purpose of this short study is to analyze the ToothGrowth data in R package. The data set contains the length of teeth in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods(orange juice or ascorbic acid).
The analysis includes the following :
Basic exploratory analysis with summary
Use of confidence interval and hypothesis tests to compare tooth growth by supp and dose
Conclusions of the analysis
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.2
df <- ToothGrowth
ggplot(df, aes(x=as.factor(dose), y = len, col = supp)) + geom_point(size = 5)
summary(df)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
The exploratory graph shows the length of tooth and amount of dose have direct relationship. Injection by orange juice seems to encourage more tooth growth when the amount of dose is less than or equal to 1.0 mg.
Let our null hypothesis to be “tooth length mean for the two different dosage is the same.”
First, compare the 0.5 mg and 1.0 mg dose
df_05_1 <- subset(df, dose %in% c(0.5,1))
t.test(len ~ dose, paired = F, var.equal = F, data = df_05_1)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
The confidence interval doesn’t include zero and the p-value is extremely low. The results show that the null hypothesis needs to be rejected.
Second, compare the 0.5 mg and 2.0 mg as well as 1.0 mg and 2.0 mg
df_05_2 <- subset(df, dose %in% c(0.5,2))
t.test(len ~ dose, paired = F, var.equal = F, data = df_05_2)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.15617 -12.83383
## sample estimates:
## mean in group 0.5 mean in group 2
## 10.605 26.100
df_1_2 <- subset(df, dose %in% c(1,2))
t.test(len ~ dose, paired = F, var.equal = F, data = df_1_2)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
The both comparisons show that confidence interval doesn’t include zero and low p-value. Therefore, we can conclude that the null hypothesis, “tooth length mean for the two different dosage is the same” needs to be rejected. The dosage has influence on the average tooth legnth.