Introduction

The purpose of this short study is to analyze the ToothGrowth data in R package. The data set contains the length of teeth in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods(orange juice or ascorbic acid).

The analysis includes the following :

  1. Basic exploratory analysis with summary

  2. Use of confidence interval and hypothesis tests to compare tooth growth by supp and dose

  3. Conclusions of the analysis

Analysis

  1. Exploratory Analysis and Summary of the data set
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.2
df <- ToothGrowth
ggplot(df, aes(x=as.factor(dose), y = len, col = supp)) + geom_point(size = 5)

summary(df)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

The exploratory graph shows the length of tooth and amount of dose have direct relationship. Injection by orange juice seems to encourage more tooth growth when the amount of dose is less than or equal to 1.0 mg.

  1. Use of confidence interval and hypothesis tests to compare tooth growth by supp and dose

Let our null hypothesis to be “tooth length mean for the two different dosage is the same.”

First, compare the 0.5 mg and 1.0 mg dose

df_05_1 <- subset(df, dose %in% c(0.5,1))
t.test(len ~ dose, paired = F, var.equal = F, data = df_05_1)
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean in group 0.5   mean in group 1 
##            10.605            19.735

The confidence interval doesn’t include zero and the p-value is extremely low. The results show that the null hypothesis needs to be rejected.

Second, compare the 0.5 mg and 2.0 mg as well as 1.0 mg and 2.0 mg

df_05_2 <- subset(df, dose %in% c(0.5,2))
t.test(len ~ dose, paired = F, var.equal = F, data = df_05_2)
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100
df_1_2 <- subset(df, dose %in% c(1,2))
t.test(len ~ dose, paired = F, var.equal = F, data = df_1_2)
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2 
##          19.735          26.100

Conclusion

The both comparisons show that confidence interval doesn’t include zero and low p-value. Therefore, we can conclude that the null hypothesis, “tooth length mean for the two different dosage is the same” needs to be rejected. The dosage has influence on the average tooth legnth.