Overview

This project analyses the ToothGrowth data in the R datasets package. The structure of this analysis is as follows:

  1. Basic exploratory data anlysis
  2. Basic summary of the data
  3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose
  4. Conclusion

Exploratory Data Analysis

The data set studies The Effect of Vitamin C on Tooth Growth in Guinea Pigs.The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid ( a form of vitamin C and coded as VC).

pairs(ToothGrowth, main = "ToothGrowth data", pch = 21, bg = c("red", "green"))

coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth, pch = 21, bg = c("red", "green"),
       xlab = "ToothGrowth data: length vs dose, given type of supplement")

ToothGrowth$dose<-as.factor(ToothGrowth$dose)
ggplot(aes(x=dose, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=dose)) + xlab("Dose Amount") + ylab("Tooth Length") + facet_grid(~ supp) + ggtitle("Length vs Dose by type of supplement") 

Data Summary

knitr::kable(summary(ToothGrowth))
len supp dose
Min. : 4.20 OJ:30 0.5:20
1st Qu.:13.07 VC:30 1 :20
Median :19.25 NA 2 :20
Mean :18.81 NA NA
3rd Qu.:25.27 NA NA
Max. :33.90 NA NA
knitr::kable(ToothGrowth  %>% group_by(dose, supp) %>% summarize(mean = mean(len), sd = sd(len)))
dose supp mean sd
0.5 OJ 13.23 4.459708
0.5 VC 7.98 2.746634
1 OJ 22.70 3.910953
1 VC 16.77 2.515309
2 OJ 26.06 2.655058
2 VC 26.14 4.797731

Comparison for tooth growth

Assuptions for hypotheis testing

    Independent and identically distributed (i.i.d.)

    Variances of growth are different for different supplement and dossage

    Tooth growth follows a normal distribution
    

Supplement comparison

Null Hypothesis: Not growth difference for either supplemt

Alternative Hypotehis: Higher growth with orange jiuce

t.test(len~supp,data=ToothGrowth, alternative="greater", paired=FALSE, var.equal=FALSE, conf.level=0.95)

    Welch Two Sample t-test

data:  len by supp
t = 1.9153, df = 55.309, p-value = 0.03032
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 0.4682687       Inf
sample estimates:
mean in group OJ mean in group VC 
        20.66333         16.96333 

With the above p-value the null hypothesis can be rejected. That could be stated as more or less 3% chance to obtain an extreme value difference on mean growth.

At the same time the p-value validates orange juice as better growth promoter.

dossage comparison

Null Hypothesis: Not growth difference for multiple dossages

Alternative Hypotehis: Higher growth with higher dossage

data(ToothGrowth)
data <- ToothGrowth %>% filter(dose <2)

t.test(len~dose,data=data, alternative="less", paired=FALSE, var.equal=FALSE, conf.level=0.95)

    Welch Two Sample t-test

data:  len by dose
t = -6.4766, df = 37.986, p-value = 6.342e-08
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
      -Inf -6.753323
sample estimates:
mean in group 0.5   mean in group 1 
           10.605            19.735 

Null hypothesis rejected with the above p-value for 0.5 and 1 mg/day.

data <- ToothGrowth %>% filter(dose >0.5)

t.test(len~dose,data=data, alternative="less", paired=FALSE, var.equal=FALSE, conf.level=0.95)

    Welch Two Sample t-test

data:  len by dose
t = -4.9005, df = 37.101, p-value = 9.532e-06
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
     -Inf -4.17387
sample estimates:
mean in group 1 mean in group 2 
         19.735          26.100 

The null hypothesis is also rejected with the above p-value for 1 and 2 mg/day.

Based on these lower p-values the alternative hypothesis becomes validated.

Supplement comparison at 2 mg/day

Since similar results seem to be obtained at this supplement concentration. A need to verify the fact is requred.

Null Hypothesis: Not growth difference for either supplement at 2 mg/day dossage

Alternative Hypotehis: Growth difference with different supplement at 2 mg/day dossage

data <- ToothGrowth %>% filter(dose ==2)

t.test(len~supp,data=data, alternative="two.sided", paired=FALSE, var.equal=FALSE, conf.level=0.95)

    Welch Two Sample t-test

data:  len by supp
t = -0.046136, df = 14.04, p-value = 0.9639
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.79807  3.63807
sample estimates:
mean in group OJ mean in group VC 
           26.06            26.14 

She null hypothesis cannot be rejected with the above p-value. Not enough information to validate the alternative hypothesis is available.

Conclusions

With the present anaylisis we can infer that:

  1. Orange juice is a better growth promoter in ginea pig teeth.
  2. The amount of supplement is directly proportional to the teeth growth.
  3. At higher supplement concentration no diffence in growth is appreciated for different supplements.

.