Synopsis

Aim of this project is to conduct analysis on ToothGrowth data:

-provide basic summary of the data,

-use hypothesis tests to compare tooth growth by supp (orange juice (OJ) or ascorbic acid (VC)) and dose (0.5, 1 or 2 mg).

Analysis and data processing

First ToothGrowth data is loaded and an exploratory boxplot is made:

data(ToothGrowth)
ToothGrowth$dose=as.factor(as.numeric(ToothGrowth$dose))

library(ggplot2)
ggplot(ToothGrowth, aes(x=dose, y=len)) + 
    ylab("Teeth lenght")+
    geom_boxplot()+
    facet_wrap( ~ supp)+
    theme_minimal()

plot of chunk unnamed-chunk-1

Secondly means, standard deviations and variances for each supplement are calculated:

library(plyr)
ddply(ToothGrowth,.(supp),summarise,Mean=mean(len),Sd=sd(len), Var=var(len))
##   supp  Mean    Sd   Var
## 1   OJ 20.66 6.606 43.63
## 2   VC 16.96 8.266 68.33

Means, standard deviations and variances for each dose level are calculated:

ddply(ToothGrowth,.(dose),summarise,Mean=mean(len),Sd=sd(len), Var=var(len))
##   dose  Mean    Sd   Var
## 1  0.5 10.61 4.500 20.25
## 2    1 19.73 4.415 19.50
## 3    2 26.10 3.774 14.24

Means, standard deviations and variances for each dose level in each supplement are calculated:

ddply(ToothGrowth,.(supp, dose),summarise,Mean=mean(len),Sd=sd(len), Var=var(len))
##   supp dose  Mean    Sd    Var
## 1   OJ  0.5 13.23 4.460 19.889
## 2   OJ    1 22.70 3.911 15.296
## 3   OJ    2 26.06 2.655  7.049
## 4   VC  0.5  7.98 2.747  7.544
## 5   VC    1 16.77 2.515  6.327
## 6   VC    2 26.14 4.798 23.018

It is seen from the graph and from different means that in higher Vitamin C dose levels mean teeth length for Guinea Pigs is higher. Also there appears to be difference in delivery methods.

To be sure that differences are statistically significant 10 t-tests are conducted. Code for t-tests is in appendix. Null hypothesis is that there is no difference in means of teeth length in different delivery methods or dosage. Alternative hypothesis is that there are differences in those means. Due to multiple comparisons Benjamini & Yekutieli correction in p-values are used to adjust false discovery rate (FDR). Benjamini & Yekutieli (BY) correction is used because t-tests are not independent of each other and so Bonferroni and Benjamini & Hochberg corrections assumptions are violated. Tests results are following:

##              Groups Pvalues    BY
## 1          OJ vs VC   0.003 0.012
## 2          0.5 vs 1   0.000 0.000
## 3            1 vs 2   0.000 0.002
## 4    0.5 OJ vs 1 OJ   0.002 0.012
## 5      1 OJ vs 2 OJ   0.084 0.273
## 6    0.5 VC vs 1 VC   0.000 0.002
## 7      1 VC vs 2 VC   0.000 0.003
## 8  0.5 OJ vs 0.5 VC   0.015 0.057
## 9      1 OJ vs 1 VC   0.008 0.034
## 10     2 OJ vs 2 VC   0.967 1.000

As seen from t-tests corrected results only tests no. 5, 8 and 10 don’t have significant p-values at a significance level of 5%. Before Benjamini & Yekutieli correction t-test no. 8 was significant. Corrected p-values are used to interpret the results.

Conclusions and assumptions

Conclusions from data analysis are following:

-In higher dosage levels Guinea Pigs mean teeth length was higher (in level 2 mg mean length was 26.1 compared to 0.5 mg mean length 26.1). Only difference was found then using orange juice as a delivery method: mean teeth length was not found to be different compared to 1mg and 2 mg doses of Vitamin C (respectively means were 22.7 and 26.06).

-In condition where orange juice was used as a Vitamin C delivery method mean teeth length was greater compared to condition where ascorbic acid was used only if dose was 1 mg (respectively means were 22.7 and 16.77).

Assumptions for this analysis is:

-each Guinea pig was assigned to a combination of dosage and supplement type so that t-tests performed could use dependent samples methodology,

-sample of 60 Guinea pigs is representative of all Guinea pigs and based on sample conclusions can be drawn about the population (they were randomly picked from population),

-variance is unequal in all groups.

Appendix

Code for t-tests:

pvalues<-c(
t.test(len ~supp, data=ToothGrowth, paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="0.5"], 
       ToothGrowth$len[ToothGrowth$dose=="1"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="1"], 
       ToothGrowth$len[ToothGrowth$dose=="2"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="0.5" & ToothGrowth$supp=="OJ"],
       ToothGrowth$len[ToothGrowth$dose=="1"& ToothGrowth$supp=="OJ"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="1"& ToothGrowth$supp=="OJ"], 
       ToothGrowth$len[ToothGrowth$dose=="2"& ToothGrowth$supp=="OJ"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="0.5" & ToothGrowth$supp=="VC"],
       ToothGrowth$len[ToothGrowth$dose=="1"& ToothGrowth$supp=="VC"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="1"& ToothGrowth$supp=="VC"], 
       ToothGrowth$len[ToothGrowth$dose=="2"& ToothGrowth$supp=="VC"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="0.5" & ToothGrowth$supp=="OJ"],
       ToothGrowth$len[ToothGrowth$dose=="0.5"&ToothGrowth$supp=="VC"], paired=T)$p.value,                                                       
t.test(ToothGrowth$len[ToothGrowth$dose=="1"& ToothGrowth$supp=="OJ"], 
       ToothGrowth$len[ToothGrowth$dose=="1"& ToothGrowth$supp=="VC"], paired=T)$p.value,
t.test(ToothGrowth$len[ToothGrowth$dose=="2"& ToothGrowth$supp=="OJ"], 
       ToothGrowth$len[ToothGrowth$dose=="2"& ToothGrowth$supp=="VC"], paired=T)$p.value)

Groups=c("OJ vs VC", "0.5 vs 1", "1 vs 2", "0.5 OJ vs 1 OJ", "1 OJ vs 2 OJ", 
          "0.5 VC vs 1 VC", "1 VC vs 2 VC", "0.5 OJ vs 0.5 VC", 
          "1 OJ vs 1 VC","2 OJ vs 2 VC")

BY=round(p.adjust(pvalues, method = "BY"),3)
Pvalues=format((round(pvalues, 3)), scientific=FALSE)
Pvalues=data.frame(Groups, Pvalues, BY)