In this document we are analysing Tooth Growth data present in R package ‘Datasets’.
ToothGrowth data contains response of Tooth Growth in each of the 10 Guinea Pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).
We will be analysing Tooth Growth and will compare the effect of Supplement by various dose levels.
For multiple comparisons we will also normalize our ‘p-values’ to avoid Type-1 error.
Loading the data
##Loading Required Libraries
require(datasets)
require(ggplot2)
require(dplyr)
require(magrittr)
require(knitr)
tgdata<-ToothGrowth
##Converting 'Dose' to factor as they are only three dose types
tgdata$dose<-as.factor(ToothGrowth$dose)
Here we can see the basic summary of the data.
summary(tgdata)
## len supp dose
## Min. : 4.20 OJ:30 0.5:20
## 1st Qu.:13.07 VC:30 1 :20
## Median :19.25 2 :20
## Mean :18.81
## 3rd Qu.:25.27
## Max. :33.90
Let’s also visualise the data using plotting
ggplot(tgdata,aes(x=dose,y=len,group=supp,color=supp))+
geom_line(stat="summary_bin",fun.y="mean")+geom_point()+
labs(x="Dose",y="Length",color="Supplement",title="Tooth Growth By Dose and Supplement")
Here we will be comparing effect of supplement type (supp) on Length (len) For Each Dose Levels (dose)
In this section we will create our subsets of data on the basis of Dose Levels so that individual ‘t.tests’ can be calculated on them.
We start by creating two groups for two supplement types and sub-setting only the data for dose level ‘0.5 mg’,‘1 mg’ and ‘2mg’ respectively
#Sub-setting 0.5 mg dose level data
halfmg<-tgdata[tgdata$dose==0.5,]
halfmg_summ<-data.frame(halfmg %>% group_by(supp) %>% summarise(mean=mean(len),sd=sd(len)))
vc0.5mg<-halfmg[halfmg$supp=='VC',]
oj0.5mg<-halfmg[halfmg$supp=='OJ',]
#Sub-setting 1 mg dose level data
onemg<-tgdata[tgdata$dose==1,]
onemg_summ<-data.frame(onemg %>% group_by(supp) %>% summarise(mean=mean(len),sd=sd(len)))
vc1mg<-onemg[onemg$supp=='VC',]
oj1mg<-onemg[onemg$supp=='OJ',]
#Sub-setting 2 mg dose level data
twomg<-tgdata[tgdata$dose==2,]
twomg_summ<-data.frame(twomg %>% group_by(supp) %>% summarise(mean=mean(len),sd=sd(len)))
vc2mg<-twomg[twomg$supp=='VC',]
oj2mg<-twomg[twomg$supp=='OJ',]
Let’s see the basic summary of these subsets of data that we have extracted.
Mean and Standard Deviation for Dose Level ‘0.5 mg’
| supp | mean | sd |
|---|---|---|
| OJ | 13.23 | 4.459708 |
| VC | 7.98 | 2.746634 |
Mean and Standard Deviation for Dose Level ‘1 mg’
| supp | mean | sd |
|---|---|---|
| OJ | 22.70 | 3.910953 |
| VC | 16.77 | 2.515309 |
Mean and Standard Deviation for Dose Level ‘2 mg’
| supp | mean | sd |
|---|---|---|
| OJ | 26.06 | 2.655058 |
| VC | 26.14 | 4.797731 |
The Hypothesis that we are going to do here for all the three subset of data are
Let \(\mu_{oj}\) be the mean of Population with Supplement ‘OJ’ and \(\mu_{vc}\) be the mean of population with supplement ‘VC’.
Hence our null and alternate hypothesis will be
\(H_0 : \mu_{oj}\leq\mu_{vc}\) i.e. supplement ‘OJ’ has less than or equal effect on length than ‘VC’
\(H_a : \mu_{oj}\ngtr\mu_{vc}\) i.e. supplement ‘OJ’ has more effect on length than ‘VC’
Doing a t-test on these hypothesis on all the three data and storing their results
t0.5test<-t.test(oj0.5mg$len,vc0.5mg$len,paired = FALSE,var.equal = FALSE,alternative = "greater")
t1test<-t.test(oj1mg$len,vc1mg$len,paired = FALSE,var.equal = FALSE,alternative = "greater")
t2test<-t.test(oj2mg$len,vc2mg$len,paired = FALSE,var.equal = FALSE,alternative = "greater")
We will also compare the quantiles obtained in ‘t.test’ with our ‘97.5%’ quantile.
t0.5comp97.5<-t0.5test$statistic>qt(0.975,t0.5test$parameter)
t1comp97.5<-t1test$statistic>qt(0.975,t1test$parameter)
t2comp97.5<-t2test$statistic>qt(0.975,t2test$parameter)
Since we are doing here multiple comparisons, probability of Type-1 Error Increase hence it is a good idea to normalize the ‘P-Values’ obtained in our all statistics using ‘Benjamini Hochberg’ method.
padjusted<-p.adjust(c(t0.5test$p.value,t1test$p.value,t2test$p.value),method="BH")
Now we will combine the result of all the t.test that we have done across dose levels.
dose<-c(0.5,1,2)
p.values<-c(t0.5test$p.value,t1test$p.value,t2test$p.value)
tComp97.5<-c(t0.5comp97.5,t1comp97.5,t2comp97.5)
results<-data.frame(dose,p.values,tComp97.5,padjusted)
names(results)<-c("Dose (mg)","P-Values (H0)","tcalc>97.5","P-Adjusted")
kable(results,format = "markdown")
| Dose (mg) | P-Values (H0) | tcalc>97.5 | P-Adjusted |
|---|---|---|---|
| 0.5 | 0.0031793 | TRUE | 0.0047690 |
| 1.0 | 0.0005192 | TRUE | 0.0015576 |
| 2.0 | 0.5180742 | FALSE | 0.5180742 |
Since for 2 mg test probability in favour of null hypothesis i.e. OJ has less effect than VC was very high and hence accepted but this is not by significant level it would be a better idea to check the reverse hypothesis or directly null hypothesis comparing the mean.
For this new test our hypothesis will be
\(H_0 : \mu_{oj}=\mu_{vc}\) i.e. supplement ‘OJ’ has equal effect on length as ‘VC’
\(H_a : \mu_{oj}\neq\mu_{vc}\) i.e. supplement ‘OJ’ and ‘VC’ have different effects on length.
Doing a t.test with these hypothesis and quantile comparison
t2testextended<-t.test(oj2mg$len,vc2mg$len,paired = FALSE,var.equal = FALSE)
t2testextended
##
## Welch Two Sample t-test
##
## data: oj2mg$len and vc2mg$len
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean of x mean of y
## 26.06 26.14
t2testextended$statistic>qt(0.975,t2testextended$parameter)
## t
## FALSE
Here p-value that we have obtained is very high in favour of our null hypothesis which means the effect of Supplement ‘OJ’ and ‘VC’ for ‘2 mg’ dose level is equal and we accept the null hypothesis.
From the results we can conclude following about our data.
Do Delivery methods and/or Dosage affect growth in guinea pigs?
The delivery methods i.e. the Vitamins as supplements given to Guinea pigs are very much in favourable of Vitamin OJ over VC unless dose levels are 0.5mg and 1mg as they reach 2mg they have approximately equal effects.