Overview

In this document we are analysing Tooth Growth data present in R package ‘Datasets’.

ToothGrowth data contains response of Tooth Growth in each of the 10 Guinea Pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).

We will be analysing Tooth Growth and will compare the effect of Supplement by various dose levels.

For multiple comparisons we will also normalize our ‘p-values’ to avoid Type-1 error.

Summary

Loading the data

##Loading Required Libraries
require(datasets)
require(ggplot2)
require(dplyr)
require(magrittr)
require(knitr)
tgdata<-ToothGrowth

##Converting 'Dose' to factor as they are only three dose types
tgdata$dose<-as.factor(ToothGrowth$dose)

Summary

Here we can see the basic summary of the data.

summary(tgdata)
##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90

Plot

Let’s also visualise the data using plotting

ggplot(tgdata,aes(x=dose,y=len,group=supp,color=supp))+
geom_line(stat="summary_bin",fun.y="mean")+geom_point()+
labs(x="Dose",y="Length",color="Supplement",title="Tooth Growth By Dose and Supplement")

Assumptions

Inferential Analysis

Here we will be comparing effect of supplement type (supp) on Length (len) For Each Dose Levels (dose)

Sub-setting of data

In this section we will create our subsets of data on the basis of Dose Levels so that individual ‘t.tests’ can be calculated on them.

We start by creating two groups for two supplement types and sub-setting only the data for dose level ‘0.5 mg’,‘1 mg’ and ‘2mg’ respectively

#Sub-setting 0.5 mg dose level data
halfmg<-tgdata[tgdata$dose==0.5,]
halfmg_summ<-data.frame(halfmg %>% group_by(supp) %>% summarise(mean=mean(len),sd=sd(len)))
vc0.5mg<-halfmg[halfmg$supp=='VC',]
oj0.5mg<-halfmg[halfmg$supp=='OJ',]

#Sub-setting 1 mg dose level data
onemg<-tgdata[tgdata$dose==1,]
onemg_summ<-data.frame(onemg %>% group_by(supp) %>% summarise(mean=mean(len),sd=sd(len)))
vc1mg<-onemg[onemg$supp=='VC',]
oj1mg<-onemg[onemg$supp=='OJ',]

#Sub-setting 2 mg dose level data
twomg<-tgdata[tgdata$dose==2,]
twomg_summ<-data.frame(twomg %>% group_by(supp) %>% summarise(mean=mean(len),sd=sd(len)))
vc2mg<-twomg[twomg$supp=='VC',]
oj2mg<-twomg[twomg$supp=='OJ',]

Summary Of Subsets of Data

Let’s see the basic summary of these subsets of data that we have extracted.

Mean and Standard Deviation for Dose Level ‘0.5 mg’

supp mean sd
OJ 13.23 4.459708
VC 7.98 2.746634

Mean and Standard Deviation for Dose Level ‘1 mg’

supp mean sd
OJ 22.70 3.910953
VC 16.77 2.515309

Mean and Standard Deviation for Dose Level ‘2 mg’

supp mean sd
OJ 26.06 2.655058
VC 26.14 4.797731

Hypothesis Testing and P-Values Calculation

The Hypothesis that we are going to do here for all the three subset of data are

Let \(\mu_{oj}\) be the mean of Population with Supplement ‘OJ’ and \(\mu_{vc}\) be the mean of population with supplement ‘VC’.

Hence our null and alternate hypothesis will be

\(H_0 : \mu_{oj}\leq\mu_{vc}\) i.e. supplement ‘OJ’ has less than or equal effect on length than ‘VC’

\(H_a : \mu_{oj}\ngtr\mu_{vc}\) i.e. supplement ‘OJ’ has more effect on length than ‘VC’

Doing a t-test on these hypothesis on all the three data and storing their results

t0.5test<-t.test(oj0.5mg$len,vc0.5mg$len,paired = FALSE,var.equal = FALSE,alternative = "greater")
t1test<-t.test(oj1mg$len,vc1mg$len,paired = FALSE,var.equal = FALSE,alternative = "greater")
t2test<-t.test(oj2mg$len,vc2mg$len,paired = FALSE,var.equal = FALSE,alternative = "greater")

Comparing With 97.5 Quantile

We will also compare the quantiles obtained in ‘t.test’ with our ‘97.5%’ quantile.

t0.5comp97.5<-t0.5test$statistic>qt(0.975,t0.5test$parameter)
t1comp97.5<-t1test$statistic>qt(0.975,t1test$parameter)
t2comp97.5<-t2test$statistic>qt(0.975,t2test$parameter)

Adjusting P-Values Using Benjamini Hochberg Method

Since we are doing here multiple comparisons, probability of Type-1 Error Increase hence it is a good idea to normalize the ‘P-Values’ obtained in our all statistics using ‘Benjamini Hochberg’ method.

padjusted<-p.adjust(c(t0.5test$p.value,t1test$p.value,t2test$p.value),method="BH")

Combining Results Of All the Statistics

Now we will combine the result of all the t.test that we have done across dose levels.

dose<-c(0.5,1,2)
p.values<-c(t0.5test$p.value,t1test$p.value,t2test$p.value)
tComp97.5<-c(t0.5comp97.5,t1comp97.5,t2comp97.5)
results<-data.frame(dose,p.values,tComp97.5,padjusted)
names(results)<-c("Dose (mg)","P-Values (H0)","tcalc>97.5","P-Adjusted")
kable(results,format = "markdown")
Dose (mg) P-Values (H0) tcalc>97.5 P-Adjusted
0.5 0.0031793 TRUE 0.0047690
1.0 0.0005192 TRUE 0.0015576
2.0 0.5180742 FALSE 0.5180742

Evaluating 2 mg Test Further

Since for 2 mg test probability in favour of null hypothesis i.e. OJ has less effect than VC was very high and hence accepted but this is not by significant level it would be a better idea to check the reverse hypothesis or directly null hypothesis comparing the mean.

For this new test our hypothesis will be

\(H_0 : \mu_{oj}=\mu_{vc}\) i.e. supplement ‘OJ’ has equal effect on length as ‘VC’

\(H_a : \mu_{oj}\neq\mu_{vc}\) i.e. supplement ‘OJ’ and ‘VC’ have different effects on length.

Doing a t.test with these hypothesis and quantile comparison

t2testextended<-t.test(oj2mg$len,vc2mg$len,paired = FALSE,var.equal = FALSE)
t2testextended
## 
##  Welch Two Sample t-test
## 
## data:  oj2mg$len and vc2mg$len
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean of x mean of y 
##     26.06     26.14
t2testextended$statistic>qt(0.975,t2testextended$parameter)
##     t 
## FALSE

Here p-value that we have obtained is very high in favour of our null hypothesis which means the effect of Supplement ‘OJ’ and ‘VC’ for ‘2 mg’ dose level is equal and we accept the null hypothesis.

Interpretation of the results

From the results we can conclude following about our data.

Final Interpretation To the Question

Do Delivery methods and/or Dosage affect growth in guinea pigs?

The delivery methods i.e. the Vitamins as supplements given to Guinea pigs are very much in favourable of Vitamin OJ over VC unless dose levels are 0.5mg and 1mg as they reach 2mg they have approximately equal effects.