Overview

Exploring ToothGrowth dataset, by applying T confidencial test and hyphotesis test.

This simulation is part of Statistical Inference Coursera Project, part II.

#Load dataset
data(ToothGrowth)

#Sample
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
#Dataset description
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Where OJ is orange juice and VC is vitamin C by ascorbic acid treatment.

Boxplot below compares dosage by OJ and VC supplement types.

par(mfrow=c(1,2))

vc_data <- subset(ToothGrowth, supp=="VC")
s <- factor(vc_data$supp)
boxplot(len ~ s*dose, data = vc_data, col = "red", 
        xlab = "Ascorbic acid", ylab = "Tooth Length", main = "VC Boxplot")

oj_data <- subset(ToothGrowth, supp=="OJ")
s <- factor(oj_data$supp)
boxplot(len ~ s*dose, data = oj_data, col = "orange", 
        xlab = "Orange juice", ylab = "Tooth Length", main = "OJ Boxplot")

Comparing tooth growth by supp and dose

In this section, comparison is given by confidence interval and hyphotesis testing of supp and dose variables. Null hyphotesis states that are difference between treatments. Default p-value is assumed 0.05.

#T confidence interval independent groups, assuming unequal variance, 
#splitting groups by supp variable.

g1 <- subset(ToothGrowth, supp=="VC", select=len)
g2 <- subset(ToothGrowth, supp=="OJ", select=len)
t.test(g1,  g2, paired = FALSE, var.equal = FALSE, data = ToothGrowth)
## 
##  Welch Two Sample t-test
## 
## data:  g1 and g2
## t = -1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.5710156  0.1710156
## sample estimates:
## mean of x mean of y 
##  16.96333  20.66333

As p-value > 0.05, we do not have enough evidence to reject null hyphotesis. So we can not conclude there is a difference between supplement type.

#T confidence interval,
#splitting groups by dose variable (0.5mg and 1mg)

g1 <- subset(ToothGrowth, dose=="0.5", select=len)
g2 <- subset(ToothGrowth, dose=="1", select=len)
t.test(g1,  g2, paired = FALSE, var.equal = TRUE, data = ToothGrowth)
## 
##  Two Sample t-test
## 
## data:  g1 and g2
## t = -6.4766, df = 38, p-value = 1.266e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983748  -6.276252
## sample estimates:
## mean of x mean of y 
##    10.605    19.735

As p-value < 0.05, it seems increasing of supplement (either OJ or VC) dosage, can result tooth growth.

#T confidence interval,
#splitting groups by dose variable (1mg and 2mg)
g1 <- subset(ToothGrowth, dose=="1", select=len)
g2 <- subset(ToothGrowth, dose=="2", select=len)
t.test(g1,  g2, paired = FALSE, var.equal = TRUE, data = ToothGrowth)
## 
##  Two Sample t-test
## 
## data:  g1 and g2
## t = -4.9005, df = 38, p-value = 1.811e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.994387 -3.735613
## sample estimates:
## mean of x mean of y 
##    19.735    26.100

As p-value > 0.05, increase supplement dosage from 1 mg to 2 mg has no significant result in treatment.

Conclusions

Above tests shows that supplement type does not matter, we can choose either OJ or VC. Both vitamin C have same effect. Exploratory boxplot analysis also gives us same insight.
However, increasing dosage from 0.5 mg to 1 mg seems discriminant factor of tooth growth. Increase dosage from 1 mg to 2 mg does not appear an effective approach.