Exploring ToothGrowth dataset, by applying T confidencial test and hyphotesis test.
This simulation is part of Statistical Inference Coursera Project, part II.
#Load dataset
data(ToothGrowth)
#Sample
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
#Dataset description
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
Where OJ is orange juice and VC is vitamin C by ascorbic acid treatment.
Boxplot below compares dosage by OJ and VC supplement types.
par(mfrow=c(1,2))
vc_data <- subset(ToothGrowth, supp=="VC")
s <- factor(vc_data$supp)
boxplot(len ~ s*dose, data = vc_data, col = "red",
xlab = "Ascorbic acid", ylab = "Tooth Length", main = "VC Boxplot")
oj_data <- subset(ToothGrowth, supp=="OJ")
s <- factor(oj_data$supp)
boxplot(len ~ s*dose, data = oj_data, col = "orange",
xlab = "Orange juice", ylab = "Tooth Length", main = "OJ Boxplot")
In this section, comparison is given by confidence interval and hyphotesis testing of supp and dose variables. Null hyphotesis states that are difference between treatments. Default p-value is assumed 0.05.
#T confidence interval independent groups, assuming unequal variance,
#splitting groups by supp variable.
g1 <- subset(ToothGrowth, supp=="VC", select=len)
g2 <- subset(ToothGrowth, supp=="OJ", select=len)
t.test(g1, g2, paired = FALSE, var.equal = FALSE, data = ToothGrowth)
##
## Welch Two Sample t-test
##
## data: g1 and g2
## t = -1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -7.5710156 0.1710156
## sample estimates:
## mean of x mean of y
## 16.96333 20.66333
As p-value > 0.05, we do not have enough evidence to reject null hyphotesis. So we can not conclude there is a difference between supplement type.
#T confidence interval,
#splitting groups by dose variable (0.5mg and 1mg)
g1 <- subset(ToothGrowth, dose=="0.5", select=len)
g2 <- subset(ToothGrowth, dose=="1", select=len)
t.test(g1, g2, paired = FALSE, var.equal = TRUE, data = ToothGrowth)
##
## Two Sample t-test
##
## data: g1 and g2
## t = -6.4766, df = 38, p-value = 1.266e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983748 -6.276252
## sample estimates:
## mean of x mean of y
## 10.605 19.735
As p-value < 0.05, it seems increasing of supplement (either OJ or VC) dosage, can result tooth growth.
#T confidence interval,
#splitting groups by dose variable (1mg and 2mg)
g1 <- subset(ToothGrowth, dose=="1", select=len)
g2 <- subset(ToothGrowth, dose=="2", select=len)
t.test(g1, g2, paired = FALSE, var.equal = TRUE, data = ToothGrowth)
##
## Two Sample t-test
##
## data: g1 and g2
## t = -4.9005, df = 38, p-value = 1.811e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.994387 -3.735613
## sample estimates:
## mean of x mean of y
## 19.735 26.100
As p-value > 0.05, increase supplement dosage from 1 mg to 2 mg has no significant result in treatment.
Above tests shows that supplement type does not matter, we can choose either OJ or VC. Both vitamin C have same effect. Exploratory boxplot analysis also gives us same insight.
However, increasing dosage from 0.5 mg to 1 mg seems discriminant factor of tooth growth. Increase dosage from 1 mg to 2 mg does not appear an effective approach.