In this project, we analyze the ToothGrowth data in the R datasets package. The data is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).
We load the data and perform some basic exploratory data analyses.
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
We encode dose as a factor and summarize the data.
## len supp dose
## Min. : 4.20 OJ:30 0.5:20
## 1st Qu.:13.07 VC:30 1 :20
## Median :19.25 2 :20
## Mean :18.81
## 3rd Qu.:25.27
## Max. :33.90
## supp
## dose OJ VC
## 0.5 10 10
## 1 10 10
## 2 10 10
## [1] "number of NAs"
## [1] 0
The tooth length is between 4.2~33.90 with a mean 18.81 overall. There is no NA data and 10 observations with each dose level and delivery method.
Now boxplot the tooth lengths vs dosage and delivery methods respectively.
As shown in Dose, generally, the tooth length increase as the rise of Vc dose regardless of Supp. And there seems no significant improvement on tooth length when performing different delivery methods (Figure Supp). In latter section, we choose to use T-test on these on assumptions due to the small sample size. At first glance of figure Dose & Supp, orange Juice seems to perform better at the lower dosages, but has a similar result to Vitamin C at the 2.0mg dosage.
We will examine 3 Null Hypotheses in this sections:
Here we group the data by three levels of dosage and perform T-test between each pair of dosage(0.5~1. 1~2, 0.5~2).
##
## Pairwise comparisons using t tests with non-pooled SD
##
## data: len and dose
##
## 0.5 1
## 1 2.5e-07 -
## 2 1.3e-13 1.9e-05
##
## P value adjustment method: holm
As we can see, the p-value in each pair is much more smaller than 0.05, which indicates that we could reject the Null Hypotheses in 95% confidence interval. In other words, levels of Vc dosage do have significant effect on tooth length in 95% confidence interval.
Now, let’s perform the t-test to the second hypothsis. Here, we group the data by two deliveray methods.
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
The result shows that 95% confidence interval ( -0.1710156, 7.5710156) contains zero, which suggest that we are not able to reject the Null Hypothesis. Or say that Orange Juice (OJ) and Vitamin C (VC) do have the same effect on tooth length with in 95% confidence interval.
Hypothesis 3 contains three sub hypotheses, which is OJ and VC have the same effect on tooth length regarding three levels of dosage. Here we check three sub hypotheses using 95% confidence interval.
## [1] "0.5" "1" "2"
## [[1]]
## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95
##
## [[2]]
## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95
##
## [[3]]
## [1] -3.79807 3.63807
## attr(,"conf.level")
## [1] 0.95
As the result shows that we would reject the Null hypothese at "0.5" and "1.0" dosage levels but fail to reject the Null hypothesis at "2.0" levels. The T-tests confirm our initial impression that orange Juice seems to perform better at the lower dosages, but has a similar result to Vitamin C at the 2.0mg dosage.
In summary, in 95% confidence interval, different level of Vc dosage have effect on tooth length. And in general orange Juice (OJ) and Vitamin C (VC) have the same effect on tooth length. Finally, at "0.5" and "1.0" dosage levels, OJ and VC show different effects on tooth but simliar effect at "2.0"
print("Chuck 1")
library(datasets)
data(ToothGrowth)
?ToothGrowth
## starting httpd help server ... done
str(ToothGrowth)
## [1] "Chuck 1"
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
print("Chuck 2")
# factorising dose
ToothGrowth$dose <- factor(ToothGrowth$dose)
# summarize the data
summary(ToothGrowth)
with(ToothGrowth, table(dose,supp))
# NA data
print("number of NAs"); sum(is.na(ToothGrowth))
## [1] "Chuck 2"
## len supp dose
## Min. : 4.20 OJ:30 0.5:20
## 1st Qu.:13.07 VC:30 1 :20
## Median :19.25 2 :20
## Mean :18.81
## 3rd Qu.:25.27
## Max. :33.90
## supp
## dose OJ VC
## 0.5 10 10
## 1 10 10
## 2 10 10
## [1] "number of NAs"
## [1] 0
print("Chuck 3")
par(mfrow=c(1,2))
boxplot(len ~ dose, data = ToothGrowth, main="(a) Dose",xlab="mg", ylab="length(mm)")
boxplot(len ~ supp, data = ToothGrowth, main="(b) Supp")
print("Chuck 4")
par(mfrow=c(1,1))
boxplot(len ~ dose*supp, data = ToothGrowth,ylab="length(mm)")
## [1] "Chuck 3"
## [1] "Chuck 4"
print("Chuck 4")
with(ToothGrowth, pairwise.t.test(len,dose, pool.sd=FALSE))
## [1] "Chuck 4"
##
## Pairwise comparisons using t tests with non-pooled SD
##
## data: len and dose
##
## 0.5 1
## 1 2.5e-07 -
## 2 1.3e-13 1.9e-05
##
## P value adjustment method: holm
print("Chuck 5")
with(ToothGrowth, t.test(len~supp))
## [1] "Chuck 5"
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
print("Chuck 6")
levels <- levels(ToothGrowth$dose);levels
# t-test regarding dose grouped by supp
re <- lapply(1:length(levels), function(i)
with(ToothGrowth,
t.test(len[dose==levels[[i]]]~supp[dose==levels[[i]]])
)
)
lapply(1:3, function(i) re[[i]]$conf.int)
## [1] "Chuck 6"
## [1] "0.5" "1" "2"
## [[1]]
## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95
##
## [[2]]
## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95
##
## [[3]]
## [1] -3.79807 3.63807
## attr(,"conf.level")
## [1] 0.95