About the data

In this project, we analyze the ToothGrowth data in the R datasets package. The data is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).

Basic inferential data analysis

We load the data and perform some basic exploratory data analyses.

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

We encode dose as a factor and summarize the data.

##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90                   
##      supp
## dose  OJ VC
##   0.5 10 10
##   1   10 10
##   2   10 10
## [1] "number of NAs"
## [1] 0

The tooth length is between 4.2~33.90 with a mean 18.81 overall. There is no NA data and 10 observations with each dose level and delivery method.

Now boxplot the tooth lengths vs dosage and delivery methods respectively.

As shown in Dose, generally, the tooth length increase as the rise of Vc dose regardless of Supp. And there seems no significant improvement on tooth length when performing different delivery methods (Figure Supp). In latter section, we choose to use T-test on these on assumptions due to the small sample size. At first glance of figure Dose & Supp, orange Juice seems to perform better at the lower dosages, but has a similar result to Vitamin C at the 2.0mg dosage.

Hypotheses

We will examine 3 Null Hypotheses in this sections:

  1. Different level of Vc dosage have the same effect on tooth length.
  2. Orange Juice (OJ) and Vitamin C (VC) have the same effect on tooth length.
  3. OJ and VC have the same effect on tooth length when the dosage is the same.

Hypothesis 1: Different level of Vc dosage have the same effect on tooth length.

Here we group the data by three levels of dosage and perform T-test between each pair of dosage(0.5~1. 1~2, 0.5~2).

## 
##  Pairwise comparisons using t tests with non-pooled SD 
## 
## data:  len and dose 
## 
##   0.5     1      
## 1 2.5e-07 -      
## 2 1.3e-13 1.9e-05
## 
## P value adjustment method: holm

As we can see, the p-value in each pair is much more smaller than 0.05, which indicates that we could reject the Null Hypotheses in 95% confidence interval. In other words, levels of Vc dosage do have significant effect on tooth length in 95% confidence interval.

Hypothesis 2: Orange Juice (OJ) and Vitamin C (VC) have the same effect on tooth length.

Now, let’s perform the t-test to the second hypothsis. Here, we group the data by two deliveray methods.

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

The result shows that 95% confidence interval ( -0.1710156, 7.5710156) contains zero, which suggest that we are not able to reject the Null Hypothesis. Or say that Orange Juice (OJ) and Vitamin C (VC) do have the same effect on tooth length with in 95% confidence interval.

Hypothesis 3: OJ and VC have the same effect on tooth length when the dosage is the same.

Hypothesis 3 contains three sub hypotheses, which is OJ and VC have the same effect on tooth length regarding three levels of dosage. Here we check three sub hypotheses using 95% confidence interval.

## [1] "0.5" "1"   "2"  
## [[1]]
## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95
## 
## [[2]]
## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95
## 
## [[3]]
## [1] -3.79807  3.63807
## attr(,"conf.level")
## [1] 0.95

As the result shows that we would reject the Null hypothese at "0.5" and "1.0" dosage levels but fail to reject the Null hypothesis at "2.0" levels. The T-tests confirm our initial impression that orange Juice seems to perform better at the lower dosages, but has a similar result to Vitamin C at the 2.0mg dosage.

Conclusions

In summary, in 95% confidence interval, different level of Vc dosage have effect on tooth length. And in general orange Juice (OJ) and Vitamin C (VC) have the same effect on tooth length. Finally, at "0.5" and "1.0" dosage levels, OJ and VC show different effects on tooth but simliar effect at "2.0"

Code Chucks

print("Chuck 1")
library(datasets)
data(ToothGrowth)
?ToothGrowth
## starting httpd help server ... done
str(ToothGrowth)
## [1] "Chuck 1"
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
print("Chuck 2")

# factorising dose 
ToothGrowth$dose <- factor(ToothGrowth$dose)

# summarize the data
summary(ToothGrowth)
with(ToothGrowth, table(dose,supp))

# NA data
print("number of NAs"); sum(is.na(ToothGrowth))
## [1] "Chuck 2"
##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90                   
##      supp
## dose  OJ VC
##   0.5 10 10
##   1   10 10
##   2   10 10
## [1] "number of NAs"
## [1] 0
print("Chuck 3")
par(mfrow=c(1,2))
boxplot(len ~ dose, data = ToothGrowth, main="(a) Dose",xlab="mg", ylab="length(mm)")
boxplot(len ~ supp, data = ToothGrowth, main="(b) Supp")

print("Chuck 4")
par(mfrow=c(1,1))
boxplot(len ~ dose*supp, data = ToothGrowth,ylab="length(mm)")

## [1] "Chuck 3"
## [1] "Chuck 4"
print("Chuck 4")
with(ToothGrowth, pairwise.t.test(len,dose, pool.sd=FALSE))
## [1] "Chuck 4"
## 
##  Pairwise comparisons using t tests with non-pooled SD 
## 
## data:  len and dose 
## 
##   0.5     1      
## 1 2.5e-07 -      
## 2 1.3e-13 1.9e-05
## 
## P value adjustment method: holm
print("Chuck 5")
with(ToothGrowth, t.test(len~supp))
## [1] "Chuck 5"
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333
print("Chuck 6")
levels <- levels(ToothGrowth$dose);levels

# t-test regarding dose grouped by supp

re <- lapply(1:length(levels), function(i) 
    with(ToothGrowth,
         t.test(len[dose==levels[[i]]]~supp[dose==levels[[i]]])
         )
    )

lapply(1:3, function(i) re[[i]]$conf.int)
## [1] "Chuck 6"
## [1] "0.5" "1"   "2"  
## [[1]]
## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95
## 
## [[2]]
## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95
## 
## [[3]]
## [1] -3.79807  3.63807
## attr(,"conf.level")
## [1] 0.95