Overview

The data set “ToothGrowth” from the R datasets package is analysed. Tooth growth lengths are measured in each of 10 guinea pigs at each of 3 dose levels of Vitamin C (0.5, 1, and 2 mg) via the orange juice (OJ variable) and ascorbic acid (VC variable) supplements.

General Goals

  1. Perform an exploratory data analysis highlighting basic features of the data.
  2. Perform some relevant confidence intervals and/or tests.
  3. Results of the tests and/or intervals interpreted in the context of the problem.
  4. Describe the assumptions needed for conclusions.

This work

  1. Exploratory data analysis
require(datasets)
# Load  data 
mydata<-ToothGrowth
# Summary of the data
summary(mydata)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
# Structure of the data 
str(mydata)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
# Assign new column names
colnames(mydata)<- c("length", "supplement", "dose")
require(ggplot2)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.2.3
ggplot(data=mydata,aes(x=supplement,y=length))+
  geom_boxplot(aes(fill=supplement))+
  geom_rug()+ 
  facet_grid(.~dose)+
  xlab('Supplement')+ylab('Length')+ggtitle('Effects on Tooth Length for each Method and Dose')+
  guides(fill=F)+theme_bw()+
  theme(axis.text=element_text(size=13),axis.title= element_text(size=15,vjust= 0.6,face='bold')
        ,title=element_text(size=15,vjust = 2,face='bold'))

Observations: we can see from the above plot that the tooth growth is faster with the orange juice supplement at dose levels of 0.5 and 1 mg. Neverthless, the both delivery methods (orange juice and ascorbic acid) seem to give similar results in absorption of vitamin C at a dose level of 2mg.

2.Confidence intervals and T-tests

We use the T-test to confirm the above observations. Here, the null hypothesis considers that there is no difference in mean of tooth growth lengths between supplements according to the dose levels.

# Dose level of 0.5 mg
T_test_1<-with(mydata[mydata$dose == 0.5,],t.test(length~supplement))
T_test_1$p.value; T_test_1$conf.int # p-value and the 95% confidence interval 
## [1] 0.006358607
## [1] 1.719057 8.780943
## attr(,"conf.level")
## [1] 0.95
# Dose level of 1 mg
T_test_2<-with(mydata[mydata$dose == 1.,],t.test(length~supplement))
T_test_2$p.value; T_test_2$conf.int # p-value and the 95% confidence interval 
## [1] 0.001038376
## [1] 2.802148 9.057852
## attr(,"conf.level")
## [1] 0.95
# Dose level of 2 mg
T_test_3<-with(mydata[mydata$dose == 2.,],t.test(length~supplement))
T_test_3$p.value; T_test_3$conf.int # p-value and the 95% confidence interval 
## [1] 0.9638516
## [1] -3.79807  3.63807
## attr(,"conf.level")
## [1] 0.95

The above T-tests show that the null hypothesis is rejected in the two first cases, i.e. for dose levels of 0.5 and 1mg (p values < 0.05 and 95% confidence intervals do not contain zero). This means that orange juice and the ascorbic acid supplements have not the same behavior on tooth growth lengths and thus in absorption of vitamin C. Howewer, in the last case (2mg), the null hypothesis is failed to be rejected (p-value= 0.9639 > 0.05, and the 95% confidence interval contains zero). This means that the two delivery methods are similar in absorption of vitamin C at the dose level of 2mg.

Conclusions: the T-tests confirm the initial observations obtained in the exploratory data analysis. The orange juice supplement is more effective than the ascorbic acid supplement in absorption of vitamin C at dose levels of 0.5 and 1mg. However, both supplements give similar results (on tooth growth) at the dose level of 2mg.