Overview

The second project is about anaylzing “tootgrowth data”. The dataset contains 60 observations which of three variables are the response the tooth length in each of 10 guinea pigs at each of three dose levels of Vitamin C(0.5, 1, and 2 mg). Delivery methods (orange juice (OJ) OR ascorbic acid (VC)). The relationship between the toot lenght is explored with boxplots.

Loading Data

library(datasets)
data(ToothGrowth)
tooth <- ToothGrowth
str(tooth) # check structure
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

Get the effect of vitamin C dose on tooth length.

MeanSupp = split(ToothGrowth$len, ToothGrowth$supp)
sapply(MeanSupp, mean)
##       OJ       VC 
## 20.66333 16.96333
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.1
ggplot(aes(x=supp, y=len), data=ToothGrowth) + geom_boxplot(aes(fill=supp))+ 
        xlab("Supplement_type") +ylab("Tooth_length") 

MeanDose = split(ToothGrowth$len, ToothGrowth$dose)
sapply(MeanDose, mean)
##    0.5      1      2 
## 10.605 19.735 26.100
tooth$dose <- factor(tooth$dose) # convert class of dose into factor
head(tooth) 
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
ggplot(tooth, aes(supp, len, fill = supp)) + geom_boxplot()

ggplot(tooth, aes(dose, len, fill = dose)) + geom_boxplot()

ggplot(aes(y = len, x = dose, fill = supp), data = tooth) + geom_boxplot()

##Inferential Statistics

len1<-ToothGrowth$len
supp1<-ToothGrowth$supp
dose1<-ToothGrowth$dose
sapply(MeanSupp, var)
##       OJ       VC 
## 43.63344 68.32723
t.test(len1[supp1=="OJ"], len1[supp1=="VC"], paired = FALSE, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  len1[supp1 == "OJ"] and len1[supp1 == "VC"]
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

The p-value of this test equals 0.06063, which is very close to the significance level of 5%. One could interpreted it as a lack of enough evidence to reject the null hypothesis, to account that the 0.05 value of significance is only a convenience value.

Test on the tooth length of the group with vitamin-C dosage

t.test(len1[dose1==2], len1[dose1==1], paired = FALSE, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  len1[dose1 == 2] and len1[dose1 == 1]
## t = 4.9005, df = 38, p-value = 1.811e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  3.735613 8.994387
## sample estimates:
## mean of x mean of y 
##    26.100    19.735

The p-value of above test is 0, a evidence that one can reject the null hypothesis. The dose level affects tooth growth significantly no matter which type of supplement were used. U sing orange juice hast the tendency to have longer teeth, especially when dose level is low. With the dose level of 2mg, these two types of supplement don’t have much differnece for tooth growth.