This report analyses tooth growth data in the R dataset package. The aims of this analysis are: 1. Perform some basic exploratoty data analysis and provide a basic summary of data 2. Compare tooth growth by supp and dose using confidence interval and hyphotesis tests. 3. State conclucions and the assumptions made that lead to the conclusions.
library(datasets)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.2
p <-ggplot(ToothGrowth, aes(factor(dose),len))
p+geom_boxplot(aes(fill=supp))+facet_grid(.~supp)+ labs(title ="Figure 1: Tooth Growth by Supp and Dose")
q <- ggplot(ToothGrowth,aes(factor(supp),len))
q+geom_boxplot(aes(fill=supp))+facet_grid(.~dose)+labs(title="Figure 2: Difference of Tooth Growth by Supp for each Dose")
For Figure 1, it shows that both supps have an impact to the growth of the teeth. At all 3 doses of 0.5, 1.0 and 2.0, there are increases shown for both supps.
For Figure 2, it shoes that there is a relationship between the dose of supp and the growth of teeth. The higher the doses, the faster the the growth of the teeth. at the dose of 0. and 1.0, OJ has a better impact compared to VC. However at the dose of 2.0, vC seems to have higher impact to OJ.
library(plyr)
## Warning: package 'plyr' was built under R version 3.2.2
data(ToothGrowth)
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
TGSummary <- ddply(ToothGrowth,.(dose,supp),summarize, mean = mean(len), sd = sd(len))
#as.factor(TGSummary$dose)
TGSummary
## dose supp mean sd
## 1 0.5 OJ 13.23 4.459709
## 2 0.5 VC 7.98 2.746634
## 3 1.0 OJ 22.70 3.910953
## 4 1.0 VC 16.77 2.515309
## 5 2.0 OJ 26.06 2.655058
## 6 2.0 VC 26.14 4.797731
#ToothGrowth <-transform(ToothGrowth, dose = as.factor(dose))
s<- split(ToothGrowth,list(ToothGrowth$supp, ToothGrowth$dose))
t1 <- t.test(s[[1]][[1]],s[[2]][[1]], paired = TRUE, alternative = "greater")
t1
##
## Paired t-test
##
## data: s[[1]][[1]] and s[[2]][[1]]
## t = 2.9791, df = 9, p-value = 0.007736
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 2.019552 Inf
## sample estimates:
## mean of the differences
## 5.25
t2 <- t.test(s[[3]][[1]],s[[4]][[1]], paired = TRUE, alternative = "greater")
t2
##
## Paired t-test
##
## data: s[[3]][[1]] and s[[4]][[1]]
## t = 3.3721, df = 9, p-value = 0.004115
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 2.706401 Inf
## sample estimates:
## mean of the differences
## 5.93
t3 <- t.test(s[[5]][[1]],s[[6]][[1]], paired = TRUE, alternative = "greater")
t3
##
## Paired t-test
##
## data: s[[5]][[1]] and s[[6]][[1]]
## t = -0.042592, df = 9, p-value = 0.5165
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## -3.523109 Inf
## sample estimates:
## mean of the differences
## -0.08
For dose 0.5, the confidence level is 2.0196 and the p-value is 0.0077364. Since it is less than 0.05, it means there is a difference between the 2 methods.
For dose 1.0, the confidence level is 3.3721 and the p-value is 0.00411. Since it is less thab 0.05, it means there is a differnece between both methods.
For dose 2.0, the confidence level is -0.042592 and the p-value is 0.5165.Since it is greater than 0.05, it means there is a not much difference between both methods.