Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package.
Below type, nature and summary of data in our dataset.
Basic summary of the data
library(datasets)
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
Numbers of rows : 60 and columns : 3
Loading some libraries
library(ggplot2)
library(GGally)
Below we perform exploratory data analyses
library(ggplot2)
ggplot(data=ToothGrowth, aes(x=as.factor(dose), y=len, fill=supp)) + geom_bar(stat="identity",) +
facet_grid(. ~ supp) + xlab("Dose") + ylab("Length") +
guides(fill=guide_legend(title="Exploratory Data Analyses"))
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 0.5:20
## 1st Qu.:13.07 VC:30 1 :20
## Median :19.25 2 :20
## Mean :18.81
## 3rd Qu.:25.27
## Max. :33.90
table(ToothGrowth$supp, ToothGrowth$dose)
##
## 0.5 1 2
## OJ 10 10 10
## VC 10 10 10
Using the package GGally to summary all hypothesis tests and confidence intervals for the variables supp and dose in a single view
an <- aov(len ~ supp * dose, data=ToothGrowth)
summary(an)
## Df Sum Sq Mean Sq F value Pr(>F)
## supp 1 205.4 205.4 15.572 0.000231 ***
## dose 2 2426.4 1213.2 92.000 < 2e-16 ***
## supp:dose 2 108.3 54.2 4.107 0.021860 *
## Residuals 54 712.1 13.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
confint(an)
## 2.5 % 97.5 %
## (Intercept) 10.9276907 15.532309
## suppVC -8.5059571 -1.994043
## dose1 6.2140429 12.725957
## dose2 9.5740429 16.085957
## suppVC:dose1 -5.2846186 3.924619
## suppVC:dose2 0.7253814 9.934619
print(model.tables(an,"means"),digits=3)
## Tables of means
## Grand mean
##
## 18.81333
##
## supp
## supp
## OJ VC
## 20.66 16.96
##
## dose
## dose
## 0.5 1 2
## 10.60 19.73 26.10
##
## supp:dose
## dose
## supp 0.5 1 2
## OJ 13.23 22.70 26.06
## VC 7.98 16.77 26.14
Based on the result below our assumptions are that the distrribution of the mean is normal. There is no doubt that OC and VJ have obvious different impact on ToothGrowth.