Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package.
Some criteria that you will be evaluated on
The ToothGrowth dataset explains the relation between the growth of teeth of guinea pigs at each of three dose levels of Vitamin C (0.5, 1 and 2 mg) with each of two delivery methods(orange juice and ascorbic acid).
Load data, check the dataset by extracting its head records, as well as structure (str)
library(datasets)
data(ToothGrowth)
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
3 columns (len as num, supp as factor, dose as num)
We use summary command to describe data:
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
Dataset ToothGrowth data consists of 60 rows.
boxplot(len ~ supp * dose, data=ToothGrowth, ylab="Tooth Length", main="Tooth Growth Data")
I create a test per dose:
1. Dose = 0.5
lowest.dose <- ToothGrowth[ToothGrowth$dose == 0.5, ]
t.test(len ~ supp, paired=FALSE, var.equal=FALSE, data=lowest.dose)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC
## 13.23 7.98
95 percent confidence interval: 1.7190573, 8.7809427 which does not contain 0. OJ supplements at this dose is higher thatn with VC, as it has a higher mean.
2. Dose = 1.0
medium.dose <- ToothGrowth[ToothGrowth$dose == 1.0, ]
t.test(len ~ supp, paired=FALSE, var.equal=FALSE, data=medium.dose)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC
## 22.70 16.77
95 percent confidence interval: 2.8021482, 9.0578518 which does not contain 0. OJ supplements at this dose is higher thatn with VC, as it has a higher mean.
3. Dose = 2.0
high.dose <- ToothGrowth[ToothGrowth$dose == 2.0, ]
t.test(len ~ supp, paired=FALSE, var.equal=FALSE, data=high.dose)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = -0.0461, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OJ mean in group VC
## 26.06 26.14
95 percent confidence interval: -3.7980705, 3.6380705 which contains 0 and similar mean, so I conclude that both supplements have similar effects.
Now we repeat these tests by supplement:
OJ Supplement:
OJ.lowest <- ToothGrowth[ToothGrowth$supp == 'OJ' & ToothGrowth$dose == 0.5, ]
OJ.medium <- ToothGrowth[ToothGrowth$supp == 'OJ' & ToothGrowth$dose == 1.0, ]
OJ.largest <- ToothGrowth[ToothGrowth$supp == 'OJ' & ToothGrowth$dose == 2.0, ]
t.test(OJ.lowest$len, OJ.medium$len, paired=FALSE, var.equal=FALSE)
##
## Welch Two Sample t-test
##
## data: OJ.lowest$len and OJ.medium$len
## t = -5.0486, df = 17.698, p-value = 8.785e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.415634 -5.524366
## sample estimates:
## mean of x mean of y
## 13.23 22.70
1. Low with Medium dose for OJ
Does not contain 0, and average is low for lowest: so increasing dose will increase toothgrow
t.test(OJ.medium$len, OJ.largest$len, paired=FALSE, var.equal=FALSE)
##
## Welch Two Sample t-test
##
## data: OJ.medium$len and OJ.largest$len
## t = -2.2478, df = 15.842, p-value = 0.0392
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.5314425 -0.1885575
## sample estimates:
## mean of x mean of y
## 22.70 26.06
2. Medium with High dose for OJ
Does not contain 0, and average is low for Meidum: so increasing dose will increase toothgrow
VC Supplement:
VC.lowest <- ToothGrowth[ToothGrowth$supp == 'VC' & ToothGrowth$dose == 0.5, ]
VC.medium <- ToothGrowth[ToothGrowth$supp == 'VC' & ToothGrowth$dose == 1.0, ]
VC.largest <- ToothGrowth[ToothGrowth$supp == 'VC' & ToothGrowth$dose == 2.0, ]
t.test(VC.lowest$len, VC.medium$len, paired=FALSE, var.equal=FALSE)
##
## Welch Two Sample t-test
##
## data: VC.lowest$len and VC.medium$len
## t = -7.4634, df = 17.862, p-value = 6.811e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.265712 -6.314288
## sample estimates:
## mean of x mean of y
## 7.98 16.77
3. Low with Medium dose for VC
Does not contain 0, and average is low for lowest: so increasing dose will increase toothgrow
t.test(VC.medium$len, VC.largest$len, paired=FALSE, var.equal=FALSE)
##
## Welch Two Sample t-test
##
## data: VC.medium$len and VC.largest$len
## t = -5.4698, df = 13.6, p-value = 9.156e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.054267 -5.685733
## sample estimates:
## mean of x mean of y
## 16.77 26.14
4. Medium with High dose for VC
Does not contain 0, and average is low for Medium: so increasing dose will increase toothgrow
Summarizing previous analysis:
OJ Supplements are better at 0.5, 1 doses than VC. No differences at 2.0 doses Bigger doses for the same supplement increate tooth growth We have used the following assumptions:
Variances are not equal Non paired data