Analyzing the ToothGrowth dataset

Project Simulation

The ToothGrowth dataset explains the relation between the growth of teeth of guinea pigs at each of three dose levels of Vitamin C (0.5, 1 and 2 mg) with each of two delivery methods(orange juice and ascorbic acid).

Load the ToothGrowth data and perform some basic exploratory data analyses

Load data, check the dataset by extracting its head records, as well as structure (str)

library(datasets)
data(ToothGrowth)
head(ToothGrowth)

##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

str(ToothGrowth)

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

3 columns (len as num, supp as factor, dose as num)

Provide a basic summary of the data

We use summary command to describe data:

summary(ToothGrowth)

##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

Dataset ToothGrowth data consists of 60 rows.

boxplot(len ~  supp * dose, data=ToothGrowth, ylab="Tooth Length", main="Tooth Growth Data")

Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose

I create a test per dose:

1. Dose = 0.5

lowest.dose <- ToothGrowth[ToothGrowth$dose == 0.5, ]
t.test(len ~ supp, paired=FALSE, var.equal=FALSE, data=lowest.dose)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98

95 percent confidence interval: 1.7190573, 8.7809427 which does not contain 0. OJ supplements at this dose is higher thatn with VC, as it has a higher mean.

2. Dose = 1.0

medium.dose <- ToothGrowth[ToothGrowth$dose == 1.0, ]
t.test(len ~ supp, paired=FALSE, var.equal=FALSE, data=medium.dose)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77

95 percent confidence interval: 2.8021482, 9.0578518 which does not contain 0. OJ supplements at this dose is higher thatn with VC, as it has a higher mean.

3. Dose = 2.0

high.dose <- ToothGrowth[ToothGrowth$dose == 2.0, ]
t.test(len ~ supp, paired=FALSE, var.equal=FALSE, data=high.dose)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.0461, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

95 percent confidence interval: -3.7980705, 3.6380705 which contains 0 and similar mean, so I conclude that both supplements have similar effects.

Now we repeat these tests by supplement:

OJ Supplement:

OJ.lowest <- ToothGrowth[ToothGrowth$supp == 'OJ' & ToothGrowth$dose == 0.5, ]
OJ.medium <- ToothGrowth[ToothGrowth$supp == 'OJ' & ToothGrowth$dose == 1.0, ]
OJ.largest <- ToothGrowth[ToothGrowth$supp == 'OJ' & ToothGrowth$dose == 2.0, ]

t.test(OJ.lowest$len, OJ.medium$len, paired=FALSE, var.equal=FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  OJ.lowest$len and OJ.medium$len
## t = -5.0486, df = 17.698, p-value = 8.785e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -13.415634  -5.524366
## sample estimates:
## mean of x mean of y 
##     13.23     22.70

Comparison

1. Low with Medium dose for OJ

Does not contain 0, and average is low for lowest: so increasing dose will increase toothgrow

t.test(OJ.medium$len, OJ.largest$len, paired=FALSE, var.equal=FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  OJ.medium$len and OJ.largest$len
## t = -2.2478, df = 15.842, p-value = 0.0392
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.5314425 -0.1885575
## sample estimates:
## mean of x mean of y 
##     22.70     26.06

2. Medium with High dose for OJ

Does not contain 0, and average is low for Meidum: so increasing dose will increase toothgrow

VC Supplement:

VC.lowest <- ToothGrowth[ToothGrowth$supp == 'VC' & ToothGrowth$dose == 0.5, ]
VC.medium <- ToothGrowth[ToothGrowth$supp == 'VC' & ToothGrowth$dose == 1.0, ]
VC.largest <- ToothGrowth[ToothGrowth$supp == 'VC' & ToothGrowth$dose == 2.0, ]

t.test(VC.lowest$len, VC.medium$len, paired=FALSE, var.equal=FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  VC.lowest$len and VC.medium$len
## t = -7.4634, df = 17.862, p-value = 6.811e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.265712  -6.314288
## sample estimates:
## mean of x mean of y 
##      7.98     16.77

3. Low with Medium dose for VC

Does not contain 0, and average is low for lowest: so increasing dose will increase toothgrow

t.test(VC.medium$len, VC.largest$len, paired=FALSE, var.equal=FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  VC.medium$len and VC.largest$len
## t = -5.4698, df = 13.6, p-value = 9.156e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -13.054267  -5.685733
## sample estimates:
## mean of x mean of y 
##     16.77     26.14

4. Medium with High dose for VC

Does not contain 0, and average is low for Medium: so increasing dose will increase toothgrow

Analyzing the ToothGrowth dataset

Giovanni Melo Carvalho Viglioni

Thursday, September 24, 2015

Project Description

Project Simulation

Load the ToothGrowth data and perform some basic exploratory data analyses

Provide a basic summary of the data

Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose

Comparison

Conclusions