Prelims

In this project we will analyze the ToothGrowth data in the R datasets package.

Steps

These are the steps, we are going to follow:

  1. Load the ToothGrowth data and perform some basic exploratory data analyses
  2. Provide a basic summary of the data.
  3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.
  4. State your conclusions and the assumptions needed for your conclusions.

Load Data

The dataset we will load captures the effect of vitamin C on tooth growth in guinea pigs.

data("ToothGrowth")
data<-ToothGrowth

Data Summary

Next, we present a summary about the data.

summary(data)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

Also, we can go thru the structure of the data.

str(data)            # structure of data
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
class(data)          # "data.frame"
## [1] "data.frame"
sapply(data, class)  # show classes of all columns
##       len      supp      dose 
## "numeric"  "factor" "numeric"
typeof(data)         # "list"
## [1] "list"
names(data)          # show list components
## [1] "len"  "supp" "dose"
dim(data)            # dimensions of object, if any
## [1] 60  3
head(data)           # extract first few (default 6) parts
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
tail(data,1)        # extract last row
##    len supp dose
## 60  23   OJ    2

Finally, we can set up a table with regarding the relationship between suplement and dose and plot them via boxplot charts.

table <- table(data$supp, data$dose)
table
##     
##      0.5  1  2
##   OJ  10 10 10
##   VC  10 10 10
layout(matrix(c(1,2,3,3), 2, 2, byrow = TRUE), heights=c(4,5))

boxplot(data$len~data$supp,data=data, main="Tooth Growth", xlab="Supplement", ylab="Lenght",col=(c("blue","red")))

boxplot(data$len~data$dose,data=data, main="Tooth Growth", xlab="Dose", ylab="Lenght",col=(c("blue","red")))

boxplot(data$len~interaction(data$supp,data$dose),data=data, main="Tooth Growth", xlab="Supplement / Dose", ylab="Lenght",col=(c("blue","red")))

Compare tooth growth by supp and dose by using confidence intervals and/or hypothesis tests

As the reader can see in the previous set of boxplots, it seems that there is a possitive effect of the OJ supplement over the VC suplemment for dossage 0.5 mg and 1.0 mg dossage. However, it is not the case for higher doses (2.0 mg), where tooth growth is quite similar where using supplement OJ and supplement VC.

Supplement OJ vs VC (Hypothesis Test)

Let us assume that our null hypothesis, \(\mu_0\), states no difference in tooth growth when using the supplement OJ and VC. Similarly, let us assume that out alternative hypothesis, \(\mu_a\), implies more tooth growth when ussing supplement OC over supplent VC.

First, we split the data.

OJ = data$len[data$supp == 'OJ']
VC = data$len[data$supp == 'VC']

Then, we perform a t-test

t.test(OJ, VC, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  OJ and VC
## t = 1.9153, df = 55.309, p-value = 0.03032
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.4682687       Inf
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

As the reader can see, the p-value, 3.032%, is lower that 5%. Hence, we can reject \(\mu_0\). Based upon this result, it is likely that supplement OJ has a possitive impact on tooth growth than supplement VC.

Dossage 0.5 mg vs. 1.0 mg. vs 2.0 mg (Hypothesis Test)

Let us assume that our null hypothesis, \(\mu_0\), states no difference in tooth growth when using either 0.5 mg, 1.0 mg, or 2.0 mg doses. Similarly, let us assume that out alternative hypothesis, \(\mu_a\), implies more tooth growth when using either 0.5 mg, 1.0 mg, or 2.0 mg doses (the larger the better).

First, we split the data.

pointFive = data$len[data$dose == 0.5]
one       = data$len[data$dose == 1.0]
two       = data$len[data$dose == 2.0]

Then, we perform a t-test on 0.5 mg and 1.0 mg doses

t.test(pointFive, one, alternative = "less", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  pointFive and one
## t = -6.4766, df = 37.986, p-value = 6.342e-08
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##       -Inf -6.753323
## sample estimates:
## mean of x mean of y 
##    10.605    19.735

As the reader can see, the p-value, 6.342e-06%, is lower that 5%. Hence, we can reject \(\mu_0\). Based upon this result, it is likely that 1.0 mg dosse has a possitive impact on tooth growth than 0.5 mg dosse.

Subsequently, we perform a t-test on 1.0 mg and 2.0 mg doses.

t.test(one, two, alternative = "less", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  one and two
## t = -4.9005, df = 37.101, p-value = 9.532e-06
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf -4.17387
## sample estimates:
## mean of x mean of y 
##    19.735    26.100

As the reader can see, the p-value, 9.532e-06%, is lower that 5%. Hence, we can reject \(\mu_0\). Based upon this result, it is likely that 2.0 mg dosse has a possitive impact on tooth growth than 1.0 mg dosse.

Hence, we can affirm that larger doses of supplement more tooth growth. However, we have to evaluated whether or not it is true for given supplement OJ and VC.

2.0 mg dosse for supplement OJ vs. VC (Hypothesis Test)

Let us assume that our null hypothesis, \(\mu_0\), implies no difference in tooth growth when applying a 2.0 mg dose of supplement OJ vs. 2.0 mg dose of supplement VC. Similarly, let us assume that out alternative hypothesis, \(\mu_a\), states more tooth growth when we use 2.0 mg dose of supplement OJ over 2.0 mg dose of supplement VC.

First, we split the data.

OJtwo       = data$len[data$supp=='OJ' & data$dose == 2.0]
VCtwo       = data$len[data$supp=='VC' & data$dose == 2.0]

Then, we perform a t-test on supplements OJ and VC and 2.0 mg doses

t.test(OJtwo, VCtwo, alternative = "two.sided", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  OJtwo and VCtwo
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean of x mean of y 
##     26.06     26.14

As the reader can see, the p-value, 96.39%, is greater than 5%. Hence, we cannot reject \(\mu_0\). Based upon this result, there is no sufficient evidence to demostrate there is more tooth growth when 2.0 mg dose of supplement OJ over 2.0 mg dose of supplement VC.

Conclusions

We can conclude that supplement OJ has a possitive impact on tooth growth over supplement VC. Also, it is the case that larger doses of supplement implies more tooth growth. However, we cannot demostrate there is more tooth growth when using 2.0 mg dose of supplement OJ over 2.0 mg dose of supplement VC.