Exercise

Loading add-on package

library(ggplot2)

1. Load the ToothGrowth data and perform some basic exploratory data analyses

data(ToothGrowth)  # ToothGrowth dataset
str(ToothGrowth)

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

2. Provide a basic summary of the data.

summary(ToothGrowth)

##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

ggplot(data = ToothGrowth, aes(x = supp, y = len)) +
         geom_boxplot(aes(fill = supp))

ggplot(data = ToothGrowth, aes(x = supp, y = len)) +
         geom_boxplot(aes(fill = supp)) + facet_wrap(~ dose)

3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

The T-test by dose:

t.test(ToothGrowth$len[ToothGrowth$dose == 1], ToothGrowth$len[ToothGrowth$dose == 0.5],
       paired = FALSE, alternative = "greater")

## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 0.5]
## t = 6.4766, df = 37.986, p-value = 6.342e-08
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  6.753323      Inf
## sample estimates:
## mean of x mean of y 
##    19.735    10.605

t.test(ToothGrowth$len[ToothGrowth$dose == 2], ToothGrowth$len[ToothGrowth$dose == 1],
       paired = FALSE, alternative = "greater")

## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len[ToothGrowth$dose == 2] and ToothGrowth$len[ToothGrowth$dose == 1]
## t = 4.9005, df = 37.101, p-value = 9.532e-06
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  4.17387     Inf
## sample estimates:
## mean of x mean of y 
##    26.100    19.735

Note: Increaseng doseages increase mean tooth length.

The T-test by supp:

t.test(len ~ supp, data = ToothGrowth, paired = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

Note: The p-value of OJ vs. VC is 0.06, which is greater than 0.05, so I accept the null hypothesis. but…

The T-test by supp for each dose:

t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 0.5, ], paired = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98

t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 1, ], paired = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77

t.test(len ~ supp, ToothGrowth[ToothGrowth$dose == 2, ], paired = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

Note:

In case of doseages is 0.5, the p-value of OJ vs. VC is 0.006, which is less than 0.05, so I DO NOT accept the null hypothesis.

In case of doseages is 1.0, the p-value of OJ vs. VC is 0.001, which is less than 0.05, so I DO NOT accept the null hypothesis.

In case of doseages is 2.0, the p-value of OJ vs. VC is 0.964, which is greater than 0.05, so I accept the null hypothesis.

4. State your conclusions and the assumptions needed for your conclusions.

Increaseng doseages increase mean tooth length.
Orange juice (OJ) increase tooth length unless doseage is 2.0.

Statistical Inference Course Project 2