Instructions

The project consists of two parts:

1. A simulation exercise.
2. Basic inferential data analysis.

I will create a report to answer the questions. Given the nature of the series, I will use knitr to create the reports and convert to a pdf.

Each pdf report should be no more than 3 pages with 3 pages of supporting appendix material if needed (code, figures, etc).

Part 2

In the second part, I will analyze the tooth growth data in the R datasets package.

I will perform the following procedures:

1. Load the ToothGrowth data and perform some basic exploratory data analyses.

2. Provide a basic summary of the data.

3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there's other approaches worth considering). 

4. State your conclusions and the assumptions needed for your conclusions.

Synopsis

The dataset shows the effect of Vitamin C on tooth growth in Guinea pigs. The outcome is the lenght of teeth in each group of 10 Guinea pigs of three different dose levels of Vitamin C (0.5, 1.0 and 2.0 mg) with two different delivery methods (orange juice or ascorbic acid).

The data frame has 60 observations and 3 variables:

[,1] len is the tooth length (numeric).

[,2] supp is the supplement type, “VC” for ascorbic acid and “OJ” for orange juice.

[,3] dose is the dose in milligrams (numeric).

1. Load the ToothGrowth data and perform some basic exploratory data analyses.

library(datasets)
library(ggplot2)
data(ToothGrowth)
dim(ToothGrowth)
## [1] 60  3
head(ToothGrowth, 10)
##     len supp dose
## 1   4.2   VC  0.5
## 2  11.5   VC  0.5
## 3   7.3   VC  0.5
## 4   5.8   VC  0.5
## 5   6.4   VC  0.5
## 6  10.0   VC  0.5
## 7  11.2   VC  0.5
## 8  11.2   VC  0.5
## 9   5.2   VC  0.5
## 10  7.0   VC  0.5

2. Provide a basic summary of the data.

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
c(round(mean(ToothGrowth$len),2), round(sd(ToothGrowth$len),2),round(var(ToothGrowth$len),2))
## [1] 18.81  7.65 58.51
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
summary(ToothGrowth)
##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90

3. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

Graphical analysis using boxplot

ggplot(ToothGrowth,aes(x=factor(dose),y=len,fill=factor(dose))) + 
    geom_boxplot(notch=F) +
    facet_grid(.~supp) +
    scale_x_discrete("Dosage (mg)") +   
    scale_y_continuous("Tooth Length") +  
    scale_fill_discrete(name="Dose (mg)") + 
    ggtitle("Effect of Supplement Method and Dosage on Tooth Growth")

Confidence Intervals (CI 95%):

I- 0.5mg dosage: OJ 13.23(10.47,15.99) VC 7.98(6.28,9.68)

x <- ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose == 0.5]
y <- ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose == 0.5]
d05 <- c(round(mean(x),2),
  (round(mean(x) + c(-1,1) * qnorm(0.975) * sd(x)/sqrt(length(x)),2)),
  round(mean(y),2),
  (round(mean(y) + c(-1,1) * qnorm(0.975) * sd(y)/sqrt(length(y)),2)))

OJ-mean OJ-lower OJ-upper VC-mean VC-lower VC-upper

## [1] 13.23 10.47 15.99  7.98  6.28  9.68

II- 1.0mg dosage: OJ 22.70(20.28,25.12) VC 16.77(15.21,18.33)

x <- ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose == 1]
y <- ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose == 1]
d10 <- c(round(mean(x),2),
  (round(mean(x) + c(-1,1) * qnorm(0.975) * sd(x)/sqrt(length(x)),2)),
  round(mean(y),2),
  (round(mean(y) + c(-1,1) * qnorm(0.975) * sd(y)/sqrt(length(y)),2)))

OJ-mean OJ-lower OJ-upper VC-mean VC-lower VC-upper

## [1] 22.70 20.28 25.12 16.77 15.21 18.33

III- 2.0mg dosage: OJ 26.06(24.41,27.71) VC 26.14(23.17,29.11)

x <- ToothGrowth$len[ToothGrowth$supp=="OJ" & ToothGrowth$dose == 2]
y <- ToothGrowth$len[ToothGrowth$supp=="VC" & ToothGrowth$dose == 2]
d20 <- c(round(mean(x),2),
  (round(mean(x) + c(-1,1) * qnorm(0.975) * sd(x)/sqrt(length(x)),2)),
  round(mean(y),2),
  (round(mean(y) + c(-1,1) * qnorm(0.975) * sd(y)/sqrt(length(y)),2)))

OJ-mean OJ-lower OJ-upper VC-mean VC-lower VC-upper

## [1] 26.06 24.41 27.71 26.14 23.17 29.11

4. State your conclusions and the assumptions needed for your conclusions.

Based on the analysis:

1. For dosages of 0.5 and 1.0 mg, oranje juice provides more tooth growth than  ascorbic acid;

2 .For dosage of 2.0 mg, the the tooth growth is the same for both supplement methods;

3. Higher dosages of Vitamin C provides more tooth growth, indepedent of supplemetn method.