Statistical Inference

Statistical Inference: Peer Assessment 2

Jenina Halitsky

September 20, 2014

=========================================================================================================================

Synopsis

Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package. 1. Load the ToothGrowth data and perform some basic exploratory data analyses 2. Provide a basic summary of the data. 3. Use confidence intervals and hypothesis tests to compare tooth growth by supp and dose. (Use the techniques from class even if there’s other approaches worth considering) 4. State your conclusions and the assumptions needed for your conclusions.

Setup Configurations & Libraries

        library(knitr)
        opts_knit$set(progress=FALSE, verbose = TRUE)
        opts_chunk$set(echo=TRUE, message=FALSE, tidy=TRUE, comment=NA,
                       fig.path="figure/", fig.keep="high", fig.width=10, fig.height=6,
                       fig.align="center")

Load needed libraries.

require(plyr)
require(ggplot2)

=============================================================================================================================

Question 1: Load the ToothGrowth data and perform some basic exploratory data analyses.

library(datasets)
boxplot(len ~ supp * dose, data = ToothGrowth, xlab = "Supp Dose", ylab = "Tooth Length", 
    main = "Boxplot of Tooth Growth Data")

plot of chunk graph

Question 1 Solution

** The boxplot graph shows on average that as the length of the tooth increases the dose is also increased.**

=============================================================================================================================

Question 2: Provide a basic summary of the data.

library(datasets)
x <- ToothGrowth
summary(x)
      len       supp         dose     
 Min.   : 4.2   OJ:30   Min.   :0.50  
 1st Qu.:13.1   VC:30   1st Qu.:0.50  
 Median :19.2           Median :1.00  
 Mean   :18.8           Mean   :1.17  
 3rd Qu.:25.3           3rd Qu.:2.00  
 Max.   :33.9           Max.   :2.00  

Question 2 Solution

The ToothGrowth dataset explains the relation between the growth of teeth at each of three dose levels of Vitamin C (0.5, 1 and 2 mg) with each of two delivery methods(orange juice and ascorbic acid).

=============================================================================================================================

Question 3: Use confidence intervals and hypothesis tests to compare tooth growth by supp and dose. (Use the techniques from class even if there’s other approaches worth considering).

alpha <- 0.05  # 95% confidence interval for 2 tail z values
z.half.alpha <- qnorm(1 - alpha/2)
c(-z.half.alpha, z.half.alpha)
[1] -1.96  1.96
t.test(x$len[x$supp == "OJ"], x$len[x$supp == "VC"], paired = TRUE)  # Hypothesis 1 that OJ does not improve growth more than VC

    Paired t-test

data:  x$len[x$supp == "OJ"] and x$len[x$supp == "VC"]
t = 3.303, df = 29, p-value = 0.00255
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 1.409 5.991
sample estimates:
mean of the differences 
                    3.7 
t.test(x$len, x$dose)  # Hypothesis 2 that dosage improves growth

    Welch Two Sample t-test

data:  x$len and x$dose
t = 17.81, df = 59.8, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 15.66 19.63
sample estimates:
mean of x mean of y 
   18.813     1.167 

=============================================================================================================================

Question 4: State your conclusions and the assumptions needed for your conclusions.

With Hypothesis 1 - The Paired Test: the 95% confidence interval contains the sample mean of the differences between -1.96 to 1.96, the hypothesis cannot be rejected. Therefore, there is insufficient evidence to conclude that supplement OJ will works any better than supplement VC.**

With Hypothesis 2 - The Welch 2 Sample T-Test: the 95% confidence interval contains the sample mean of the differences between -1.96 to 1.96, the hypothesis cannot be rejected. Therefore, there is sufficient evidence to conclude that increased dosages will effect tooth Growth. **

Question 4 Solution

Due to the values obtained it can be assumed that there is a difference in the growth of the tooth while the doses are larger. By looking at the boxplot and the assumptions from the hypothesis, the delivery methods are independent of the dose size.