Part 2: Basic Inferential Data Analysis Instructions

Now in the second portion of the project, we’re going to analyze the ToothGrowth data in the R datasets package.

Part 2.1: Load the ToothGrowth data and perform some basic exploratory data analyses

data("ToothGrowth")
dim(ToothGrowth)

## [1] 60  3

str(ToothGrowth)

## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Part 2.2: Provide a basic summary of the data

summary(ToothGrowth)

##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

I wil split the data by supp and calcule len mean by dose.

library(dplyr)
ToothGrowth <- tbl_df(ToothGrowth)
OJ <- filter(ToothGrowth, supp=="OJ")
VC<- filter(ToothGrowth, supp=="VC")

aggregate(len~dose, OJ, mean)

aggregate(len~dose, VC, mean)

Part 2.3: Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose

Now we will compare tooth growth by supplement using a t-test.

t.test(len~supp,data=ToothGrowth)

## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

Since the p-value is greater than 0.05 and the confidence interval of the test contains zero we can say that supplement types seems to have no impact on Tooth growth based on this test.

Now we will compare tooth growth by supplement using a t-test.

Analyze dose = 0.5 vs. dose = 1.0

td_subset1 <- subset(ToothGrowth, ToothGrowth$dose %in% c(1.0,0.5))
t.test(len ~ dose, data=td_subset1)

## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean in group 0.5   mean in group 1 
##            10.605            19.735

Analyze dose = 1.0 vs. dose = 2.0

td_subset2 <- subset(ToothGrowth, ToothGrowth$dose %in% c(1.0,2.0))
t.test(len ~ dose, data=td_subset2)

## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2 
##          19.735          26.100

Analyze dose = 0.5 vs. dose = 2.0

td_subset3 <- subset(ToothGrowth, ToothGrowth$dose %in% c(0.5,2.0))
t.test(len ~ dose, data=td_subset3)

## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100

The p-value of each test was essentially zero and the confidence interval of each test does not cross over zero (0).

Based on this result we can assume that the average tooth length increases with an inceasing dose.

Statistical Inference Course Project Part 2

Johnnery Aldana

23/12/2019

Part 2: Basic Inferential Data Analysis Instructions

Part 2.1: Load the ToothGrowth data and perform some basic exploratory data analyses

Part 2.2: Provide a basic summary of the data

Part 2.3: Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose

Conclusions and assumptions