First we turn off the warning, because it dosenโ€™t look very good in the output

knitr::opts_chunk$set(warning = FALSE)

1. Overview

We load the dataset ToothGrowth and perform some Exploratory Analysis on it and finally some Statistical Analysis will be performed.

2. Data

We do Statistical Inferences on the ToothGrowth dataset, so we need to load this dataset.

library(datasets)
data("ToothGrowth")

2.1 Summary of the Data

names(ToothGrowth)
## [1] "len"  "supp" "dose"

The variables are:

  1. len
  2. sup
  3. dose
summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

We check the summary of the ToothGrowth dataset and see more details about the dataset.

2.2 Checking for the different values

unique(ToothGrowth$len)
##  [1]  4.2 11.5  7.3  5.8  6.4 10.0 11.2  5.2  7.0 16.5 15.2 17.3 22.5 13.6 14.5
## [16] 18.8 15.5 23.6 18.5 33.9 25.5 26.4 32.5 26.7 21.5 23.3 29.5 17.6  9.7  8.2
## [31]  9.4 19.7 20.0 25.2 25.8 21.2 27.3 22.4 24.5 24.8 30.9 29.4 23.0
unique(ToothGrowth$supp)
## [1] VC OJ
## Levels: OJ VC
unique(ToothGrowth$dose)
## [1] 0.5 1.0 2.0

3. Exploratory Analysis

We need to use graphical representation to view the dataset, and we will do that with some boxplots.

3.1 Importing the ggplot library

library(ggplot2)

3.2 Tooth Length VS Dose

ToothGrowth$dose <- as.factor(ToothGrowth$dose)
pl1 <- ggplot(data = ToothGrowth, aes(x = dose, y = len))
pl1 <- pl1 + geom_boxplot(aes(fill = dose))
pl1 <- pl1 + xlab("Dose") + ylab("Length of the Tooth") + ggtitle("Tooth length VS Dose with respect to Dilevery Method")
pl1 <- pl1 + facet_grid(~supp)
pl1

3.3 Supp VS Tooth Length

pl2 <- ggplot(data = ToothGrowth, aes(x = supp, y = len))
pl2 <- pl2 + geom_boxplot(aes(fill = supp))
pl2 <- pl2 + xlab("Dilevery Method") + ylab("Tooth Length") + ggtitle("Tooth Length VS Dilevery Method with respect to the Dose")
pl2 <- pl2 + facet_grid(~ dose)
pl2

4. Analysis

Comparing the the Tooth growth with the suppliment used using t.test

t.test(ToothGrowth$len ~ ToothGrowth$supp)
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$len by ToothGrowth$supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

We see that the p value is 0.06063 which is more than the p value of 0.05 which implies that we can safely say that there is no effect of Suppliments on the tooth growth.

Comparing Tooth Growth with the subsets of the dose

# The T Test is done for the subset of the doses of the type (1.0, 0.5) with the Tooth length
tooth_subs <- subset(ToothGrowth, ToothGrowth$dose %in% c(1.0, 0.5))
t.test(tooth_subs$len ~ tooth_subs$dose)
## 
##  Welch Two Sample t-test
## 
## data:  tooth_subs$len by tooth_subs$dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean in group 0.5   mean in group 1 
##            10.605            19.735
# The T Test is done for the subset of the doses of the type (2.0, 0.5) with the Tooth length
tooth_subs <- subset(ToothGrowth, ToothGrowth$dose %in% c(2.0, 0.5))
t.test(tooth_subs$len ~ tooth_subs$dose)
## 
##  Welch Two Sample t-test
## 
## data:  tooth_subs$len by tooth_subs$dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100
# The T Test is done for the subset of the doses of the type (1.0, 2.0) with the Tooth length
tooth_subs <- subset(ToothGrowth, ToothGrowth$dose %in% c(1.0, 2.0))
t.test(tooth_subs$len ~ tooth_subs$dose)
## 
##  Welch Two Sample t-test
## 
## data:  tooth_subs$len by tooth_subs$dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2 
##          19.735          26.100

We see that p values for all the test are nearly 0 and the interval dosent contain Zero in its range, so we can safely reject the Null Hypothesis and conclude that Doses have a direct relation with the Tooth Growth

5. Conclusion

Given the following assumptions:

  1. The sample is representative of the population
  2. The distribution of the sample means follows the Central Limit Theorem

In reviewing our t-test analysis from above, we can conclude that supplement delivery method has no effect on tooth growth/length, however increased doses do result in increased tooth length.