Introduction

Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package. Load the ToothGrowth data and perform some basic exploratory data analyses Provide a basic summary of the data. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering) State your conclusions and the assumptions needed for your conclusions.

Load and analyse data

library(datasets)
data(ToothGrowth)
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

Provide a basic summary of the data.

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
c(round(mean(ToothGrowth$len),3) , round(sd(ToothGrowth$len),3),round(var(ToothGrowth$len),3))
## [1] 18.813  7.649 58.512
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
summary(ToothGrowth)
##       len        supp     dose   
##  Min.   : 4.20   OJ:30   0.5:20  
##  1st Qu.:13.07   VC:30   1  :20  
##  Median :19.25           2  :20  
##  Mean   :18.81                   
##  3rd Qu.:25.27                   
##  Max.   :33.90

Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering)

Graphical analysis of data:

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.3
library(knitr)
## Warning: package 'knitr' was built under R version 3.5.3
ggplot(ToothGrowth,aes(x=factor(dose),y=len,fill=factor(dose))) + 
    geom_boxplot(notch=F) +
    facet_grid(.~supp) +
    scale_x_discrete("Dosage (mg)") +   
    scale_y_continuous("Tooth Length") +  
    scale_fill_discrete(name="Dose (mg)") + 
    ggtitle("Effect of Supplement Type and Dosage on Tooth Growth")

The condifence intervals (95%) are:

OJ = ToothGrowth$len[ToothGrowth$supp == 'OJ']
VC = ToothGrowth$len[ToothGrowth$supp == 'VC']
 
t.test(OJ, VC, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  OJ and VC
## t = 1.9153, df = 55.309, p-value = 0.03032
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.4682687       Inf
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

State your conclusions and the assumptions needed for your conclusions.

My conclusion is that based on a 5% confidence interval: 1. There is no relationship between the supplement and the length of the tooth. This means that you could use either of them. Although the basic summary suggests that one supplement is better in small doses. This is not futher investigated. 2. There is a relationship between the dose and the length of the tooth. The P values are to small so the null hypotheses (diffence between doses is 0) have to be rejected.