Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package. Load the ToothGrowth data and perform some basic exploratory data analyses Provide a basic summary of the data. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering) State your conclusions and the assumptions needed for your conclusions.
library(datasets)
data(ToothGrowth)
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
c(round(mean(ToothGrowth$len),3) , round(sd(ToothGrowth$len),3),round(var(ToothGrowth$len),3))
## [1] 18.813 7.649 58.512
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 0.5:20
## 1st Qu.:13.07 VC:30 1 :20
## Median :19.25 2 :20
## Mean :18.81
## 3rd Qu.:25.27
## Max. :33.90
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.3
library(knitr)
## Warning: package 'knitr' was built under R version 3.5.3
ggplot(ToothGrowth,aes(x=factor(dose),y=len,fill=factor(dose))) +
geom_boxplot(notch=F) +
facet_grid(.~supp) +
scale_x_discrete("Dosage (mg)") +
scale_y_continuous("Tooth Length") +
scale_fill_discrete(name="Dose (mg)") +
ggtitle("Effect of Supplement Type and Dosage on Tooth Growth")
OJ = ToothGrowth$len[ToothGrowth$supp == 'OJ']
VC = ToothGrowth$len[ToothGrowth$supp == 'VC']
t.test(OJ, VC, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
##
## Welch Two Sample t-test
##
## data: OJ and VC
## t = 1.9153, df = 55.309, p-value = 0.03032
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.4682687 Inf
## sample estimates:
## mean of x mean of y
## 20.66333 16.96333
My conclusion is that based on a 5% confidence interval: 1. There is no relationship between the supplement and the length of the tooth. This means that you could use either of them. Although the basic summary suggests that one supplement is better in small doses. This is not futher investigated. 2. There is a relationship between the dose and the length of the tooth. The P values are to small so the null hypotheses (diffence between doses is 0) have to be rejected.