In this report I will analyze the ToothGrowth data in the R datasets package and I will cover a basic inferential data analysis.
Below I loaded the data.
We can see all variables with the names command and we can see some data with head command.
The amout of the ToothGrowth is ontained with nrow command.
data(ToothGrowth) #load the data
names(ToothGrowth)
## [1] "len" "supp" "dose"
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
nrow(ToothGrowth)
## [1] 60
Below we can see a summary of the data to each variable.
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
The plot below shows that longer teeth tend to use a higher dose.
library(ggplot2)
qplot(supp, len, data = ToothGrowth, facets= .~ dose)
Now we can see the hypothesus tests with the ToothGrowth data.
t.test(len~supp, paired=F, data=ToothGrowth)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
The below code splits the data set into 3 datasets, one for each of the doses. The hypothesis test is then performed on all 3 data sets (dose values 0.5, 1.0 and 2.0).
a = subset(ToothGrowth, dose==0.5)
b = subset(ToothGrowth, dose==1)
c = subset(ToothGrowth, dose==2)
t.test(len~supp, paired=F, data=a)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC
## 13.23 7.98
t.test(len~supp, paired=F, data=b)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC
## 22.70 16.77
t.test(len~supp, paired=F, data=c)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OJ mean in group VC
## 26.06 26.14
It can be concluded that as tooth size increases, the doses tend to be higher. The confidence interval is (-0.171, 7.571). The hypothesis test has been performed taking paired as FALSE.