We’re going to analyze the ToothGrowth data in the R datasets package.
load the data
tg <- ToothGrowth
str(tg)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
As per data, we can see there are only two level in ‘supp’ which is OJ and VC, lets check distinct value of ‘dose’
unique(tg$dose)
## [1] 0.5 1.0 2.0
there are only three level in dose, now we do all analysis around three level of dose with respect of two level of supply
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.2
g <- ggplot(aes(x=dose, y = len), data = tg) +
geom_point(aes(color = supp))
print(g)
g <- ggplot(aes(x = factor(dose), y = len), data = tg) +
geom_boxplot(aes(fill = factor(dose)))
print(g)
g <- ggplot(aes(x = factor(supp), y = len), data = tg) +
geom_boxplot(aes(fill = factor(supp)))
print(g)
g <- ggplot(aes(x = supp, y = len), data = tg) +
geom_boxplot(aes(fill = supp)) + facet_wrap(~ dose)
print(g)
head(tg)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
summary(tg)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
Lower <- subset(tg, dose %in% c(0.5, 1.0))
Middle <- subset(tg, dose %in% c(0.5, 2.0))
Upper <- subset(tg, dose %in% c(1.0, 2.0))
Now we will do t test on basis of doses
t.test(len ~ dose, paired = F, var.equal = F, data = Lower)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
t.test(len ~ dose, paired = F, var.equal = F, data = Middle)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.15617 -12.83383
## sample estimates:
## mean in group 0.5 mean in group 2
## 10.605 26.100
t.test(len ~ dose, paired = F, var.equal = F, data = Upper)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
t test on basis of supply
t.test(len ~ supp, paired = F, var.equal = F, data = tg)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333