Use one-way ANOVA to analysis the “EspressoData”. Three brew methods with a measure of the crème on the top. Find the method producing the most crème.
setwd("E:\\mikhilesh\\HU Sem VI ANLY 510 and 506\\ANLY 510 Kao Principals and Applications\\assignment and project")
data1 <- read.csv("Lecture 2 EspressoData.csv")
names(data1)
## [1] "cereme" "brewmethod"
str(data1)
## 'data.frame': 27 obs. of 2 variables:
## $ cereme : num 36.6 39.6 37.7 36 38.5 ...
## $ brewmethod: int 1 1 1 1 1 1 1 1 1 2 ...
summary(data1)
## cereme brewmethod
## Min. :21.02 Min. :1
## 1st Qu.:35.66 1st Qu.:1
## Median :38.52 Median :2
## Mean :44.47 Mean :2
## 3rd Qu.:55.23 3rd Qu.:3
## Max. :73.19 Max. :3
Assumptions
library(moments)
plot(density(data1$cereme))
qqnorm(data1$cereme)
agostino.test(data1$cereme) #skewness test
##
## D'Agostino skewness test
##
## data: data1$cereme
## skew = 0.54679, z = 1.32787, p-value = 0.1842
## alternative hypothesis: data have a skewness
shapiro.test(data1$cereme) #normality test
##
## Shapiro-Wilk normality test
##
## data: data1$cereme
## W = 0.92201, p-value = 0.04414
#Residual plot (lm - linear model)
cereme.lm <- lm(cereme ~ brewmethod, data = data1)
cereme.lm
##
## Call:
## lm(formula = cereme ~ brewmethod, data = data1)
##
## Coefficients:
## (Intercept) brewmethod
## 37.17 3.65
cereme.res <- resid(cereme.lm)
cereme.res
## 1 2 3 4 5 6
## -4.1766667 -1.1666667 -3.0766667 -4.8566667 -2.2966667 -19.7966667
## 7 8 9 10 11 12
## -16.0066667 -6.6366667 -17.7366667 26.3733333 2.2133333 28.7233333
## 13 14 15 16 17 18
## 13.3133333 4.1433333 28.3033333 20.5733333 18.0633333 9.7933333
## 19 20 21 22 23 24
## 8.0733333 -11.4466667 -12.7666667 -8.0066667 -14.5966667 -10.9966667
## 25 26 27
## -10.7866667 -15.4366667 0.2133333
plot(data1$cereme, cereme.res, ylab = "Residual", xlab = "Brew Method", main = "Qauntity of the crème")
abline(0, 0)
bartlett.test(data1$cereme, data1$brewmethod)
##
## Bartlett test of homogeneity of variances
##
## data: data1$cereme and data1$brewmethod
## Bartlett's K-squared = 0.96331, df = 2, p-value = 0.6178
# Variance are equal. p-value = 0.6178 failed to reject the null hypothesis
#Checking variance among groups
tapply(data1$cereme, data1$brewmethod, var)
## 1 2 3
## 53.29088 102.02220 59.30182
#ratio of largest variance to smallest variance 102/53 = 1.9 - is way less than 3 (signifies that there is no issue with failing to reject null hypothesis)
#After checking all the assumptions fulfill our criteria, now we will perform ANOVA
#With only one continuous “outcome” and more than two groups of categorical "input" variables
#IV - 3 brew methods (predictor)
#DV - quantity of crème on the top (outcome)
#Perform the linear model:
#OPTION 1 - directly summary the model
summary(aov(cereme ~ factor(brewmethod), data = data1)) #(Don't forget to factor()ize your predictor - categorical data)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(brewmethod) 2 4065 2032.6 28.41 4.7e-07 ***
## Residuals 24 1717 71.5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#OR OPTION 2 - We can also create a modeland then summary that model seperately
model <- aov(aov(cereme ~ factor(brewmethod), data = data1))
model
## Call:
## aov(formula = aov(cereme ~ factor(brewmethod), data = data1))
##
## Terms:
## factor(brewmethod) Residuals
## Sum of Squares 4065.180 1716.919
## Deg. of Freedom 2 24
##
## Residual standard error: 8.458032
## Estimated effects may be unbalanced
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(brewmethod) 2 4065 2032.6 28.41 4.7e-07 ***
## Residuals 24 1717 71.5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#To find out what group differs from the others
library(pgirmess)
## Warning: package 'pgirmess' was built under R version 3.6.3
#Type 1 - Pairwise comparisons using t tests - Bonferroni Method
pairwise.t.test(data1$cereme, data1$brewmethod, paired = FALSE, p.adjust.method = "bonferroni") #method can be "none", "bonferroni", "holm", "hochberg", "hommel", "BH", or "BY“
##
## Pairwise comparisons using t tests with pooled SD
##
## data: data1$cereme and data1$brewmethod
##
## 1 2
## 2 5.2e-07 -
## 3 0.24 4.4e-05
##
## P value adjustment method: bonferroni
#Bonferroni Method show -> group 1 significantly differs from group 2, and group 2 and group 3 significantly differs from each others, but group 1 and 3 are not significantly different.
#Type 2 - Kurskal-Wallis:
kruskalmc(cereme ~ factor(brewmethod), data = data1)
## Multiple comparison test after Kruskal-Wallis
## p.value: 0.05
## Comparisons
## obs.dif critical.dif difference
## 1-2 14.666667 8.957452 TRUE
## 1-3 3.666667 8.957452 FALSE
## 2-3 11.000000 8.957452 TRUE
#Kurskal-Wallis Method show -> group 1 significantly differs from group 2, and group 2 and group 3 significantly differs from each others, but group 1 and 3 are not significantly different.
#Type 3 - Tukey’s Test:
TukeyHSD(model)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = aov(cereme ~ factor(brewmethod), data = data1))
##
## $`factor(brewmethod)`
## diff lwr upr p adj
## 2-1 28.9 18.942931 38.85707 0.0000005
## 3-1 7.3 -2.657069 17.25707 0.1811000
## 3-2 -21.6 -31.557069 -11.64293 0.0000419
# Tuey's test show -> group 2-1 and 3-2 are significant, 3-1 are nt significantaly different
library(pastecs)
## Warning: package 'pastecs' was built under R version 3.6.3
library(compute.es)
## Warning: package 'compute.es' was built under R version 3.6.3
#First get the relevant stats for each factor level (only relevant output pasted below):
by(data1$cereme, data1$brewmethod, stat.desc)
## data1$brewmethod: 1
## nbr.val nbr.null nbr.na min max range
## 9.0000000 0.0000000 0.0000000 21.0200000 39.6500000 18.6300000
## sum median mean SE.mean CI.mean.0.95 var
## 291.6000000 35.9600000 32.4000000 2.4333533 5.6113228 53.2908750
## std.dev coef.var
## 7.3000599 0.2253105
## ------------------------------------------------------------
## data1$brewmethod: 2
## nbr.val nbr.null nbr.na min max range
## 9.0000000 0.0000000 0.0000000 46.6800000 73.1900000 26.5100000
## sum median mean SE.mean CI.mean.0.95 var
## 551.7000000 62.5300000 61.3000000 3.3668680 7.7640115 102.0222000
## std.dev coef.var
## 10.1006039 0.1647733
## ------------------------------------------------------------
## data1$brewmethod: 3
## nbr.val nbr.null nbr.na min max range
## 9.000000 0.000000 0.000000 32.680000 56.190000 23.510000
## sum median mean SE.mean CI.mean.0.95 var
## 357.300000 37.120000 39.700000 2.566923 5.919334 59.301825
## std.dev coef.var
## 7.700768 0.193974
#n - sample size, M - mean, SD - standard deviation
#brewmethod 1 (n = 9, M = 32.4 , SD = 7.3)
#brewmethod 2 (n = 9, M = 61.3, SD = 10.1)
#brewmethod 3 (n = 9, M = 39.7, SD = 7.7)
#Use these values to compute a few standard Measures of Effect Size (MES or mes) for any pair of interest
#OPtion 1 will compare brewmethod 1 and 2 ~ cereme vs brewmwthod
mes(61.3, 32.4, 10.1, 7.3, 9, 9)
## Mean Differences ES:
##
## d [ 95 %CI] = 3.28 [ 1.86 , 4.69 ]
## var(d) = 0.52
## p-value(d) = 0
## U3(d) = 99.95 %
## CLES(d) = 98.98 %
## Cliff's Delta = 0.98
##
## g [ 95 %CI] = 3.12 [ 1.78 , 4.47 ]
## var(g) = 0.47
## p-value(g) = 0
## U3(g) = 99.91 %
## CLES(g) = 98.64 %
##
## Correlation ES:
##
## r [ 95 %CI] = 0.87 [ 0.67 , 0.95 ]
## var(r) = 0
## p-value(r) = 0
##
## z [ 95 %CI] = 1.32 [ 0.81 , 1.83 ]
## var(z) = 0.07
## p-value(z) = 0
##
## Odds Ratio ES:
##
## OR [ 95 %CI] = 383.22 [ 29.45 , 4987.18 ]
## p-value(OR) = 0
##
## Log OR [ 95 %CI] = 5.95 [ 3.38 , 8.51 ]
## var(lOR) = 1.71
## p-value(Log OR) = 0
##
## Other:
##
## NNT = 1.26
## Total N = 18
#OPtion 2 will compare brewmethod 2 and 3 ~ cereme vs brewmwthod
mes(39.7, 61.3, 7.7, 10.1, 9, 9)
## Mean Differences ES:
##
## d [ 95 %CI] = -2.41 [ -3.62 , -1.19 ]
## var(d) = 0.38
## p-value(d) = 0
## U3(d) = 0.81 %
## CLES(d) = 4.45 %
## Cliff's Delta = -0.91
##
## g [ 95 %CI] = -2.29 [ -3.45 , -1.14 ]
## var(g) = 0.35
## p-value(g) = 0
## U3(g) = 1.1 %
## CLES(g) = 5.26 %
##
## Correlation ES:
##
## r [ 95 %CI] = -0.79 [ -0.92 , -0.51 ]
## var(r) = 0.01
## p-value(r) = 0
##
## z [ 95 %CI] = -1.06 [ -1.57 , -0.56 ]
## var(z) = 0.07
## p-value(z) = 0
##
## Odds Ratio ES:
##
## OR [ 95 %CI] = 0.01 [ 0 , 0.12 ]
## p-value(OR) = 0
##
## Log OR [ 95 %CI] = -4.36 [ -6.56 , -2.16 ]
## var(lOR) = 1.26
## p-value(Log OR) = 0
##
## Other:
##
## NNT = -5.01
## Total N = 18
#OPtion 3 will compare brewmethod 1 and 3 ~ cereme vs brewmwthod
mes(39.7, 32.4, 7.7, 7.3, 9, 9)
## Mean Differences ES:
##
## d [ 95 %CI] = 0.97 [ 0 , 1.95 ]
## var(d) = 0.25
## p-value(d) = 0.07
## U3(d) = 83.47 %
## CLES(d) = 75.43 %
## Cliff's Delta = 0.51
##
## g [ 95 %CI] = 0.93 [ 0 , 1.86 ]
## var(g) = 0.23
## p-value(g) = 0.07
## U3(g) = 82.29 %
## CLES(g) = 74.38 %
##
## Correlation ES:
##
## r [ 95 %CI] = 0.46 [ -0.01 , 0.76 ]
## var(r) = 0.03
## p-value(r) = 0.07
##
## z [ 95 %CI] = 0.5 [ -0.01 , 1 ]
## var(z) = 0.07
## p-value(z) = 0.07
##
## Odds Ratio ES:
##
## OR [ 95 %CI] = 5.84 [ 0.99 , 34.36 ]
## p-value(OR) = 0.07
##
## Log OR [ 95 %CI] = 1.76 [ -0.01 , 3.54 ]
## var(lOR) = 0.82
## p-value(Log OR) = 0.07
##
## Other:
##
## NNT = 2.84
## Total N = 18
#Effect size (more than or equal to) < = 0.1 is small, 0.25 is medium, 0.4 is large (Cohen, 1988)
#Summary of ANOVA:
Observations from the study were analyzed by conducting a one-way analysis of variance using R version 3.6.1. First, all assumptions are met and there is no adjustment made. Results suggest that the brew method 2 has an influence on measure of the crème on the top of the expresso (F(2, 24) = 28.41, p < .001) Continuing the discussion with specifically which brew method produced the signiificantaly differed measures of the crème on the top of the expresso, a Tukey’s hoc test was established. The result suggested that there is a significant difference between brew method 1 and 2 (p-value < 0.001), [and brew method 2 and 3 (p-value < 0.001)] in terms of the measures of the crème on the top of the expresso. The effect was large, Cohen’ D = 3.28 for 1 ad 2, and Cohen’s D = 2.41 for method 2 and 3].