This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
## package 'kableExtra' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\fiona\AppData\Local\Temp\RtmpwV2X5E\downloaded_packages
Do the drugs help? YES
There is a significant interaction between “Drug1” and “Drug2” (F(4,80) = 2.961). Both “Drug1” alone and “Drug2” alone also show significant effects (F(2,80) = 405.411, P < 0.001) (F(2,80) = 313.645, P < 0.001). All assumptions of ANOVA held, so no adjustment was made.
After performing a post-hoc Tukey test:
Explore Data
ggplot(data, aes(x = Drug1, y = Rating, fill = Drug1)) +
theme_bw() +
theme(legend.position = "none") +
scale_fill_brewer(palette = "Dark2") +
geom_boxplot() +
labs(title = "Does Drug1 impact Ratings")
ggplot(data, aes(x = Drug2, y = Rating, fill = Drug2)) +
theme_bw() +
theme(legend.position = "none") +
scale_fill_brewer(palette = "Dark2") +
geom_boxplot() +
labs(title = "Does Drug2 impact Ratings")
Test Data
bartlett.test(Rating ~ Drug1, data = data)
##
## Bartlett test of homogeneity of variances
##
## data: Rating by Drug1
## Bartlett's K-squared = 1.0587, df = 2, p-value = 0.589
bartlett.test(Rating ~ Drug2, data = data)
##
## Bartlett test of homogeneity of variances
##
## data: Rating by Drug2
## Bartlett's K-squared = 0.14381, df = 2, p-value = 0.9306
shapiro.test(data$Rating)
##
## Shapiro-Wilk normality test
##
## data: data$Rating
## W = 0.9755, p-value = 0.09018
summary(aov(Rating ~ Drug1, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Drug1 2 1550 774.9 48.48 7.97e-15 ***
## Residuals 86 1374 16.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Rating ~ Drug2, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Drug2 2 1199 599.5 29.89 1.4e-10 ***
## Residuals 86 1725 20.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No Blocking !
Factorial
summary(aov(Rating ~ Drug1 + Drug2, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Drug1 2 1549.7 774.9 370.8 <2e-16 ***
## Drug2 2 1198.9 599.5 286.9 <2e-16 ***
## Residuals 84 175.5 2.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Rating ~ Drug1 + Drug2 + Drug1:Drug2, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Drug1 2 1549.7 774.9 405.411 <2e-16 ***
## Drug2 2 1198.9 599.5 313.645 <2e-16 ***
## Drug1:Drug2 4 22.6 5.7 2.961 0.0246 *
## Residuals 80 152.9 1.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
data <-
data %>%
mutate(cocktail = factor(paste(Drug1,Drug2)))
ggplot(data, aes(x = cocktail, y = Rating, fill = cocktail)) +
theme_bw() +
theme(legend.position = "none") +
scale_fill_brewer(palette = "Set1") +
geom_boxplot() +
labs(title = "Does the cocktail impact Ratings")
summary(aov(Rating ~ Drug1 + Drug2 + Drug1:Drug2, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Drug1 2 1549.7 774.9 405.411 <2e-16 ***
## Drug2 2 1198.9 599.5 313.645 <2e-16 ***
## Drug1:Drug2 4 22.6 5.7 2.961 0.0246 *
## Residuals 80 152.9 1.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Rating ~ Drug1 + Drug2 + cocktail, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Drug1 2 1549.7 774.9 405.411 <2e-16 ***
## Drug2 2 1198.9 599.5 313.645 <2e-16 ***
## cocktail 4 22.6 5.7 2.961 0.0246 *
## Residuals 80 152.9 1.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Post-hoc test
TukeyHSD(aov(Rating ~ Drug1 + Drug2 + Drug1:Drug2, data = data))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Rating ~ Drug1 + Drug2 + Drug1:Drug2, data = data)
##
## $Drug1
## diff lwr upr p adj
## b-a -4.512792 -5.365725 -3.659859 0
## c-a -10.310345 -11.177377 -9.443313 0
## c-b -5.797553 -6.650486 -4.944620 0
##
## $Drug2
## diff lwr upr p adj
## y-x -4.189467 -5.057015 -3.321919 0
## z-x -9.005432 -9.866199 -8.144666 0
## z-y -4.815965 -5.661522 -3.970409 0
##
## $`Drug1:Drug2`
## diff lwr upr p adj
## b:x-a:x -2.922222e+00 -4.9474745 -0.896970 0.0005097
## c:x-a:x -9.666667e+00 -11.7445316 -7.588802 0.0000000
## a:y-a:x -2.922222e+00 -4.9474745 -0.896970 0.0005097
## b:y-a:x -8.722222e+00 -10.7474745 -6.696970 0.0000000
## c:y-a:x -1.342222e+01 -15.4474745 -11.396970 0.0000000
## a:z-a:x -8.022222e+00 -10.0474745 -5.996970 0.0000000
## b:z-a:x -1.276768e+01 -14.7488433 -10.786510 0.0000000
## c:z-a:x -1.872222e+01 -20.7474745 -16.696970 0.0000000
## c:x-b:x -6.744444e+00 -8.7696967 -4.719192 0.0000000
## a:y-b:x 7.105427e-15 -1.9712358 1.971236 1.0000000
## b:y-b:x -5.800000e+00 -7.7712358 -3.828764 0.0000000
## c:y-b:x -1.050000e+01 -12.4712358 -8.528764 0.0000000
## a:z-b:x -5.100000e+00 -7.0712358 -3.128764 0.0000000
## b:z-b:x -9.845455e+00 -11.7713685 -7.919541 0.0000000
## c:z-b:x -1.580000e+01 -17.7712358 -13.828764 0.0000000
## a:y-c:x 6.744444e+00 4.7191922 8.769697 0.0000000
## b:y-c:x 9.444444e-01 -1.0808078 2.969697 0.8583631
## c:y-c:x -3.755556e+00 -5.7808078 -1.730303 0.0000028
## a:z-c:x 1.644444e+00 -0.3808078 3.669697 0.2078478
## b:z-c:x -3.101010e+00 -5.0821766 -1.119844 0.0001161
## c:z-c:x -9.055556e+00 -11.0808078 -7.030303 0.0000000
## b:y-a:y -5.800000e+00 -7.7712358 -3.828764 0.0000000
## c:y-a:y -1.050000e+01 -12.4712358 -8.528764 0.0000000
## a:z-a:y -5.100000e+00 -7.0712358 -3.128764 0.0000000
## b:z-a:y -9.845455e+00 -11.7713685 -7.919541 0.0000000
## c:z-a:y -1.580000e+01 -17.7712358 -13.828764 0.0000000
## c:y-b:y -4.700000e+00 -6.6712358 -2.728764 0.0000000
## a:z-b:y 7.000000e-01 -1.2712358 2.671236 0.9674422
## b:z-b:y -4.045455e+00 -5.9713685 -2.119541 0.0000001
## c:z-b:y -1.000000e+01 -11.9712358 -8.028764 0.0000000
## a:z-c:y 5.400000e+00 3.4287642 7.371236 0.0000000
## b:z-c:y 6.545455e-01 -1.2713685 2.580459 0.9750171
## c:z-c:y -5.300000e+00 -7.2712358 -3.328764 0.0000000
## b:z-a:z -4.745455e+00 -6.6713685 -2.819541 0.0000000
## c:z-a:z -1.070000e+01 -12.6712358 -8.728764 0.0000000
## c:z-b:z -5.954545e+00 -7.8804594 -4.028631 0.0000000
Charting
data %>%
group_by(Drug1, Drug2) %>%
summarize(
count = n(),
mean = mean(Rating),
sd = sd(Rating),
.groups = "keep") %>%
kable(
caption = "Does the cocktail impact Ratings",
col.names = c("Drug 1", "Drug 2", "Count", "Mean", "Standard Deviation"),
digits = c(0,0,0,2,2)) %>%
kable_styling() %>%
collapse_rows(columns = c(1))
| Drug 1 | Drug 2 | Count | Mean | Standard Deviation |
|---|---|---|---|---|
| a | x | 9 | 36.22 | 1.30 |
| a | y | 10 | 33.30 | 1.34 |
| a | z | 10 | 28.20 | 1.32 |
| b | x | 10 | 33.30 | 1.77 |
| b | y | 10 | 27.50 | 1.35 |
| b | z | 11 | 23.45 | 1.21 |
| c | x | 9 | 26.56 | 1.13 |
| c | y | 10 | 22.80 | 1.55 |
| c | z | 10 | 17.50 | 1.35 |
Do the main effects of “PG” and “Condition” impact “Valuation”? Yes
1.There is evidence to suggest that there is difference in group means among Conditions(F(19.61,2), p<0.05) * No evidence of different variance * There is evidence of non-normality for Valuation (w = 0.988, p = 0.036)
2.There is evidence to suggest that there is difference in group means among Productor Gamble( t=-6.0463,df = 237.76, p<0.05) * No evidence of different variance * There is evidence of non-normality for Valuation (w = 0.982, p = 0.036)
data <- read_excel("C:/Users/fiona/OneDrive/Desktop/Anly 510 Fall 2022/Assignments/Factorial ANOVA/ANOVAExample2.xlsx")
data <-
data %>%
mutate(
PG = factor(ProductorGamble),
Condition = factor(Condition),
Session = factor(Session),
ProductorGamble = NULL)
Explore Data
ggplot(data, aes(x = PG, y = Valuation, fill = PG)) +
theme_bw() +
theme(legend.position = "none") +
scale_fill_brewer(palette = "Dark2") +
geom_boxplot() +
labs(title = "Does PG impact Valuation")
ggplot(data, aes(x = Condition, y = Valuation, fill = Condition)) +
theme_bw() +
theme(legend.position = "none") +
scale_fill_brewer(palette = "Dark2") +
geom_boxplot() +
labs(title = "Does Condition impact Valuation")
Test Data
bartlett.test(Valuation ~ PG, data = data)
##
## Bartlett test of homogeneity of variances
##
## data: Valuation by PG
## Bartlett's K-squared = 0.11719, df = 1, p-value = 0.7321
bartlett.test(Valuation ~ Condition, data = data)
##
## Bartlett test of homogeneity of variances
##
## data: Valuation by Condition
## Bartlett's K-squared = 0.57097, df = 2, p-value = 0.7516
bartlett.test(Valuation ~ Session, data = data)
##
## Bartlett test of homogeneity of variances
##
## data: Valuation by Session
## Bartlett's K-squared = 0.48295, df = 2, p-value = 0.7855
shapiro.test(data$Valuation)
##
## Shapiro-Wilk normality test
##
## data: data$Valuation
## W = 0.97406, p-value = 0.0002225
xtabs(~PG + Condition + Session, data = data)
## , , Session = 1
##
## Condition
## PG Buyer Chooser Seller
## Gamble 14 13 13
## Product 13 14 13
##
## , , Session = 2
##
## Condition
## PG Buyer Chooser Seller
## Gamble 13 14 13
## Product 13 13 14
##
## , , Session = 3
##
## Condition
## PG Buyer Chooser Seller
## Gamble 13 13 14
## Product 14 13 13
summary(aov(Valuation ~ PG, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## PG 1 429.3 429.3 36.56 5.69e-09 ***
## Residuals 238 2795.1 11.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Valuation ~ Condition, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## Condition 2 457.9 228.95 19.61 1.31e-08 ***
## Residuals 237 2766.5 11.67
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Factorial + Blocking
summary(aov(Valuation ~ PG + Condition + Session, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## PG 1 429.3 429.3 44.180 2.09e-10 ***
## Condition 2 457.9 229.0 23.560 4.76e-10 ***
## Session 2 63.1 31.6 3.249 0.0406 *
## Residuals 234 2274.0 9.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Valuation ~ PG*Condition*Session, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## PG 1 429.3 429.3 44.813 1.75e-10 ***
## Condition 2 457.9 229.0 23.898 3.98e-10 ***
## Session 2 63.1 31.6 3.295 0.0389 *
## PG:Condition 2 55.2 27.6 2.882 0.0581 .
## PG:Session 2 12.0 6.0 0.624 0.5366
## Condition:Session 4 23.7 5.9 0.619 0.6492
## PG:Condition:Session 4 56.2 14.0 1.467 0.2133
## Residuals 222 2126.9 9.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Charting
data %>%
group_by(PG) %>%
summarize(
count = n(),
mean = mean(Valuation),
sd = sd(Valuation)) %>%
kable(
caption = "Does the main effect of PG impact Valuation",
col.names = c("PG", "Count", "Mean", "Standard Deviation"),
digits = c(0,0,2,2)) %>%
kable_styling()
| PG | Count | Mean | Standard Deviation |
|---|---|---|---|
| Gamble | 120 | 4.89 | 3.37 |
| Product | 120 | 7.57 | 3.48 |
data %>%
group_by(Condition) %>%
summarize(
count = n(),
mean = mean(Valuation),
sd = sd(Valuation)) %>%
kable(
caption = "Does the main effect of Condition impact Valuation",
col.names = c("Condition", "Count", "Mean", "Standard Deviation"),
digits = c(0,0,2,2)) %>%
kable_styling()
| Condition | Count | Mean | Standard Deviation |
|---|---|---|---|
| Buyer | 80 | 4.40 | 3.30 |
| Chooser | 80 | 6.55 | 3.37 |
| Seller | 80 | 7.74 | 3.58 |
An interaction occurs if the effect of one independent variable on the dependent variable (the main effect) changes based on the level of the other independent variable. Therefore, if we do not include interactions in factorial experiments, we might interpret the main effect incorrectly because in this case, we ignore the effect of the presence of the other independent variable on the main effect (which might be significant depending on the level of the other independent variable).
In order to get a robust estimate, we must always include the lower order components if they are not significant. For e.g. If we have an interaction of type PQR, then we must include PQ, QR, P*R, P, Q, and R. The lower order components will be measuring an altogether different effect than what they were without an interaction. In case, we ignore the lower order components than we are missing an effect created by them on the output. In addition, these components may be insignificant but would help in evaluating the hypothesis.