Factorial ANOVA

#Setup Make sure you check which packages you haven’t installed and do that first Also, don’t forget to set your working directory

library(ggplot2)
library(ggpubr)
library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ✔ purrr   0.3.5      
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

library(effectsize)
library(tidyr)
library(rstatix)

## 
## Attaching package: 'rstatix'
## 
## The following objects are masked from 'package:effectsize':
## 
##     cohens_d, eta_squared
## 
## The following object is masked from 'package:stats':
## 
##     filter

df <- read.csv("FacANOVAData.csv")

#Factorial ANOVA: Between Subjects Today, we’re going to run a couple factorial ANOVAs: one that is all between subjects, and one that is mixed between and within subjects. The first ANOVA will be all between subjects, looking at how perceptions of how this country is doing (Right Track or Wrong Direction) and political party (Republican, Democrat, Independent, Other) affect views of Republicans in 2019. Our first IV, political party, has 4 levels; our second IV, perceptions about the country has two levels. Our DV is views of Republicans (Republican favorability)

##Visualizing the Data There’s a few ways you can look at this data, but let’s start with a bar graph:

ggplot(df, aes(fill=PARTY, y=repFav, x=direction)) + 
    geom_bar(position="dodge", stat="identity")+
  xlab("Direction Country is Going In")+
  ylab("Republican Favorability")

We can also look at a line graph (though maybe a little wrong for these data):

ggline(df, x = "direction", y = "repFav", color = "PARTY",
       add = c("mean_se"))+
  xlab("Direction Country is Going In")+
  ylab("Republican Favorability")

You can also reverse the lines/colors and the axes for these graphs, if it makes logical sense to do so.

ggline(df, x = "PARTY", y = "repFav", color = "direction",
       add = c("mean_se"))+
  xlab("Political Party")+
  ylab("Republican Favorability")

We may also want to know what these values are, so we can find the cell means:

cellDescript <- with(df, aggregate(x=list(Mean=repFav,SD=repFav),
                                  by=list(F1=direction, F2=PARTY),
                                  FUN=mean_sd) )
View(cellDescript)

## Warning in format.data.frame(x0): corrupt data frame: columns will be truncated
## or padded with NAs

##Running a Factorial ANOVA Now we can run an ANOVA! We haven’t covered ANOVA in lecture or in lab yet, but we’re basically looking at our outcome of interest (Republican favorability) and seeing if that score depends on political party and the direction participants believe the country is going The code: outcome variable predicted by our first IV times our second IV

model1 <- aov(repFav~PARTY*direction,data=df)

summary(model1)

##                   Df Sum Sq Mean Sq F value   Pr(>F)    
## PARTY              3  566.4  188.80  601.29  < 2e-16 ***
## direction          1  202.9  202.88  646.13  < 2e-16 ***
## PARTY:direction    3   13.2    4.41   14.06 4.38e-09 ***
## Residuals       2472  776.2    0.31                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

First row = simple/main effect for political party Second row = simple/main effect for direction country is going in Third row = interaction between both IVs We found some significant results! But we need to break down that interaction.

2x3 - first is the rows 2nd is columns - 2 numbers becauses two variables and the value of each number reps the number of levels

library(report)
report(model1)

## Warning: Could not find Sum-of-Squares for the (Intercept) in the ANOVA table.

## The ANOVA (formula: repFav ~ PARTY * direction) suggests that:
## 
##   - The main effect of PARTY is statistically significant and large (F(3, 2472) =
## 601.29, p < .001; Eta2 (partial) = 0.42, 95% CI [0.40, 1.00])
##   - The main effect of direction is statistically significant and large (F(1,
## 2472) = 646.13, p < .001; Eta2 (partial) = 0.21, 95% CI [0.18, 1.00])
##   - The interaction between PARTY and direction is statistically significant and
## small (F(3, 2472) = 14.06, p < .001; Eta2 (partial) = 0.02, 95% CI [8.70e-03,
## 1.00])
## 
## Effect sizes were labelled following Field's (2013) recommendations.

omega squared is for smaller sample size and parital eta is for a larger sample size

library(effectsize)
omega_squared(model1)

## # Effect Size for ANOVA (Type I)
## 
## Parameter       | Omega2 (partial) |       95% CI
## -------------------------------------------------
## PARTY           |             0.42 | [0.40, 1.00]
## direction       |             0.21 | [0.18, 1.00]
## PARTY:direction |             0.02 | [0.01, 1.00]
## 
## - One-sided CIs: upper bound fixed at [1.00].

##Simple Effects Calculation People tend to do some of the following to examine the simple effects. Here, we can see if there is a difference in how political parties see the Republican Party if they think the country is going in the “Right Direction”:

simpEff1 <- df %>% filter(direction=="Right Direction") %>% 
  aov(repFav~PARTY,data=.) 

summary(simpEff1)

##              Df Sum Sq Mean Sq F value Pr(>F)    
## PARTY         3  47.38   15.79   43.92 <2e-16 ***
## Residuals   878 315.74    0.36                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

we have to do simple effects to do tukeys You would then need to do a post-hoc test on this:

TukeyHSD(simpEff1)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = repFav ~ PARTY, data = .)
## 
## $PARTY
##                              diff          lwr       upr     p adj
## Independent-Democrat   0.43540485  0.243330696 0.6274790 0.0000000
## Other-Democrat         0.47506693  0.167030668 0.7831032 0.0004516
## Republican-Democrat    0.74743040  0.563710960 0.9311498 0.0000000
## Other-Independent      0.03966208 -0.233043981 0.3123682 0.9821041
## Republican-Independent 0.31202555  0.196981691 0.4270694 0.0000000
## Republican-Other       0.27236347  0.005475941 0.5392510 0.0434294

We can see that the only comparison that wasn’t significant was the group comparing Other and Independent, which might make sense

This is a lot to take in… We can also do this for when people think the country is on the “Wrong Track.”

simpEff2 <- df %>% filter(direction=="Wrong Track") %>% 
  aov(repFav~PARTY,data=.) 

summary(simpEff2)

##               Df Sum Sq Mean Sq F value Pr(>F)    
## PARTY          3  122.0   40.66   140.7 <2e-16 ***
## Residuals   1594  460.5    0.29                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And break it down again:

TukeyHSD(simpEff2)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = repFav ~ PARTY, data = .)
## 
## $PARTY
##                             diff        lwr       upr     p adj
## Independent-Democrat   0.2246251 0.14907537 0.3001749 0.0000000
## Other-Democrat         0.4170004 0.24907905 0.5849218 0.0000000
## Republican-Democrat    0.9658452 0.84184420 1.0898462 0.0000000
## Other-Independent      0.1923753 0.02165092 0.3630997 0.0198932
## Republican-Independent 0.7412201 0.61344890 0.8689912 0.0000000
## Republican-Other       0.5488448 0.35182878 0.7458607 0.0000000

Here, they’re all significant. We can look at the cell means to determine which is higher and what’s going on. We can see that the largest discrepancies are between Republicans and Democrats

We can also break down the simple effects going the other way, such as do Republicans have different ratings of favorability depending on their perceptions of the direction this country is going in?

df %>% filter(PARTY=="Republican") %>% 
  t.test(repFav~direction,data=.)

## 
##  Welch Two Sample t-test
## 
## data:  repFav by direction
## t = 7.5997, df = 220.68, p-value = 8.392e-13
## alternative hypothesis: true difference in means between group Right Direction and group Wrong Track is not equal to 0
## 95 percent confidence interval:
##  0.3304286 0.5618040
## sample estimates:
## mean in group Right Direction     mean in group Wrong Track 
##                      2.841808                      2.395692

library(lsr)
df %>% filter(PARTY=="Republican") %>% 
  cohensD(repFav~direction,data=.)

## [1] 0.7681856

This can be done with all of the political parties, but be careful of Type I error!

df %>% filter(PARTY=="Independent") %>% 
  t.test(repFav~direction,data=.)

## 
##  Welch Two Sample t-test
## 
## data:  repFav by direction
## t = 19.768, df = 548.41, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Right Direction and group Wrong Track is not equal to 0
## 95 percent confidence interval:
##  0.7883318 0.9622899
## sample estimates:
## mean in group Right Direction     mean in group Wrong Track 
##                      2.529782                      1.654472

##Effect Size Now, you might want to find effect size. We’ll stick with partial eta-squared here:

partial_eta_squared(model1)

##           PARTY       direction PARTY:direction 
##      0.42187204      0.20721657      0.01677177

#Factorial ANOVA: Mixed Design The second ANOVA will be mixed, looking at how political party of the participant and political party asked about affects their favorability ratings.

##Pivoting the Data Maybe you knew this was coming, but now we have to pivot our data to properly analyze within subjects data.

df1 <- pivot_longer(df,cols=c("repFav","demFav"),names_to="partyRated",values_to="favorability")

##Visualizing the Data Again, there’s a few ways you can look at this data, but let’s start with a bar graph:

ggplot(df1, aes(fill=PARTY, y=favorability, x=partyRated)) + 
    geom_bar(position="dodge", stat="identity")+
  xlab("Party Being Rated")+
  ylab("Favorability")

We can also look at a line graph (though maybe a little wrong for this data):

ggline(df1, x = "partyRated", y = "favorability", color = "PARTY", add = c("mean_se"))+
  xlab("Party Being Rated")+
  ylab("Favorability")

You can also reverse the lines/colors and the axes for these graphs, if it makes logical sense to do so.

ggline(df1, x = "PARTY", y = "favorability", color = "partyRated", add = c("mean_se"))+
  xlab("Participant Political Party")+
  ylab("Favorability")

We may also want to know what these values are, so we can find the cell means:

cellDescript2 <- with(df1, aggregate(x=list(Mean=favorability,SD=favorability),
                                  by=list(F1=partyRated, F2=PARTY),
                                  FUN=mean_sd) )
View(cellDescript2)

## Warning in format.data.frame(x0): corrupt data frame: columns will be truncated
## or padded with NAs

##Running a Mixed Factorial ANOVA Now we can run an ANOVA! The technique is similar to what you’ve learned before, but a little different to include the other variables.

model2 <- anova_test(data=df1, dv=favorability, wid=X, between=PARTY, within=partyRated, type=3, detailed=T, effect.size="pes")
get_anova_table(model2)

## ANOVA Table (type III tests)
## 
##             Effect DFn  DFd       SSn      SSd         F        p p<.05   pes
## 1      (Intercept)   1 2476 11359.607  634.428 44333.453 0.00e+00     * 0.947
## 2            PARTY   3 2476     6.848  634.428     8.908 7.19e-06     * 0.011
## 3       partyRated   1 2476    16.736 1242.471    33.351 8.65e-09     * 0.013
## 4 PARTY:partyRated   3 2476  1262.829 1242.471   838.857 0.00e+00     * 0.504

the argument effect.size should be set to “pes” which is partial eta squares

This gives us SS and df but not MS. But we can just add a new column with those in there!

model2$MSn <- model2$SSn/model2$DFn
model2$MSd <- model2$SSd/model2$DFd

Now, rerun the get_anova_table(model2) code; that line only!

##Simple Effects Calculation People tend to do some of the following to examine the simple effects. Here, we can see if there is a difference in how Democrats rate favorability of political parties:

df1 %>% filter(PARTY=="Democrat") %>% 
  t.test(favorability~partyRated,data=.,paired=T)

## 
##  Paired t-test
## 
## data:  favorability by partyRated
## t = 54.54, df = 885, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  1.385242 1.488660
## sample estimates:
## mean difference 
##        1.436951

We can also do this for Independents:

df1 %>% filter(PARTY=="Independent") %>% 
  t.test(favorability~partyRated,data=.,paired=T)

## 
##  Paired t-test
## 
## data:  favorability by partyRated
## t = 9.1399, df = 864, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.2900072 0.4486223
## sample estimates:
## mean difference 
##       0.3693148

We can also break down the simple effects going the other way, such as does favorability of Democrats vary by political party?

simpEff3 <- df1 %>% filter(partyRated=="demFav") %>% 
  aov(favorability~PARTY,data=.) 

summary(simpEff3)

##               Df Sum Sq Mean Sq F value Pr(>F)    
## PARTY          3  703.3  234.42   656.2 <2e-16 ***
## Residuals   2476  884.6    0.36                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And break it down again:

TukeyHSD(simpEff3)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = favorability ~ PARTY, data = .)
## 
## $PARTY
##                              diff        lwr         upr     p adj
## Independent-Democrat   -0.6107954 -0.6842425 -0.53734828 0.0000000
## Other-Democrat         -0.8024500 -0.9577863 -0.64711363 0.0000000
## Republican-Democrat    -1.3832969 -1.4637908 -1.30280305 0.0000000
## Other-Independent      -0.1916546 -0.3471990 -0.03611011 0.0084586
## Republican-Independent -0.7725015 -0.8533962 -0.69160676 0.0000000
## Republican-Other       -0.5808469 -0.7398402 -0.42185364 0.0000000

And again, for how they feel about Republicans:

simpEff4 <- df1 %>% filter(partyRated=="repFav") %>% 
  aov(favorability~PARTY,data=.) 

summary(simpEff4)

##               Df Sum Sq Mean Sq F value Pr(>F)    
## PARTY          3  566.4   188.8   471.1 <2e-16 ***
## Residuals   2476  992.3     0.4                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And break it down again:

TukeyHSD(simpEff4)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = favorability ~ PARTY, data = .)
## 
## $PARTY
##                             diff         lwr       upr     p adj
## Independent-Democrat   0.4568409  0.37904916 0.5346327 0.0000000
## Other-Democrat         0.5912340  0.42670896 0.7557591 0.0000000
## Republican-Democrat    1.2437650  1.15850962 1.3290203 0.0000000
## Other-Independent      0.1343931 -0.03035241 0.2991385 0.1542656
## Republican-Independent 0.7869240  0.70124409 0.8726040 0.0000000
## Republican-Other       0.6525310  0.48413267 0.8209293 0.0000000

Factorial ANOVA

Janine Medina

11/8/2022