One-way analysis of variance (ANOVA) in R

Anna Shirokanova and Olesya Volchenko

February 27, 2023

This seminar

ANOVA is used:

See more: https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide.php

F-test

Source: https://www.youtube.com/watch?v=0Vj2V2qRU10

Example from the World Values Survey data, Russia

Context to the analysis:

For an entertaining overview, see: http://www.worldvaluessurvey.org/WVSContents.jsp?CMSID=Findings

Our research question is: “Do Russians of different levels of education have the same level of emancipative values?”

Let’s explore the issue. First, get the data:

World Value Survey, Russia subset

library(foreign)
wvs <- read.spss("/Users/olesyavolchenko/Yandex.Disk.localized/teaching/DA BA 2023/seminars/WV6_Russia.sav", to.data.frame = T, use.value.labels = T)
wvsrus <- subset(wvs, V2 == "Russia")

dim(wvsrus) # there are 2,500 observations of 430 variables
## [1] 2500  430

Variables

wvsrus$RESEMAVAL1 <-  as.numeric(as.character(wvsrus$RESEMAVAL)) # emancipative values is a continuous var

Emancipative Values

library(ggplot2)
library(scales)
ggplot(wvsrus, aes(x = RESEMAVAL1)) + 
  geom_density() +
  xlim(0, 1)

summary(wvsrus$RESEMAVAL1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.0000  0.2998  0.3917  0.3931  0.4850  0.9259      22

Education levels in the data:

par(mar = c(5, 25, 2, 2))
barplot(table(wvsrus$V248)/nrow(wvsrus)*100, las = 2, horiz = T, xlim = c(0, 60))

Groups as they are now are very different in numbers, which may affect the F-test.

Recoding education levels (for the purpose of this example)

Let’s merge smaller groups together.

wvsrus$educ <- rep(NA, length(wvsrus$V248)) # create a new, empty variable

wvsrus$educ[wvsrus$V248 == "No formal education" | 
            wvsrus$V248 == "Incomplete primary school" |
            wvsrus$V248 == "Complete primary school"|
            wvsrus$V248 == "Incomplete secondary school: university-preparatory type"|
            wvsrus$V248 == "Incomplete secondary school: technical/ vocational type"] <- "primary" 

wvsrus$educ[wvsrus$V248 == "Complete secondary school: university-preparatory type"] <- "secondary gen"

wvsrus$educ[wvsrus$V248 == "Complete secondary school: technical/ vocational type"] <- "secondary vocational"

wvsrus$educ[wvsrus$V248 == "Some university-level education, without degree" | 
            wvsrus$V248 == "University - level education, with degree" ] <- "tertiary" # merge
wvsrus$educ <- as.factor(wvsrus$educ)

library(dplyr)
wvsrus <- select(wvsrus, RESEMAVAL1, educ) # select the variables you need to work with

New groups

par(mar = c(3,10,0,3))
barplot(table(wvsrus$educ)/nrow(wvsrus)*100, 
        horiz = T, 
        xlim = c(0, 60), 
        las = 2) # we get 4 groups of comparable size, which is good for the example

Values descriptives across new groups

library(psych)
library(magrittr)
library(knitr)
library(kableExtra)

describeBy(wvsrus$RESEMAVAL1, wvsrus$educ, mat = TRUE) %>% #create dataframe
  select(Education = group1, N=n, Mean=mean, SD=sd, Median=median, Min=min, Max=max, 
                Skew=skew, Kurtosis=kurtosis, st.error = se) %>% 
  kable(align=c("lrrrrrrrr"), digits=2, row.names = FALSE,
        caption="Emancipative Values by Education Level") %>% 
  kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
Emancipative Values by Education Level
Education N Mean SD Median Min Max Skew Kurtosis st.error
primary 356 0.35 0.13 0.34 0.06 0.74 0.41 -0.08 0.01
secondary gen 363 0.38 0.13 0.38 0.08 0.76 0.05 -0.59 0.01
secondary vocational 973 0.39 0.13 0.39 0.00 0.93 0.07 0.04 0.00
tertiary 774 0.42 0.13 0.43 0.03 0.79 0.03 -0.16 0.00

The groups are of comparable size now, and the continuous variable is normally distributed within the groups (this is good).

Boxplot: Emancipative Values by Education Level

ggplot(wvsrus, aes(x = educ, y = RESEMAVAL1)) +
  geom_boxplot() +
  theme_classic()

From this boxplot, we see that the Y variable is distributed rather normally in across the education groups and that the level of emancipative values is slightly higher among the better educated respondents.

ANOVA test

ANOVA is a parametric test that rests on three assumptions:

If those assumptions are not met by the data, the classic ANOVA test is not reliable.

Homogeneity of variances

library(car)
leveneTest(wvsrus$RESEMAVAL1 ~ wvsrus$educ)
## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value Pr(>F)
## group    3  0.4393 0.7249
##       2462

P-value is equal to 0.72, which means that variances are equal, i.e., you can indicate in the ANOVA test that var.equal = T.

F-test

oneway.test(wvsrus$RESEMAVAL1 ~ wvsrus$educ, var.equal = T) 
## 
##  One-way analysis of means
## 
## data:  wvsrus$RESEMAVAL1 and wvsrus$educ
## F = 31.738, num df = 3, denom df = 2462, p-value < 2.2e-16
aov.out <- aov(wvsrus$RESEMAVAL1 ~ wvsrus$educ)
# another function of ANOVA to be used here; implies that var.equal = T
summary(aov.out)
##               Df Sum Sq Mean Sq F value Pr(>F)    
## wvsrus$educ    3   1.58  0.5258   31.74 <2e-16 ***
## Residuals   2462  40.79  0.0166                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 34 пропущенных наблюдений удалены

F(3, 2462) = 31.738, p-value < 0.001 means that the difference in the level of emancipative values across education groups is statistically significant.

Normality of residuals

plot(aov.out, 2)

The data points lie mostly along the diagonal line, which means that the distribution of residuals is close to normal.

layout(matrix(1:4, 2, 2))
plot(aov.out)

In normally distributed residuals, you will see a straight red line in the two upper graphs, and a straight line along the diagonal in the Q-Q plot.

A more formal way to check normality of residuals:

anova.res <- residuals(object = aov.out)
describe(anova.res) # look at skew and kurtosis, should be < 2
##    vars    n mean   sd median trimmed  mad  min  max range skew kurtosis se
## X1    1 2466    0 0.13      0       0 0.14 -0.4 0.53  0.93  0.1    -0.12  0
shapiro.test(x = anova.res) # if the p-value is > .05, the distribution IS normal.
## 
##  Shapiro-Wilk normality test
## 
## data:  anova.res
## W = 0.99824, p-value = 0.008871
hist(anova.res)

Here, skew and kurtosis are OK for a normal distribution. However, the Shapiro-Wilk test says that it is not normal. Visual analysis and skew and kurtosis agree here, which is why we can conclude that the normality assumption holds.

Post hoc tests

To inspect which groups contribute to the statistical significance of this test, run post hoc (“after analysis”) tests.

Here, you have two options:

  1. if variances are equal across groups, use Tukey’s post hoc test;

  2. if variances are unequal, use pairwise t-test with Bonferroni’s correction.

Tukey ‘Honestly Significant Differences’

TukeyHSD(aov.out)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = wvsrus$RESEMAVAL1 ~ wvsrus$educ)
## 
## $`wvsrus$educ`
##                                          diff          lwr        upr     p adj
## secondary gen-primary              0.03380885  0.009126274 0.05849143 0.0024662
## secondary vocational-primary       0.04705791  0.026561133 0.06755468 0.0000000
## tertiary-primary                   0.07771337  0.056522529 0.09890421 0.0000000
## secondary vocational-secondary gen 0.01324905 -0.007102517 0.03360062 0.3379127
## tertiary-secondary gen             0.04390452  0.022854091 0.06495494 0.0000005
## tertiary-secondary vocational      0.03065546  0.014717825 0.04659310 0.0000048
plot(TukeyHSD(aov.out))

To make all labels visible in the plot, arrange margins and change the direction of labels

par(mar = c(5, 15, 3, 1))
Tukey <- TukeyHSD(aov.out)
plot(Tukey, las = 2)

We can see that all but one differences between pairs of groups are statistically significant. The difference between “secondary gen” and “secondary vocational” is not statistically significant.

Bonferroni post hoc

pairwise.t.test(wvsrus$RESEMAVAL1, wvsrus$educ, 
                adjust = "bonferroni") # for unequal variances, pool.sd = T
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  wvsrus$RESEMAVAL1 and wvsrus$educ 
## 
##                      primary secondary gen secondary vocational
## secondary gen        0.00087 -             -                   
## secondary vocational 2.0e-08 0.09434       -                   
## tertiary             < 2e-16 3.6e-07       2.4e-06             
## 
## P value adjustment method: holm

We see that all but one pairs have a statistically significant difference between the means. If you have trouble reading such p-values, use the options parameter:

options(scipen = 999)
pairwise.t.test(wvsrus$RESEMAVAL1, wvsrus$educ, 
                adjust = "bonferroni") 
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  wvsrus$RESEMAVAL1 and wvsrus$educ 
## 
##                      primary              secondary gen secondary vocational
## secondary gen        0.00087              -             -                   
## secondary vocational 0.00000002           0.09434       -                   
## tertiary             < 0.0000000000000002 0.00000036    0.00000244          
## 
## P value adjustment method: holm

The results are the same as in the TukeyHSD test.

Non-parametric equivalent of ANOVA

kruskal.test(RESEMAVAL1 ~ educ, data = wvsrus)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  RESEMAVAL1 by educ
## Kruskal-Wallis chi-squared = 92.577, df = 3, p-value <
## 0.00000000000000022

With KW chi-square(3) = 92.6, p-value is < .001, which means that the mean ranks of the education groups are not the same. This results confirms what we saw earlier in the ANOVA test.

If Kruskal-Wallis test is significant, you can run non-parametric post hoc tests, e.g. Dunn’s test:

library(DescTools) 
DunnTest(RESEMAVAL1 ~ educ, data = wvsrus)
## 
##  Dunn's test of multiple comparisons using rank sums : holm  
## 
##                                    mean.rank.diff                 pval    
## secondary gen-primary                    196.7600              0.00042 ***
## secondary vocational-primary             266.8025         0.0000000073 ***
## tertiary-primary                         426.8090 < 0.0000000000000002 ***
## secondary vocational-secondary gen        70.0425              0.10971    
## tertiary-secondary gen                   230.0490         0.0000015181 ***
## tertiary-secondary vocational            160.0065         0.0000092217 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

These results show that all but one pair of groups have statistically significant differences in their medians.

See: https://stats.stackexchange.com/questions/126686/how-to-read-the-results-of-dunns-test

This also confirms what we saw above for the ANOVA post hoc tests.

Effect size

Now that you have established the statistical significance, you may think, what is the substantive significance of group membership in determining the level of emancipative values.

To answer this question, look at the share of variance explained by education groups in all the variance of emancipative values.

This is called the ‘effect size’ in ANOVA, or omega-squared. The larger it is, the more important your finding.

Omega-squared varies from 0 to 1. Interpretation: 0.01- < 0.06 (small effect), 0.06 - < 0.14 (moderate effect) and >= 0.14 (large effect).

library(sjstats)
anova_stats(aov.out) # replace the 'aov.out' with the name of your ANOVA resulting object
## term        |   df |  sumsq | meansq | statistic | p.value | etasq | partial.etasq | omegasq | partial.omegasq | epsilonsq | cohens.f | power
## ---------------------------------------------------------------------------------------------------------------------------------------------
## wvsrus$educ |    3 |  1.578 |  0.526 |    31.738 |  < .001 | 0.037 |         0.037 |   0.036 |           0.036 |     0.036 |    0.197 |     1
## Residuals   | 2462 | 40.791 |  0.017 |           |         |       |               |         |                 |           |          |

The result is 0.036 - this is a low effect. It means that even though there is a statistically significant difference in the level of emancipative values across education groups and this difference is statistically significant across all but one pairs of groups, in practical terms this effect is modest.

Extra: How is the effect size for ANOVA calculated?

\[ \omega^2 = \frac{(SS_{model} - (df_{model})*(MS_{error}))}{ SS_{model} + MS_{error} + SS_{error}}\]

See: http://www.statisticshowto.com/omega-squared/

What about the effect size for a non-parametric test?

Consider the epsilon-squared: \[ \epsilon^2 = \frac{{\chi^2} - k + 1}{n - k}\] , where the chi-squared is the statistic from the Kruskal-Wallis test, k is the number of groups, and n is the total number of observations.

library(rcompanion)
epsilonSquared(x = wvsrus$RESEMAVAL1, g = wvsrus$educ)
## epsilon.squared 
##           0.037

“The epsilon-squared shows the percentage of variance in the dependent variable explained by the independent variable. Omega-squared varies from 0 to 1. Interpretation: 0.01- < 0.06 (small effect), 0.06 - < 0.14 (moderate effect) and >= 0.14 (large effect).

Read more: https://rpkgs.datanovia.com/rstatix/reference/kruskal_effsize.html

Here, the epsilon-squared equals 0.036, which means a small effect.

Sources and things to think over:

Examples of ANOVA scripts (read through them and take the best solutions):

A nice bonus

you can use ggstatsplot::ggbetweenstats() for beautiful (and informative!) ANOVA plots

library(ggstatsplot)
## You can cite this package as:
##      Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
##      Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167
ggbetweenstats(data = wvsrus,  y = RESEMAVAL1, x = educ, var.equal = T) 

Explore possible areguments of ggbetweenstats() on your own!

To summarise the key points

More advanced and detailed flow chart: