Anna Shirokanova and Olesya Volchenko
Last updated: March 25, 2022
See more: https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide.php
Context to the analysis:
There are theories set to explain on the macro level how societies change when they modernise: get industry, then, information industry, people get rich, live longer lives, etc. What happens to people’s values when life becomes secure and survival guaranteed? Theories that aim to explain these processes are called ‘modernization theories’.
One of the offsprings of modernization theory by R.Inglehart is the recent ‘theory of emancipation’ by C.Welzel (2013, for details, see: https://en.wikipedia.org/wiki/Freedom_Rising). Its main concept is called emancipative values, which are a combination of the freedom of choice and equality of opportunities. These values become especially important to people under certain social conditions, and they motivate further changes in societies by changing life strategies of individuals and putting greater emphasis on human agency.
The concept of emancipative values has been implemented in the World Values Survey cross-national research project. It is an index based on several items. The resulting variable, RESEMAVAL is standardised; it ranges from 0 to 1 where “0” means the lowest possible level of emancipative values, and “1” is the highest possible level of emancipative values.
For an entertaining overview, see: http://www.worldvaluessurvey.org/WVSContents.jsp?CMSID=Findings
Let’s explore the issue. First, get the data:
World Value Survey, Russia subset
## [1] 2500 430
wvsrus$RESEMAVAL1 <- as.numeric(as.character(wvsrus$RESEMAVAL)) # emancipative values is a continuous var## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0000 0.2998 0.3917 0.3931 0.4850 0.9259 22
par(mar = c(5, 25, 2, 2))
barplot(table(wvsrus$V248)/nrow(wvsrus)*100, las = 2, horiz = T, xlim = c(0, 60))Groups as they are now are very different in numbers, which may affect the F-test.
Let’s merge smaller groups together.
wvsrus$educ <- rep(NA, length(wvsrus$V248)) # create a new, empty variable
wvsrus$educ[wvsrus$V248 == "No formal education" |
wvsrus$V248 == "Incomplete primary school" |
wvsrus$V248 == "Complete primary school"|
wvsrus$V248 == "Incomplete secondary school: university-preparatory type"|
wvsrus$V248 == "Incomplete secondary school: technical/ vocational type"] <- "primary"
wvsrus$educ[wvsrus$V248 == "Complete secondary school: university-preparatory type"] <- "secondary gen"
wvsrus$educ[wvsrus$V248 == "Complete secondary school: technical/ vocational type"] <- "secondary vocational"
wvsrus$educ[wvsrus$V248 == "Some university-level education, without degree" |
wvsrus$V248 == "University - level education, with degree" ] <- "tertiary" # merge
wvsrus$educ <- as.factor(wvsrus$educ)par(mar = c(3,10,0,3))
barplot(table(wvsrus$educ)/nrow(wvsrus)*100,
horiz = T,
xlim = c(0, 60),
las = 2) # we get 4 groups of comparable size, which is good for the exampleThe groups are of comparable size now, and the continuous variable is normally distributed within the groups (this is good).
Comment: If group sizes are unequal, the power of the test is based on the smallest group, see more: https://www.theanalysisfactor.com/when-unequal-sample-sizes-are-and-are-not-a-problem-in-anova/
library(dplyr)
wvsrus <- select(wvsrus, RESEMAVAL1, educ) # select the variables you need to work with
summary(wvsrus) ## RESEMAVAL1 educ
## Min. :0.0000 primary :364
## 1st Qu.:0.2998 secondary gen :367
## Median :0.3917 secondary vocational:978
## Mean :0.3931 tertiary :779
## 3rd Qu.:0.4850 NA's : 12
## Max. :0.9259
## NA's :22
library(psych)
library(magrittr)
library(knitr)
library(kableExtra)
describeBy(wvsrus$RESEMAVAL1, wvsrus$educ, mat = TRUE) %>% #create dataframe
select(Education = group1, N = n, Mean = mean, SD = sd, Median = median, Min = min, Max = max, Skew = skew, Kurtosis = kurtosis, st.error = se) %>%
kable(align = c("lrrrrrrrr"), digits = 2, row.names = FALSE,
caption = "Emancipative Values by Education Level") %>%
kable_styling(bootstrap_options = c("bordered", "responsive","striped"), full_width = FALSE)| Education | N | Mean | SD | Median | Min | Max | Skew | Kurtosis | st.error |
|---|---|---|---|---|---|---|---|---|---|
| primary | 356 | 0.35 | 0.13 | 0.34 | 0.06 | 0.74 | 0.41 | -0.08 | 0.01 |
| secondary gen | 363 | 0.38 | 0.13 | 0.38 | 0.08 | 0.76 | 0.05 | -0.59 | 0.01 |
| secondary vocational | 973 | 0.39 | 0.13 | 0.39 | 0.00 | 0.93 | 0.07 | 0.04 | 0.00 |
| tertiary | 774 | 0.42 | 0.13 | 0.43 | 0.03 | 0.79 | 0.03 | -0.16 | 0.00 |
From this boxplot, we see that the Y variable is distributed rather normally in across the education groups and that the level of emancipative values is slightly higher among the better educated respondents.
ANOVA is a parametric test that rests on three assumptions:
If those assumptions are not met by the data, the classic ANOVA test is not reliable.
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 3 0.4393 0.7249
## 2462
P-value is equal to 0.72, which means that variances are equal, i.e., you can indicate in the ANOVA test that var.equal = T.
##
## One-way analysis of means
##
## data: wvsrus$RESEMAVAL1 and wvsrus$educ
## F = 31.738, num df = 3, denom df = 2462, p-value < 2.2e-16
aov.out <- aov(wvsrus$RESEMAVAL1 ~ wvsrus$educ)
# another function of ANOVA to be used here; implies that var.equal = T
summary(aov.out)## Df Sum Sq Mean Sq F value Pr(>F)
## wvsrus$educ 3 1.58 0.5258 31.74 <2e-16 ***
## Residuals 2462 40.79 0.0166
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 34 observations deleted due to missingness
F(3, 2462) = 31.738, p-value < 0.001 means that the difference in the level of emancipative values across education groups is statistically significant.
The data points lie mostly along the diagonal line, which means that the distribution of residuals is close to normal.
In normally distributed residuals, you will see a straight red line in the two upper graphs, and a straight line along the diagonal in the Q-Q plot.
anova.res <- residuals(object = aov.out)
describe(anova.res) # look at skew and kurtosis, should be < 2## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 2466 0 0.13 0 0 0.14 -0.4 0.53 0.93 0.1 -0.12 0
##
## Shapiro-Wilk normality test
##
## data: anova.res
## W = 0.99824, p-value = 0.008871
Here, skew and kurtosis are OK for a normal distribution. However, the Shapiro-Wilk test says that it is not normal. Visual analysis and skew and kurtosis agree here, which is why we can conclude that the normality assumption holds.
To inspect which groups contribute to the statistical significance of this test, run post hoc (“after analysis”) tests.
Here, you have two options:
if the variances are equal across groups, use Tukey’s post hoc test;
if the variances are unequal, use Games-Howell test or pairwise t-test with the Bonferroni or Holm-Bonferroni correction.
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = wvsrus$RESEMAVAL1 ~ wvsrus$educ)
##
## $`wvsrus$educ`
## diff lwr upr p adj
## secondary gen-primary 0.03380885 0.009126274 0.05849143 0.0024662
## secondary vocational-primary 0.04705791 0.026561133 0.06755468 0.0000000
## tertiary-primary 0.07771337 0.056522529 0.09890421 0.0000000
## secondary vocational-secondary gen 0.01324905 -0.007102517 0.03360062 0.3379127
## tertiary-secondary gen 0.04390452 0.022854091 0.06495494 0.0000005
## tertiary-secondary vocational 0.03065546 0.014717825 0.04659310 0.0000048
To make all labels visible in the plot, arrange margins and change the direction of labels
We can see that all but one differences between pairs of groups are statistically significant. The difference between “secondary gen” and “secondary vocational” is not statistically significant.
Disclaimer for those who did not attend the seminar: the following post hoc test is not required by our example; it is shown for educational purposes to demonstrate how one would proceed in case where variances are not homogeneous.
pairwise.t.test(wvsrus$RESEMAVAL1, wvsrus$educ,
p.adjust.method = "bonferroni") # for unequal variances, add pool.sd = F##
## Pairwise comparisons using t tests with pooled SD
##
## data: wvsrus$RESEMAVAL1 and wvsrus$educ
##
## primary secondary gen secondary vocational
## secondary gen 0.0026 - -
## secondary vocational 2.4e-08 0.5660 -
## tertiary < 2e-16 5.4e-07 4.9e-06
##
## P value adjustment method: bonferroni
We see that all but one pairs have a statistically significant difference between the means. If you have trouble reading such p-values, use the options parameter:
options(scipen = 999)
pairwise.t.test(wvsrus$RESEMAVAL1, wvsrus$educ,
p.adjust.method = "bonferroni") ##
## Pairwise comparisons using t tests with pooled SD
##
## data: wvsrus$RESEMAVAL1 and wvsrus$educ
##
## primary secondary gen secondary vocational
## secondary gen 0.0026 - -
## secondary vocational 0.000000024 0.5660 -
## tertiary < 0.0000000000000002 0.000000540 0.000004881
##
## P value adjustment method: bonferroni
The results are the same as in the TukeyHSD test.
Disclaimer for those who did not attend the seminar: the following non-parametric test is not required by our example; it is shown for educational purposes to demonstrate how one would proceed in case where assumptions are not met.
##
## Kruskal-Wallis rank sum test
##
## data: RESEMAVAL1 by educ
## Kruskal-Wallis chi-squared = 92.577, df = 3, p-value <
## 0.00000000000000022
With KW chi-square(3) = 92.6, p-value is < .001, which means that the mean ranks of the education groups are not the same. This results confirms what we saw earlier in the ANOVA test.
If Kruskal-Wallis test is significant, you can run non-parametric post hoc tests, e.g. Dunn’s test:
library(DescTools)
DunnTest(RESEMAVAL1 ~ educ, # the grouping variable MUST be a factor; if it is a character, write `as.factor(educ)`
method = "holm", # even with non-parametric methods, apply an adjustment for multiple comparisons. Holm is OK.
data = wvsrus)##
## Dunn's test of multiple comparisons using rank sums : holm
##
## mean.rank.diff pval
## secondary gen-primary 196.7600 0.00042 ***
## secondary vocational-primary 266.8025 0.0000000073 ***
## tertiary-primary 426.8090 < 0.0000000000000002 ***
## secondary vocational-secondary gen 70.0425 0.10971
## tertiary-secondary gen 230.0490 0.0000015181 ***
## tertiary-secondary vocational 160.0065 0.0000092217 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Another way to do the same test is this:
## # A tibble: 6 x 9
## .y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
## 1 RESEMAVAL1 primary secon~ 356 363 3.70 2.12e- 4 4.23e- 4 ***
## 2 RESEMAVAL1 primary secon~ 356 973 6.05 1.45e- 9 7.27e- 9 ****
## 3 RESEMAVAL1 primary terti~ 356 774 9.36 7.94e-21 4.76e-20 ****
## 4 RESEMAVAL1 second~ secon~ 363 973 1.60 1.10e- 1 1.10e- 1 ns
## 5 RESEMAVAL1 second~ terti~ 363 774 5.08 3.80e- 7 1.52e- 6 ****
## 6 RESEMAVAL1 second~ terti~ 973 774 4.67 3.07e- 6 9.22e- 6 ****
These results show that all but one pair of groups have statistically significant differences in their medians.
See: https://stats.stackexchange.com/questions/126686/how-to-read-the-results-of-dunns-test
This also confirms what we saw above for the ANOVA post hoc tests.
Now that you have established the statistical significance, you may think, what is the substantive significance of group membership in determining the level of emancipative values.
To answer this question, look at the share of variance explained by education groups in all the variance of emancipative values.
This is called the ‘effect size’ in ANOVA, or omega-squared. The larger it is, the more important your finding.
Omega-squared varies from 0 to 1. Interpretation: 0.01- < 0.06 (small effect), 0.06 - < 0.14 (moderate effect) and >= 0.14 (large effect).
library(sjstats)
anova_stats(aov.out) # replace the 'aov.out' with the name of your ANOVA resulting object ## term | df | sumsq | meansq | statistic | p.value | etasq | partial.etasq | omegasq | partial.omegasq | epsilonsq | cohens.f | power
## ---------------------------------------------------------------------------------------------------------------------------------------------
## wvsrus$educ | 3 | 1.578 | 0.526 | 31.738 | < .001 | 0.037 | 0.037 | 0.036 | 0.036 | 0.036 | 0.197 | 1
## Residuals | 2462 | 40.791 | 0.017 | | | | | | | | |
## # Effect Size for ANOVA
##
## Parameter | Omega2 | 95% CI
## -----------------------------------
## wvsrus$educ | 0.04 | [0.02, 1.00]
##
## - One-sided CIs: upper bound fixed at (1).
The omega-squared is 0.036 - this is a small effect. It means that even though there is a statistically significant difference in the level of emancipative values across education groups and this difference is statistically significant across all but one pairs of groups, in practical terms this effect is modest.
Extra: How is the effect size for ANOVA calculated?
\[ \omega^2 = \frac{(SS_{model} - (df_{model})*(MS_{error}))}{ SS_{model} + MS_{error} + SS_{error}}\]
See: http://www.statisticshowto.com/omega-squared/
What about the effect size for a non-parametric test?
Consider the epsilon-squared: \[ \epsilon^2 = \frac{{\chi^2} - k + 1}{n - k}\] where the chi-squared is the statistic from the Kruskal-Wallis test, k is the number of groups, and n is the total number of observations.
## epsilon.squared
## 0.037
## # A tibble: 1 x 5
## .y. n effsize method magnitude
## * <chr> <int> <dbl> <chr> <ord>
## 1 RESEMAVAL1 2500 0.0359 eta2[H] small
“The epsilon-squared shows the percentage of variance in the dependent variable explained by the independent variable. The interpretation values commonly in published literature are: 0.01- < 0.06 (small effect), 0.06 - < 0.14 (moderate effect) and >= 0.14 (large effect).”
Read more: https://rpkgs.datanovia.com/rstatix/reference/kruskal_effsize.html
Here, the epsilon-squared equals 0.036, which means a small effect.
Examples of ANOVA scripts (read through them and take the best solutions):
ggstatsplot::ggbetweenstats() for beautiful ANOVA plots