Although the name of the technique refers to variances, the main goal of ANOVA is to investigate differences in means.
two-way ANOVA used to evaluate simultaneously the effect of two different grouping variables on a continuous outcome variable. Other synonyms are: two factorial design, factorial anova or two-way between-subjects ANOVA.
The repeated-measures ANOVA is used for analyzing data where same subjects are measured more than once. This test is also referred to as a within-subjects ANOVA or ANOVA with repeated measures.
A Mixed Analysis of Variance (Mixed ANOVA), also known as a Split-Plot ANOVA, is a statistical technique used to analyze the effects of two or more independent variables on a dependent variable. It combines aspects of both the One-Way ANOVA and the Two-Way ANOVA, allowing for the examination of fixed effects (between-subjects factors) and repeated measures (within-subjects factors) in a single analysis.
Learn how to:
In a Mixed ANOVA, the independent variables can be of two types:
Between-Subjects Factor: This is similar to the independent variable in a traditional One-Way or Two-Way ANOVA. It categorizes the observations into different groups or conditions, and the interest lies in understanding the differences in means across these groups.
Within-Subjects Factor: Also known as a repeated measures factor, this variable represents factors for which measurements are taken on the same subjects under different conditions. The within-subjects factor allows you to investigate how subjects’ responses change across these conditions.
Total Sum of Squares (SST)
\(SST= \sum_{i=1}^{m}\sum_{j=1}^{k}(X_{ij} - \bar{X})^2\)
Between-Groups Sum of Squares (SSB)
\(SSB= \sum_{a=1}^{b} n_a (\bar{X_a} - \bar{X})^2\)
Within-Groups Sum of Squares (SSW)
\(SSW= \sum_{a=1}^{b}\sum_{i=1}^{n_a}\sum_{j=1}^{k}(X_{aij} - \bar{X_a})^2\)
Sum of Squares for Interaction (SSI)
\(SSI=SST−SSB−SSW\)
where
Factor_or_Effect | Sum_of_Squares | Degrees_of_Freedom | Mean_Square | F_Statistic |
---|---|---|---|---|
Between-Groups (Group) | SSB | \(df_B\) | \(MSB = \frac{SSB}{df_B}\) | \(F_{\text{Group}} = \frac{MSB}{MSW}\) |
Within-Groups (Condition) | SSW | \(df_W\) | \(MSW = \frac{SSW}{df_W}\) | \(F_{\text{Condition}} = \frac{MSW}{MSB}\) |
Interaction (Group * Condition) | SSI | \(df_I\) | \(MSI = \frac{SSI}{df_I}\) | \(F_{\text{Interaction}} = \frac{MSI}{MSW}\) |
\(df_B\): Degrees of Freedom for the Group factor (number of groups minus 1).
\(df_W\) : Degrees of Freedom for the Condition factor (total number of observations minus the number of groups).
\(df_I\): Degrees of Freedom for the Interaction effect (product of degrees of freedom for Group and Condition factors).
Rstatix: Provides a simple and intuitive pipe-friendly framework, coherent with the ‘tidyverse’ design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses.
‘ggpubr’: provides some easy-to-use functions for creating and customizing ‘ggplot2’- based publication ready plots.
##reading libraries to use
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
library(ggpubr)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.2.3
Fatigue can manifest in various ways and impact an individual’s ability to perform tasks effectively.
Fatigue
Fatigue is a condition characterized by a decline in physical and/or cognitive capabilities due to sustained activity, often accompanied by feelings of tiredness, reduced energy levels, and increased effort required to perform tasks.
Groups
Groups in the dataset refer to distinct categories or experimental conditions under which measurements are taken. Each group represents a specific context or scenario that might influence the response variable (velocity).
Velocity
Velocity is a measure of how quickly an object changes its position with respect to time. In your dataset, velocity represents the rate of movement or change in position for individuals under different conditions.
This information can have implications in various fields, such as sports science, occupational health, and human performance optimization, where understanding the impact of fatigue on velocity is critical for making informed decisions and improvements.
# Demo data
hip <- data.frame(
stringsAsFactors = FALSE,
id = c(1L,2L,3L,4L,5L,6L,7L,8L,
9L,10L,11L,12L,13L,14L,15L,16L,17L,18L,19L,
20L,21L),
Group = c("LOW","LOW","LOW","LOW",
"LOW","LOW","LOW","LOW","LOW","LOW","HIGH","HIGH",
"HIGH","HIGH","HIGH","HIGH","HIGH","HIGH","HIGH",
"HIGH","HIGH"),
Non_Fatigue = c(0.54,0.35,0.69,0.6,0.5,
0.56,0.72,0.3,0.56,0.63,0.4,0.46,0.35,0.7,0.54,
0.46,0.35,0.39,0.62,0.52,0.45),
Fatigue = c(0.6,0.38,0.82,0.5,0.51,
0.68,0.73,0.38,0.7,0.54,0.62,0.37,0.32,0.85,0.73,
0.49,0.56,0.29,0.79,0.54,0.48)
)
head(hip)
## id Group Non_Fatigue Fatigue
## 1 1 LOW 0.54 0.60
## 2 2 LOW 0.35 0.38
## 3 3 LOW 0.69 0.82
## 4 4 LOW 0.60 0.50
## 5 5 LOW 0.50 0.51
## 6 6 LOW 0.56 0.68
Treatment or factor to compare is Group of each individual.
The repeated condition for each individual is under Fatigue or Non Fatigue.
# Transform data into long format
hip <- hip %>%
gather(key = "Condition", value = "Velocity", Non_Fatigue, Fatigue) %>%
convert_as_factor(id, Condition)
head(hip)
## id Group Condition Velocity
## 1 1 LOW Non_Fatigue 0.54
## 2 2 LOW Non_Fatigue 0.35
## 3 3 LOW Non_Fatigue 0.69
## 4 4 LOW Non_Fatigue 0.60
## 5 5 LOW Non_Fatigue 0.50
## 6 6 LOW Non_Fatigue 0.56
hip %>%
group_by(Group, Condition) %>%
get_summary_stats(Velocity, type = "mean_sd")
## # A tibble: 4 × 6
## Group Condition variable n mean sd
## <chr> <fct> <fct> <dbl> <dbl> <dbl>
## 1 HIGH Fatigue Velocity 11 0.549 0.186
## 2 HIGH Non_Fatigue Velocity 11 0.476 0.111
## 3 LOW Fatigue Velocity 10 0.584 0.148
## 4 LOW Non_Fatigue Velocity 10 0.545 0.134
Between-Groups Hypotheses (Main Effect of Group):
Null Hypothesis (H₀): There is no significant difference in the means of the dependent variable among the different levels of the between-groups factor (Group).
Alternative Hypothesis (H₁): There is a significant difference in the means of the dependent variable among at least two levels of the between-groups factor (Group).
Within-Groups Hypotheses (Main Effect of Condition):
Null Hypothesis (H₀): There is no significant difference in the means of the dependent variable among the different levels of the within-groups factor (Condition).
Alternative Hypothesis (H₁): There is a significant difference in the means of the dependent variable among at least two levels of the within-groups factor (Condition).
Interaction Effect Hypotheses (Between-Groups and Within-Groups Interaction):
Null Hypothesis (H₀): There is no significant interaction effect between the between-groups factor (Group) and the within-groups factor (Condition) on the dependent variable.
Alternative Hypothesis (H₁): There is a significant interaction effect between the between-groups factor (Group) and the within-groups factor (Condition) on the dependent variable.
bxp <- ggboxplot(
hip, x = "Group", y = "Velocity",
color = "Condition", palette = "jco"
)
bxp
# Create boxplot and highlight paired data points
bxp <- ggpaired(
hip, x = "Condition", y = "Velocity", id = "id",
line.color = "gray", linetype = "dashed"
)
bxp
The ANOVA Repeted Measures test makes the following assumptions about the data:
hip %>%
group_by(Group,Condition) %>%
identify_outliers(Velocity)
## # A tibble: 1 × 6
## Group Condition id Velocity is.outlier is.extreme
## <chr> <fct> <fct> <dbl> <lgl> <lgl>
## 1 LOW Non_Fatigue 8 0.3 TRUE FALSE
hip %>%
group_by(Group,Condition) %>%
shapiro_test(Velocity)
## # A tibble: 4 × 5
## Group Condition variable statistic p
## <chr> <fct> <chr> <dbl> <dbl>
## 1 HIGH Fatigue Velocity 0.957 0.729
## 2 HIGH Non_Fatigue Velocity 0.925 0.359
## 3 LOW Fatigue Velocity 0.952 0.690
## 4 LOW Non_Fatigue Velocity 0.927 0.418
The data Velocity was normally distributed at each time point, as assessed by Shapiro-Wilk’s test (p > 0.05).
## Homogeneity of variance assumption
hip %>%
group_by(Condition)%>%
levene_test(Velocity ~ Group)
## # A tibble: 2 × 5
## Condition df1 df2 statistic p
## <fct> <int> <int> <dbl> <dbl>
## 1 Fatigue 1 19 0.331 0.572
## 2 Non_Fatigue 1 19 0.136 0.716
The Levene’s test is not significant (p > 0.05). Therefore, we can assume the homogeneity of variances in the different groups.
## Homogeneity of covariances assumption
## Compute Box’s M-test:
box_m(hip[, "Velocity", drop = FALSE], hip$Group)
## # A tibble: 1 × 4
## statistic p.value parameter method
## <dbl> <dbl> <dbl> <chr>
## 1 0.202 0.653 1 Box's M-test for Homogeneity of Covariance Matric…
There was homogeneity of covariances, as assessed by Box’s test of equality of covariance matrices (p > 0.001).
# Compute ANOVA
res.aov <- anova_test (data = hip, dv = Velocity, wid = id,
between = Group, within = Condition)
res.aov
## ANOVA Table (type III tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 Group 1 19 0.735 0.402 0.033
## 2 Condition 1 19 5.975 0.024 * 0.038
## 3 Group:Condition 1 19 0.545 0.470 0.004
where,
From the output above, it can be seen that, there is No statistically significant two-way interactions between group and condition on Velocity, F(1, 19) = 0.545, p > 0.05.
## Alternatively it is possible to rewrite it as:
res.aov3<- hip%>%
anova_test( Velocity ~ Group*Condition + Error(id/Condition) )
res.aov3
## ANOVA Table (type III tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 Group 1 19 0.735 0.402 0.033
## 2 Condition 1 19 5.975 0.024 * 0.038
## 3 Group:Condition 1 19 0.545 0.470 0.004
## if it were a ANOVA two way
res.aov2 <- hip %>% anova_test(Velocity ~ Group * Condition)
res.aov2
## ANOVA Table (type II tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 Group 1 38 1.286 0.264 0.033
## 2 Condition 1 38 1.544 0.222 0.039
## 3 Group:Condition 1 38 0.136 0.714 0.004
# Visual report
# Show the report for the within-subject variable, here "Condition"
# Corresponding to the row number 2 in the ANOVA table output
bxp +
labs(subtitle = get_test_label(res.aov, row = 2, detailed = TRUE))
Performing pairwise paired t-tests
# pairwise comparisons
pwc <- hip %>%
group_by(Group) %>%
pairwise_t_test(
Velocity ~ Condition, paired = TRUE,
p.adjust.method = "bonferroni"
)
pwc
## # A tibble: 2 × 11
## Group .y. group1 group2 n1 n2 stati…¹ df p p.adj p.adj…²
## * <chr> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 HIGH Velocity Fatigue Non_Fati… 11 11 2.02 10 0.071 0.071 ns
## 2 LOW Velocity Fatigue Non_Fati… 10 10 1.45 9 0.18 0.18 ns
## # … with abbreviated variable names ¹statistic, ²p.adj.signif
However, interaction was not significant then it is possible to proceed for each main effect only.
# pairwise comparisons for condition only
pwc <- hip %>%
pairwise_t_test(
Velocity ~ Condition, paired = TRUE,
p.adjust.method = "bonferroni"
)
pwc
## # A tibble: 1 × 10
## .y. group1 group2 n1 n2 statistic df p p.adj p.adj.s…¹
## * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 Velocity Fatigue Non_Fatigue 21 21 2.51 20 0.021 0.021 *
## # … with abbreviated variable name ¹p.adj.signif
The key features of Mixed ANOVA are:
In summary, Mixed ANOVA is a powerful tool for analyzing data with both between-subjects and within-subjects factors, making it particularly suitable for experiments where you want to examine how different conditions affect participants over time or across different groups. It allows researchers to uncover nuanced insights into the combined effects of various factors on the dependent variable, facilitating a more comprehensive understanding of the underlying relationships in the data.