Non-parametric Alternative for Two-way ANOVA with repeated measures

In this notebook, I have used two approaches for the non-parametric alternative of the two-way ANOVA and why I used both:

Aligned Rank Transform (ART) ANOVA is a non-parametric approach to factorial ANOVA that enables you to analyze the interaction as well as the main effects.
The nparLD package in R is used for performing non-parametric analysis of longitudinal data (i.e., repeated measures) or factorial experiments where assumptions like normality and homogeneity of variance are violated.

Importing Libraries

library(haven)

## Warning: package 'haven' was built under R version 4.4.3

library(ez)

## Warning: package 'ez' was built under R version 4.4.3

library(tidyverse)

## Warning: package 'ggplot2' was built under R version 4.4.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggpubr)

## Warning: package 'ggpubr' was built under R version 4.4.3

library(rstatix)

## Warning: package 'rstatix' was built under R version 4.4.3

## 
## Attaching package: 'rstatix'
## 
## The following object is masked from 'package:stats':
## 
##     filter

library(ARTool)

## Warning: package 'ARTool' was built under R version 4.4.3

library(emmeans)

## Warning: package 'emmeans' was built under R version 4.4.3

## Welcome to emmeans.
## Caution: You lose important information if you filter this package's results.
## See '? untidy'

library(nparLD)

## Warning: package 'nparLD' was built under R version 4.4.3

## Loading required package: MASS

## Warning: package 'MASS' was built under R version 4.4.3

## 
## Attaching package: 'MASS'
## 
## The following object is masked from 'package:rstatix':
## 
##     select
## 
## The following object is masked from 'package:dplyr':
## 
##     select

library(lme4)

## Warning: package 'lme4' was built under R version 4.4.3

## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## 
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack

Reading source files

paus_df <- read_sav("files/PAUS_English_Translated.sav")

Creating a copy of the mother file

paus_df_copy <- paus_df

Functions

remove_nas <- function(columns, data) {   
  return(data %>% filter(if_all(all_of(columns), ~ !is.na(.)))) 
} 
transform_numeric_to_categorical <- function(column, data) {          
  col_data <-data[[column]]   
  levels <- sort(unique(col_data))   
  label_vec <- as_factor(sort(unique(col_data)))   
  return(ordered(col_data, levels = levels, labels = label_vec)) 
} 
find_shapiro_normal_columns <- function(data) {   
  for (col_name in names(data)) {   
    if (is.numeric(data[[col_name]])) {     
      values <- na.omit(data[[col_name]])          
      if (length(values) >= 3 && length(unique(values)) > 1) {       
        shapiro_result <- shapiro.test(values)              
        if (shapiro_result$p.value > 0.05) {         
          cat("\nColumn:", col_name,             
              "\nW =", round(shapiro_result$statistic, 4),             
              "p-value =", round(shapiro_result$p.value, 4), "\n")       
        }     
      }   
    } 
  } 
}

Converting to Long format

data_long <- paus_df_copy %>%
  gather(key = "time", value = "sleep_sufficiency_score", 
        recovery1.1, recovery1.2, recovery1.3) %>%
  convert_as_factor(sysUserID, chronotyp_tric.1, time)
data_long$sleep_sufficiency_score <- as.numeric(data_long$sleep_sufficiency_score)
data_long <- remove_nas(c("sysUserID", "chronotyp_tric.1", "time","sleep_sufficiency_score"),data_long)
data_long <- data_long[c("sysUserID", "chronotyp_tric.1", "time","sleep_sufficiency_score")]
data_long <- data_long %>%
  group_by(sysUserID) %>%
  filter(n_distinct(time) == 3) %>%
  ungroup()
data_long <- droplevels(data_long)

ART ANOVA

Aligned Rank Transform (ART) ANOVA is a non-parametric approach to factorial ANOVA that enables you to analyze the interaction as well as the main effects.

Reference for the formula used: Aligned Rank Transform

model <- art(sleep_sufficiency_score ~ (time*chronotyp_tric.1) + Error(sysUserID), data = data_long)
summary(model)

## Aligned Rank Transform of Factorial Model
## 
## Call:
## art(formula = sleep_sufficiency_score ~ (time * chronotyp_tric.1) + 
##     Error(sysUserID), data = data_long)
## 
## Column sums of aligned responses (should all be ~0):
##                  time      chronotyp_tric.1 time:chronotyp_tric.1 
##                     0                     0                     0 
## 
## F values of ANOVAs on aligned responses not of interest (should all be ~0):
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0       0       0       0       0

Because the p-value is < 0.05, we can reject the null hypothesis that the mean sleep sufficiency score is the same for the 3 assessments.

To determine which groups contains significant differences, we can do a post-hoc test.For a Friedman Test, the appropriate post-hoc test is the pairwise Wilcoxon rank sum test with a bonferroni correction.

Reference: How to Perform the Friedman Test in R

anova(model)

## Analysis of Variance of Aligned Rank Transformed Data
## 
## Table Type: Repeated Measures Analysis of Variance Table (Type I) 
## Model: Repeated Measures (aov)
## Response: art(sleep_sufficiency_score)
## 
##                         Error Df Df.res F value   Pr(>F)   
## 1 time                  Withn  2    658 5.85397 0.003020 **
## 2 chronotyp_tric.1      syUID  2    329 3.79583 0.023454  *
## 3 time:chronotyp_tric.1 Withn  4    658 0.27428 0.894611   
## ---
## Signif. codes:   0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

An Aligned Rank Transform ANOVA was conducted to examine the effects of time, chronotype group, and their interaction on sleep sufficiency scores.

The results revealed a significant main effect of time, F(2, 658) = 5.75, p = .003, indicating that sleep sufficiency scores differed across time points. There was also a significant main effect of chronotype, F(2, 329) = 3.80, p = .023, suggesting that different chronotype groups had different average levels of sleep sufficiency. However, the interaction between time and chronotype was not significant, F(4, 658) = 0.27, p = .895, indicating that the pattern of change over time did not differ significantly between chronotype groups.

Post-hoc Test

The art.con() function in the ARTool package provides non-parametric pairwise comparisons (contrasts) for models fit with art().

art.con(model, "time")

## NOTE: Results may be misleading due to involvement in interactions

##  contrast                  estimate   SE  df t.ratio p.value
##  recovery1.1 - recovery1.2    -26.6 11.4 658  -2.325  0.0531
##  recovery1.1 - recovery1.3    -37.7 11.4 658  -3.302  0.0029
##  recovery1.2 - recovery1.3    -11.2 11.4 658  -0.977  0.5917
## 
## Results are averaged over the levels of: chronotyp_tric.1 
## P value adjustment: tukey method for comparing a family of 3 estimates

art.con(model, "chronotyp_tric.1")

## NOTE: Results may be misleading due to involvement in interactions

##  contrast                              estimate   SE  df t.ratio p.value
##  chronotyp_tric.11 - chronotyp_tric.12    0.921 35.6 329   0.026  0.9996
##  chronotyp_tric.11 - chronotyp_tric.13   85.165 33.7 329   2.529  0.0318
##  chronotyp_tric.12 - chronotyp_tric.13   84.244 37.8 329   2.228  0.0680
## 
## Results are averaged over the levels of: time 
## P value adjustment: tukey method for comparing a family of 3 estimates

The post-hoc test results are similar to the post-hoc results from the two-way ANOVA with repeated measures I have conducted.

Emmeans

The emmean stands for estimated marginal means (aka least square means or adjusted means) and is the model-adjusted average for each group level (e.g., time point or chronotype), accounting for other variables in the model.

Higher emmean = higher predicted value for your outcome (here: sleep sufficiency score)

mod_time     <- artlm(model, "time")
mod_chrono   <- artlm(model, "chronotyp_tric.1")
emmeans(mod_time, pairwise ~ time, adjust = "tukey")

## NOTE: Results may be misleading due to involvement in interactions

## $emmeans
##  time        emmean   SE  df lower.CL upper.CL
##  recovery1.1    477 15.9 471      446      508
##  recovery1.2    504 15.9 471      472      535
##  recovery1.3    515 15.9 471      484      546
## 
## Results are averaged over the levels of: chronotyp_tric.1 
## Warning: EMMs are biased unless design is perfectly balanced 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                  estimate   SE  df t.ratio p.value
##  recovery1.1 - recovery1.2    -26.6 11.4 658  -2.325  0.0531
##  recovery1.1 - recovery1.3    -37.7 11.4 658  -3.302  0.0029
##  recovery1.2 - recovery1.3    -11.2 11.4 658  -0.977  0.5917
## 
## Results are averaged over the levels of: chronotyp_tric.1 
## P value adjustment: tukey method for comparing a family of 3 estimates

emmeans(mod_chrono, pairwise ~ chronotyp_tric.1, adjust = "tukey")

## NOTE: Results may be misleading due to involvement in interactions

## $emmeans
##  chronotyp_tric.1 emmean   SE  df lower.CL upper.CL
##  1                   527 24.1 329      480      575
##  2                   526 26.0 329      475      578
##  3                   442 25.2 329      393      492
## 
## Results are averaged over the levels of: time 
## Warning: EMMs are biased unless design is perfectly balanced 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                              estimate   SE  df t.ratio p.value
##  chronotyp_tric.11 - chronotyp_tric.12    0.921 35.6 329   0.026  0.9996
##  chronotyp_tric.11 - chronotyp_tric.13   85.165 33.7 329   2.529  0.0318
##  chronotyp_tric.12 - chronotyp_tric.13   84.244 37.8 329   2.228  0.0680
## 
## Results are averaged over the levels of: time 
## P value adjustment: tukey method for comparing a family of 3 estimates

Interpretations

estimate is the Pairwise difference
Sleep sufficiency significantly improved from recovery1.1 to recovery1.3 (p = .0029, estimate = -11.2 for group recovery 1.1 - recovery1.3).
No real difference between recovery1.2 and recovery1.3.
Evening Person Chronotype has significantly lower (estimate = 85.165 for group chronotyp_tric.11 (morningness) - chronotyp_tric.13 (eveningness)) sleep sufficiency compared to Morning Person Chronotype (p = .0318).
Evening Person Chronotype is also lower (estimate = 84.244 for group chronotyp_tric.12 (neither) - chronotyp_tric.13 (eveningness)) than Morning Person Chronotype, but this is only marginal (p = .068).
No difference between Types 1 and 2.

Warning Causes

Grouping Balance

table(data_long$time, data_long$chronotyp_tric.1)

##              
##                 1   2   3
##   recovery1.1 140  87 105
##   recovery1.2 140  87 105
##   recovery1.3 140  87 105

This indicates an unbalance across groups (chronotype levels), which may cause a warning on the results. This can result in imperfect alignment when estimating interaction effects.

Interactions

mod_check <- lmer(sleep_sufficiency_score ~ time * chronotyp_tric.1 + (1 | sysUserID), data = data_long) 
anova(mod_check)

## Analysis of Variance Table
##                       npar  Sum Sq Mean Sq F value
## time                     2 1128.13  564.06  4.9498
## chronotyp_tric.1         2  711.68  355.84  3.1226
## time:chronotyp_tric.1    4  210.83   52.71  0.4625

Ideal result is for the groups F value to not be significant.
If there was a significant F value here, it may also raise a warning on the ART model results.

Non-parametric Analysis of Longitudinal Data

The nparLD package in R is used for performing non-parametric analysis of longitudinal data (i.e., repeated measures) or factorial experiments where assumptions like normality and homogeneity of variance are violated. It’s based on rank-based methods.

nparLD(
  sleep_sufficiency_score ~ time * chronotyp_tric.1,
  subject = data_long$sysUserID,
  data = data_long,
  description = TRUE
)

##  Total number of observations:  996 
##  Total number of subjects:   332 
##  Total number of missing observations:  0 
## 
##  Class level information 
##  ----------------------- 
##  Levels of time (sub-plot factor time) :  3 
##  Levels of chronotyp_tric.1 (whole-plot factor group) :  3 
## 
##  Abbreviations 
##  ----------------------- 
##  RankMeans = Rank means
##  Nobs = Number of observations
##  RTE = Relative treatment effect
##  case2x2 = tests for 2-by-2 design
##  Wald.test = Wald-type test statistic
##  ANOVA.test = ANOVA-type test statistic with Box approximation
##  ANOVA.test.mod.Box = modified ANOVA-type test statistic with Box approximation
##  Wald.test.time = Wald-type test statistic for simple time effect
##  ANOVA.test.time = ANOVA-type test statistic for simple time effect
##  N = Standard Normal Distribution N(0,1)
##  T = Student's T distribution with respective degrees of freedom
##  pattern.time (time effects) = Test against patterned alternatives in time using normal distribution ( no pattern specified ) 
##  pair.comparison = Tests for pairwise comparisions (without specifying a pattern) 
##  pattern.pair.comparison = Test for pairwise comparisons with patterned alternatives in time ( no pattern specified ) 
##  pattern.group (group effects) = Test against patterned alternatives in group ( no pattern specified ) 
##  covariance = Covariance matrix 
##  Note: The description output above will disappear by setting description=FALSE in the input. See the help file for details. 
## 
##  F1 LD F1 Model 
##  ----------------------- 
##  Check that the order of the time and group levels are correct.
##  Time level:   recovery1.1 recovery1.2 recovery1.3 
##  Group level:   1 2 3 
##  If the order is not correct, specify the correct order in time.order or group.order.

## Model: 
## F1 LD F1 Model 
##  
## Call: 
## sleep_sufficiency_score ~ time * chronotyp_tric.1
## 
## Wald-Type Statistc (WTS):
##                       Statistic df     p-value
## chronotyp_tric.1       7.956953  2 0.018714126
## time                  10.335620  2 0.005697031
## chronotyp_tric.1:time  1.291018  4 0.862897588
## 
## ANOVA-Type Statistc (ATS):
##                       Statistic       df     p-value
## chronotyp_tric.1      3.5081462 1.912707 0.031903136
## time                  5.5291804 1.988055 0.004043082
## chronotyp_tric.1:time 0.3370964 3.942431 0.850513084
## 
## Modified ANOVA-Type Statistic for the Whole-Plot Factors:
##                  Statistic      df1      df2    p-value
## chronotyp_tric.1  3.508146 1.912707 262.6609 0.03330941

The p-values from both Wald-Type Statictic and ANOVA-Type Statistic have similar significant variables (main effects from chronotyp_tric.1 and times).