Ecological Validity and the Statistical Modeling of Inter-Subject Variability

1. The Context: Ecology Meets Climbing Science

In the preprint “Arm position but not handedness modulates forearm oxygen recovery dynamics in rock climbers” (Štěpánová et al., 2026), the authors made a brilliant design choice: testing athletes in a \(180^\circ\) overhead position rather than supine. This introduces much-needed ecological validity to climbing science.

However, human physiology in extreme environments is “noisy.” The authors noted massive inter-subject variability—some climbers’ recovery slowed by \(3\times\), while others slowed by up to \(12.6\times\) (suggesting distinct “vasodilator phenotypes” or anatomical bottlenecks). Because the data was highly right-skewed, the authors utilized non-parametric tests (Kruskal-Wallis and Wilcoxon).

Limitation 1: Flattening Magnitude. Rank-based tests flatten absolute magnitude. A \(12.6\times\) slowdown is converted into a simple sequential rank, treating vital biological variance as statistical noise.

Limitation 2: Methodological Inconsistency. The analytical framework splits its assumptions. The authors utilized rank-based non-parametric tests (Kruskal-Wallis/Wilcoxon) for main effects, implying non-normal data. However, they concurrently relied on parametric math (Mean Squares from a repeated-measures ANOVA for ICC, and Hedges’ \(g\) for effect sizes) which inherently assume normality.

The Solution: In complex systems and ecology, we use Generalized Linear Mixed Models (GLMMs) as a unified analytical framework.

A Unified Distribution: By using a Gamma distribution, the GLMM mathematically respects the strictly positive, right-skewed nature of the data natively. It calculates the main effects, the absolute multiplier, and the variance components all under a single, consistent assumption.
Modeling Phenotypes: By using a Random Effect (1 | Subject_ID) for the individual athlete, we mathematically map their unique physiological baseline rather than burying it as error variance.
Consistent Reliability: Because the GLMM natively extracts variance components, it allows for a rigorous Intraclass Correlation (ICC) calculation directly from the model’s environment, entirely eliminating the need to fall back on an invalid ANOVA.

2. Data Preparation

We begin by loading the supplementary data and transforming it from a wide format to a hierarchical long format.

3. The Generalized Linear Mixed Model (GLMM)

We model the Half-Time Recovery (HTR) using a Gamma family with a log link. We include Angle and Ability_Group as fixed effects, and Subject_ID as a random intercept to account for individual baseline differences.

library(lme4)
# 1. Run the intercept-only model in lme4 to match
model_intercept <- glmer(
Value ~ Angle * Ability_Group + Limb + (1 | Subject_ID),
data = df_htr,
family = Gamma(link = "log")
)
summary(model_intercept)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: Value ~ Angle * Ability_Group + Limb + (1 | Subject_ID)
##    Data: df_htr
## 
##       AIC       BIC    logLik -2*log(L)  df.resid 
##    1288.0    1317.8    -635.0    1270.0       195 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5807 -0.5307 -0.1386  0.3450  5.4035 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Subject_ID (Intercept) 0.07091  0.2663  
##  Residual               0.12244  0.3499  
## Number of obs: 204, groups:  Subject_ID, 51
## 
## Fixed effects:
##                         Estimate Std. Error t value Pr(>|z|)    
## (Intercept)              2.30051    0.14484  15.883   <2e-16 ***
## Angle180                 0.87670    0.09146   9.585   <2e-16 ***
## Ability_Group1           0.03253    0.17425   0.187    0.852    
## Ability_Group2           0.10468    0.17560   0.596    0.551    
## LimbN                   -0.02878    0.03991  -0.721    0.471    
## Angle180:Ability_Group1  0.17184    0.11124   1.545    0.122    
## Angle180:Ability_Group2 -0.07200    0.11169  -0.645    0.519    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Ang180 Abl_G1 Abl_G2 LimbN  A180:A_G1
## Angle180    -0.309                                      
## Abilty_Grp1 -0.815  0.257                               
## Abilty_Grp2 -0.809  0.255  0.672                        
## LimbN       -0.141 -0.009  0.004  0.001                 
## Ang180:A_G1  0.254 -0.821 -0.313 -0.210  0.005          
## Ang180:A_G2  0.251 -0.818 -0.211 -0.312  0.015  0.673

model_slope <- glmer(
  Value ~ Angle * Ability_Group + Limb + (1 + Angle | Subject_ID),
  data = df_htr,
  family = Gamma(link = "log")
)
summary(model_slope)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: Value ~ Angle * Ability_Group + Limb + (1 + Angle | Subject_ID)
##    Data: df_htr
## 
##       AIC       BIC    logLik -2*log(L)  df.resid 
##    1172.7    1209.2    -575.4    1150.7       193 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.5366 -0.3995 -0.1034  0.4561  3.7847 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev. Corr 
##  Subject_ID (Intercept) 0.04317  0.2078        
##             Angle180    0.12248  0.3500   -0.33
##  Residual               0.05670  0.2381        
## Number of obs: 204, groups:  Subject_ID, 51
## 
## Fixed effects:
##                         Estimate Std. Error t value Pr(>|z|)    
## (Intercept)              2.27650    0.10287  22.130  < 2e-16 ***
## Angle180                 0.86200    0.18512   4.657 3.22e-06 ***
## Ability_Group1           0.02773    0.12401   0.224    0.823    
## Ability_Group2           0.11544    0.12499   0.924    0.356    
## LimbN                   -0.02646    0.02549  -1.038    0.299    
## Angle180:Ability_Group1  0.16895    0.22488   0.751    0.452    
## Angle180:Ability_Group2 -0.07124    0.22679  -0.314    0.753    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Ang180 Abl_G1 Abl_G2 LimbN  A180:A_G1
## Angle180    -0.163                                      
## Abilty_Grp1 -0.816  0.136                               
## Abilty_Grp2 -0.810  0.135  0.672                        
## LimbN       -0.125 -0.004  0.002  0.000                 
## Ang180:A_G1  0.134 -0.823 -0.165 -0.111  0.003          
## Ang180:A_G2  0.133 -0.816 -0.111 -0.165  0.005  0.672

Methodological Note: > Please note that passing two fitted mixed models into R’s anova() function does not perform a traditional Analysis of Variance on the raw data (such as R’s aov() function). Instead, it executes a Likelihood Ratio Test (LRT). This test formally compares the two architectures to determine if the more complex model (adding individual random slopes) provides a statistically significant improvement in explaining the data compared to the baseline model.

anova(model_intercept, model_slope)

## Data: df_htr
## Models:
## model_intercept: Value ~ Angle * Ability_Group + Limb + (1 | Subject_ID)
## model_slope: Value ~ Angle * Ability_Group + Limb + (1 + Angle | Subject_ID)
##                 npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)    
## model_intercept    9 1288.0 1317.8 -634.99    1270.0                         
## model_slope       11 1172.7 1209.2 -575.35    1150.7 119.27  2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The Three Methodological Upgrades:

The Absolute Multiplier: The Angle180 estimate quantifies the exact positional penalty. Exponentiating the log-link coefficient (\(\exp(0.862) = 2.37\)) reveals a definitive 2.37x penalty multiplier across all athletes’ baseline recovery times.
Mathematical Proof of Anatomical Bottlenecks: By incorporating a Random Slope (1 + Angle | Subject_ID), the model formally tests the authors’ “thoracic outlet compression” hypothesis. The variance in how individuals react to the 180° position (0.122) is nearly three times larger than their baseline variance (0.043). A Likelihood Ratio Test confirms this fanning-out effect is highly significant (\(p < 2.2 \times 10^{-16}\)), proving that extreme “non-compensator” outliers are not statistical noise, but are driven by individualized structural constraints triggered by arm elevation.
The Robust Null: Even under this highly sensitive random-slope model, the interaction terms (Angle180:Ability_Group) remain statistically non-significant. This mathematically bulletproofs the authors’ controversial conclusion that elite athletes do not possess superior forearm hemodynamics.

4. Visualizing the Signal in the Noise

To truly understand the power of this approach, we must visualize the “shadowy topography” of the raw intra-subject variance underneath the model’s structural predictions.

Visualizing the Data: Raw Noise vs. Modeled Signal

To fully capture both the biological reality of the experiment and the structural findings of the GLMM, we present two distinct visualizations:

Figure 1: Raw Intra-Subject Trajectories This plot maps the unedited, recorded recovery times. Because each athlete was measured on both limbs, there are two distinct trajectories per person. This captures the natural biological “noise”—the minor, random physiological variations in recovery speed between a climber’s left and right forearm during the actual test.

Figure 2: Modeled Individual Trajectories (Random Slopes) This plot maps the conditional predictions generated directly by the statistical model. Because we instructed the model to calculate the positional penalty at the individual level using a random slope (1 + Angle | Subject_ID)—and because the fixed effect difference between limbs proved mathematically negligible—the model isolates a single, unified “penalty trajectory” for each athlete. This filters out the random limb-to-limb noise and exposes the pure physiological signal, visually confirming the severe “fanning out” effect. While physical anatomy was not directly measured, this extreme mathematical variance in how individuals react to arm elevation is exactly the signature we would expect to see if the authors’ hypothesis regarding individualized “anatomical bottlenecks” is correct.

ABOUT THE AUTHOR & THE INSIGHT DATALAB: Dr. Federico J. Villatoro Paz applies an “Ecologist’s Mindset” to sports performance, modeling athletes as complex adaptive systems interacting with chaotic environments.

Research Focus & The V-Rank Engine: Specializing in hierarchical mixed modeling and Bayesian forecasting, his work is grounded in Simon Levin’s (1992) “Problem of Pattern and Scale.” He is the developer of the V-Rank Engine (Patent-Pending), a talent detection and ranking system, an analytical framework that models the gap between an athlete’s raw physiological capacity (micro-scale) and their real-world competitive execution (macro-scale) by treating the competition environment itself as an active, measurable filter.