| site | n_trials | n_users | n_runs |
|---|---|---|---|
| pilot_mpieva_de | 17635 | 245 | 374 |
| pilot_uniandes_co | 7931 | 248 | 248 |
| pilot_western_ca | 6453 | 129 | 133 |
Mental Rotation Analysis
Sample size by site
Raw data
(1) Ability - proportion correct
(2) Angle curves
IRT estimates
(3) Ability - thetas
(4) Difficulty by rotation angle (2PL scalar)
(5) Discrimination by rotation angle (2PL scalar)
Reaction time (RT)
(6) RT by age
(7) Accuracy by RT (raw scores) broken down by age group
(8) Accuracy by RT (IRT scores) broken down by age group
(9) Rotation angle by RT
(10) Rotation angle by RT and age
(11) Histogram - Reaction Time
PHASE 1: Diagnostic Checks (Understanding the Data)
Step 1: Document the 3D Selection Problem
Why: Need to establish that excluding 3D is justified, not arbitrary
A. Correlation: 3D exposure × ability
r = 0.644 (95% CI: [0.599, 0.685])
p < .001
→ INTERPRETATION: LARGE correlation - 3D strongly predicts ability
B. Mean ability by 3D exposure
Saw 3D: M = -0.07 (SD = 0.83, n = 534)
No 3D: M = -1.75 (SD = 0.57, n = 173)
Difference: 1.68 SD (Cohen's d = 2.36)
→ INTERPRETATION: HUGE effect - 3D exposure is not random
C. Selection mechanism (logistic regression)
Cumulative accuracy: OR = 3.32 per 100% accuracy
Trial position: OR = 2.03 per 10 trials
→ INTERPRETATION: 3D exposure is ADAPTIVE - depends on performance
VERDICT: 3D items CANNOT be used as a predictor
- Correlation r = 0.644 is too large to ignore
- Cohen's d = 2.36 is a huge effect size
- Adaptive selection (OR = 3.32) creates endogeneity
→ SOLUTION: Exclude 3D items from RT residualization
Step 2: Check Response Acceleration
Why: Need to know if we should control for trial_number in RT model
RESEARCH QUESTION: Do children speed up during the test?
Hypothesis: May be confounded by item difficulty changes
A. Simple linear model: log(RT) ~ trial_number
β = 0.00286 (p < .001)
B. Controlled model: log(RT) ~ trial_number + angle + 2D/3D
β_trial = -0.00352
✓ Include trial_number in RT residualization
Step 3: Check Item Position Effects on Accuracy
Why: Items might be harder later independent of their difficulty
Model: accuracy ~ trial_number + angle + 2D/3D + person + item
β_position = -0.00096 (per trial)
-> INTERPRETATION: No strong position effects on accuracy
PHASE 2: Core Analysis
Step 4: Refit RT Model with Corrections (Option B: Simple Residuals)
Why: Incorporate all the diagnostics into one clean model using simple residualization.
======================================================================
STEP 4: CORRECTED RT RESIDUALIZATION (2D Items Only)
======================================================================
Sample for RT model:
- Trials: 19051 (2D items only)
- People: 621
Fitting: log(RT) ~ angle + trial_number + (1|item)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log_rt ~ angle + trial_number + (1 | item)
Data: mrot_rt_2d
REML criterion at convergence: 35497.9
Scaled residuals:
Min 1Q Median 3Q Max
-4.2969 -0.6318 -0.0192 0.6590 2.8639
Random effects:
Groups Name Variance Std.Dev.
item (Intercept) 0.007461 0.08638
Residual 0.376188 0.61334
Number of obs: 19051, groups: item, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 8.167e-01 7.639e-02 5.973e+00 10.691 4.07e-05 ***
angle 2.609e-03 6.904e-04 5.742e+00 3.778 0.00998 **
trial_number -9.569e-03 4.930e-04 6.342e+03 -19.408 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) angle
angle -0.903
trial_numbr -0.130 -0.018
Person-level RT Residuals (Option B):
Mean = 0.008 (should be approx 0)
SD = 0.394 (This is the between-person variance)
✓ RT residualization (Simple Method) complete
Step 5: Compare Original vs Corrected RT Residuals
======================================================================
STEP 5: COMPARING ORIGINAL VS CORRECTED RESIDUALS
======================================================================
Correlation: Original (2D+3D) vs Corrected (2D Only): r = 0.412
PHASE 3: Final Bivariate Model
Step 6: Final Bivariate Model with Corrected Residuals
======================================================================
STEP 6: FINAL BIVARIATE MODEL
======================================================================
Final sample: 706 person-runs
Fitting bivariate model...
Formula: mvbind(ability, rt_resid) ~ site + age
Family: MV(gaussian, gaussian)
Links: mu = identity
mu = identity
Formula: ability ~ site + age
rt_resid ~ site + age
Data: model_data_corrected (Number of observations: 706)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
ability_Intercept -1.86 0.13 -2.10 -1.61 1.00
rtresid_Intercept 0.47 0.06 0.35 0.59 1.00
ability_sitepilot_uniandes_co -1.32 0.07 -1.45 -1.19 1.00
ability_sitepilot_western_ca -0.43 0.08 -0.59 -0.28 1.00
ability_age 0.20 0.01 0.17 0.22 1.00
rtresid_sitepilot_uniandes_co -0.14 0.03 -0.20 -0.08 1.00
rtresid_sitepilot_western_ca -0.03 0.04 -0.10 0.05 1.00
rtresid_age -0.05 0.01 -0.06 -0.03 1.00
Bulk_ESS Tail_ESS
ability_Intercept 8218 3030
rtresid_Intercept 7491 3230
ability_sitepilot_uniandes_co 7105 3293
ability_sitepilot_western_ca 7212 3418
ability_age 8980 3240
rtresid_sitepilot_uniandes_co 7518 3134
rtresid_sitepilot_western_ca 6786 2821
rtresid_age 7579 3304
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_ability 0.77 0.02 0.73 0.81 1.00 10435 3131
sigma_rtresid 0.38 0.01 0.36 0.40 1.00 7655 3018
Residual Correlations:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
rescor(ability,rtresid) 0.10 0.04 0.03 0.17 1.00 8539
Tail_ESS
rescor(ability,rtresid) 3357
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
1800 / 2000 [ 90%] (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
Chain 1:
Chain 1: Elapsed Time: 1.009 seconds (Warm-up)
Chain 1: 0.841 seconds (Sampling)
Chain 1: 1.85 seconds (Total)
Chain 1:
Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
Chain 2:
Chain 2: Elapsed Time: 1.05 seconds (Warm-up)
Chain 2: 0.847 seconds (Sampling)
Chain 2: 1.897 seconds (Total)
Chain 2:
Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
Chain 3:
Chain 3: Elapsed Time: 1.058 seconds (Warm-up)
Chain 3: 0.853 seconds (Sampling)
Chain 3: 1.911 seconds (Total)
Chain 3:
Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
Chain 4:
Chain 4: Elapsed Time: 1.046 seconds (Warm-up)
Chain 4: 0.864 seconds (Sampling)
Chain 4: 1.91 seconds (Total)
Chain 4:
| Joint Analysis of Spatial Ability and Processing Speed | ||||
|---|---|---|---|---|
| Posterior estimates from Bivariate Bayesian Regression | ||||
| Predictor | Beta | Lower 95% CI | Upper 95% CI | Finding |
| Ability (SD) | ||||
| Site: Colombia (vs. DE) | −1.32 | −1.45 | −1.19 | Significantly Lower Accuracy |
| Site: Canada (vs. DE) | −0.43 | −0.59 | −0.28 | Moderately Lower Accuracy |
| Age (per year) | 0.20 | 0.17 | 0.22 | Developmental Growth |
| Speed (log s) | ||||
| Site: Colombia (vs. DE) | −0.14 | −0.20 | −0.08 | Significantly Faster (13%) |
| Site: Canada (vs. DE) | −0.03 | −0.10 | 0.05 | No Difference in Speed |
| Age (per year) | −0.05 | −0.06 | −0.03 | Developmental Speedup |
Results: Joint Analysis of Ability and Processing Speed
To investigate the relationship between spatial ability and processing speed, we fit a bivariate Bayesian regression model estimating latent spatial ability (\(\theta\)) and residualized reaction times (RT).
Cross-Cultural Differences Results contradicted a “cultural caution” hypothesis. Relative to the German baseline, Colombian children showed significantly lower accuracy (\(\beta =\) -1.32 SD, 95% CI [-1.45, -1.19]) yet responded significantly faster (\(\beta =\) -0.14 log-seconds, 95% CI [-0.20, -0.08]; 13.0% faster). Canadian children displayed lower accuracy (\(\beta =\) -0.43 SD, 95% CI [-0.59, -0.28]) with processing speeds statistically equivalent to the German group (\(\beta =\) -0.03, 95% CI [-0.10, 0.05]). Thus, the lowest-performing group exhibited the fastest response style.
Developmental Trajectories Age robustly predicted simultaneous gains in accuracy and efficiency. Each additional year was associated with increased spatial ability (\(\beta =\) 0.20 SD, 95% CI [0.17, 0.22]) and faster response times (\(\beta =\) -0.05 log-seconds, 95% CI [-0.06, -0.03]).
The Speed-Accuracy Trade-off We found a significant positive residual correlation between ability and response time (\(r =\) 0.10, 95% CI [, ]). Since higher RT values indicate slower responses, this confirms a speed-accuracy trade-off: controlling for age and site, children who allocated more time to the task achieved higher accuracy.
Developmental Trajectories and Strategic Trade-offs
Our analysis reveals that cognitive maturation drives simultaneous improvements in both the precision and efficiency of spatial reasoning. Age was a strong, positive predictor of spatial ability (\(\beta =\) 0.20 SD, 95% CI [0.17, 0.22]) and a significant negative predictor of response time (\(\beta =\) -0.05 log-seconds, 95% CI [-0.06, -0.03]). This indicates that as children grow older, they not only solve mental rotation tasks more accurately but also do so significantly faster.
Furthermore, we identified a significant positive residual correlation between ability and response time (\(r =\) 0.10, 95% CI [, ]). Since higher RT values indicate slower responses, this finding highlights a functional speed-accuracy trade-off: independent of age and site, children who allocated more time to the task achieved higher accuracy. This suggests that successful performance in mental rotation relies partly on a deliberative strategy where resisting the impulse to respond quickly allows for more successful spatial transformation.
Analysis 3: Strategy Effectiveness (Slope Differences)
To test whether the effectiveness of the “slow-down” strategy varies across cultures, we fit a Bayesian regression model predicting spatial ability from the interaction between site and processing speed (ability ~ age + site * rt_resid).
# Fit the Interaction Model
# We ask: Does the slope of RT -> Ability change depending on the Site?
m_strategy <- brm(
ability ~ age + site * rt_resid,
data = model_data_corrected,
chains = 4,
cores = 4,
iter = 2000,
seed = 123
)
# Extract Fixed Effects to check the Interaction terms
summary(m_strategy)Developmental Trajectories and Strategic Trade-offs
Developmental Gains Our analysis reveals that cognitive maturation drives simultaneous improvements in both precision and efficiency. Age was a robust predictor of spatial ability (\(\beta =\) 0.20 SD) and response time (\(\beta =\) -0.05 log-seconds). This confirms that as children grow older, they not only solve mental rotation tasks more accurately but also do so significantly faster.
Strategy Effectiveness We identified a functional speed-accuracy trade-off that varies by context. While German children maintained high accuracy regardless of their speed, children in Colombia and Canada showed a significant “return on investment” for slowing down (Interaction \(\beta > 0.50\)). For these groups, resisting the impulse to respond quickly was strongly predictive of higher performance, suggesting that their lower average accuracy is partly attributable to a faster, less deliberative response style.