Mental Rotation Analysis

Sample size by site

Mental Rotation: Counts by Site
site	n_trials	n_users	n_runs
pilot_mpieva_de	17635	245	374
pilot_uniandes_co	7931	248	248
pilot_western_ca	6453	129	133

Raw data

(1) Ability - proportion correct

(2) Angle curves

IRT estimates

(3) Ability - thetas

(4) Difficulty by rotation angle (2PL scalar)

(5) Discrimination by rotation angle (2PL scalar)

Reaction time (RT)

(6) RT by age

(7) Accuracy by RT (raw scores) broken down by age group

(8) Accuracy by RT (IRT scores) broken down by age group

(9) Rotation angle by RT

(10) Rotation angle by RT and age

(11) Histogram - Reaction Time

Analysis 1: Accuracy-Time Bivariate Model

RQs

RQ1: Do children trade off speed and accuracy consistently?

RQ2: After controlling for RT, do site differences persist?

RQ3: How does the relationship change with age?

 Family: MV(gaussian, gaussian) 
  Links: mu = identity
         mu = identity 
Formula: ability ~ site + age + (1 | p | user_id) 
         logrt_mean ~ site + age + (1 | p | user_id) 
   Data: model_data (Number of observations: 707) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~user_id (Number of levels: 574) 
                                           Estimate Est.Error l-95% CI u-95% CI
sd(ability_Intercept)                          0.58      0.04     0.50     0.65
sd(logrtmean_Intercept)                        0.30      0.02     0.25     0.34
cor(ability_Intercept,logrtmean_Intercept)     0.40      0.08     0.23     0.56
                                           Rhat Bulk_ESS Tail_ESS
sd(ability_Intercept)                      1.01      361      789
sd(logrtmean_Intercept)                    1.01      302      752
cor(ability_Intercept,logrtmean_Intercept) 1.01      497     1189

Regression Coefficients:
                                Estimate Est.Error l-95% CI u-95% CI Rhat
ability_Intercept                  -1.91      0.14    -2.18    -1.65 1.00
logrtmean_Intercept                 1.40      0.07     1.26     1.55 1.00
ability_sitepilot_uniandes_co      -1.30      0.07    -1.44    -1.16 1.00
ability_sitepilot_western_ca       -0.42      0.08    -0.58    -0.26 1.00
ability_age                         0.20      0.01     0.18     0.23 1.00
logrtmean_sitepilot_uniandes_co    -0.37      0.04    -0.44    -0.30 1.00
logrtmean_sitepilot_western_ca     -0.17      0.04    -0.26    -0.09 1.00
logrtmean_age                      -0.01      0.01    -0.02     0.01 1.00
                                Bulk_ESS Tail_ESS
ability_Intercept                   1957     2686
logrtmean_Intercept                 1838     2524
ability_sitepilot_uniandes_co       1546     2230
ability_sitepilot_western_ca        1762     2551
ability_age                         2029     2734
logrtmean_sitepilot_uniandes_co     1785     2266
logrtmean_sitepilot_western_ca      1965     2324
logrtmean_age                       1752     2255

Further Distributional Parameters:
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_ability       0.51      0.03     0.45     0.58 1.01      330      853
sigma_logrtmean     0.28      0.02     0.25     0.33 1.01      306      769

Residual Correlations: 
                          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
rescor(ability,logrtmean)     0.23      0.08     0.06     0.38 1.01      537
                          Tail_ESS
rescor(ability,logrtmean)     1383

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

What the Results Tell Us:

RQ1: Do children trade off speed and accuracy consistently?

Yes

Person-level correlation: r = 0.40 (95% CI: 0.22–0.57)
- Children who are generally slower have higher ability
Residual correlation: r = 0.23 (95% CI: 0.06–0.38)
- Even within-person, slower responses → higher ability

RQ2: After controlling for RT, do site differences persist?

Yes

Colombia: -1.30 SD lower ability (after accounting for their faster RT)
Canada: -0.42 SD lower ability (after accounting for their faster RT)
Germany: Reference group (highest ability)

Colombia is BOTH lower in ability AND faster in RT (-0.37 log seconds). This suggests they may be sacrificing accuracy for speed more than other sites.

RQ3: How does the relationship change with age?

Ability improves but speed doesn’t change much

Ability increases: +0.20 SD per year (strong developmental effect)
RT barely changes: -0.01 log seconds per year (credible interval includes 0)

Interpretation: Children get better at the task with age, but they don’t get faster. This suggests they’re improving in mental rotation skill rather than just processing speed.

Summary

Main Finding 1: Speed-Accuracy Tradeoff is Universal

Across all sites, we observed a consistent speed-accuracy tradeoff (person-level r = 0.40, residual r = 0.23), where slower response times were associated with higher mental rotation ability. This pattern held across all three cultural contexts.

Main Finding 2: Site Differences Reflect True Ability Gaps

After accounting for response time, substantial site differences persisted. Colombian children showed 1.30 SD lower ability than German children, despite responding 0.37 log-seconds faster. This suggests that accuracy differences cannot be explained by differential speed-accuracy strategies alone.

Main Finding 3: Development Improves Skill, Not Speed

Mental rotation ability improved by 0.20 SD per year, but response times remained stable across ages. This dissociation suggests that developmental gains reflect improved mental rotation skill rather than general processing speed.