ExGaussian Analyses

Author

Greg Cox

Published

June 3, 2025

Preamble

Load necessary packages and set universal options

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(brms)

Loading required package: Rcpp
Loading 'brms' package (version 2.22.0). Useful instructions
can be found by typing help('brms'). A more detailed introduction
to the package is available through vignette('brms_overview').

Attaching package: 'brms'

The following object is masked from 'package:stats':

    ar

library(rstan)

Loading required package: StanHeaders

rstan version 2.36.0.9000 (Stan version 2.36.0)

For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
For within-chain threading using `reduce_sum()` or `map_rect()` Stan functions,
change `threads_per_chain` option:
rstan_options(threads_per_chain = 1)


Attaching package: 'rstan'

The following object is masked from 'package:tidyr':

    extract

library(modelr)
library(readxl)
library(afex)

Loading required package: lme4
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:brms':

    ngrps

************
Welcome to afex. For support visit: http://afex.singmann.science/
- Functions for ANOVAs: aov_car(), aov_ez(), and aov_4()
- Methods for calculating p-values with mixed(): 'S', 'KR', 'LRT', and 'PB'
- 'afex_aov' and 'mixed' objects can be passed to emmeans() for follow-up tests
- Get and set global package options with: afex_options()
- Set sum-to-zero contrasts globally: set_sum_contrasts()
- For example analyses see: browseVignettes("afex")
************

Attaching package: 'afex'

The following object is masked from 'package:lme4':

    lmer

set_sum_contrasts()

setting contr.sum globally: options(contrasts=c('contr.sum', 'contr.poly'))

plot_theme <- theme_bw() +
    theme(panel.grid = element_blank(), strip.background = element_blank())

Load the data and recode variables as necessary

data <- rbind(
    cbind(group = "Expert", full_join(
        read_xlsx("expert_fixation_duration v3.xlsx") %>%
            rename(fixation_duration = duration),
        read_xlsx("expert_reaction_time v3.xlsx") %>%
            rename(rt = duration)
    )),
    cbind(group = "Nonmusician", full_join(
        read_xlsx("non_musician_fixation_duration v3.xlsx") %>%
            rename(fixation_duration = duration),
        read_xlsx("non_musician_reaction_time v3.xlsx") %>%
            rename(rt = duration)
    ))
) %>%
    mutate(fixation_duration = fixation_duration / 1000) %>%
    mutate(rt = rt / 1000) %>%
    mutate(group = factor(group, levels = c("Nonmusician", "Expert"))) %>%
    mutate(id = interaction(group, subject, sep = "", drop = TRUE)) %>%
    mutate(condition = factor(condition, levels = c("U", "R"), labels = c("Upright", "Rotated"))) %>%
    mutate(accuracy = factor(accuracy, levels = c(1, 0), labels = c("Correct", "Error")))

Joining with `by = join_by(subject, condition, image_name, accuracy)`
Joining with `by = join_by(subject, condition, image_name, accuracy)`

Warning in full_join(read_xlsx("non_musician_fixation_duration v3.xlsx") %>% : Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1991 of `x` matches multiple rows in `y`.
ℹ Row 1991 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.

Filter trials that are too short and exclude errors

Since the images flickered every 800 ms (700 ms + 100 ms blank), any response with an RT under 800 ms must be a “premature” response, i.e., one in which the participant could not have made an informed response.

In addition, these analyses focus only on correct trials.

data <- data %>%
    filter(accuracy == "Correct", rt > 0.8) # Remember that times are now in seconds, not milliseconds

Specify the ExGaussian as a likelihood family

Unfortunately, while the BRMS package has a built-in version of the ExGaussian distribution, it is a non-standard parameterization of the model that requires us to specify it manually.

pexgauss <- function(x, mu, sigma, tau) {
    return(pnorm(x, mu, sigma) - exp(-(x - mu) / tau + sigma^2 / (2 * tau^2) + pnorm(x, mu + sigma^2 / tau, sigma, log=TRUE)))
}

exgauss_ll <- function(i, prep) {
    mu <- brms::get_dpar(prep, "mu", i = i)
    sigma <- brms::get_dpar(prep, "sigma", i = i)
    lambda <- brms::get_dpar(prep, "lambda", i = i)
    y <- prep$data$Y[i]
    return(sum(lambda * (2 * mu + sigma^2 * lambda - 2 * y) / 2 + pnorm(y, mu + sigma^2 * lambda, sigma, log=TRUE) + log(lambda)))
}

exgauss_epred <- function(prep) {
    mu <- brms::get_dpar(prep, "mu")
    lambda <- brms::get_dpar(prep, "lambda")
    return(mu + 1 / lambda)
}

exgaussian2 <- custom_family(
    "exgaussian2",
    dpars = c("mu", "sigma", "lambda"),
    links = rep("log", 3),
    lb = rep(0, 3), ub = c(NA, NA, NA),
    loop = FALSE,
    type = "real",
    log_lik = exgauss_ll,
    posterior_epred = exgauss_epred
)

stan_density <- "
    real exgaussian2_lpdf(vector y, vector mu, vector sigma, vector lambda) {
        return exp_mod_normal_lpdf(y | mu, sigma, lambda);
    }
"

stanvars <- stanvar(scode = stan_density, block = "functions")

Compile the model

This model uses minimally-informed prior distributions on model parameters to implement regularization and avoid convergence issues.

model <- stan_model(
    file = "exgauss_corr_model.stan",
    model_name = "exgauss_corr_model",
    save_dso = TRUE,
    auto_write = TRUE
)

Response time

Format data as necessary for estimation

rt_data <- make_standata(
    formula = brmsformula(
        rt ~ 1 + condition * group + (1 + condition | gr(id, by = group, id = "i")),
        sigma ~ 1 + condition * group + (1 + condition | gr(id, by = group, id = "i")),
        lambda ~ 1 + condition * group + (1 + condition | gr(id, by = group, id = "i"))
    ),
    data = data,
    family = exgaussian2,
    stanvars = stanvars
)

Run the model

rt_fit <- optimizing(model, data = rt_data, init = 0, verbose = TRUE, as_vector = FALSE, iter = 100000)

Chain 1: Initial log joint probability = -70359.3
Chain 1:     Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
Chain 1: Error evaluating model log probability: Non-finite gradient.

Chain 1:      999      -18686.8    0.00464424       226.052           1           1     1069   
Chain 1:     Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
Chain 1:     1441      -18685.1   1.19971e-05       10.8693      0.7033      0.7033     1541   
Chain 1: Optimization terminated normally: 
Chain 1:   Convergence detected: relative gradient magnitude is below tolerance

Plot predicted vs. observed

rt_pars <- data %>%
    mutate(
        mu = rt_fit$par$mu,
        sigma = rt_fit$par$sigma,
        tau = 1 / rt_fit$par$lambda
    ) %>%
    group_by(group, subject, condition, id) %>%
    summarize(mu = mean(mu), sigma = mean(sigma), tau = mean(tau))

`summarise()` has grouped output by 'group', 'subject', 'condition'. You can
override using the `.groups` argument.

time_p <- c(0.1, 0.3, 0.5, 0.7, 0.9)

obs_rt_quantiles <- data %>%
    group_by(group, subject, condition, id) %>%
    reframe(tibble(p = time_p, q = quantile(rt, probs = time_p)))

model_cdf <- rt_pars %>%
    group_by(group, subject, condition, id) %>%
    reframe(tibble(q = seq_range(data$rt, n = 51), p = pexgauss(q, mu, sigma, tau)))

model_rt_summary <- model_cdf %>%
    group_by(group, condition, q) %>%
    summarize(p = mean(p))

`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.

obs_rt_summary <- obs_rt_quantiles %>%
    group_by(subject, group, id) %>%
    mutate(defl = q - mean(q)) %>%
    group_by(group, condition, p) %>%
    summarize(q = mean(q), sd_q = sd(defl), n_q = n(), se_q = sd_q / sqrt(n_q), crit_q = qt(0.975, df = n_q - 1))

`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.

obs_rt_summary %>%
    ggplot(aes(x = q, y = p, color = group)) +
    geom_line(data = model_rt_summary) +
    geom_pointrange(aes(xmin = pmax(0, q - se_q * crit_q), xmax = q + se_q * crit_q)) +
    expand_limits(y = c(0, 1)) +
    labs(x = "RT Quantile", y = "Cumulative probability", color = NULL) +
    facet_wrap("condition", nrow = 1) +
    scale_color_brewer(palette = "Dark2") +
    plot_theme +
    theme(legend.position = "inside", legend.position.inside = c(1, 0), legend.justification = c(1, 0), legend.background = element_blank())

Plot model parameters

rt_pars %>%
    pivot_longer(c(mu, sigma, tau), names_to = "par", values_to = "val") %>%
    ggplot(aes(x = condition, y = val, color = group, shape = group)) +
    geom_point(position = position_jitterdodge(jitter.width = 0.1, dodge.width = 0.5), size = 0.5, alpha = 0.2) +
    stat_summary(geom = "line", fun = mean, position = position_dodge(width = 0.5), mapping = aes(group = group, linetype = group)) +
    stat_summary(geom = "pointrange", fun.data = mean_cl_boot, position = position_dodge(width = 0.5)) +
    expand_limits(y = c(0, 0.1)) +
    facet_wrap("par", scales = "free_y", nrow = 1, labeller = label_parsed) +
    scale_color_brewer(palette = "Dark2") +
    labs(x = "Condition", y = "Parameter value", color = NULL, shape = NULL, linetype = NULL) +
    plot_theme

ANOVA on model parameters

Mu

aov_ez(
    id = "id",
    dv = "mu",
    data = rt_pars,
    between = "group",
    within = "condition"
)

Anova Table (Type 3 tests)

Response: mu
           Effect    df  MSE       F  ges p.value
1           group 1, 58 0.48    1.57 .014    .216
2       condition 1, 58 0.42    0.26 .002    .610
3 group:condition 1, 58 0.42 8.03 ** .060    .006
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sigma

aov_ez(
    id = "id",
    dv = "sigma",
    data = rt_pars,
    between = "group",
    within = "condition"
)

Anova Table (Type 3 tests)

Response: sigma
           Effect    df  MSE           F  ges p.value
1           group 1, 58 0.00      3.26 + .033    .076
2       condition 1, 58 0.00  857.77 *** .852   <.001
3 group:condition 1, 58 0.00 1184.65 *** .889   <.001
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Tau

aov_ez(
    id = "id",
    dv = "tau",
    data = rt_pars,
    between = "group",
    within = "condition"
)

Anova Table (Type 3 tests)

Response: tau
           Effect    df   MSE         F  ges p.value
1           group 1, 58 23.94      2.38 .031    .128
2       condition 1, 58  6.29 84.78 *** .233   <.001
3 group:condition 1, 58  6.29   7.12 ** .025    .010
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Fixation duration

Format data as necessary for estimation

fix_data <- make_standata(
    formula = brmsformula(
        fixation_duration ~ 1 + condition * group + (1 + condition | gr(id, by = group, id = "i")),
        sigma ~ 1 + condition * group + (1 + condition | gr(id, by = group, id = "i")),
        lambda ~ 1 + condition * group + (1 + condition | gr(id, by = group, id = "i"))
    ),
    data = data,
    family = exgaussian2,
    stanvars = stanvars
)

Run the model

fix_fit <- optimizing(model, data = fix_data, init = 0, verbose = TRUE, as_vector = FALSE, iter = 100000)

Chain 1: Initial log joint probability = -9848.94
Chain 1:     Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
Chain 1: Error evaluating model log probability: Non-finite gradient.

Chain 1:      999       5919.12   0.000489533       19.3046           1           1     1126   
Chain 1:     Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
Chain 1:     1465       5919.31   5.85387e-05       1.31526           1           1     1653   
Chain 1: Optimization terminated normally: 
Chain 1:   Convergence detected: relative gradient magnitude is below tolerance

Plot predicted vs. observed

fix_pars <- data %>%
    mutate(
        mu = fix_fit$par$mu,
        sigma = fix_fit$par$sigma,
        tau = 1 / fix_fit$par$lambda
    ) %>%
    group_by(group, subject, condition, id) %>%
    summarize(mu = mean(mu), sigma = mean(sigma), tau = mean(tau))

`summarise()` has grouped output by 'group', 'subject', 'condition'. You can
override using the `.groups` argument.

time_p <- c(0.1, 0.3, 0.5, 0.7, 0.9)

obs_fix_quantiles <- data %>%
    group_by(group, subject, condition, id) %>%
    reframe(tibble(p = time_p, q = quantile(fixation_duration, probs = time_p)))

model_cdf <- fix_pars %>%
    group_by(group, subject, condition, id) %>%
    reframe(tibble(q = seq_range(data$fixation_duration, n = 51), p = pexgauss(q, mu, sigma, tau)))

model_fix_summary <- model_cdf %>%
    group_by(group, condition, q) %>%
    summarize(p = mean(p))

`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.

obs_fix_summary <- obs_fix_quantiles %>%
    group_by(subject, group, id) %>%
    mutate(defl = q - mean(q)) %>%
    group_by(group, condition, p) %>%
    summarize(q = mean(q), sd_q = sd(defl), n_q = n(), se_q = sd_q / sqrt(n_q), crit_q = qt(0.975, df = n_q - 1))

`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.

obs_fix_summary %>%
    ggplot(aes(x = q, y = p, color = group)) +
    geom_line(data = model_fix_summary) +
    geom_pointrange(aes(xmin = pmax(0, q - se_q * crit_q), xmax = q + se_q * crit_q)) +
    expand_limits(y = c(0, 1)) +
    labs(x = "Fixation Duration Quantile", y = "Cumulative probability", color = NULL) +
    facet_wrap("condition", nrow = 1) +
    scale_color_brewer(palette = "Dark2") +
    plot_theme +
    theme(legend.position = "inside", legend.position.inside = c(1, 0), legend.justification = c(1, 0), legend.background = element_blank())

Plot model parameters

fix_pars %>%
    pivot_longer(c(mu, sigma, tau), names_to = "par", values_to = "val") %>%
    ggplot(aes(x = condition, y = val, color = group, shape = group)) +
    geom_point(position = position_jitterdodge(jitter.width = 0.1, dodge.width = 0.5), size = 0.5, alpha = 0.2) +
    stat_summary(geom = "line", fun = mean, position = position_dodge(width = 0.5), mapping = aes(group = group, linetype = group)) +
    stat_summary(geom = "pointrange", fun.data = mean_cl_boot, position = position_dodge(width = 0.5)) +
    expand_limits(y = 0) +
    facet_wrap("par", scales = "free_y", nrow = 1, labeller = label_parsed) +
    scale_color_brewer(palette = "Dark2") +
    labs(x = "Condition", y = "Parameter value", color = NULL, shape = NULL, linetype = NULL) +
    plot_theme

ANOVA on model parameters

Mu

aov_ez(
    id = "id",
    dv = "mu",
    data = fix_pars,
    between = "group",
    within = "condition"
)

Anova Table (Type 3 tests)

Response: mu
           Effect    df  MSE         F  ges p.value
1           group 1, 58 0.00 20.30 *** .228   <.001
2       condition 1, 58 0.00 54.12 *** .126   <.001
3 group:condition 1, 58 0.00    6.61 * .017    .013
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sigma

aov_ez(
    id = "id",
    dv = "sigma",
    data = fix_pars,
    between = "group",
    within = "condition"
)

Anova Table (Type 3 tests)

Response: sigma
           Effect    df  MSE      F  ges p.value
1           group 1, 58 0.00 6.80 * .072    .012
2       condition 1, 58 0.00   1.55 .009    .218
3 group:condition 1, 58 0.00   0.17 .001    .681
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Tau

aov_ez(
    id = "id",
    dv = "tau",
    data = fix_pars,
    between = "group",
    within = "condition"
)

Anova Table (Type 3 tests)

Response: tau
           Effect    df  MSE       F  ges p.value
1           group 1, 58 0.00  3.07 + .040    .085
2       condition 1, 58 0.00 7.55 ** .026    .008
3 group:condition 1, 58 0.00    0.82 .003    .369
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Writeup

For each participant in each condition, we estimated parameters of an ExGaussian distribution describing the distributions of both their total response time (RT) and fixation duration in both correct and error trials. The estimated parameters are the maximum a posteriori estimates from a hierarchical model using minimally informative priors as a form of regularization to bound parameter estimates within a reasonable range. As shown in Figure XXX1, the ExGaussian distribution is a good fit to the observed distributions of both RT (Figure XXX1a) and fixation duration (Figure XXX1b).

Figure XXX2a shows the estimated ExGaussian parameters describing RT distributions on correct trials. For the \(\mu\) parameter, we find no main effects of group or condition, but an interaction between group and condition (\(F(1, 58) = 8.03\), \(p = 0.006\)) effect of group; experts have a lower \(\mu\) than nonmusicians in the Upright condition, but not in the Rotated condition. Given that none of the estimated \(\sigma\) parameters for any participant in any condition were greater than 0.001 seconds, we do not analyze this parameter. Finally, the \(\tau\) parameter shows a main effect of condition (\(F(1, 58) = 84.8\), \(p < 0.001\)) as well as an interaction between group and condition (\(F(1, 58) = 7.12\), \(p = 0.01\)). The Upright condition tends to have a lower \(\tau\) than the Rotated condition, but this is particularly so for experts; experts and nonmusicians have similar \(\tau\) parameters on rotated trials.

Figure XXX2b shows the estimated ExGaussian parameters describing distributions of fixation duration on correct trials. The \(\mu\) parameter shows a main effect of both group (\(F(1, 58) = 20.3\), \(p < 0.001\)) and condition (\(F(1, 58) = 54.12\), \(p < 0.001\)) as well as an interaction between group and condition (\(F(1, 58) = 6.61\), \(p = 0.013\)). Experts have a lower \(\mu\) than nonmusicians, and the Upright condition has a lower \(\mu\) than the Rotated condition but particularly for experts. The \(\sigma\) parameter shows a main effect of group (\(F(1, 58) = 6.8\), \(p = 0.012\)) with experts having smaller \(\sigma\) than nonmusicians, but otherwise no main effect of condition nor an interaction. Finally, the \(\tau\) parameter shows a main effect of condition (\(F(1, 58) = 7.55\), \(p = 0.008\)), with the Upright condition associated with smaller \(\tau\) than the Rotated condition. Although there is a trend for experts to have a lower \(\tau\) parameter than nonmusicians, this trend is not statistically significant (\(F(1, 58) = 3.07\), \(p = 0.085\)) nor is there statistical support for an interaction between group and condition.