ExGaussian Analyses

Author

Greg Cox

Published

September 13, 2025

Preamble

Load necessary packages and set universal options

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
library(modelr)
library(readxl)
library(afex)
Loading required package: lme4
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

Registered S3 methods overwritten by 'car':
  method                          from
  influence.merMod                lme4
  cooks.distance.influence.merMod lme4
  dfbeta.influence.merMod         lme4
  dfbetas.influence.merMod        lme4
************
Welcome to afex. For support visit: http://afex.singmann.science/
- Functions for ANOVAs: aov_car(), aov_ez(), and aov_4()
- Methods for calculating p-values with mixed(): 'KR', 'S', 'LRT', and 'PB'
- 'afex_aov' and 'mixed' objects can be passed to emmeans() for follow-up tests
- NEWS: library('emmeans') now needs to be called explicitly!
- Get and set global package options with: afex_options()
- Set orthogonal sum-to-zero contrasts globally: set_sum_contrasts()
- For example analyses see: browseVignettes("afex")
************

Attaching package: 'afex'

The following object is masked from 'package:lme4':

    lmer
set_sum_contrasts()
setting contr.sum globally: options(contrasts=c('contr.sum', 'contr.poly'))
plot_theme <- theme_bw() +
    theme(panel.grid = element_blank(), strip.background = element_blank())

Load the data and recode variables as necessary

data <- rbind(
    cbind(group = "Expert", full_join(
        read_xlsx("expert_fixation_duration v3.xlsx") %>%
            rename(fixation_duration = duration),
        read_xlsx("expert_reaction_time v3.xlsx") %>%
            rename(rt = duration)
    )),
    cbind(group = "Nonmusician", full_join(
        read_xlsx("non_musician_fixation_duration v3.xlsx") %>%
            rename(fixation_duration = duration),
        read_xlsx("non_musician_reaction_time v3.xlsx") %>%
            rename(rt = duration)
    ))
) %>%
    mutate(fixation_duration = fixation_duration / 1000) %>%
    mutate(rt = rt / 1000) %>%
    mutate(group = factor(group, levels = c("Nonmusician", "Expert"))) %>%
    mutate(id = interaction(group, subject, sep = "", drop = TRUE)) %>%
    mutate(condition = factor(condition, levels = c("U", "R"), labels = c("Upright", "Rotated"))) %>%
    mutate(accuracy = factor(accuracy, levels = c(1, 0), labels = c("Correct", "Error")))
Joining with `by = join_by(subject, condition, image_name, accuracy)`
Joining with `by = join_by(subject, condition, image_name, accuracy)`
Warning in full_join(read_xlsx("non_musician_fixation_duration v3.xlsx") %>% : Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1991 of `x` matches multiple rows in `y`.
ℹ Row 1991 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.

Filter trials that are too short and exclude errors

Since the images flickered every 800 ms (700 ms + 100 ms blank), any response with an RT under 800 ms must be a “premature” response, i.e., one in which the participant could not have made an informed response.

In addition, these analyses focus only on correct trials.

data <- data %>%
    filter(accuracy == "Correct", rt > 0.8) # Remember that times are now in seconds, not milliseconds

Specify functions for fitting parameters of the ExGaussian

Below, I specify several functions that we will use to estimate the parameters of the ExGaussian distribution and to obtain predicted density and distribution functions so we can compare the fit of the model to data.

# The probability density function for the ExGaussian distribution.
# x: vector of observed values
# mu, sigma, tau: parameters of the ExGaussian distribution; each should either be a single value or a vector that has the same length as `x`
# log: If TRUE, return the natural logarithm of the density function (used for fitting), otherwise return the raw value of the density function
dexgauss <- function(x, mu, sigma, tau, log = FALSE) {
    if (log) {
        return((2 * mu + sigma^2 / tau - 2 * x) / (2 * tau) + pnorm((x - mu) / sigma - sigma / tau, log.p = TRUE) - log(tau))
    } else {
        return(exp((2 * mu + sigma^2 / tau - 2 * x) / (2 * tau)) * pnorm((x - mu) / sigma - sigma / tau) / tau)
    }
}

# The cumulative distribution function for the ExGaussian distribution.
# Parameters are the same as for the `dexgauss` function above.
pexgauss <- function(x, mu, sigma, tau) {
    return(pnorm(x, mu, sigma) - exp(-(x - mu) / tau + sigma^2 / (2 * tau^2) + pnorm((x - mu) / sigma - sigma / tau, log=TRUE)))
}

# The negative log-likelihood of a set of observed values (`x`) given a vector of ExGaussian parameters `par`.
# par: Should be a named vector, with elements named "mu", "sigma", and "tau" (though these can be in any order).
# x: A vector of observed values
exgauss_nll <- function(par, x) {
    nll <- try(-sum(dexgauss(x, par["mu"], par["sigma"], par["tau"], log = TRUE)))
    if (any(class(nll) == "try-error")) {
        nll <- Inf
    } else if (is.na(nll)) {
        nll <- Inf
    }
    return(nll)
}

# The gradient of the negative log-likelihood.  Useful, though not strictly necessary, for parameter fitting.  Here, "par" should be a named vector of ExGaussian parameters and "x" is a vector of observed values.
exgauss_nll_grad <- function(par, x) {
    norm_factor <- dnorm((x - par["mu"]) / par["sigma"] - par["sigma"] / par["tau"]) / pnorm((x - par["mu"]) / par["sigma"] - par["sigma"] / par["tau"])
    gr <- c(
        "mu" = -sum(1 / par["tau"] - norm_factor / par["sigma"]),
        "sigma" = -sum(par["sigma"] / (par["tau"]^2) + ((par["mu"] - x) / par["sigma"]^2 - 1 / par["tau"]) * norm_factor),
        "tau" = -sum((x - par["mu"] - par["tau"] - par["sigma"]^2 / par["tau"] + par["sigma"] * norm_factor) / (par["tau"]^2))
    )
    gr[is.na(gr)] <- -Inf
    return(gr)
}

# A function for finding the best-fitting ExGaussian parameters.
# x: vector of observed values
# max_x: maximum possible parameter value, by default set to the maximum observed value
fit_exgauss <- function(x, max_x = max(x), ...) {
    # A not-very-good initial guess for the values of the ExGaussian parameters
    mean_x_div2 <- mean(x) / 2
    sd_x <- sd(x)
    
    init_par <- c(
        "mu" = mean_x_div2,
        "sigma" = sd_x,
        "tau" = mean_x_div2
    )
    
    # Use gradient descent to find the parameters that minimize the negative log-likelihood (i.e., that maximize the likelihood) of the observed values
    fit <- try(nlminb(start = init_par, objective = exgauss_nll, gradient = exgauss_nll_grad, x = x, lower = 0, upper = max_x, ...))
    
    if (any(class(fit) == "try-error")) {
        # If for any reason, the initial attempt fails, try again using a finite-difference approximation to the gradient which is less accurate but can be more robust
        fit <- try(nlminb(start = init_par, objective = exgauss_nll, x = x, lower = 0, upper = max_x, ...))
        
        if (any(class(fit) == "try-error")) {
            # If the second attempt fails, give up and return "NA" for the parameter estimates
            est_par <- rep(NA, 3)
            names(est_par) <- names(init_par)
        } else {
            est_par <- fit$par
        }
    } else {
        est_par <- fit$par
    }
    
    # Estimates are returned as a "tibble" so that they can be estimated using dplyr's "reframe" function
    return(tibble(par = names(est_par), val = est_par))
}

Response time

Run the model

rt_pars <- data %>%
    group_by(group, subject, condition) %>%
    reframe(fit_exgauss(rt)) %>%
    pivot_wider(names_from = "par", values_from = "val") %>%
    mutate(id = interaction(group, subject))

Plot predicted vs. observed

rt_pred <- rt_pars %>%
    expand_grid(rt = seq(0, max(data$rt), length.out = 101)) %>%
    mutate(d = dexgauss(rt, mu, sigma, tau))

expert_rt_plot <- data %>%
    filter(group == "Expert") %>%
    ggplot(aes(x = rt)) +
    geom_histogram(aes(fill = condition, y = after_stat(density)), binwidth = 10, position = "identity", alpha = 0.4, boundary = 0, color = "white") +
    geom_line(aes(y = d, color = condition), data = rt_pred %>% filter(group == "Expert")) +
    scale_color_brewer(palette = "Set1", aesthetics = c('fill', 'color')) +
    facet_wrap("subject") +
    labs(x = "Response time (s)", y = "Density", color = NULL, fill = NULL, title = "Experts") +
    plot_theme +
    theme(strip.text = element_blank())

nonmusician_rt_plot <- data %>%
    filter(group == "Nonmusician") %>%
    ggplot(aes(x = rt)) +
    geom_histogram(aes(fill = condition, y = after_stat(density)), binwidth = 10, position = "identity", alpha = 0.4, boundary = 0, color = "white") +
    geom_line(aes(y = d, color = condition), data = rt_pred %>% filter(group == "Nonmusician")) +
    scale_color_brewer(palette = "Set1", aesthetics = c('fill', 'color')) +
    facet_wrap("subject") +
    labs(x = "Response time (s)", y = "Density", color = NULL, fill = NULL, title = "Nonmusicians") +
    plot_theme +
    theme(strip.text = element_blank())

expert_rt_plot + nonmusician_rt_plot + plot_layout(guides = "collect", ncol = 1) & theme(legend.position = 'bottom')

time_p <- c(0.1, 0.3, 0.5, 0.7, 0.9)

obs_rt_quantiles <- data %>%
    group_by(group, subject, condition) %>%
    reframe(tibble(p = time_p, q = quantile(rt, probs = time_p)))

model_cdf <- rt_pars %>%
    group_by(group, subject, condition) %>%
    reframe(tibble(q = seq_range(data$rt, n = 51), p = pexgauss(q, mu, sigma, tau)))

model_rt_summary <- model_cdf %>%
    group_by(group, condition, q) %>%
    summarize(p = mean(p))
`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.
obs_rt_summary <- obs_rt_quantiles %>%
    group_by(subject, group) %>%
    mutate(defl = q - mean(q)) %>%
    group_by(group, condition, p) %>%
    summarize(q = mean(q), sd_q = sd(defl), n_q = n(), se_q = sd_q / sqrt(n_q), crit_q = qt(0.975, df = n_q - 1))
`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.
obs_rt_summary %>%
    ggplot(aes(x = q, y = p, color = group)) +
    geom_line(data = model_rt_summary) +
    geom_pointrange(aes(xmin = pmax(0, q - se_q * crit_q), xmax = q + se_q * crit_q)) +
    expand_limits(y = c(0, 1)) +
    labs(x = "Response time (s)", y = "Cumulative probability", color = NULL) +
    facet_wrap("condition", nrow = 1) +
    scale_color_brewer(palette = "Dark2") +
    plot_theme +
    theme(legend.position = "inside", legend.position.inside = c(1, 0), legend.justification = c(1, 0), legend.background = element_blank())

Plot model parameters

rt_pars %>%
    pivot_longer(c(mu, sigma, tau), names_to = "par", values_to = "val") %>%
    ggplot(aes(x = condition, y = val, color = group, shape = group)) +
    geom_point(position = position_jitterdodge(jitter.width = 0.1, dodge.width = 0.5), size = 0.5, alpha = 0.2) +
    stat_summary(geom = "line", fun = mean, position = position_dodge(width = 0.5), mapping = aes(group = group, linetype = group)) +
    stat_summary(geom = "pointrange", fun.data = mean_cl_boot, position = position_dodge(width = 0.5)) +
    expand_limits(y = c(0, 0.1)) +
    facet_wrap("par", scales = "free_y", nrow = 1, labeller = label_parsed) +
    scale_color_brewer(palette = "Dark2") +
    labs(x = "Condition", y = "Parameter value", color = NULL, shape = NULL, linetype = NULL) +
    plot_theme

ANOVA on model parameters

Mu

aov_ez(
    id = "id",
    dv = "mu",
    data = rt_pars,
    between = "group",
    within = "condition"
)
Anova Table (Type 3 tests)

Response: mu
           Effect    df  MSE      F   ges p.value
1           group 1, 58 4.92   2.57  .026    .114
2       condition 1, 58 3.28 4.56 *  .031    .037
3 group:condition 1, 58 3.28   0.00 <.001    .998
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sigma

aov_ez(
    id = "id",
    dv = "sigma",
    data = rt_pars,
    between = "group",
    within = "condition"
)
Anova Table (Type 3 tests)

Response: sigma
           Effect    df  MSE       F  ges p.value
1           group 1, 58 2.34  3.64 + .036    .062
2       condition 1, 58 1.56 7.50 ** .049    .008
3 group:condition 1, 58 1.56    0.62 .004    .435
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Tau

aov_ez(
    id = "id",
    dv = "tau",
    data = rt_pars,
    between = "group",
    within = "condition"
)
Anova Table (Type 3 tests)

Response: tau
           Effect    df   MSE         F  ges p.value
1           group 1, 58 29.73      0.99 .014    .324
2       condition 1, 58  7.07 51.04 *** .145   <.001
3 group:condition 1, 58  7.07  11.15 ** .036    .001
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Fixation duration

Run the model

fix_pars <- data %>%
    group_by(group, subject, condition) %>%
    reframe(fit_exgauss(fixation_duration)) %>%
    pivot_wider(names_from = "par", values_from = "val") %>%
    mutate(id = interaction(group, subject))

Plot predicted vs. observed

fix_pred <- fix_pars %>%
    expand_grid(fixation_duration = seq(0, max(data$fixation_duration), length.out = 101)) %>%
    mutate(d = dexgauss(fixation_duration, mu, sigma, tau))

expert_fix_plot <- data %>%
    filter(group == "Expert") %>%
    ggplot(aes(x = fixation_duration)) +
    geom_histogram(aes(fill = condition, y = after_stat(density)), binwidth = 0.1, position = "identity", alpha = 0.4, boundary = 0, color = "white") +
    geom_line(aes(y = d, color = condition), data = fix_pred %>% filter(group == "Expert")) +
    scale_color_brewer(palette = "Set1", aesthetics = c('fill', 'color')) +
    facet_wrap("subject") +
    labs(x = "Fixation duration (s)", y = "Density", color = NULL, fill = NULL, title = "Experts") +
    plot_theme +
    theme(strip.text = element_blank())

nonmusician_fix_plot <- data %>%
    filter(group == "Nonmusician") %>%
    ggplot(aes(x = fixation_duration)) +
    geom_histogram(aes(fill = condition, y = after_stat(density)), binwidth = 0.1, position = "identity", alpha = 0.4, boundary = 0, color = "white") +
    geom_line(aes(y = d, color = condition), data = fix_pred %>% filter(group == "Nonmusician")) +
    scale_color_brewer(palette = "Set1", aesthetics = c('fill', 'color')) +
    facet_wrap("subject") +
    labs(x = "Fixation duration (s)", y = "Density", color = NULL, fill = NULL, title = "Nonmusicians") +
    plot_theme +
    theme(strip.text = element_blank())

expert_fix_plot + nonmusician_fix_plot + plot_layout(guides = "collect", ncol = 1) & theme(legend.position = 'bottom')

time_p <- c(0.1, 0.3, 0.5, 0.7, 0.9)

obs_fix_quantiles <- data %>%
    group_by(group, subject, condition, id) %>%
    reframe(tibble(p = time_p, q = quantile(fixation_duration, probs = time_p)))

model_cdf <- fix_pars %>%
    group_by(group, subject, condition, id) %>%
    reframe(tibble(q = seq_range(data$fixation_duration, n = 51), p = pexgauss(q, mu, sigma, tau)))

model_fix_summary <- model_cdf %>%
    group_by(group, condition, q) %>%
    summarize(p = mean(p))
`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.
obs_fix_summary <- obs_fix_quantiles %>%
    group_by(subject, group, id) %>%
    mutate(defl = q - mean(q)) %>%
    group_by(group, condition, p) %>%
    summarize(q = mean(q), sd_q = sd(defl), n_q = n(), se_q = sd_q / sqrt(n_q), crit_q = qt(0.975, df = n_q - 1))
`summarise()` has grouped output by 'group', 'condition'. You can override
using the `.groups` argument.
obs_fix_summary %>%
    ggplot(aes(x = q, y = p, color = group)) +
    geom_line(data = model_fix_summary) +
    geom_pointrange(aes(xmin = pmax(0, q - se_q * crit_q), xmax = q + se_q * crit_q)) +
    expand_limits(y = c(0, 1)) +
    labs(x = "Fixation Duration Quantile", y = "Cumulative probability", color = NULL) +
    facet_wrap("condition", nrow = 1) +
    scale_color_brewer(palette = "Dark2") +
    plot_theme +
    theme(legend.position = "inside", legend.position.inside = c(1, 0), legend.justification = c(1, 0), legend.background = element_blank())

Plot model parameters

fix_pars %>%
    pivot_longer(c(mu, sigma, tau), names_to = "par", values_to = "val") %>%
    ggplot(aes(x = condition, y = val, color = group, shape = group)) +
    geom_point(position = position_jitterdodge(jitter.width = 0.1, dodge.width = 0.5), size = 0.5, alpha = 0.2) +
    stat_summary(geom = "line", fun = mean, position = position_dodge(width = 0.5), mapping = aes(group = group, linetype = group)) +
    stat_summary(geom = "pointrange", fun.data = mean_cl_boot, position = position_dodge(width = 0.5)) +
    expand_limits(y = 0) +
    facet_wrap("par", scales = "free_y", nrow = 1, labeller = label_parsed) +
    scale_color_brewer(palette = "Dark2") +
    labs(x = "Condition", y = "Parameter value", color = NULL, shape = NULL, linetype = NULL) +
    plot_theme

ANOVA on model parameters

Mu

aov_ez(
    id = "id",
    dv = "mu",
    data = fix_pars,
    between = "group",
    within = "condition"
)
Anova Table (Type 3 tests)

Response: mu
           Effect    df  MSE         F  ges p.value
1           group 1, 58 0.00 18.45 *** .209   <.001
2       condition 1, 58 0.00 42.73 *** .112   <.001
3 group:condition 1, 58 0.00   7.78 ** .022    .007
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sigma

aov_ez(
    id = "id",
    dv = "sigma",
    data = fix_pars,
    between = "group",
    within = "condition"
)
Anova Table (Type 3 tests)

Response: sigma
           Effect    df  MSE      F  ges p.value
1           group 1, 58 0.00 5.85 * .063    .019
2       condition 1, 58 0.00   1.24 .007    .270
3 group:condition 1, 58 0.00   0.42 .002    .521
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Tau

aov_ez(
    id = "id",
    dv = "tau",
    data = fix_pars,
    between = "group",
    within = "condition"
)
Anova Table (Type 3 tests)

Response: tau
           Effect    df  MSE       F  ges p.value
1           group 1, 58 0.00  3.45 + .044    .068
2       condition 1, 58 0.00 8.64 ** .031    .005
3 group:condition 1, 58 0.00    2.15 .008    .148
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Writeup

For each participant in each condition, we estimated the maximum-likelihood parameters of an ExGaussian distribution describing the distributions of both their total response time (RT) and fixation duration in both correct and error trials. The ExGaussian distribution provides a good fit to the observed distributions of both RT (Figure XXX1a) and fixation duration (Figure XXX1b).

Figure XXX2a shows the best-fitting ExGaussian parameters for RT distributions on correct trials. For the \(\mu\) parameter, we find a main effect of condition (\(F(1, 58) = 4.56\), \(p = 0.037\)), but no support for an effect of group (\(F(1, 58) = 2.57\), \(p = 0.114\)) or for an interaction between group and condition (\(F(1, 58) = 0.00\), \(p = 0.998\)). Overall, there is evidence that the \(\mu\) parameter, which largely represents the peak response time, is greater for rotated displays than for upright displays. The \(\sigma\) parameter also exhibits a main effect of condition (\(F(1, 58) = 7.50\), \(p = 0.008\)) with no main effect of group (\(F(1, 58) = 3.64\), \(p = 0.062\)) nor an interaction (\(F(1, 58) = 0.62\), \(p = 0.435\)). The \(\sigma\) parameter, which primarily reflects variability in the earliest response times, appears to be higher for rotated displays than upright displays. Finally, the \(\tau\) parameter demonstrates a main effect of condition (\(F(1, 58) = 51.04\), \(p < 0.001\)) as well as an interaction between group and condition (\(F(1, 58) = 11.15\), \(p = 0.001\)) but no main effect of group (\(F(1, 58) = 0.99\), \(p = 0.324\)). Given that the \(\tau\) parameter primarily describes the tail of the RT distribution, the tail appear to be longer for rotated displays than upright, with the tail being particularly short for experts viewing upright displays. In summary, RT distributions for both experts and nonmusicians tended to have later peaks (higher \(\mu\)), greater variability (higher \(\sigma\)), and longer tails (higher \(\tau\)) in the rotated condition than in the upright condition. While experts and nonmusicians had similar RT distributions in the rotated condition, the RT distributions of experts in the upright condition had shorter tails (lower \(\tau\)) than those of nonmusicians.

Figure XXX2b shows the estimated ExGaussian parameters describing distributions of fixation duration on correct trials. The \(\mu\) parameter shows a main effect of both group (\(F(1, 58) = 18.45\), \(p < 0.001\)) and condition (\(F(1, 58) = 42.73\), \(p < 0.001\)) as well as an interaction between group and condition (\(F(1, 58) = 7.78\), \(p = 0.007\)). Experts have a lower \(\mu\) than nonmusicians, and the Upright condition has a lower \(\mu\) than the Rotated condition but particularly for experts. The \(\sigma\) parameter shows a main effect of group (\(F(1, 58) = 5.85\), \(p = 0.019\)) with experts having smaller \(\sigma\) than nonmusicians, but otherwise no main effect of condition (\(F(1, 58) = 1.24\), \(p = 0.270\)) nor an interaction (\(F(1, 58) = 0.42\), \(p = 0.521\)). Finally, the \(\tau\) parameter shows a main effect of condition (\(F(1, 58) = 8.64\), \(p = 0.005\)), with the Upright condition associated with smaller \(\tau\) than the Rotated condition. Although there is a trend for experts to have a lower \(\tau\) parameter than nonmusicians, this trend is not statistically significant (\(F(1, 58) = 3.45\), \(p = 0.068\)) nor is there statistical support for an interaction between group and condition (\(F(1, 58) = 2.15\), \(p = 0.148\)). In summary, distributions of experts’ fixation durations had earlier peaks (lower \(\mu\)) and less variability (lower \(\sigma\)) than those of nonmusicians, with particularly early peaks in the upright condition.