We’re going to do a simulation on the relative power of the hurdle vs. difference score models to detect condition differences.
Dataset generation
We start by generating a dataset. We can do this either via our data, via a fitted model or more artificially. I’m going to start with a simple synthetic dataset.
I’m doing this in the simplest possible way where there is a match vs. a mismatch trial on exactly one trial type. We can generalize this later.
One big question is about what the correlation is between search times and whether you search. We need to assume some random effects in this simulation. Here I’ve set the magnitude of these arbitrarily, and I assume there is no correlation between them (which is largely true in the pilot) and also that there’s no random slope (e.g., difference between kids in the match effect).
d <-make_hurdle_data()ggplot(d, aes(x = trial_type, y = search_time)) +geom_jitter(alpha = .4) +stat_summary()
No summary function supplied, defaulting to `mean_se()`
d |>group_by(trial_type) |>summarise(searched =sum(searched))
# A tibble: 2 × 2
trial_type searched
<chr> <int>
1 match 46
2 mismatch 75
d <-make_hurdle_data(mismatch_boost =0)ggplot(d, aes(x = trial_type, y = search_time)) +geom_jitter(alpha = .4) +stat_summary()
No summary function supplied, defaulting to `mean_se()`
d |>group_by(trial_type) |>summarise(searched =sum(searched))
# A tibble: 2 × 2
trial_type searched
<chr> <int>
1 match 41
2 mismatch 48
Functions for simulation
Now set up functions to do both analyses for these data and get the relevant p-values.
Here I’m fitting a model that DOES assume correlation between the zero and continuous random effects.
# runs the simple difference scoredo_diff_analysis <-function(data) { tt <- data |>group_by(subid) |>summarise(diff_score = search_time[trial_type =="mismatch"] - search_time[trial_type =="match"]) |>pull(diff_score) |>t.test() return(tt$p.value)}# runs the modeldo_hurdle_analysis <-function(data) { mod_hurdle <-mixed_model(search_time ~ trial_type, random =~1| subid, data = data, family =hurdle.lognormal(), n_phis =1,zi_fixed =~ trial_type, zi_random =~1| subid, control =list(max_coef_value =100)) # to avoid non-convergence errors p_vals <-summary(mod_hurdle)[[1]][,"p-value"] hurdle_tibble <-tibble(search_time_p =summary(mod_hurdle)[[1]][,"p-value"][2],searched_p =summary(mod_hurdle)[[2]][,"p-value"][2])return(hurdle_tibble)}# helper function to avoid problems when glmm crashessafe_hurdle_analysis <-safely(do_hurdle_analysis, otherwise =NA)
Simulations
First test run
Here we run the simulations. They are slow, so we put them in an eval=FALSE and run them only once and cache. We also parallelize all of this to speed up.
No summary function supplied, defaulting to `mean_se()`
Interesting. So under our default parameters, the binary variable in general is much lower power, but the search time is higher power. In particular the search time is higher power for low N but then maybe the difference score is a hint better for higher N?
Full simulation
How does this depend on how big each effect is? Let’s do that simulation.
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
No summary function supplied, defaulting to `mean_se()`
The quadrants show different strengths of the effect on not searching vs. on search time. Red is the difference scores. The different line types are different combos of random effects (high/low variance on not searching vs. high/low variance on search times).
What I see here is that if you have an effect, it can move between search time and searching. In some cases, difference scores can be as good as one or the other - but in some cases it’s substantially worse than one.
Single siimulation result
For debugging purposes.
Let’s dig into what is going on.
data <- sims$data[[4800]]ggplot(data, aes(x = trial_type, y = search_time)) +geom_jitter(alpha = .4) +stat_summary()
No summary function supplied, defaulting to `mean_se()`
data |>group_by(trial_type) |>summarise(m_searched =mean(searched), m_search_time =mean(search_time), m_search_time_nozero =mean(search_time[search_time >0]))
One Sample t-test
data: pull(summarise(group_by(data, subid), diff_score = search_time[trial_type == "mismatch"] - search_time[trial_type == "match"]), diff_score)
t = 2.7798, df = 99, p-value = 0.006511
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.9669979 5.7906872
sample estimates:
mean of x
3.378843
mod_hurdle <-mixed_model(search_time ~ trial_type, random =~1| subid, data = data, family =hurdle.lognormal(), n_phis =1,zi_fixed =~ trial_type, zi_random =~1| subid, control =list(max_coef_value =100)) # to avoid non-convergence errorssummary(mod_hurdle)
Call:
mixed_model(fixed = search_time ~ trial_type, random = ~1 | subid,
data = data, family = hurdle.lognormal(), zi_fixed = ~trial_type,
zi_random = ~1 | subid, n_phis = 1, control = list(max_coef_value = 100))
Data Descriptives:
Number of Observations: 200
Number of Groups: 100
Model:
family: hurdle log-normal
link: identity
Fit statistics:
log.Lik AIC BIC
-278.1749 572.3498 593.1911
Random effects covariance matrix:
StdDev Corr
(Intercept) 0.1794
zi_(Intercept) 0.7809 -0.1939
Fixed effects:
Estimate Std.Err z-value p-value
(Intercept) 1.1108 0.2347 4.7335 < 1e-04
trial_typemismatch 0.5727 0.2290 2.5012 0.012378
Zero-part coefficients:
Estimate Std.Err z-value p-value
(Intercept) 0.5061 0.2387 2.1208 0.0339381
trial_typemismatch -0.8276 0.3205 -2.5822 0.0098166
log(residual std. dev.):
Estimate Std.Err
0.0638 0.0877
Integration:
method: adaptive Gauss-Hermite quadrature rule
quadrature points: 11
Optimization:
method: hybrid EM and quasi-Newton
converged: TRUE