library(dplyr)
library(tidyr)
library(ggplot2)
library(langcog)
library(broom)
theme_set(theme_mikabr() +
            theme(panel.grid = element_blank(),
                  strip.background = element_blank()))
font <- "Open Sans"

Simulations

Generate data with no true effect.

This is the complicated part because we need to generate data with successively longer look-aways. What’s the distribution for these? I don’t know.

We can do minimum-looking criteria though.

ns <- c(8, 16, 32, 64)
n_sims <- 1000

sims <- expand.grid(n = ns, 
                    sim = 1:n_sims) %>%
  group_by(n, sim) %>%
  do(data_frame(n = .$n, 
                sim = .$sim,
                sub = 1:.$n,
                grp1 = rlnorm(.$n, meanlog = 2.5, sdlog = 1), 
                grp2 = rlnorm(.$n, meanlog = 2.5, sdlog = 1)))

Do analysis.

standard <- sims %>%
  filter(grp1 > 0, grp2 > 0, grp1 < 60, grp2 < 60) %>%
  group_by(n, sim) %>%
  do(tidy(t.test(.$grp1, .$grp2)))

stats <- standard %>%
  group_by(n) %>%
  summarise(false_positive_rate = mean(p.value < .05))

kable(stats)
n false_positive_rate
8 0.037
16 0.049
32 0.041
64 0.039

P-hacked. Consider p-values with 6 different min-valus and choose best p-value from these.

t.test.n <- function(x, y, n) {
  if (sum(x>n) > 1 & sum(y>n) > 1) {
    return(t.test(x[x > n], y[y > n])$p.value)
  } else {
    return(NA)
  }
}

ph_standard <- sims %>%
  filter(grp1 > 0, grp2 > 0, grp1 < 60, grp2 < 60) %>%
  group_by(n, sim) %>%
  summarise(p.value = min(c(t.test.n(grp1, grp2, 0),
                            t.test.n(grp1, grp2, 2),
                            t.test.n(grp1, grp2, 4),
                            t.test.n(grp1, grp2, 6), 
                            t.test.n(grp1, grp2, 8),
                            t.test.n(grp1, grp2, 10)),
                          na.rm=TRUE))

ph_stats <- ph_standard %>%
  group_by(n) %>%
  summarise(false_positive_rate = mean(p.value < .05))

kable(ph_stats)
n false_positive_rate
8 0.078
16 0.112
32 0.112
64 0.122

Conclusions

There is a modest increase in false positive rate for selecting min criteria within a range. This would be increased significantly if you could also post-hoc select a second criterion (e.g., lookaway duration).

I also learned that log normal distributions do not maintain a constant false-positive rate with N for t-tests. The bigger then N, the more likely a false positive (because of outliers). That’s pretty interesting.