Within-Subjects Attention Check Study: Student Sample

Author

Jamie C. Lee

Published

April 10, 2026

Experimental Design

Overview

We conducted a within-subjects study in which undergraduate students completed five “JDM-esque” paradigms, each followed by an attention check (AC). The order of paradigm–AC pairs was fully randomized across participants. Participants were randomly assigned to a low-interference or high-interference condition, which determined the version of each paradigm they encountered (aka which manipulation they received). The median completion time was 12.1 minutes.

Because prior research suggests that student samples are among the most attentive participant pools (e.g., Krefeld-Schwalb et al., 2024), we embedded this study at the end of a required hour-long experimental session to increase the likelihood of fatigue-induced AC failures.

Study Goals

This study has three goals:

  1. AC sensitivity to attentional manipulations — Do ACs pick up on the effects of experimentally induced attentional interference (e.g., does cognitive load increase AC failure rates)?
  2. AC convergence — Do different ACs identify the same individuals as inattentive, and does any apparent (dis)agreement depend on which paradigm preceded each AC?
  3. Downstream consequences — Do estimated treatment effects differ depending on which AC is used to exclude participants?

Paradigms and Manipulations

Each participant completed five JDM paradigms. In the high-interference condition, manipulations were designed to impair AC performance; in the low-interference condition, control versions imposed minimal interference. The conditions for two paradigms (Snack Preferences and Time Perception) were not expected to affect AC performance.

Paradigm Citation Manipulation (High vs. Low) Expected AC Impact?
Anchor-and-Adjustment Epley & Gilovich (2006) High vs. low cognitive load Yes, particularly IMC and English Comprehension
Processing Fluency Alter & Oppenheimer (2006) Disfluent vs. fluent font Yes, particularly English Comprehension
Word Search Andre Hard vs. easy search Yes, all ACs
Snack Preferences Winterich & Haws (2011) Hope vs. pride induction No
Time Perception Siddiqui et al. (2017) Expanded vs. contracted scale No

Attention Checks

AC Item Variable
IMC “…To show you are paying attention…please click Other and write: reading is good.” ac_imc_failed
Straightline “For this item, select ‘not at all’.” (embedded in PANAS) ac_stline_failed
Obvious (T/F) “The sky is blue.” (True/False) ac_obvious_tf_failed
Eng. Comprehension “The trophy doesn’t fit…because it’s too small. What is too small?” ac_eng_failed
Obvious (5-pt) “How often do you breathe oxygen?” (Never–Always) ac_obvious_failed

All checks used, except for the Obvious (T/F), were taken from Perfecto and O’Donnell (2025).

Sample

N = 543 undergraduate students from a large Midwestern university.

Illustration of experiment flow and paradigms:


Data Dictionary

Participant Info

Variable_Name Variable_Type Variable_Description Value_Coding
id numeric Unique ID for identifying each participant 1 to 543
treatment numeric Binary indicator for condition assignment 0 = Low-interference, 1 = High-interference
gender numeric Participant's self-reported gender 1 = Female, 2 = Male, 3 = Non-Binary, 4 = Prefer not to say
birth_country numeric Participant's birth country 1 = United States, 2 = Other, 4 = Prefer not to say
first_lang numeric Participant's first language 1 = English, 2 = Other language, 4 = Prefer not to say

Attention Checks

Variable_Name Variable_Type Variable_Description Value_Coding
ac_imc_failed numeric Imagine bringing home a new puppy to your family...To show you are paying attention to this one, please click Other and write: reading is good. 0 = Passed, 1 = Failed
ac_stline_failed numeric For this item, select 'not at all'. (Item embedded in PANAS scale.) 0 = Passed, 1 = Failed
ac_obvious_tf_failed numeric The sky is blue. (Options: True or False) 0 = Passed, 1 = Failed
ac_eng_failed numeric The trophy doesn't fit into the brown suitcase because it's too small. What is too small? (Text input) 0 = Passed, 1 = Failed
ac_obvious_failed numeric How often do you breathe oxygen? (5 options from 1: Never to 5: Always) 0 = Passed, 1 = Failed
ac_count_failed numeric Total number of ACs failed (out of 5)
exp_* numeric The extent to which the participant is certain that they have seen or have not seen a version of an AC question prior to taking this survey 1 = Definitely have not seen before today, 2 = May not have seen before today, 3 = Unsure if seen before today, 4 = May have seen before today, 5 = Definitely have seen before today
ex_* text Participant's recalled example(s) of similar ACs seen prior to survey (shown only if exp_* response was 4 or 5)

Task and AC Order Variables

Variable_Name Variable_Type Variable_Description Value_Coding
task_[1-5] character JDM paradigm presented in that slot a = Anchor-and-Adjustment, b = Processing Fluency, c = Word Search, d = Snack Preferences, e = Time Perception
ac_[1-5] character AC presented in that slot a = IMC, b = Straightline, c = Obvious (T/F), d = English Comprehension, e = Obvious (5-pt)
task_order text Comma-separated order of JDM paradigms presented
ac_order text Comma-separated order of ACs presented

Anchor-and-Adjustment Variables

Variable_Name Variable_Type Variable_Description Variable_Language_or_Value_Coding
estimate_[1-6] numeric Participant's estimate for each anchor-and-adjustment question estimate_1 = Year Washington first elected; estimate_2 = Days for Mars orbit; estimate_3 = Months elephant pregnancy; estimate_4 = Boiling point of water at Everest summit (°F); estimate_5 = Freezing point of vodka (°F); estimate_6 = Number of U.S. states in 1880
anchor_[1-6] numeric Participant's (potential) anchor value relied on for each question anchor_1 = Year U.S. declared independence; anchor_2 = Days for Earth orbit; anchor_3 = Months human pregnancy; anchor_4 = Boiling point of water at sea level (°F); anchor_5 = Freezing point of water (°F); anchor_6 = Current number of U.S. states
anchor_[1-6]_yn numeric Whether the participant reported relying on the anchor 1 = Yes, 2 = No, 3 = Maybe
load_removal character Cognitive load removal string recalled by participant Low-interference: DK. High-interference: DKOUFWLVJ.
t_load_* numeric Timer variables for cognitive load manipulation page

Processing Fluency Variables

Variable_Name Variable_Type Variable_Description Value_Coding
fluent_[1-10] numeric Participant's rating of how easy or difficult it would be to pronounce each hypothetical brand name 1 = Very easy, 2 = Relatively easy, 3 = Relatively difficult, 4 = Very difficult

Word Search Variables

Variable_Name Variable_Type Variable_Description Value_Coding_and_Notes
words_found numeric Total number of words found (out of 5) 0–5
timing_words_found text Timing since start of task for each word found (in seconds) Comma-separated; -1 = word not found
t_wordsearch_* numeric Timer variables for word search task

Snack Preferences Variables

Variable_Name Variable_Type Variable_Description Value_Coding
emo_check_[1-4] numeric Manipulation check items
energy numeric Check for whether arousal was unintentionally manipulated by the scenarios 1 = Not at all emotionally aroused to 7 = Extremely emotionally aroused
snack_preferences text Participant's ideal snacks to hypothetically receive as a thank-you
t_emo_scenario_* numeric Timer variables for hope vs. pride (emotion) manipulation

Time Perception Variables

Variable_Name Variable_Type Variable_Description Value_Coding
e_days_hours numeric Participant's preference for expedited or standard shipping when time is presented in either days or hours 1 = Expedited shipping, 2 = Standard shipping
time_perception numeric How long did the time period between now and standard shipping delivery feel? 0 (very short) to 100 (very long) slider

Goal 1: Are ACs sensitive to the manipulations?

If ACs are valid measures of attentiveness, failure rates should be higher in the high-interference condition — especially for paradigms designed to impair attention (cognitive load, processing fluency, and word search). Failure rates for paradigms not designed to impair attention (snack preferences and time perception) are not expected to differ by condition.

Overall failure rates by condition

Failure rates are substantially lower for the Obvious and Straightline checks than for the IMC and English Comprehension checks. Given the near-ceiling pass rates for the former, further analysis of variation within those checks is likely not to be very informative.

When aggregated across paradigms, there does not appear to be an obvious difference in failure rates between conditions. However, this aggregation may obscure meaningful variation: failure rates may depend jointly on condition (high- vs. low-interference) and the specific paradigm preceding each AC. The next subsection presents these disaggregated results.

Code
treatment_summary <- ac_data |>
  left_join(students_data |> select(id, treatment), by = "id") |>
  pivot_longer(all_of(ac_vars), names_to = "ac", values_to = "failed") |>
  mutate(
    AC        = ac_labels[ac],
    Condition = ifelse(treatment == 1, "High-interference", "Low-interference")
  ) |>
  group_by(AC, Condition) |>
  summarise(N = n(), N_failed = sum(failed, na.rm = TRUE),
            Pct = N_failed / N, .groups = "drop")

treatment_summary |>
  mutate(`% failed` = percent(Pct, accuracy = 0.1)) |>
  select(AC, Condition, N, N_failed, `% failed`) |>
  arrange(AC, Condition) |>
  kbl(caption = "AC failure rates by experimental condition") |>
  kable_styling(full_width = FALSE) |>
  collapse_rows(columns = 1, valign = "middle")
AC failure rates by experimental condition
AC Condition N N_failed % failed
Eng. Comprehension High-interference 262 56 21.4%
Low-interference 281 50 17.8%
IMC High-interference 262 50 19.1%
Low-interference 281 44 15.7%
Obvious (5-pt) High-interference 262 3 1.1%
Low-interference 281 2 0.7%
Obvious (T/F) High-interference 262 3 1.1%
Low-interference 281 7 2.5%
Straightline High-interference 262 0 0.0%
Low-interference 281 6 2.1%
Code
treatment_summary |>
  ggplot(aes(x = reorder(AC, -Pct), y = Pct, fill = Condition)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  geom_text(
    aes(label = paste0(N_failed, "\n(", percent(Pct, accuracy = 0.1), ")")),
    position = position_dodge(width = 0.7),
    vjust = -0.3, size = 3
  ) +
  scale_y_continuous(labels = percent_format(), expand = expansion(mult = c(0, .15))) +
  scale_fill_manual(values = c("Low-interference" = "#B4B2A9", "High-interference" = "#639922")) +
  labs(x = NULL, y = "% failed", fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "top")

AC failure rates by condition. If the manipulations work, high-interference bars should be taller.

Failure rates by preceding task and condition

Because ACs were embedded within or immediately following paradigms, failure rates may vary depending on which task preceded the AC. The table below breaks down failure rates by both preceding task and condition, which lets us assess whether specific manipulations drove failures for specific ACs.

Although the sample sizes are small, the observed patterns are somewhat consistent with our predictions. The cognitive load manipulation (in the Anchor-and-Adjustment paradigm) and the word search difficulty manipulation appear to increase failure rates on both the IMC and English Comprehension checks. In contrast, the processing disfluency (vs. fluency) manipulation does not show a clear effect on either check. Notably, engaging with the fluency paradigm itself - regardless of condition - appears to reduce performance on the English Comprehension check more than on the IMC.

Code
ac_task_results <- bind_rows(
  students_data |> select(treatment, task_before = task_before_a, failure = ac_imc_failed)         |> mutate(ac = "IMC"),
  students_data |> select(treatment, task_before = task_before_b, failure = ac_stline_failed)      |> mutate(ac = "Straightline"),
  students_data |> select(treatment, task_before = task_before_c, failure = ac_obvious_tf_failed)  |> mutate(ac = "Obvious (T/F)"),
  students_data |> select(treatment, task_before = task_before_d, failure = ac_eng_failed)         |> mutate(ac = "Eng. Comprehension"),
  students_data |> select(treatment, task_before = task_before_e, failure = ac_obvious_failed)     |> mutate(ac = "Obvious (5-pt)")
) |>
  filter(!is.na(task_before))

ac_task_results |>
  group_by(ac, task_before, treatment) |>
  summarise(
    n         = n(),
    n_fail    = sum(failure == 1, na.rm = TRUE),
    fail_rate = round(mean(failure == 1, na.rm = TRUE) * 100, 1),
    .groups   = "drop"
  ) |>
  mutate(
    task_before = task_labels[task_before],
    treatment   = factor(treatment, labels = c("Low", "High"))
  ) |>
  pivot_wider(
    names_from  = treatment,
    values_from = c(n, n_fail, fail_rate),
    names_glue  = "{treatment}_{.value}"
  ) |>
  arrange(ac, task_before) |>
  select(ac, task_before,
         Low_n, Low_n_fail, Low_fail_rate,
         High_n, High_n_fail, High_fail_rate) |>
  rename(
    `Attention Check` = ac,
    `Preceding Task`  = task_before,
    `N`               = Low_n,   `N Failed`        = Low_n_fail,  `Failure Rate (%)` = Low_fail_rate,
    `N `              = High_n,  `N Failed `       = High_n_fail, `Failure Rate (%) ` = High_fail_rate
  ) |>
  kable(
    caption = "AC failure rates by preceding task and condition",
    align   = c("l", "l", "r", "r", "r", "r", "r", "r"),
    booktabs = TRUE
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE, font_size = 12) |>
  collapse_rows(columns = 1, valign = "top") |>
  add_header_above(c(" " = 2, "Low-Interference" = 3, "High-Interference" = 3))
AC failure rates by preceding task and condition
Low-Interference
High-Interference
Attention Check Preceding Task N N Failed Failure Rate (%) N N Failed Failure Rate (%)
Eng. Comprehension Anchor-and-Adjustment 61 8 13.1 49 10 20.4
Processing Fluency 59 14 23.7 60 13 21.7
Snack Preferences 60 12 20.0 45 12 26.7
Time Perception 47 7 14.9 54 9 16.7
Word Search 54 9 16.7 54 12 22.2
IMC Anchor-and-Adjustment 50 7 14.0 57 12 21.1
Processing Fluency 59 10 16.9 57 8 14.0
Snack Preferences 50 9 18.0 41 6 14.6
Time Perception 65 9 13.8 58 12 20.7
Word Search 57 9 15.8 49 12 24.5
Obvious (5-pt) Anchor-and-Adjustment 53 0 0.0 57 1 1.8
Processing Fluency 54 0 0.0 48 0 0.0
Snack Preferences 54 0 0.0 62 2 3.2
Time Perception 62 0 0.0 47 0 0.0
Word Search 58 2 3.4 48 0 0.0
Obvious (T/F) Anchor-and-Adjustment 49 2 4.1 51 0 0.0
Processing Fluency 56 1 1.8 49 1 2.0
Snack Preferences 59 3 5.1 60 2 3.3
Time Perception 56 0 0.0 45 0 0.0
Word Search 61 1 1.6 57 0 0.0
Straightline Anchor-and-Adjustment 68 0 0.0 48 0 0.0
Processing Fluency 53 4 7.5 48 0 0.0
Snack Preferences 58 1 1.7 54 0 0.0
Time Perception 51 0 0.0 58 0 0.0
Word Search 51 1 2.0 54 0 0.0

Goal 2: Do different ACs flag the same participants?

A key question is whether ACs converge on a shared set of inattentive participants or whether they each identify largely non-overlapping subgroups. Because ACs were randomized across positions and paradigms, we also examine whether apparent (dis)agreement between ACs is explained by which task preceded each AC.

Overall failure rates

Code
ac_data |>
  summarise(across(all_of(ac_vars), \(x) sum(x, na.rm = TRUE))) |>
  pivot_longer(everything(), names_to = "ac", values_to = "n_failed") |>
  mutate(
    AC         = ac_labels[ac],
    `N failed` = n_failed,
    `% failed` = percent(n_failed / n_total, accuracy = 0.1)
  ) |>
  arrange(desc(n_failed)) |>
  select(AC, `N failed`, `% failed`) |>
  kbl(caption = paste0("AC failure counts and rates (N = ", n_total, ")")) |>
  kable_styling(full_width = FALSE)
AC failure counts and rates (N = 543)
AC N failed % failed
Eng. Comprehension 106 19.5%
IMC 94 17.3%
Obvious (T/F) 10 1.8%
Straightline 6 1.1%
Obvious (5-pt) 5 0.9%

Pairwise overlap

For each pair of ACs, the cell shows: among participants who failed check A, how many (n, %) also failed check B? Low values indicate the checks are flagging largely non-overlapping subsets.

Code
overlap_mat_n   <- matrix(NA_real_, 5, 5, dimnames = list(ac_labels, ac_labels))
overlap_mat_pct <- matrix(NA_real_, 5, 5, dimnames = list(ac_labels, ac_labels))

for (i in seq_along(ac_vars)) {
  for (j in seq_along(ac_vars)) {
    if (i == j) next
    fail_i <- ac_data[[ac_vars[i]]] == 1 & !is.na(ac_data[[ac_vars[i]]])
    fail_j <- ac_data[[ac_vars[j]]] == 1 & !is.na(ac_data[[ac_vars[j]]])
    overlap_mat_n[i, j]   <- sum(fail_i & fail_j, na.rm = TRUE)
    overlap_mat_pct[i, j] <- mean(fail_j[fail_i], na.rm = TRUE)
  }
}

overlap_combined <- matrix(
  ifelse(is.na(overlap_mat_n), "—",
         paste0(overlap_mat_n, " (", round(overlap_mat_pct * 100, 1), "%)")),
  nrow = 5, dimnames = dimnames(overlap_mat_n)
)

as.data.frame(overlap_combined) |>
  kbl(caption = "Row: n (%) of that check's failers who *also* failed the column check") |>
  kable_styling(full_width = FALSE, font_size = 12)
Row: n (%) of that check's failers who *also* failed the column check
IMC Straightline Obvious (T/F) Eng. Comprehension Obvious (5-pt)
IMC 4 (4.3%) 2 (2.1%) 15 (16%) 3 (3.2%)
Straightline 4 (66.7%) 1 (16.7%) 2 (33.3%) 1 (16.7%)
Obvious (T/F) 2 (20%) 1 (10%) 3 (30%) 0 (0%)
Eng. Comprehension 15 (14.2%) 2 (1.9%) 3 (2.8%) 0 (0%)
Obvious (5-pt) 3 (60%) 1 (20%) 0 (0%) 0 (0%)

UpSet plot: Which combination of ACs flags each participant?

Each bar in the UpSet plot represents a unique combination of ACs failed. If single-check bars dominate, the checks are largely flagging different individuals.

Code
upset_data <- ac_data |>
  select(all_of(ac_vars)) |>
  rename_with(\(x) ac_labels[x]) |>
  as.data.frame()

upset(
  upset_data,
  sets            = rev(ac_labels),
  order.by        = "freq",
  decreasing      = TRUE,
  mb.ratio        = c(0.6, 0.4),
  text.scale      = c(1.3, 1.1, 1, 1, 1.1, 1),
  point.size      = 2.8,
  line.size       = 0.8,
  mainbar.y.label = "Intersection size",
  sets.x.label    = "N failed",
  keep.order      = FALSE
)

UpSet plot of AC failure combinations. Bars dominated by single-AC intersections indicate low convergence across checks.

Jaccard similarity heatmap

Jaccard similarity ranges from 0 (no shared failers) to 1 (identical failer sets). Values near 0 confirm the checks are identifying mostly different participants.

Code
jaccard <- function(a, b) {
  a <- a == 1 & !is.na(a); b <- b == 1 & !is.na(b)
  sum(a & b) / sum(a | b)
}

jac_mat <- matrix(NA_real_, 5, 5, dimnames = list(ac_labels, ac_labels))
for (i in seq_along(ac_vars))
  for (j in seq_along(ac_vars))
    jac_mat[i, j] <- jaccard(ac_data[[ac_vars[i]]], ac_data[[ac_vars[j]]])

pheatmap(
  jac_mat,
  color           = colorRampPalette(c("#F1EFE8", "#639922"))(50),
  display_numbers = TRUE,
  number_format   = "%.2f",
  cluster_rows    = TRUE,
  cluster_cols    = TRUE,
  fontsize        = 10,
  main            = "Jaccard similarity between AC failer sets",
  angle_col       = 45
)

Pairwise Jaccard similarity between AC failer sets.

Does preceding paradigm explain the (dis)agreement between ACs?

Because ACs were randomized, each AC type was preceded by different paradigms for different participants. This means apparent non-overlap between ACs could partly reflect different task contexts rather than the ACs measuring different constructs. The analyses below assess the severity of this confound.

Step 1: Balance check — how evenly was each task distributed as the preceding task?

If the randomization worked, each AC should have been approximately equally preceded by each of the five paradigms. Imbalance here would mean some AC pairs are more confounded than others.

Randomization looks fine.

Code
ac_task_map <- list(
  IMC          = "task_before_a",
  Straightline = "task_before_b",
  `Obvious (T/F)` = "task_before_c",
  `Eng. Comprehension` = "task_before_d",
  `Obvious (5-pt)` = "task_before_e"
)

balance_table <- map_dfr(names(ac_task_map), function(ac_name) {
  students_data |>
    count(.data[[ac_task_map[[ac_name]]]], name = "n") |>
    rename(task_before = 1) |>
    mutate(
      ac         = ac_name,
      pct        = round(n / sum(n) * 100, 1),
      task_before = task_labels[task_before]
    )
})

balance_table |>
  pivot_wider(names_from = ac, values_from = c(n, pct),
              names_glue = "{ac}_{.value}") |>
  rename(`Preceding Task` = task_before) |>
  kbl(caption = "Distribution of preceding tasks per AC (n and %)") |>
  kable_styling(full_width = FALSE, font_size = 11)
Distribution of preceding tasks per AC (n and %)
Preceding Task IMC_n Straightline_n Obvious (T/F)_n Eng. Comprehension_n Obvious (5-pt)_n IMC_pct Straightline_pct Obvious (T/F)_pct Eng. Comprehension_pct Obvious (5-pt)_pct
Anchor-and-Adjustment 107 116 100 110 110 19.7 21.4 18.4 20.3 20.3
Processing Fluency 116 101 105 119 102 21.4 18.6 19.3 21.9 18.8
Snack Preferences 91 112 119 105 116 16.8 20.6 21.9 19.3 21.4
Word Search 106 105 118 108 106 19.5 19.3 21.7 19.9 19.5
Time Perception 123 109 101 101 109 22.7 20.1 18.6 18.6 20.1

Step 2: Joint overlap — does AC agreement vary by preceding paradigms?

Note

I’m not sure if this is the most appropriate analysis to conduct. But basically, I want to see if the low degree of overlap between who is flagged as inattentive by the IMC and English Comprehension check can be explained by the fact that people see different paradigms before each check, and some of the paradigms could affect their likelihood of passing/failing certain checks more or less than others.

To assess whether paradigm context influences which participants are flagged by different ACs, we compute agreement rates while conditioning on the paradigms that preceded both ACs in each pair. Specifically, for each AC pair, we calculate the proportion of participants for whom both ACs yield the same pass/fail classification within each combination of the paradigm preceding AC1 and the paradigm preceding AC2.

If agreement varies substantially across rows, this suggests that paradigm context affects which participants are jointly flagged. On the other hand, if agreement for an AC pair is relatively stable (and low across paradigm pairs), this suggests that differences in preceding paradigms do not meaningfully explain the low overlap across ACs.

Code
joint_overlap <- map_dfr(ac_pairs, function(pair) {
  ac1 <- pair[1]; tb1 <- pair[2]; ac2 <- pair[3]; tb2 <- pair[4]
  
  d <- students_data |>
    select(ac1_val = all_of(ac1), ac2_val = all_of(ac2),
           tb1_val = all_of(tb1), tb2_val = all_of(tb2)) |>
    drop_na()
  
  d |>
    group_by(task_before_ac1 = tb1_val,
             task_before_ac2 = tb2_val) |>
    summarise(
      n         = n(),
      both_fail = round(mean(ac1_val == 1 & ac2_val == 1), 3),
      both_pass = round(mean(ac1_val == 0 & ac2_val == 0), 3),
      agree     = round(mean(ac1_val == ac2_val), 3),
      .groups   = "drop"
    ) |>
    mutate(
      ac_pair = paste(ac_labels[ac1], "×", ac_labels[ac2]),
      task_before_ac1 = task_labels[task_before_ac1],
      task_before_ac2 = task_labels[task_before_ac2]
    )
}) |>
  select(ac_pair, task_before_ac1, task_before_ac2, n, both_fail, both_pass, agree)

joint_overlap |>
  rename(
    `AC Pair` = ac_pair,
    `Preceding Paradigm (AC1)` = task_before_ac1,
    `Preceding Paradigm (AC2)` = task_before_ac2,
    N = n,
    `P(both fail)` = both_fail,
    `P(both pass)` = both_pass,
    `P(agree)` = agree
  ) |>
  kbl(caption = "Pairwise AC agreement stratified by preceding paradigms for both ACs") |>
  kable_styling(full_width = FALSE, font_size = 11)
Pairwise AC agreement stratified by preceding paradigms for both ACs
AC Pair Preceding Paradigm (AC1) Preceding Paradigm (AC2) N P(both fail) P(both pass) P(agree)
IMC × Eng. Comprehension Anchor-and-Adjustment Processing Fluency 27 0.037 0.630 0.667
IMC × Eng. Comprehension Anchor-and-Adjustment Snack Preferences 24 0.125 0.542 0.667
IMC × Eng. Comprehension Anchor-and-Adjustment Word Search 27 0.037 0.667 0.704
IMC × Eng. Comprehension Anchor-and-Adjustment Time Perception 29 0.034 0.552 0.586
IMC × Eng. Comprehension Processing Fluency Anchor-and-Adjustment 33 0.000 0.515 0.515
IMC × Eng. Comprehension Processing Fluency Snack Preferences 35 0.029 0.829 0.857
IMC × Eng. Comprehension Processing Fluency Word Search 26 0.000 0.769 0.769
IMC × Eng. Comprehension Processing Fluency Time Perception 22 0.000 0.727 0.727
IMC × Eng. Comprehension Snack Preferences Anchor-and-Adjustment 22 0.000 0.591 0.591
IMC × Eng. Comprehension Snack Preferences Processing Fluency 30 0.000 0.733 0.733
IMC × Eng. Comprehension Snack Preferences Word Search 20 0.000 0.500 0.500
IMC × Eng. Comprehension Snack Preferences Time Perception 19 0.000 0.842 0.842
IMC × Eng. Comprehension Word Search Anchor-and-Adjustment 32 0.062 0.750 0.812
IMC × Eng. Comprehension Word Search Processing Fluency 27 0.037 0.741 0.778
IMC × Eng. Comprehension Word Search Snack Preferences 16 0.000 0.500 0.500
IMC × Eng. Comprehension Word Search Time Perception 31 0.000 0.645 0.645
IMC × Eng. Comprehension Time Perception Anchor-and-Adjustment 23 0.000 0.696 0.696
IMC × Eng. Comprehension Time Perception Processing Fluency 35 0.057 0.600 0.657
IMC × Eng. Comprehension Time Perception Snack Preferences 30 0.033 0.667 0.700
IMC × Eng. Comprehension Time Perception Word Search 35 0.057 0.629 0.686
IMC × Obvious (5-pt) Anchor-and-Adjustment Processing Fluency 35 0.000 0.857 0.857
IMC × Obvious (5-pt) Anchor-and-Adjustment Snack Preferences 24 0.000 0.875 0.875
IMC × Obvious (5-pt) Anchor-and-Adjustment Word Search 25 0.040 0.800 0.840
IMC × Obvious (5-pt) Anchor-and-Adjustment Time Perception 23 0.000 0.739 0.739
IMC × Obvious (5-pt) Processing Fluency Anchor-and-Adjustment 24 0.000 0.875 0.875
IMC × Obvious (5-pt) Processing Fluency Snack Preferences 26 0.038 0.808 0.846
IMC × Obvious (5-pt) Processing Fluency Word Search 25 0.000 0.840 0.840
IMC × Obvious (5-pt) Processing Fluency Time Perception 41 0.000 0.829 0.829
IMC × Obvious (5-pt) Snack Preferences Anchor-and-Adjustment 27 0.000 0.852 0.852
IMC × Obvious (5-pt) Snack Preferences Processing Fluency 19 0.000 0.684 0.684
IMC × Obvious (5-pt) Snack Preferences Word Search 24 0.000 0.958 0.958
IMC × Obvious (5-pt) Snack Preferences Time Perception 21 0.000 0.810 0.810
IMC × Obvious (5-pt) Word Search Anchor-and-Adjustment 26 0.000 0.808 0.808
IMC × Obvious (5-pt) Word Search Processing Fluency 23 0.000 0.826 0.826
IMC × Obvious (5-pt) Word Search Snack Preferences 33 0.000 0.758 0.758
IMC × Obvious (5-pt) Word Search Time Perception 24 0.000 0.833 0.833
IMC × Obvious (5-pt) Time Perception Anchor-and-Adjustment 33 0.000 0.848 0.848
IMC × Obvious (5-pt) Time Perception Processing Fluency 25 0.000 0.880 0.880
IMC × Obvious (5-pt) Time Perception Snack Preferences 33 0.030 0.788 0.818
IMC × Obvious (5-pt) Time Perception Word Search 32 0.000 0.781 0.781
IMC × Obvious (T/F) Anchor-and-Adjustment Processing Fluency 22 0.000 0.773 0.773
IMC × Obvious (T/F) Anchor-and-Adjustment Snack Preferences 30 0.000 0.767 0.767
IMC × Obvious (T/F) Anchor-and-Adjustment Word Search 23 0.000 0.826 0.826
IMC × Obvious (T/F) Anchor-and-Adjustment Time Perception 32 0.000 0.906 0.906
IMC × Obvious (T/F) Processing Fluency Anchor-and-Adjustment 32 0.000 0.844 0.844
IMC × Obvious (T/F) Processing Fluency Snack Preferences 32 0.031 0.719 0.750
IMC × Obvious (T/F) Processing Fluency Word Search 33 0.000 0.848 0.848
IMC × Obvious (T/F) Processing Fluency Time Perception 19 0.000 0.842 0.842
IMC × Obvious (T/F) Snack Preferences Anchor-and-Adjustment 19 0.000 0.842 0.842
IMC × Obvious (T/F) Snack Preferences Processing Fluency 20 0.000 0.750 0.750
IMC × Obvious (T/F) Snack Preferences Word Search 26 0.000 0.846 0.846
IMC × Obvious (T/F) Snack Preferences Time Perception 26 0.000 0.846 0.846
IMC × Obvious (T/F) Word Search Anchor-and-Adjustment 27 0.000 0.852 0.852
IMC × Obvious (T/F) Word Search Processing Fluency 26 0.000 0.808 0.808
IMC × Obvious (T/F) Word Search Snack Preferences 29 0.000 0.690 0.690
IMC × Obvious (T/F) Word Search Time Perception 24 0.000 0.875 0.875
IMC × Obvious (T/F) Time Perception Anchor-and-Adjustment 22 0.000 0.818 0.818
IMC × Obvious (T/F) Time Perception Processing Fluency 37 0.027 0.811 0.838
IMC × Obvious (T/F) Time Perception Snack Preferences 28 0.000 0.786 0.786
IMC × Obvious (T/F) Time Perception Word Search 36 0.000 0.806 0.806
IMC × Straightline Anchor-and-Adjustment Processing Fluency 23 0.043 0.826 0.870
IMC × Straightline Anchor-and-Adjustment Snack Preferences 29 0.000 0.862 0.862
IMC × Straightline Anchor-and-Adjustment Word Search 32 0.000 0.844 0.844
IMC × Straightline Anchor-and-Adjustment Time Perception 23 0.000 0.739 0.739
IMC × Straightline Processing Fluency Anchor-and-Adjustment 27 0.000 0.852 0.852
IMC × Straightline Processing Fluency Snack Preferences 23 0.000 0.826 0.826
IMC × Straightline Processing Fluency Word Search 32 0.000 0.781 0.781
IMC × Straightline Processing Fluency Time Perception 34 0.000 0.882 0.882
IMC × Straightline Snack Preferences Anchor-and-Adjustment 23 0.000 0.870 0.870
IMC × Straightline Snack Preferences Processing Fluency 22 0.045 0.909 0.955
IMC × Straightline Snack Preferences Word Search 21 0.000 0.810 0.810
IMC × Straightline Snack Preferences Time Perception 25 0.000 0.760 0.760
IMC × Straightline Word Search Anchor-and-Adjustment 21 0.000 0.762 0.762
IMC × Straightline Word Search Processing Fluency 30 0.033 0.700 0.733
IMC × Straightline Word Search Snack Preferences 28 0.000 0.929 0.929
IMC × Straightline Word Search Time Perception 27 0.000 0.815 0.815
IMC × Straightline Time Perception Anchor-and-Adjustment 45 0.000 0.800 0.800
IMC × Straightline Time Perception Processing Fluency 26 0.038 0.769 0.808
IMC × Straightline Time Perception Snack Preferences 32 0.000 0.875 0.875
IMC × Straightline Time Perception Word Search 20 0.000 0.850 0.850
Eng. Comprehension × Obvious (5-pt) Anchor-and-Adjustment Processing Fluency 23 0.000 0.870 0.870
Eng. Comprehension × Obvious (5-pt) Anchor-and-Adjustment Snack Preferences 29 0.000 0.862 0.862
Eng. Comprehension × Obvious (5-pt) Anchor-and-Adjustment Word Search 30 0.000 0.767 0.767
Eng. Comprehension × Obvious (5-pt) Anchor-and-Adjustment Time Perception 28 0.000 0.786 0.786
Eng. Comprehension × Obvious (5-pt) Processing Fluency Anchor-and-Adjustment 33 0.000 0.909 0.909
Eng. Comprehension × Obvious (5-pt) Processing Fluency Snack Preferences 32 0.000 0.594 0.594
Eng. Comprehension × Obvious (5-pt) Processing Fluency Word Search 27 0.000 0.704 0.704
Eng. Comprehension × Obvious (5-pt) Processing Fluency Time Perception 27 0.000 0.852 0.852
Eng. Comprehension × Obvious (5-pt) Snack Preferences Anchor-and-Adjustment 23 0.000 0.696 0.696
Eng. Comprehension × Obvious (5-pt) Snack Preferences Processing Fluency 24 0.000 0.667 0.667
Eng. Comprehension × Obvious (5-pt) Snack Preferences Word Search 30 0.000 0.867 0.867
Eng. Comprehension × Obvious (5-pt) Snack Preferences Time Perception 28 0.000 0.821 0.821
Eng. Comprehension × Obvious (5-pt) Word Search Anchor-and-Adjustment 30 0.000 0.700 0.700
Eng. Comprehension × Obvious (5-pt) Word Search Processing Fluency 25 0.000 0.840 0.840
Eng. Comprehension × Obvious (5-pt) Word Search Snack Preferences 27 0.000 0.704 0.704
Eng. Comprehension × Obvious (5-pt) Word Search Time Perception 26 0.000 0.962 0.962
Eng. Comprehension × Obvious (5-pt) Time Perception Anchor-and-Adjustment 24 0.000 0.958 0.958
Eng. Comprehension × Obvious (5-pt) Time Perception Processing Fluency 30 0.000 0.833 0.833
Eng. Comprehension × Obvious (5-pt) Time Perception Snack Preferences 28 0.000 0.893 0.893
Eng. Comprehension × Obvious (5-pt) Time Perception Word Search 19 0.000 0.579 0.579
Eng. Comprehension × Obvious (T/F) Anchor-and-Adjustment Processing Fluency 28 0.000 0.857 0.857
Eng. Comprehension × Obvious (T/F) Anchor-and-Adjustment Snack Preferences 30 0.033 0.767 0.800
Eng. Comprehension × Obvious (T/F) Anchor-and-Adjustment Word Search 30 0.000 0.833 0.833
Eng. Comprehension × Obvious (T/F) Anchor-and-Adjustment Time Perception 22 0.000 0.864 0.864
Eng. Comprehension × Obvious (T/F) Processing Fluency Anchor-and-Adjustment 28 0.000 0.679 0.679
Eng. Comprehension × Obvious (T/F) Processing Fluency Snack Preferences 23 0.000 0.913 0.913
Eng. Comprehension × Obvious (T/F) Processing Fluency Word Search 38 0.026 0.763 0.789
Eng. Comprehension × Obvious (T/F) Processing Fluency Time Perception 30 0.000 0.733 0.733
Eng. Comprehension × Obvious (T/F) Snack Preferences Anchor-and-Adjustment 28 0.000 0.750 0.750
Eng. Comprehension × Obvious (T/F) Snack Preferences Processing Fluency 27 0.000 0.704 0.704
Eng. Comprehension × Obvious (T/F) Snack Preferences Word Search 25 0.000 0.840 0.840
Eng. Comprehension × Obvious (T/F) Snack Preferences Time Perception 25 0.000 0.720 0.720
Eng. Comprehension × Obvious (T/F) Word Search Anchor-and-Adjustment 21 0.000 0.905 0.905
Eng. Comprehension × Obvious (T/F) Word Search Processing Fluency 28 0.036 0.714 0.750
Eng. Comprehension × Obvious (T/F) Word Search Snack Preferences 35 0.000 0.829 0.829
Eng. Comprehension × Obvious (T/F) Word Search Time Perception 24 0.000 0.750 0.750
Eng. Comprehension × Obvious (T/F) Time Perception Anchor-and-Adjustment 23 0.000 0.913 0.913
Eng. Comprehension × Obvious (T/F) Time Perception Processing Fluency 22 0.000 0.864 0.864
Eng. Comprehension × Obvious (T/F) Time Perception Snack Preferences 31 0.000 0.710 0.710
Eng. Comprehension × Obvious (T/F) Time Perception Word Search 25 0.000 0.840 0.840
Eng. Comprehension × Straightline Anchor-and-Adjustment Processing Fluency 26 0.038 0.846 0.885
Eng. Comprehension × Straightline Anchor-and-Adjustment Snack Preferences 29 0.000 0.793 0.793
Eng. Comprehension × Straightline Anchor-and-Adjustment Word Search 18 0.056 0.778 0.833
Eng. Comprehension × Straightline Anchor-and-Adjustment Time Perception 37 0.000 0.865 0.865
Eng. Comprehension × Straightline Processing Fluency Anchor-and-Adjustment 31 0.000 0.677 0.677
Eng. Comprehension × Straightline Processing Fluency Snack Preferences 34 0.000 0.794 0.794
Eng. Comprehension × Straightline Processing Fluency Word Search 27 0.000 0.815 0.815
Eng. Comprehension × Straightline Processing Fluency Time Perception 27 0.000 0.815 0.815
Eng. Comprehension × Straightline Snack Preferences Anchor-and-Adjustment 30 0.000 0.933 0.933
Eng. Comprehension × Straightline Snack Preferences Processing Fluency 19 0.000 0.737 0.737
Eng. Comprehension × Straightline Snack Preferences Word Search 34 0.000 0.706 0.706
Eng. Comprehension × Straightline Snack Preferences Time Perception 22 0.000 0.682 0.682
Eng. Comprehension × Straightline Word Search Anchor-and-Adjustment 30 0.000 0.800 0.800
Eng. Comprehension × Straightline Word Search Processing Fluency 29 0.000 0.759 0.759
Eng. Comprehension × Straightline Word Search Snack Preferences 26 0.000 0.808 0.808
Eng. Comprehension × Straightline Word Search Time Perception 23 0.000 0.783 0.783
Eng. Comprehension × Straightline Time Perception Anchor-and-Adjustment 25 0.000 0.880 0.880
Eng. Comprehension × Straightline Time Perception Processing Fluency 27 0.000 0.741 0.741
Eng. Comprehension × Straightline Time Perception Snack Preferences 23 0.000 0.826 0.826
Eng. Comprehension × Straightline Time Perception Word Search 26 0.000 0.885 0.885
Obvious (5-pt) × Obvious (T/F) Anchor-and-Adjustment Processing Fluency 24 0.000 0.958 0.958
Obvious (5-pt) × Obvious (T/F) Anchor-and-Adjustment Snack Preferences 32 0.000 0.938 0.938
Obvious (5-pt) × Obvious (T/F) Anchor-and-Adjustment Word Search 27 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Anchor-and-Adjustment Time Perception 27 0.000 0.963 0.963
Obvious (5-pt) × Obvious (T/F) Processing Fluency Anchor-and-Adjustment 17 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Processing Fluency Snack Preferences 32 0.000 0.969 0.969
Obvious (5-pt) × Obvious (T/F) Processing Fluency Word Search 29 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Processing Fluency Time Perception 24 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Snack Preferences Anchor-and-Adjustment 27 0.000 0.963 0.963
Obvious (5-pt) × Obvious (T/F) Snack Preferences Processing Fluency 30 0.000 0.967 0.967
Obvious (5-pt) × Obvious (T/F) Snack Preferences Word Search 36 0.000 0.972 0.972
Obvious (5-pt) × Obvious (T/F) Snack Preferences Time Perception 23 0.000 0.957 0.957
Obvious (5-pt) × Obvious (T/F) Word Search Anchor-and-Adjustment 24 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Word Search Processing Fluency 30 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Word Search Snack Preferences 25 0.000 0.880 0.880
Obvious (5-pt) × Obvious (T/F) Word Search Time Perception 27 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Time Perception Anchor-and-Adjustment 32 0.000 0.938 0.938
Obvious (5-pt) × Obvious (T/F) Time Perception Processing Fluency 21 0.000 1.000 1.000
Obvious (5-pt) × Obvious (T/F) Time Perception Snack Preferences 30 0.000 0.967 0.967
Obvious (5-pt) × Obvious (T/F) Time Perception Word Search 26 0.000 1.000 1.000
Obvious (5-pt) × Straightline Anchor-and-Adjustment Processing Fluency 29 0.000 0.966 0.966
Obvious (5-pt) × Straightline Anchor-and-Adjustment Snack Preferences 28 0.000 0.929 0.929
Obvious (5-pt) × Straightline Anchor-and-Adjustment Word Search 27 0.000 1.000 1.000
Obvious (5-pt) × Straightline Anchor-and-Adjustment Time Perception 26 0.000 1.000 1.000
Obvious (5-pt) × Straightline Processing Fluency Anchor-and-Adjustment 27 0.000 1.000 1.000
Obvious (5-pt) × Straightline Processing Fluency Snack Preferences 27 0.000 1.000 1.000
Obvious (5-pt) × Straightline Processing Fluency Word Search 25 0.000 1.000 1.000
Obvious (5-pt) × Straightline Processing Fluency Time Perception 23 0.000 1.000 1.000
Obvious (5-pt) × Straightline Snack Preferences Anchor-and-Adjustment 36 0.000 1.000 1.000
Obvious (5-pt) × Straightline Snack Preferences Processing Fluency 28 0.000 0.964 0.964
Obvious (5-pt) × Straightline Snack Preferences Word Search 20 0.000 0.900 0.900
Obvious (5-pt) × Straightline Snack Preferences Time Perception 32 0.000 1.000 1.000
Obvious (5-pt) × Straightline Word Search Anchor-and-Adjustment 27 0.000 1.000 1.000
Obvious (5-pt) × Straightline Word Search Processing Fluency 24 0.042 0.875 0.917
Obvious (5-pt) × Straightline Word Search Snack Preferences 27 0.000 1.000 1.000
Obvious (5-pt) × Straightline Word Search Time Perception 28 0.000 1.000 1.000
Obvious (5-pt) × Straightline Time Perception Anchor-and-Adjustment 26 0.000 1.000 1.000
Obvious (5-pt) × Straightline Time Perception Processing Fluency 20 0.000 1.000 1.000
Obvious (5-pt) × Straightline Time Perception Snack Preferences 30 0.000 1.000 1.000
Obvious (5-pt) × Straightline Time Perception Word Search 33 0.000 0.970 0.970
Obvious (T/F) × Straightline Anchor-and-Adjustment Processing Fluency 23 0.000 1.000 1.000
Obvious (T/F) × Straightline Anchor-and-Adjustment Snack Preferences 26 0.000 1.000 1.000
Obvious (T/F) × Straightline Anchor-and-Adjustment Word Search 28 0.000 0.929 0.929
Obvious (T/F) × Straightline Anchor-and-Adjustment Time Perception 23 0.000 1.000 1.000
Obvious (T/F) × Straightline Processing Fluency Anchor-and-Adjustment 31 0.000 0.968 0.968
Obvious (T/F) × Straightline Processing Fluency Snack Preferences 28 0.000 0.964 0.964
Obvious (T/F) × Straightline Processing Fluency Word Search 21 0.000 0.952 0.952
Obvious (T/F) × Straightline Processing Fluency Time Perception 25 0.000 1.000 1.000
Obvious (T/F) × Straightline Snack Preferences Anchor-and-Adjustment 27 0.000 0.963 0.963
Obvious (T/F) × Straightline Snack Preferences Processing Fluency 32 0.000 0.938 0.938
Obvious (T/F) × Straightline Snack Preferences Word Search 30 0.033 0.900 0.933
Obvious (T/F) × Straightline Snack Preferences Time Perception 30 0.000 0.967 0.967
Obvious (T/F) × Straightline Word Search Anchor-and-Adjustment 38 0.000 0.974 0.974
Obvious (T/F) × Straightline Word Search Processing Fluency 18 0.000 1.000 1.000
Obvious (T/F) × Straightline Word Search Snack Preferences 31 0.000 1.000 1.000
Obvious (T/F) × Straightline Word Search Time Perception 31 0.000 1.000 1.000
Obvious (T/F) × Straightline Time Perception Anchor-and-Adjustment 20 0.000 1.000 1.000
Obvious (T/F) × Straightline Time Perception Processing Fluency 28 0.000 0.929 0.929
Obvious (T/F) × Straightline Time Perception Snack Preferences 27 0.000 1.000 1.000
Obvious (T/F) × Straightline Time Perception Word Search 26 0.000 1.000 1.000

Prior exposure to each AC type

We also tested whether participants who reported prior exposure to a given AC type were less likely to fail it. In principle, prior exposure should improve a participant’s chance of passing a particular AC. This expectation was largely not supported — with the possible exception of the Straightline check.

Two plausible explanations: (1) open-ended responses suggest some participants misunderstood what counts as an AC, introducing measurement error in the prior exposure variable; and (2) students received course credit regardless of AC performance, so prior exposure may not have translated into greater effort to pass.

Code
exposure_pairs <- list(
  ac_obvious_failed    = "exp_obvious",
  ac_obvious_tf_failed = "exp_obvious",
  ac_stline_failed     = "exp_straightline",
  ac_imc_failed        = "exp_imc",
  ac_eng_failed        = "exp_english"
)

exposure_data <- students_data |>
  select(id, starts_with("exp_")) |>
  left_join(ac_data, by = "id")

exposure_summary <- imap_dfr(exposure_pairs, function(exp_var, ac_var) {
  if (!exp_var %in% names(exposure_data)) return(NULL)
  exposure_data |>
    filter(!is.na(.data[[ac_var]]), !is.na(.data[[exp_var]])) |>
    group_by(Status = ifelse(.data[[ac_var]] == 1, "Failed", "Passed")) |>
    summarise(
      AC           = ac_labels[ac_var],
      `Exp. var`   = exp_var,
      N            = n(),
      `M exposure` = round(mean(.data[[exp_var]], na.rm = TRUE), 2),
      SD           = round(sd(.data[[exp_var]],   na.rm = TRUE), 2),
      .groups      = "drop"
    )
})

exposure_summary |>
  select(AC, `Exp. var`, Status, N, `M exposure`, SD) |>
  arrange(AC, Status) |>
  kbl(caption = "Mean prior exposure rating (1–5) by AC pass/fail status") |>
  kable_styling(full_width = FALSE) |>
  collapse_rows(columns = 1:2, valign = "middle")
Mean prior exposure rating (1–5) by AC pass/fail status
AC Exp. var Status N M exposure SD
Eng. Comprehension exp_english Failed 106 2.16 1.65
Passed 437 1.93 1.44
IMC exp_imc Failed 94 2.29 1.67
Passed 449 2.11 1.50
Obvious (5-pt) exp_obvious Failed 5 4.80 0.45
Passed 538 3.30 1.58
Obvious (T/F) Failed 10 3.00 1.94
Passed 533 3.32 1.57
Straightline exp_straightline Failed 6 2.33 1.21
Passed 537 4.05 1.39
Code
imap_dfr(exposure_pairs, function(exp_var, ac_var) {
  if (!exp_var %in% names(exposure_data)) return(NULL)
  exposure_data |>
    filter(!is.na(.data[[ac_var]]), !is.na(.data[[exp_var]])) |>
    mutate(Status = ifelse(.data[[ac_var]] == 1, "Failed", "Passed"),
           AC     = ac_labels[ac_var],
           exp    = .data[[exp_var]])
}) |>
  ggplot(aes(x = factor(exp), fill = Status)) +
  geom_bar(position = "fill") +
  facet_wrap(~ AC, nrow = 2) +
  scale_y_continuous(labels = percent_format()) +
  scale_fill_manual(values = c("Passed" = "#3B8BD4", "Failed" = "#D85A30")) +
  labs(x = "Prior exposure (1 = definitely not seen → 5 = definitely seen)",
       y = "Proportion", fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "top")

Proportion failing each AC at each prior exposure level. If prior exposure helps, the red (Failed) proportion should decrease at higher exposure ratings.

Behavioral correlates of AC failure

As an additional validity check, we compare logged response times and open-ended response length between AC passers and failers within each condition. Failers might show shorter times (rushing) and/or longer times (distracted), and shorter or lower-effort text responses.

Note

The violin plots may not be very meaningful for the checks in which there are very few failures (e.g., Obvious 5-pt, Obvious T/F, and Straightline checks).

Code
behav_data <- students_data |>
  select(id, treatment, t_wordsearch_page_submit, t_load_page_submit,
         total_time_sec, snack_preferences) |>
  mutate(
    condition      = ifelse(treatment == 1, "High", "Low"),
    log_wordsearch = log1p(t_wordsearch_page_submit),
    log_load       = log1p(t_load_page_submit),
    log_total_time = log1p(total_time_sec),
    snack_chars    = nchar(snack_preferences)
  ) |>
  left_join(ac_data, by = "id")

behav_vars <- c(
  "log_wordsearch" = "Log word search time",
  "log_load"       = "Log cognitive load time",
  "log_total_time" = "Log total time",
  "snack_chars"    = "Snack response length (chars)"
)

behav_long <- behav_data |>
  pivot_longer(all_of(ac_vars), names_to = "ac", values_to = "failed") |>
  mutate(AC = ac_labels[ac], Status = ifelse(failed == 1, "Failed", "Passed"))
Code
behav_long |>
  pivot_longer(names(behav_vars), names_to = "measure", values_to = "value") |>
  mutate(measure = behav_vars[measure]) |>
  ggplot(aes(x = Status, y = value, fill = Status, color = Status)) +
  geom_violin(alpha = 0.25, trim = TRUE) +
  geom_jitter(alpha = 0.08, width = 0.15, size = 0.6) +
  stat_summary(fun.data = mean_cl_normal, geom = "pointrange",
               size = 0.6, color = "black") +
  facet_grid(measure ~ AC + condition, scales = "free_y") +
  scale_fill_manual(values  = c("Passed" = "#3B8BD4", "Failed" = "#D85A30")) +
  scale_color_manual(values = c("Passed" = "#3B8BD4", "Failed" = "#D85A30")) +
  labs(x = NULL, y = NULL) +
  theme_minimal(base_size = 10) +
  theme(legend.position = "none", strip.text = element_text(size = 8),
        axis.text.x = element_text(size = 8))

Behavioral measures by AC pass/fail status and condition. Points show means ± 95% CIs.

Goal 3: Do effect sizes change depending on which AC is used for exclusion?

The core question is whether excluding participants based on different ACs yields meaningfully different effect size estimates for each JDM paradigm. For each paradigm × AC combination, we will compare: (a) the effect size in the full sample, (b) the effect size after excluding participants who failed that AC, and (c) how much the estimate changes relative to the full-sample estimate.

Paradigms and their primary outcomes:

  • Anchor-and-Adjustment: mean distance from anchor value (among anchored participants)
  • Processing Fluency: mean fluency rating for disfluent vs. fluent brand names
  • Word Search: words found (hard vs. easy condition)
  • Snack Preferences: snack choice composition (hope vs. pride condition)
  • Time Perception: shipping preference / perceived time duration (days vs. hours condition)

Anchoring effects

Among participants who correctly identified the anchor and reported thinking of it (anchor_x_yn == 1), we examine how closely estimates clustered near the anchor value. Higher cognitive load is expected to increase anchoring (i.e., estimates closer to the anchor).

Here is a breakdown of the number of people who are retained (and excluded) based on this criteria:

Code
anchors <- c(estimate_1 = 1776, estimate_2 = 365, estimate_3 = 9,
             estimate_4 = 212,  estimate_5 = 32,   estimate_6 = 50)

students_data |>
  select(id,
         estimate_1, anchor_1_yn, anchor_1,
         estimate_2, anchor_2_yn, anchor_2,
         estimate_3, anchor_3_yn, anchor_3,
         estimate_4, anchor_4_yn, anchor_4,
         estimate_5, anchor_5_yn, anchor_5,
         estimate_6, anchor_6_yn, anchor_6) |>
  pivot_longer(cols = -id, names_to = "variable", values_to = "value") |>
  mutate(
    number = str_extract(variable, "\\d+"),
    type   = case_when(
      str_starts(variable, "estimate") ~ "rating",
      str_ends(variable,   "_yn")      ~ "exposed",
      str_starts(variable, "anchor")   ~ "anchor_reported"
    )
  ) |>
  select(-variable) |>
  pivot_wider(id_cols = c(id, number),
              names_from = type, values_from = value) |>
  mutate(
    anchor_value  = anchors[paste0("estimate_", number)],
    correct_anchor = anchor_reported == anchor_value,
    retained       = exposed == 1 & correct_anchor,
    status = case_when(
      retained                          ~ "Retained (used correct anchor)",
      exposed == 1 & !correct_anchor    ~ "Used anchor, wrong value",
      exposed == 2                      ~ "Did not use anchor",
      exposed == 3                      ~ "Maybe used anchor",
      TRUE                              ~ "Other / missing"
    )
  ) |>
  group_by(Item = paste0("Estimate ", number,
                         " (anchor = ", anchor_value, ")"), status) |>
  summarise(n = n(), .groups = "drop") |>
  group_by(Item) |>
  mutate(
    total = sum(n),
    pct   = round(n / total * 100, 1)
  ) |>
  ungroup() |>
  mutate(cell = paste0(n, " (", pct, "%)")) |>
  select(Item, status, cell) |>
  pivot_wider(names_from = status, values_from = cell, values_fill = "0 (0%)") |>
  kbl(caption = paste0(
        "Retention by estimate. 'Retained' = participant reported using the ",
        "correct anchor. ",
        "N total per item = ", n_total, "."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11)
Retention by estimate. 'Retained' = participant reported using the correct anchor. N total per item = 543.
Item Did not use anchor Maybe used anchor Retained (used correct anchor) Used anchor, wrong value
Estimate 1 (anchor = 1776) 153 (28.2%) 44 (8.1%) 255 (47%) 91 (16.8%)
Estimate 2 (anchor = 365) 140 (25.8%) 29 (5.3%) 320 (58.9%) 54 (9.9%)
Estimate 3 (anchor = 9) 129 (23.8%) 27 (5%) 319 (58.7%) 68 (12.5%)
Estimate 4 (anchor = 212) 181 (33.3%) 48 (8.8%) 65 (12%) 249 (45.9%)
Estimate 5 (anchor = 32) 140 (25.8%) 48 (8.8%) 226 (41.6%) 129 (23.8%)
Estimate 6 (anchor = 50) 133 (24.5%) 39 (7.2%) 350 (64.5%) 21 (3.9%)
Note

Interestingly, without any AC exclusions (but with the correct anchor use exclusions), I’m not confident that we replicated the original anchoring-and-adjustment effect. But as you can see in the table above, many participants were excluded from analysis, especially for estimate 4. Additionally, the cognitive load manipulation was operationalized differently from the original paper. In Epley and Gilovich (2006), the control group was not required to remember any length of letters; whereas in this study, the control group received a low cognitive load manipulation.

Code
anchors <- c(estimate_1 = 1776, estimate_2 = 365, estimate_3 = 9,
             estimate_4 = 212,  estimate_5 = 32,   estimate_6 = 50)

anchor_summary <- function(data, exclude_ids = NULL, label = "No exclusion",
                           usage_filter = c(1)) {
  d <- if (!is.null(exclude_ids)) filter(data, !id %in% exclude_ids) else data
  d |>
    select(id, treatment,
           estimate_1, anchor_1_yn, anchor_1,
           estimate_2, anchor_2_yn, anchor_2,
           estimate_3, anchor_3_yn, anchor_3,
           estimate_4, anchor_4_yn, anchor_4,
           estimate_5, anchor_5_yn, anchor_5,
           estimate_6, anchor_6_yn, anchor_6) |>
    pivot_longer(cols = -c(id, treatment), names_to = "variable", values_to = "value") |>
    mutate(
      number = str_extract(variable, "\\d+"),
      type   = case_when(
        str_starts(variable, "estimate") ~ "rating",
        str_ends(variable,   "_yn")      ~ "exposed",
        str_starts(variable, "anchor")   ~ "anchor_reported"
      )
    ) |>
    select(-variable) |>
    pivot_wider(id_cols = c(id, treatment, number),
                names_from = type, values_from = value) |>
    mutate(anchor_value = anchors[paste0("estimate_", number)]) |>
    filter(anchor_reported == anchor_value, exposed %in% usage_filter) |>
    mutate(distance = abs(rating - anchor_value)) |>
    group_by(number, treatment) |>
    summarise(
      n               = n(),
      mean_rating     = round(mean(rating,    na.rm = TRUE), 2),
      median_rating   = round(median(rating,  na.rm = TRUE), 2),
      mean_distance   = round(mean(distance,  na.rm = TRUE), 2),
      median_distance = round(median(distance, na.rm = TRUE), 2),
      .groups         = "drop"
    ) |>
    mutate(
      anchor_value = anchors[paste0("estimate_", number)],
      Condition    = factor(treatment, levels = c(0, 1),
                            labels = c("Low-interference", "High-interference")),
      exclusion    = label
    )
}

exclusion_sets <- list(
  "No exclusion"         = NULL,
  "Excl: IMC"            = students_data |> filter(ac_imc_failed        == 1) |> pull(id),
  "Excl: Straightline"   = students_data |> filter(ac_stline_failed     == 1) |> pull(id),
  "Excl: Obvious (T/F)"  = students_data |> filter(ac_obvious_tf_failed == 1) |> pull(id),
  "Excl: Eng. Comp."     = students_data |> filter(ac_eng_failed        == 1) |> pull(id),
  "Excl: Obvious (5-pt)" = students_data |> filter(ac_obvious_failed    == 1) |> pull(id),
  "Excl: Any AC failed"  = students_data |> filter(ac_count_failed      >= 1) |> pull(id)
)

all_results <- imap_dfr(exclusion_sets, function(ids, label) {
  anchor_summary(students_data, exclude_ids = ids, label = label,
                 usage_filter = c(1))
})

all_results_maybe <- imap_dfr(exclusion_sets, function(ids, label) {
  anchor_summary(students_data, exclude_ids = ids, label = label,
                 usage_filter = c(1, 3))
})

all_results_any <- imap_dfr(exclusion_sets, function(ids, label) {
  anchor_summary(students_data, exclude_ids = ids, label = label,
                 usage_filter = c(1, 2, 3))
})

results <- all_results |>
  filter(exclusion == "No exclusion") |>
  mutate(number = paste("Estimate", number))

anchor_long <- students_data |>
  select(id, treatment,
         estimate_1, anchor_1_yn, anchor_1,
         estimate_2, anchor_2_yn, anchor_2,
         estimate_3, anchor_3_yn, anchor_3,
         estimate_4, anchor_4_yn, anchor_4,
         estimate_5, anchor_5_yn, anchor_5,
         estimate_6, anchor_6_yn, anchor_6) |>
  pivot_longer(cols = -c(id, treatment), names_to = "variable", values_to = "value") |>
  mutate(
    number = str_extract(variable, "\\d+"),
    type   = case_when(
      str_starts(variable, "estimate") ~ "rating",
      str_ends(variable,   "_yn")      ~ "exposed",
      str_starts(variable, "anchor")   ~ "anchor_reported"
    )
  ) |>
  select(-variable) |>
  pivot_wider(id_cols = c(id, treatment, number),
              names_from = type, values_from = value) |>
  mutate(anchor_value = anchors[paste0("estimate_", number)]) |>
  filter(exposed == 1, anchor_reported == anchor_value)
Code
all_results |>
  filter(exclusion == "No exclusion") |>
  select(Item = number, `Anchor value` = anchor_value, Condition, N = n,
         `Mean estimate` = mean_rating, `Median estimate` = median_rating,
         `Mean distance` = mean_distance, `Median distance` = median_distance) |>
  mutate(Item = paste("Estimate", Item)) |>
  kbl(caption = "Estimates and distance from anchor value by condition — correctly anchored participants only)",
      align = c("l", "r", "l", "r", "r", "r", "r", "r"), booktabs = TRUE) |>
  kable_styling(full_width = FALSE) |>
  collapse_rows(columns = 1:2, valign = "top")
Estimates and distance from anchor value by condition — correctly anchored participants only)
Item Anchor value Condition N Mean estimate Median estimate Mean distance Median distance
Estimate 1 1776 Low-interference 135 1775.26 1776 9.36 0
High-interference 120 1779.35 1776 4.35 0
Estimate 2 365 Low-interference 163 390.61 365 176.75 115
High-interference 157 370.57 365 138.98 115
Estimate 3 9 Low-interference 175 11.31 10 3.86 3
High-interference 144 11.29 9 3.97 3
Estimate 4 212 Low-interference 33 202.79 212 25.70 0
High-interference 32 202.78 212 28.97 0
Estimate 5 32 Low-interference 105 -8.96 0 46.07 32
High-interference 121 -0.35 0 33.60 32
Estimate 6 50 Low-interference 181 35.07 38 14.93 12
High-interference 169 32.20 35 17.80 15

We observe the largest changes in estimated effect size, based on different AC exclusions, for estimate 2.

Code
all_results |>
  select(exclusion, number, Condition, n, mean_distance, median_distance, anchor_value) |>
  pivot_wider(
    names_from  = Condition,
    values_from = c(n, mean_distance, median_distance)
  ) |>
  mutate(
    `Δ mean (High − Low)`   = round(`mean_distance_High-interference`   - `mean_distance_Low-interference`,   2),
    `Δ median (High − Low)` = round(`median_distance_High-interference` - `median_distance_Low-interference`, 2),
    Item = paste0("Est. ", number, " (anchor = ", anchor_value, ")")
  ) |>
  select(
    `Exclusion rule`     = exclusion,
    Item,
    `N (Low)`            = `n_Low-interference`,
    `N (High)`           = `n_High-interference`,
    `M dist (Low)`       = `mean_distance_Low-interference`,
    `Mdn dist (Low)`     = `median_distance_Low-interference`,
    `M dist (High)`      = `mean_distance_High-interference`,
    `Mdn dist (High)`    = `median_distance_High-interference`,
    `Δ mean (High − Low)`,
    `Δ median (High − Low)`
  ) |>
  arrange(Item, `Exclusion rule`) |>
  kbl(caption = paste0(
        "Mean and median distance from anchor by condition under each exclusion rule. ",
        "Negative Δ = High-interference closer to anchor (stronger anchoring). ",
        "Restricted to confirmed-anchor participants."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11) |>
  collapse_rows(columns = 1, valign = "top")
Mean and median distance from anchor by condition under each exclusion rule. Negative Δ = High-interference closer to anchor (stronger anchoring). Restricted to confirmed-anchor participants.
Exclusion rule Item N (Low) N (High) M dist (Low) Mdn dist (Low) M dist (High) Mdn dist (High) Δ mean (High − Low) Δ median (High − Low)
Excl: Any AC failed Est. 1 (anchor = 1776) 96 81 9.55 0.0 3.84 0.0 -5.71 0.0
Excl: Eng. Comp. Est. 1 (anchor = 1776) 114 96 9.26 0.0 3.85 0.0 -5.41 0.0
Excl: IMC Est. 1 (anchor = 1776) 118 106 9.46 0.0 4.25 0.0 -5.21 0.0
Excl: Obvious (5-pt) Est. 1 (anchor = 1776) 134 119 9.42 0.0 4.39 0.0 -5.03 0.0
Excl: Obvious (T/F) Est. 1 (anchor = 1776) 132 119 9.56 0.0 4.29 0.0 -5.27 0.0
Excl: Straightline Est. 1 (anchor = 1776) 135 120 9.36 0.0 4.35 0.0 -5.01 0.0
No exclusion Est. 1 (anchor = 1776) 135 120 9.36 0.0 4.35 0.0 -5.01 0.0
Excl: Any AC failed Est. 2 (anchor = 365) 118 101 180.70 101.5 143.89 121.0 -36.81 19.5
Excl: Eng. Comp. Est. 2 (anchor = 365) 142 124 170.82 100.0 136.23 116.5 -34.59 16.5
Excl: IMC Est. 2 (anchor = 365) 140 131 177.26 109.0 146.75 117.0 -30.51 8.0
Excl: Obvious (5-pt) Est. 2 (anchor = 365) 162 155 177.84 116.0 138.95 115.0 -38.89 -1.0
Excl: Obvious (T/F) Est. 2 (anchor = 365) 160 157 175.22 115.0 138.98 115.0 -36.24 0.0
Excl: Straightline Est. 2 (anchor = 365) 161 157 176.82 115.0 138.98 115.0 -37.84 0.0
No exclusion Est. 2 (anchor = 365) 163 157 176.75 115.0 138.98 115.0 -37.77 0.0
Excl: Any AC failed Est. 3 (anchor = 9) 126 92 4.01 3.0 4.01 3.0 0.00 0.0
Excl: Eng. Comp. Est. 3 (anchor = 9) 148 112 3.77 3.0 3.99 3.0 0.22 0.0
Excl: IMC Est. 3 (anchor = 9) 154 118 3.96 3.0 3.92 3.0 -0.04 0.0
Excl: Obvious (5-pt) Est. 3 (anchor = 9) 174 143 3.89 3.0 3.93 3.0 0.04 0.0
Excl: Obvious (T/F) Est. 3 (anchor = 9) 170 144 3.81 3.0 3.97 3.0 0.16 0.0
Excl: Straightline Est. 3 (anchor = 9) 173 144 3.91 3.0 3.97 3.0 0.06 0.0
No exclusion Est. 3 (anchor = 9) 175 144 3.86 3.0 3.97 3.0 0.11 0.0
Excl: Any AC failed Est. 4 (anchor = 212) 24 22 19.42 0.0 27.50 0.0 8.08 0.0
Excl: Eng. Comp. Est. 4 (anchor = 212) 29 26 25.79 0.0 27.96 0.0 2.17 0.0
Excl: IMC Est. 4 (anchor = 212) 29 27 20.83 0.0 29.81 0.0 8.98 0.0
Excl: Obvious (5-pt) Est. 4 (anchor = 212) 32 31 25.88 0.0 29.90 0.0 4.02 0.0
Excl: Obvious (T/F) Est. 4 (anchor = 212) 32 32 25.31 0.0 28.97 0.0 3.66 0.0
Excl: Straightline Est. 4 (anchor = 212) 32 32 25.88 0.0 28.97 0.0 3.09 0.0
No exclusion Est. 4 (anchor = 212) 33 32 25.70 0.0 28.97 0.0 3.27 0.0
Excl: Any AC failed Est. 5 (anchor = 32) 68 79 44.24 32.0 32.96 32.0 -11.28 0.0
Excl: Eng. Comp. Est. 5 (anchor = 32) 86 92 42.59 32.0 33.53 32.0 -9.06 0.0
Excl: IMC Est. 5 (anchor = 32) 87 106 47.98 32.0 34.29 32.0 -13.69 0.0
Excl: Obvious (5-pt) Est. 5 (anchor = 32) 103 119 46.34 32.0 33.06 32.0 -13.28 0.0
Excl: Obvious (T/F) Est. 5 (anchor = 32) 101 120 45.36 32.0 33.88 32.0 -11.48 0.0
Excl: Straightline Est. 5 (anchor = 32) 102 121 47.11 32.0 33.60 32.0 -13.51 0.0
No exclusion Est. 5 (anchor = 32) 105 121 46.07 32.0 33.60 32.0 -12.47 0.0
Excl: Any AC failed Est. 6 (anchor = 50) 127 112 15.18 10.0 17.82 15.0 2.64 5.0
Excl: Eng. Comp. Est. 6 (anchor = 50) 151 136 14.86 10.0 17.15 14.0 2.29 4.0
Excl: IMC Est. 6 (anchor = 50) 158 142 15.24 12.0 17.99 15.0 2.75 3.0
Excl: Obvious (5-pt) Est. 6 (anchor = 50) 179 168 15.07 12.0 17.80 15.0 2.73 3.0
Excl: Obvious (T/F) Est. 6 (anchor = 50) 177 168 14.87 12.0 17.91 15.0 3.04 3.0
Excl: Straightline Est. 6 (anchor = 50) 179 169 14.89 12.0 17.80 15.0 2.91 3.0
No exclusion Est. 6 (anchor = 50) 181 169 14.93 12.0 17.80 15.0 2.87 3.0
Code
all_results |>
  select(exclusion, number, Condition, median_distance, anchor_value) |>
  pivot_wider(names_from = Condition, values_from = median_distance) |>
  mutate(
    delta     = `High-interference` - `Low-interference`,
    Item      = paste0("Est. ", number, " (anchor = ", anchor_value, ")"),
    exclusion = factor(exclusion, levels = names(exclusion_sets))
  ) |>
  ggplot(aes(x = exclusion, y = delta, group = Item, color = Item)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  geom_line(linewidth = 0.7, alpha = 0.7) +
  geom_point(size = 2.5) +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(
    x       = "Exclusion rule",
    y       = "Δ median distance (High − Low)",
    color   = "Item",
    caption = "Note. Negative values = High-interference closer to anchor (stronger anchoring). Each line traces one estimate item across exclusion rules."
  ) +
  theme_minimal(base_size = 11) +
  theme(
    legend.position  = "right",
    plot.caption     = element_text(hjust = 0, size = 9, color = "grey40"),
    panel.grid.minor = element_blank()
  )

How the anchoring effect (High − Low median distance) shifts under different AC exclusion rules. Flat lines indicate the effect is robust to exclusion choice; variable lines indicate sensitivity.

Scale Effects

This section examines the time perception paradigm (Siddiqui et al. 2017). Participants were shown a hypothetical shipping scenario in which delivery time was presented either in days or hours (e.g., contracted vs. expanded scale). We focus on two outcomes: (1) shipping preference (e_days_hours), indicating whether participants chose expedited or standard shipping, and (2) perceived duration (time_perception), a 0–100 slider capturing how long the wait time felt. Consistent with prior work, we would expect that in the days (contracted scale; low-interference) condition, participants are more likely to choose standard shipping and perceive the duration as shorter.

Note

Interestingly, without any AC-based exclusions, we do not replicate the original scale expansion/contraction effect on shipping choice. When applying different AC exclusion criteria, the pattern appears to reverse slightly, with more participants in the expanded scale (hours) condition opting for standard shipping. At the same time, participants in the expanded scale condition report longer perceived durations than those in the contracted scale condition, which is consistent with prior findings. One possible explanation is that features of the pricing structure (e.g., relative cost differences between expedited and standard shipping) may have attenuated or overridden the expected effect of scale framing on choice. Experimental stimuli should be revised before running future iterations of this study.

Code
scale_data <- students_data |>
  select(id, treatment, e_days_hours, time_perception) |>
  mutate(
    Condition = factor(treatment, levels = c(0, 1),
                       labels = c("Contracted Scale", "Expanded Scale"))
  )

# ── Shipping preference: proportion choosing standard shipping (value == 2) ──
shipping_summary <- scale_data |>
  filter(!is.na(e_days_hours)) |>
  group_by(Condition) |>
  summarise(
    N                   = n(),
    N_standard          = sum(e_days_hours == 2),
    N_expedited         = sum(e_days_hours == 1),
    `% standard (2)`    = round(N_standard  / N * 100, 1),
    `% expedited (1)`   = round(N_expedited / N * 100, 1),
    .groups = "drop"
  )

shipping_summary |>
  kbl(caption = "Shipping preference by condition (1 = Expedited, 2 = Standard)",
      booktabs = TRUE) |>
  kable_styling(full_width = FALSE)
Shipping preference by condition (1 = Expedited, 2 = Standard)
Condition N N_standard N_expedited % standard (2) % expedited (1)
Contracted Scale 281 228 53 81.1 18.9
Expanded Scale 262 220 42 84.0 16.0
Code
# ── Perceived duration: mean and SD of time_perception slider ────────────────
perception_summary <- scale_data |>
  filter(!is.na(time_perception)) |>
  group_by(Condition) |>
  summarise(
    N    = n(),
    M    = round(mean(time_perception, na.rm = TRUE), 2),
    SD   = round(sd(time_perception,   na.rm = TRUE), 2),
    Mdn  = round(median(time_perception, na.rm = TRUE), 2),
    .groups = "drop"
  )

perception_summary |>
  kbl(caption = "Perceived duration (0 = very short, 100 = very long) by condition",
      booktabs = TRUE) |>
  kable_styling(full_width = FALSE)
Perceived duration (0 = very short, 100 = very long) by condition
Condition N M SD Mdn
Contracted Scale 281 43.62 25.38 42.0
Expanded Scale 262 46.15 25.79 44.5
Code
scale_summary <- function(data, exclude_ids = NULL, label = "No exclusion") {
  d <- if (!is.null(exclude_ids)) filter(data, !id %in% exclude_ids) else data
  d |>
    select(id, treatment, e_days_hours, time_perception) |>
    filter(!is.na(e_days_hours) | !is.na(time_perception)) |>
    mutate(Condition = factor(treatment, levels = c(0, 1),
                              labels = c("Contracted Scale", "Expanded Scale"))) |>
    group_by(Condition) |>
    summarise(
      n               = n(),
      pct_standard    = round(sum(e_days_hours == 2, na.rm = TRUE) / sum(!is.na(e_days_hours)) * 100, 1),
      pct_expedited   = round(sum(e_days_hours == 1, na.rm = TRUE) / sum(!is.na(e_days_hours)) * 100, 1),
      mean_perception = round(mean(time_perception,   na.rm = TRUE), 2),
      sd_perception   = round(sd(time_perception,     na.rm = TRUE), 2),
      mdn_perception  = round(median(time_perception, na.rm = TRUE), 2),
      .groups         = "drop"
    ) |>
    mutate(exclusion = label)
}

all_scale_results <- imap_dfr(exclusion_sets, function(ids, label) {
  scale_summary(students_data, exclude_ids = ids, label = label)
}) |>
  mutate(exclusion = factor(exclusion, levels = names(exclusion_sets)))

# ── Table: shipping preference ────────────────────────────────────────────────
all_scale_results |>
  select(exclusion, Condition, n, pct_standard, pct_expedited) |>
  pivot_wider(
    names_from  = Condition,
    values_from = c(n, pct_standard, pct_expedited)
  ) |>
  mutate(
    `Δ % standard (Expanded − Contracted)` = round(
      `pct_standard_Expanded Scale` - `pct_standard_Contracted Scale`, 1)
  ) |>
  select(
    `Exclusion rule`                       = exclusion,
    `N (Contracted)`                       = `n_Contracted Scale`,
    `N (Expanded)`                         = `n_Expanded Scale`,
    `% standard (Contracted)`              = `pct_standard_Contracted Scale`,
    `% standard (Expanded)`                = `pct_standard_Expanded Scale`,
    `Δ % standard (Expanded − Contracted)`
  ) |>
  kbl(caption = paste0(
        "Shipping preference (% choosing standard) by scale condition under each exclusion rule. ",
        "Positive Δ = higher rate of standard shipping in expanded scale condition."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11)
Shipping preference (% choosing standard) by scale condition under each exclusion rule. Positive Δ = higher rate of standard shipping in expanded scale condition.
Exclusion rule N (Contracted) N (Expanded) % standard (Contracted) % standard (Expanded) Δ % standard (Expanded − Contracted)
No exclusion 281 262 81.1 84.0 2.9
Excl: IMC 237 212 83.1 85.4 2.3
Excl: Straightline 275 262 82.2 84.0 1.8
Excl: Obvious (T/F) 274 259 81.4 83.8 2.4
Excl: Eng. Comp. 231 206 79.7 83.5 3.8
Excl: Obvious (5-pt) 279 259 81.7 83.8 2.1
Excl: Any AC failed 188 161 81.9 83.9 2.0
Code
# ── Table: perceived duration ─────────────────────────────────────────────────
all_scale_results |>
  select(exclusion, Condition, n, mean_perception, sd_perception, mdn_perception) |>
  pivot_wider(
    names_from  = Condition,
    values_from = c(n, mean_perception, sd_perception, mdn_perception)
  ) |>
  mutate(
    `Δ mean (Expanded − Contracted)` = round(
      `mean_perception_Expanded Scale` - `mean_perception_Contracted Scale`, 2),
    `Δ mdn (Expanded − Contracted)`  = round(
      `mdn_perception_Expanded Scale`  - `mdn_perception_Contracted Scale`,  2)
  ) |>
  select(
    `Exclusion rule`                 = exclusion,
    `N (Contracted)`                 = `n_Contracted Scale`,
    `N (Expanded)`                   = `n_Expanded Scale`,
    `M (Contracted)`                 = `mean_perception_Contracted Scale`,
    `SD (Contracted)`                = `sd_perception_Contracted Scale`,
    `Mdn (Contracted)`               = `mdn_perception_Contracted Scale`,
    `M (Expanded)`                   = `mean_perception_Expanded Scale`,
    `SD (Expanded)`                  = `sd_perception_Expanded Scale`,
    `Mdn (Expanded)`                 = `mdn_perception_Expanded Scale`,
    `Δ mean (Expanded − Contracted)`,
    `Δ mdn (Expanded − Contracted)`
  ) |>
  kbl(caption = paste0(
        "Perceived duration (0–100 slider) by scale condition under each exclusion rule. ",
        "Positive Δ = expanded scale condition perceived wait as longer."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11) |>
  scroll_box(width = "100%")
Perceived duration (0–100 slider) by scale condition under each exclusion rule. Positive Δ = expanded scale condition perceived wait as longer.
Exclusion rule N (Contracted) N (Expanded) M (Contracted) SD (Contracted) Mdn (Contracted) M (Expanded) SD (Expanded) Mdn (Expanded) Δ mean (Expanded − Contracted) Δ mdn (Expanded − Contracted)
No exclusion 281 262 43.62 25.38 42.0 46.15 25.79 44.5 2.53 2.5
Excl: IMC 237 212 43.62 24.34 41.0 46.84 25.54 44.0 3.22 3.0
Excl: Straightline 275 262 43.58 25.38 41.0 46.15 25.79 44.5 2.57 3.5
Excl: Obvious (T/F) 274 259 43.54 25.23 42.5 46.16 25.70 44.0 2.62 1.5
Excl: Eng. Comp. 231 206 43.59 25.59 43.0 46.22 25.52 44.5 2.63 1.5
Excl: Obvious (5-pt) 279 259 43.70 25.30 42.0 46.40 25.75 45.0 2.70 3.0
Excl: Any AC failed 188 161 43.39 24.70 42.0 46.97 24.95 45.0 3.58 3.0
Code
# ── Plot: Δ % standard across exclusion rules ─────────────────────────────────
all_scale_results |>
  select(exclusion, Condition, pct_standard) |>
  pivot_wider(names_from = Condition, values_from = pct_standard) |>
  mutate(delta = `Expanded Scale` - `Contracted Scale`) |>
  ggplot(aes(x = exclusion, y = delta, group = 1)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  geom_line(linewidth = 0.7, color = "#2E5F8A") +
  geom_point(size = 2.5, color = "#2E5F8A") +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(
    x       = "Exclusion rule",
    y       = "Δ % standard shipping (Expanded − Contracted)",
    caption = "Note. Positive values = higher rate of standard shipping in expanded scale condition."
  ) +
  theme_minimal(base_size = 11) +
  theme(
    plot.caption     = element_text(hjust = 0, size = 9, color = "grey40"),
    panel.grid.minor = element_blank()
  )

Code
# ── Plot: Δ median perception across exclusion rules ─────────────────────────
all_scale_results |>
  select(exclusion, Condition, mdn_perception) |>
  pivot_wider(names_from = Condition, values_from = mdn_perception) |>
  mutate(delta = `Expanded Scale` - `Contracted Scale`) |>
  ggplot(aes(x = exclusion, y = delta, group = 1)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  geom_line(linewidth = 0.7, color = "#E07B3F") +
  geom_point(size = 2.5, color = "#E07B3F") +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(
    x       = "Exclusion rule",
    y       = "Δ median perceived duration (Expanded − Contracted)",
    caption = "Note. Positive values = expanded scale condition perceived wait as longer."
  ) +
  theme_minimal(base_size = 11) +
  theme(
    plot.caption     = element_text(hjust = 0, size = 9, color = "grey40"),
    panel.grid.minor = element_blank()
  )


Summary

[Key findings to be filled in once all analyses are complete.]

Goal Key Finding
1. AC sensitivity to manipulations
2. AC convergence
3. Effect size sensitivity to AC choice