Within-Subjects Attention Check Study: Student Sample

Author

Jamie C. Lee

Published

April 10, 2026

Experimental Design

Overview

We conducted a within-subjects study in which undergraduate students completed five “JDM-esque” paradigms, each followed by an attention check (AC). The order of paradigm–AC pairs was fully randomized across participants. Participants were randomly assigned to a low-interference or high-interference condition, which determined the version of each paradigm they encountered (aka which manipulation they received). The median completion time was 12.1 minutes.

Because prior research suggests that student samples are among the most attentive participant pools (e.g., Krefeld-Schwalb et al., 2024), we embedded this study at the end of a required hour-long experimental session to increase the likelihood of fatigue-induced AC failures.

Study Goals

This study has three goals:

AC sensitivity to attentional manipulations — Do ACs pick up on the effects of experimentally induced attentional interference (e.g., does cognitive load increase AC failure rates)?
AC convergence — Do different ACs identify the same individuals as inattentive, and does any apparent (dis)agreement depend on which paradigm preceded each AC?
Downstream consequences — Do estimated treatment effects differ depending on which AC is used to exclude participants?

Paradigms and Manipulations

Each participant completed five JDM paradigms. In the high-interference condition, manipulations were designed to impair AC performance; in the low-interference condition, control versions imposed minimal interference. The conditions for two paradigms (Snack Preferences and Time Perception) were not expected to affect AC performance.

Paradigm	Citation	Manipulation (High vs. Low)	Expected AC Impact?
Anchor-and-Adjustment	Epley & Gilovich (2006)	High vs. low cognitive load	Yes, particularly IMC and English Comprehension
Processing Fluency	Alter & Oppenheimer (2006)	Disfluent vs. fluent font	Yes, particularly English Comprehension
Word Search	Andre	Hard vs. easy search	Yes, all ACs
Snack Preferences	Winterich & Haws (2011)	Hope vs. pride induction	No
Time Perception	Siddiqui et al. (2017)	Expanded vs. contracted scale	No

Attention Checks

AC	Item	Variable
IMC	“…To show you are paying attention…please click Other and write: reading is good.”	`ac_imc_failed`
Straightline	“For this item, select ‘not at all’.” (embedded in PANAS)	`ac_stline_failed`
Obvious (T/F)	“The sky is blue.” (True/False)	`ac_obvious_tf_failed`
Eng. Comprehension	“The trophy doesn’t fit…because it’s too small. What is too small?”	`ac_eng_failed`
Obvious (5-pt)	“How often do you breathe oxygen?” (Never–Always)	`ac_obvious_failed`

All checks used, except for the Obvious (T/F), were taken from Perfecto and O’Donnell (2025).

Sample

N = 543 undergraduate students from a large Midwestern university.

Illustration of experiment flow and paradigms:

Data Dictionary

Participant Info

Variable_Name	Variable_Type	Variable_Description	Value_Coding
id	numeric	Unique ID for identifying each participant	1 to 543
treatment	numeric	Binary indicator for condition assignment	0 = Low-interference, 1 = High-interference
gender	numeric	Participant's self-reported gender	1 = Female, 2 = Male, 3 = Non-Binary, 4 = Prefer not to say
birth_country	numeric	Participant's birth country	1 = United States, 2 = Other, 4 = Prefer not to say
first_lang	numeric	Participant's first language	1 = English, 2 = Other language, 4 = Prefer not to say

Attention Checks

Variable_Name	Variable_Type	Variable_Description	Value_Coding
ac_imc_failed	numeric	Imagine bringing home a new puppy to your family...To show you are paying attention to this one, please click Other and write: reading is good.	0 = Passed, 1 = Failed
ac_stline_failed	numeric	For this item, select 'not at all'. (Item embedded in PANAS scale.)	0 = Passed, 1 = Failed
ac_obvious_tf_failed	numeric	The sky is blue. (Options: True or False)	0 = Passed, 1 = Failed
ac_eng_failed	numeric	The trophy doesn't fit into the brown suitcase because it's too small. What is too small? (Text input)	0 = Passed, 1 = Failed
ac_obvious_failed	numeric	How often do you breathe oxygen? (5 options from 1: Never to 5: Always)	0 = Passed, 1 = Failed
ac_count_failed	numeric	Total number of ACs failed (out of 5)
exp_*	numeric	The extent to which the participant is certain that they have seen or have not seen a version of an AC question prior to taking this survey	1 = Definitely have not seen before today, 2 = May not have seen before today, 3 = Unsure if seen before today, 4 = May have seen before today, 5 = Definitely have seen before today
ex_*	text	Participant's recalled example(s) of similar ACs seen prior to survey (shown only if exp_* response was 4 or 5)

Task and AC Order Variables

Variable_Name	Variable_Type	Variable_Description	Value_Coding
task_[1-5]	character	JDM paradigm presented in that slot	a = Anchor-and-Adjustment, b = Processing Fluency, c = Word Search, d = Snack Preferences, e = Time Perception
ac_[1-5]	character	AC presented in that slot	a = IMC, b = Straightline, c = Obvious (T/F), d = English Comprehension, e = Obvious (5-pt)
task_order	text	Comma-separated order of JDM paradigms presented
ac_order	text	Comma-separated order of ACs presented

Anchor-and-Adjustment Variables

Variable_Name	Variable_Type	Variable_Description	Variable_Language_or_Value_Coding
estimate_[1-6]	numeric	Participant's estimate for each anchor-and-adjustment question	estimate_1 = Year Washington first elected; estimate_2 = Days for Mars orbit; estimate_3 = Months elephant pregnancy; estimate_4 = Boiling point of water at Everest summit (°F); estimate_5 = Freezing point of vodka (°F); estimate_6 = Number of U.S. states in 1880
anchor_[1-6]	numeric	Participant's (potential) anchor value relied on for each question	anchor_1 = Year U.S. declared independence; anchor_2 = Days for Earth orbit; anchor_3 = Months human pregnancy; anchor_4 = Boiling point of water at sea level (°F); anchor_5 = Freezing point of water (°F); anchor_6 = Current number of U.S. states
anchor_[1-6]_yn	numeric	Whether the participant reported relying on the anchor	1 = Yes, 2 = No, 3 = Maybe
load_removal	character	Cognitive load removal string recalled by participant	Low-interference: DK. High-interference: DKOUFWLVJ.
t_load_*	numeric	Timer variables for cognitive load manipulation page

Processing Fluency Variables

Variable_Name	Variable_Type	Variable_Description	Value_Coding
fluent_[1-10]	numeric	Participant's rating of how easy or difficult it would be to pronounce each hypothetical brand name	1 = Very easy, 2 = Relatively easy, 3 = Relatively difficult, 4 = Very difficult

Word Search Variables

Variable_Name	Variable_Type	Variable_Description	Value_Coding_and_Notes
words_found	numeric	Total number of words found (out of 5)	0–5
timing_words_found	text	Timing since start of task for each word found (in seconds)	Comma-separated; -1 = word not found
t_wordsearch_*	numeric	Timer variables for word search task

Snack Preferences Variables

Variable_Name	Variable_Type	Variable_Description	Value_Coding
emo_check_[1-4]	numeric	Manipulation check items
energy	numeric	Check for whether arousal was unintentionally manipulated by the scenarios	1 = Not at all emotionally aroused to 7 = Extremely emotionally aroused
snack_preferences	text	Participant's ideal snacks to hypothetically receive as a thank-you
t_emo_scenario_*	numeric	Timer variables for hope vs. pride (emotion) manipulation

Time Perception Variables

Variable_Name	Variable_Type	Variable_Description	Value_Coding
e_days_hours	numeric	Participant's preference for expedited or standard shipping when time is presented in either days or hours	1 = Expedited shipping, 2 = Standard shipping
time_perception	numeric	How long did the time period between now and standard shipping delivery feel?	0 (very short) to 100 (very long) slider

Goal 1: Are ACs sensitive to the manipulations?

If ACs are valid measures of attentiveness, failure rates should be higher in the high-interference condition — especially for paradigms designed to impair attention (cognitive load, processing fluency, and word search). Failure rates for paradigms not designed to impair attention (snack preferences and time perception) are not expected to differ by condition.

Overall failure rates by condition

Failure rates are substantially lower for the Obvious and Straightline checks than for the IMC and English Comprehension checks. Given the near-ceiling pass rates for the former, further analysis of variation within those checks is likely not to be very informative.

When aggregated across paradigms, there does not appear to be an obvious difference in failure rates between conditions. However, this aggregation may obscure meaningful variation: failure rates may depend jointly on condition (high- vs. low-interference) and the specific paradigm preceding each AC. The next subsection presents these disaggregated results.

Code

treatment_summary <- ac_data |>
  left_join(students_data |> select(id, treatment), by = "id") |>
  pivot_longer(all_of(ac_vars), names_to = "ac", values_to = "failed") |>
  mutate(
    AC        = ac_labels[ac],
    Condition = ifelse(treatment == 1, "High-interference", "Low-interference")
  ) |>
  group_by(AC, Condition) |>
  summarise(N = n(), N_failed = sum(failed, na.rm = TRUE),
            Pct = N_failed / N, .groups = "drop")

treatment_summary |>
  mutate(`% failed` = percent(Pct, accuracy = 0.1)) |>
  select(AC, Condition, N, N_failed, `% failed`) |>
  arrange(AC, Condition) |>
  kbl(caption = "AC failure rates by experimental condition") |>
  kable_styling(full_width = FALSE) |>
  collapse_rows(columns = 1, valign = "middle")

AC failure rates by experimental condition
AC	Condition	N	N_failed	% failed
Eng. Comprehension	High-interference	262	56	21.4%
Eng. Comprehension	Low-interference	281	50	17.8%
IMC	High-interference	262	50	19.1%
IMC	Low-interference	281	44	15.7%
Obvious (5-pt)	High-interference	262	3	1.1%
Obvious (5-pt)	Low-interference	281	2	0.7%
Obvious (T/F)	High-interference	262	3	1.1%
Obvious (T/F)	Low-interference	281	7	2.5%
Straightline	High-interference	262	0	0.0%
Straightline	Low-interference	281	6	2.1%

Code

treatment_summary |>
  ggplot(aes(x = reorder(AC, -Pct), y = Pct, fill = Condition)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  geom_text(
    aes(label = paste0(N_failed, "\n(", percent(Pct, accuracy = 0.1), ")")),
    position = position_dodge(width = 0.7),
    vjust = -0.3, size = 3
  ) +
  scale_y_continuous(labels = percent_format(), expand = expansion(mult = c(0, .15))) +
  scale_fill_manual(values = c("Low-interference" = "#B4B2A9", "High-interference" = "#639922")) +
  labs(x = NULL, y = "% failed", fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "top")

AC failure rates by condition. If the manipulations work, high-interference bars should be taller.

Failure rates by preceding task and condition

Because ACs were embedded within or immediately following paradigms, failure rates may vary depending on which task preceded the AC. The table below breaks down failure rates by both preceding task and condition, which lets us assess whether specific manipulations drove failures for specific ACs.

Although the sample sizes are small, the observed patterns are somewhat consistent with our predictions. The cognitive load manipulation (in the Anchor-and-Adjustment paradigm) and the word search difficulty manipulation appear to increase failure rates on both the IMC and English Comprehension checks. In contrast, the processing disfluency (vs. fluency) manipulation does not show a clear effect on either check. Notably, engaging with the fluency paradigm itself - regardless of condition - appears to reduce performance on the English Comprehension check more than on the IMC.

Code

ac_task_results <- bind_rows(
  students_data |> select(treatment, task_before = task_before_a, failure = ac_imc_failed)         |> mutate(ac = "IMC"),
  students_data |> select(treatment, task_before = task_before_b, failure = ac_stline_failed)      |> mutate(ac = "Straightline"),
  students_data |> select(treatment, task_before = task_before_c, failure = ac_obvious_tf_failed)  |> mutate(ac = "Obvious (T/F)"),
  students_data |> select(treatment, task_before = task_before_d, failure = ac_eng_failed)         |> mutate(ac = "Eng. Comprehension"),
  students_data |> select(treatment, task_before = task_before_e, failure = ac_obvious_failed)     |> mutate(ac = "Obvious (5-pt)")
) |>
  filter(!is.na(task_before))

ac_task_results |>
  group_by(ac, task_before, treatment) |>
  summarise(
    n         = n(),
    n_fail    = sum(failure == 1, na.rm = TRUE),
    fail_rate = round(mean(failure == 1, na.rm = TRUE) * 100, 1),
    .groups   = "drop"
  ) |>
  mutate(
    task_before = task_labels[task_before],
    treatment   = factor(treatment, labels = c("Low", "High"))
  ) |>
  pivot_wider(
    names_from  = treatment,
    values_from = c(n, n_fail, fail_rate),
    names_glue  = "{treatment}_{.value}"
  ) |>
  arrange(ac, task_before) |>
  select(ac, task_before,
         Low_n, Low_n_fail, Low_fail_rate,
         High_n, High_n_fail, High_fail_rate) |>
  rename(
    `Attention Check` = ac,
    `Preceding Task`  = task_before,
    `N`               = Low_n,   `N Failed`        = Low_n_fail,  `Failure Rate (%)` = Low_fail_rate,
    `N `              = High_n,  `N Failed `       = High_n_fail, `Failure Rate (%) ` = High_fail_rate
  ) |>
  kable(
    caption = "AC failure rates by preceding task and condition",
    align   = c("l", "l", "r", "r", "r", "r", "r", "r"),
    booktabs = TRUE
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE, font_size = 12) |>
  collapse_rows(columns = 1, valign = "top") |>
  add_header_above(c(" " = 2, "Low-Interference" = 3, "High-Interference" = 3))

AC failure rates by preceding task and condition
		Low-Interference			High-Interference
Attention Check	Preceding Task	N	N Failed	Failure Rate (%)	N	N Failed	Failure Rate (%)
Eng. Comprehension	Anchor-and-Adjustment	61	8	13.1	49	10	20.4
	Processing Fluency	59	14	23.7	60	13	21.7
	Snack Preferences	60	12	20.0	45	12	26.7
	Time Perception	47	7	14.9	54	9	16.7
	Word Search	54	9	16.7	54	12	22.2
IMC	Anchor-and-Adjustment	50	7	14.0	57	12	21.1
	Processing Fluency	59	10	16.9	57	8	14.0
	Snack Preferences	50	9	18.0	41	6	14.6
	Time Perception	65	9	13.8	58	12	20.7
	Word Search	57	9	15.8	49	12	24.5
Obvious (5-pt)	Anchor-and-Adjustment	53	0	0.0	57	1	1.8
	Processing Fluency	54	0	0.0	48	0	0.0
	Snack Preferences	54	0	0.0	62	2	3.2
	Time Perception	62	0	0.0	47	0	0.0
	Word Search	58	2	3.4	48	0	0.0
Obvious (T/F)	Anchor-and-Adjustment	49	2	4.1	51	0	0.0
	Processing Fluency	56	1	1.8	49	1	2.0
	Snack Preferences	59	3	5.1	60	2	3.3
	Time Perception	56	0	0.0	45	0	0.0
	Word Search	61	1	1.6	57	0	0.0
Straightline	Anchor-and-Adjustment	68	0	0.0	48	0	0.0
	Processing Fluency	53	4	7.5	48	0	0.0
	Snack Preferences	58	1	1.7	54	0	0.0
	Time Perception	51	0	0.0	58	0	0.0
	Word Search	51	1	2.0	54	0	0.0

Goal 2: Do different ACs flag the same participants?

A key question is whether ACs converge on a shared set of inattentive participants or whether they each identify largely non-overlapping subgroups. Because ACs were randomized across positions and paradigms, we also examine whether apparent (dis)agreement between ACs is explained by which task preceded each AC.

Overall failure rates

Code

ac_data |>
  summarise(across(all_of(ac_vars), \(x) sum(x, na.rm = TRUE))) |>
  pivot_longer(everything(), names_to = "ac", values_to = "n_failed") |>
  mutate(
    AC         = ac_labels[ac],
    `N failed` = n_failed,
    `% failed` = percent(n_failed / n_total, accuracy = 0.1)
  ) |>
  arrange(desc(n_failed)) |>
  select(AC, `N failed`, `% failed`) |>
  kbl(caption = paste0("AC failure counts and rates (N = ", n_total, ")")) |>
  kable_styling(full_width = FALSE)

AC failure counts and rates (N = 543)
AC	N failed	% failed
Eng. Comprehension	106	19.5%
IMC	94	17.3%
Obvious (T/F)	10	1.8%
Straightline	6	1.1%
Obvious (5-pt)	5	0.9%

Pairwise overlap

For each pair of ACs, the cell shows: among participants who failed check A, how many (n, %) also failed check B? Low values indicate the checks are flagging largely non-overlapping subsets.

Code

overlap_mat_n   <- matrix(NA_real_, 5, 5, dimnames = list(ac_labels, ac_labels))
overlap_mat_pct <- matrix(NA_real_, 5, 5, dimnames = list(ac_labels, ac_labels))

for (i in seq_along(ac_vars)) {
  for (j in seq_along(ac_vars)) {
    if (i == j) next
    fail_i <- ac_data[[ac_vars[i]]] == 1 & !is.na(ac_data[[ac_vars[i]]])
    fail_j <- ac_data[[ac_vars[j]]] == 1 & !is.na(ac_data[[ac_vars[j]]])
    overlap_mat_n[i, j]   <- sum(fail_i & fail_j, na.rm = TRUE)
    overlap_mat_pct[i, j] <- mean(fail_j[fail_i], na.rm = TRUE)
  }
}

overlap_combined <- matrix(
  ifelse(is.na(overlap_mat_n), "—",
         paste0(overlap_mat_n, " (", round(overlap_mat_pct * 100, 1), "%)")),
  nrow = 5, dimnames = dimnames(overlap_mat_n)
)

as.data.frame(overlap_combined) |>
  kbl(caption = "Row: n (%) of that check's failers who *also* failed the column check") |>
  kable_styling(full_width = FALSE, font_size = 12)

Row: n (%) of that check's failers who *also* failed the column check
	IMC	Straightline	Obvious (T/F)	Eng. Comprehension	Obvious (5-pt)
IMC	—	4 (4.3%)	2 (2.1%)	15 (16%)	3 (3.2%)
Straightline	4 (66.7%)	—	1 (16.7%)	2 (33.3%)	1 (16.7%)
Obvious (T/F)	2 (20%)	1 (10%)	—	3 (30%)	0 (0%)
Eng. Comprehension	15 (14.2%)	2 (1.9%)	3 (2.8%)	—	0 (0%)
Obvious (5-pt)	3 (60%)	1 (20%)	0 (0%)	0 (0%)	—

UpSet plot: Which combination of ACs flags each participant?

Each bar in the UpSet plot represents a unique combination of ACs failed. If single-check bars dominate, the checks are largely flagging different individuals.

Code

upset_data <- ac_data |>
  select(all_of(ac_vars)) |>
  rename_with(\(x) ac_labels[x]) |>
  as.data.frame()

upset(
  upset_data,
  sets            = rev(ac_labels),
  order.by        = "freq",
  decreasing      = TRUE,
  mb.ratio        = c(0.6, 0.4),
  text.scale      = c(1.3, 1.1, 1, 1, 1.1, 1),
  point.size      = 2.8,
  line.size       = 0.8,
  mainbar.y.label = "Intersection size",
  sets.x.label    = "N failed",
  keep.order      = FALSE
)

UpSet plot of AC failure combinations. Bars dominated by single-AC intersections indicate low convergence across checks.

Jaccard similarity heatmap

Jaccard similarity ranges from 0 (no shared failers) to 1 (identical failer sets). Values near 0 confirm the checks are identifying mostly different participants.

Code

jaccard <- function(a, b) {
  a <- a == 1 & !is.na(a); b <- b == 1 & !is.na(b)
  sum(a & b) / sum(a | b)
}

jac_mat <- matrix(NA_real_, 5, 5, dimnames = list(ac_labels, ac_labels))
for (i in seq_along(ac_vars))
  for (j in seq_along(ac_vars))
    jac_mat[i, j] <- jaccard(ac_data[[ac_vars[i]]], ac_data[[ac_vars[j]]])

pheatmap(
  jac_mat,
  color           = colorRampPalette(c("#F1EFE8", "#639922"))(50),
  display_numbers = TRUE,
  number_format   = "%.2f",
  cluster_rows    = TRUE,
  cluster_cols    = TRUE,
  fontsize        = 10,
  main            = "Jaccard similarity between AC failer sets",
  angle_col       = 45
)

Pairwise Jaccard similarity between AC failer sets.

Does preceding paradigm explain the (dis)agreement between ACs?

Because ACs were randomized, each AC type was preceded by different paradigms for different participants. This means apparent non-overlap between ACs could partly reflect different task contexts rather than the ACs measuring different constructs. The analyses below assess the severity of this confound.

Step 1: Balance check — how evenly was each task distributed as the preceding task?

If the randomization worked, each AC should have been approximately equally preceded by each of the five paradigms. Imbalance here would mean some AC pairs are more confounded than others.

Randomization looks fine.

Code

ac_task_map <- list(
  IMC          = "task_before_a",
  Straightline = "task_before_b",
  `Obvious (T/F)` = "task_before_c",
  `Eng. Comprehension` = "task_before_d",
  `Obvious (5-pt)` = "task_before_e"
)

balance_table <- map_dfr(names(ac_task_map), function(ac_name) {
  students_data |>
    count(.data[[ac_task_map[[ac_name]]]], name = "n") |>
    rename(task_before = 1) |>
    mutate(
      ac         = ac_name,
      pct        = round(n / sum(n) * 100, 1),
      task_before = task_labels[task_before]
    )
})

balance_table |>
  pivot_wider(names_from = ac, values_from = c(n, pct),
              names_glue = "{ac}_{.value}") |>
  rename(`Preceding Task` = task_before) |>
  kbl(caption = "Distribution of preceding tasks per AC (n and %)") |>
  kable_styling(full_width = FALSE, font_size = 11)

Distribution of preceding tasks per AC (n and %)
Preceding Task	IMC_n	Straightline_n	Obvious (T/F)_n	Eng. Comprehension_n	Obvious (5-pt)_n	IMC_pct	Straightline_pct	Obvious (T/F)_pct	Eng. Comprehension_pct	Obvious (5-pt)_pct
Anchor-and-Adjustment	107	116	100	110	110	19.7	21.4	18.4	20.3	20.3
Processing Fluency	116	101	105	119	102	21.4	18.6	19.3	21.9	18.8
Snack Preferences	91	112	119	105	116	16.8	20.6	21.9	19.3	21.4
Word Search	106	105	118	108	106	19.5	19.3	21.7	19.9	19.5
Time Perception	123	109	101	101	109	22.7	20.1	18.6	18.6	20.1

Step 2: Joint overlap — does AC agreement vary by preceding paradigms?

Note

I’m not sure if this is the most appropriate analysis to conduct. But basically, I want to see if the low degree of overlap between who is flagged as inattentive by the IMC and English Comprehension check can be explained by the fact that people see different paradigms before each check, and some of the paradigms could affect their likelihood of passing/failing certain checks more or less than others.

To assess whether paradigm context influences which participants are flagged by different ACs, we compute agreement rates while conditioning on the paradigms that preceded both ACs in each pair. Specifically, for each AC pair, we calculate the proportion of participants for whom both ACs yield the same pass/fail classification within each combination of the paradigm preceding AC1 and the paradigm preceding AC2.

If agreement varies substantially across rows, this suggests that paradigm context affects which participants are jointly flagged. On the other hand, if agreement for an AC pair is relatively stable (and low across paradigm pairs), this suggests that differences in preceding paradigms do not meaningfully explain the low overlap across ACs.

Code

joint_overlap <- map_dfr(ac_pairs, function(pair) {
  ac1 <- pair[1]; tb1 <- pair[2]; ac2 <- pair[3]; tb2 <- pair[4]
  
  d <- students_data |>
    select(ac1_val = all_of(ac1), ac2_val = all_of(ac2),
           tb1_val = all_of(tb1), tb2_val = all_of(tb2)) |>
    drop_na()
  
  d |>
    group_by(task_before_ac1 = tb1_val,
             task_before_ac2 = tb2_val) |>
    summarise(
      n         = n(),
      both_fail = round(mean(ac1_val == 1 & ac2_val == 1), 3),
      both_pass = round(mean(ac1_val == 0 & ac2_val == 0), 3),
      agree     = round(mean(ac1_val == ac2_val), 3),
      .groups   = "drop"
    ) |>
    mutate(
      ac_pair = paste(ac_labels[ac1], "×", ac_labels[ac2]),
      task_before_ac1 = task_labels[task_before_ac1],
      task_before_ac2 = task_labels[task_before_ac2]
    )
}) |>
  select(ac_pair, task_before_ac1, task_before_ac2, n, both_fail, both_pass, agree)

joint_overlap |>
  rename(
    `AC Pair` = ac_pair,
    `Preceding Paradigm (AC1)` = task_before_ac1,
    `Preceding Paradigm (AC2)` = task_before_ac2,
    N = n,
    `P(both fail)` = both_fail,
    `P(both pass)` = both_pass,
    `P(agree)` = agree
  ) |>
  kbl(caption = "Pairwise AC agreement stratified by preceding paradigms for both ACs") |>
  kable_styling(full_width = FALSE, font_size = 11)

Pairwise AC agreement stratified by preceding paradigms for both ACs
AC Pair	Preceding Paradigm (AC1)	Preceding Paradigm (AC2)	N	P(both fail)	P(both pass)	P(agree)
IMC × Eng. Comprehension	Anchor-and-Adjustment	Processing Fluency	27	0.037	0.630	0.667
IMC × Eng. Comprehension	Anchor-and-Adjustment	Snack Preferences	24	0.125	0.542	0.667
IMC × Eng. Comprehension	Anchor-and-Adjustment	Word Search	27	0.037	0.667	0.704
IMC × Eng. Comprehension	Anchor-and-Adjustment	Time Perception	29	0.034	0.552	0.586
IMC × Eng. Comprehension	Processing Fluency	Anchor-and-Adjustment	33	0.000	0.515	0.515
IMC × Eng. Comprehension	Processing Fluency	Snack Preferences	35	0.029	0.829	0.857
IMC × Eng. Comprehension	Processing Fluency	Word Search	26	0.000	0.769	0.769
IMC × Eng. Comprehension	Processing Fluency	Time Perception	22	0.000	0.727	0.727
IMC × Eng. Comprehension	Snack Preferences	Anchor-and-Adjustment	22	0.000	0.591	0.591
IMC × Eng. Comprehension	Snack Preferences	Processing Fluency	30	0.000	0.733	0.733
IMC × Eng. Comprehension	Snack Preferences	Word Search	20	0.000	0.500	0.500
IMC × Eng. Comprehension	Snack Preferences	Time Perception	19	0.000	0.842	0.842
IMC × Eng. Comprehension	Word Search	Anchor-and-Adjustment	32	0.062	0.750	0.812
IMC × Eng. Comprehension	Word Search	Processing Fluency	27	0.037	0.741	0.778
IMC × Eng. Comprehension	Word Search	Snack Preferences	16	0.000	0.500	0.500
IMC × Eng. Comprehension	Word Search	Time Perception	31	0.000	0.645	0.645
IMC × Eng. Comprehension	Time Perception	Anchor-and-Adjustment	23	0.000	0.696	0.696
IMC × Eng. Comprehension	Time Perception	Processing Fluency	35	0.057	0.600	0.657
IMC × Eng. Comprehension	Time Perception	Snack Preferences	30	0.033	0.667	0.700
IMC × Eng. Comprehension	Time Perception	Word Search	35	0.057	0.629	0.686
IMC × Obvious (5-pt)	Anchor-and-Adjustment	Processing Fluency	35	0.000	0.857	0.857
IMC × Obvious (5-pt)	Anchor-and-Adjustment	Snack Preferences	24	0.000	0.875	0.875
IMC × Obvious (5-pt)	Anchor-and-Adjustment	Word Search	25	0.040	0.800	0.840
IMC × Obvious (5-pt)	Anchor-and-Adjustment	Time Perception	23	0.000	0.739	0.739
IMC × Obvious (5-pt)	Processing Fluency	Anchor-and-Adjustment	24	0.000	0.875	0.875
IMC × Obvious (5-pt)	Processing Fluency	Snack Preferences	26	0.038	0.808	0.846
IMC × Obvious (5-pt)	Processing Fluency	Word Search	25	0.000	0.840	0.840
IMC × Obvious (5-pt)	Processing Fluency	Time Perception	41	0.000	0.829	0.829
IMC × Obvious (5-pt)	Snack Preferences	Anchor-and-Adjustment	27	0.000	0.852	0.852
IMC × Obvious (5-pt)	Snack Preferences	Processing Fluency	19	0.000	0.684	0.684
IMC × Obvious (5-pt)	Snack Preferences	Word Search	24	0.000	0.958	0.958
IMC × Obvious (5-pt)	Snack Preferences	Time Perception	21	0.000	0.810	0.810
IMC × Obvious (5-pt)	Word Search	Anchor-and-Adjustment	26	0.000	0.808	0.808
IMC × Obvious (5-pt)	Word Search	Processing Fluency	23	0.000	0.826	0.826
IMC × Obvious (5-pt)	Word Search	Snack Preferences	33	0.000	0.758	0.758
IMC × Obvious (5-pt)	Word Search	Time Perception	24	0.000	0.833	0.833
IMC × Obvious (5-pt)	Time Perception	Anchor-and-Adjustment	33	0.000	0.848	0.848
IMC × Obvious (5-pt)	Time Perception	Processing Fluency	25	0.000	0.880	0.880
IMC × Obvious (5-pt)	Time Perception	Snack Preferences	33	0.030	0.788	0.818
IMC × Obvious (5-pt)	Time Perception	Word Search	32	0.000	0.781	0.781
IMC × Obvious (T/F)	Anchor-and-Adjustment	Processing Fluency	22	0.000	0.773	0.773
IMC × Obvious (T/F)	Anchor-and-Adjustment	Snack Preferences	30	0.000	0.767	0.767
IMC × Obvious (T/F)	Anchor-and-Adjustment	Word Search	23	0.000	0.826	0.826
IMC × Obvious (T/F)	Anchor-and-Adjustment	Time Perception	32	0.000	0.906	0.906
IMC × Obvious (T/F)	Processing Fluency	Anchor-and-Adjustment	32	0.000	0.844	0.844
IMC × Obvious (T/F)	Processing Fluency	Snack Preferences	32	0.031	0.719	0.750
IMC × Obvious (T/F)	Processing Fluency	Word Search	33	0.000	0.848	0.848
IMC × Obvious (T/F)	Processing Fluency	Time Perception	19	0.000	0.842	0.842
IMC × Obvious (T/F)	Snack Preferences	Anchor-and-Adjustment	19	0.000	0.842	0.842
IMC × Obvious (T/F)	Snack Preferences	Processing Fluency	20	0.000	0.750	0.750
IMC × Obvious (T/F)	Snack Preferences	Word Search	26	0.000	0.846	0.846
IMC × Obvious (T/F)	Snack Preferences	Time Perception	26	0.000	0.846	0.846
IMC × Obvious (T/F)	Word Search	Anchor-and-Adjustment	27	0.000	0.852	0.852
IMC × Obvious (T/F)	Word Search	Processing Fluency	26	0.000	0.808	0.808
IMC × Obvious (T/F)	Word Search	Snack Preferences	29	0.000	0.690	0.690
IMC × Obvious (T/F)	Word Search	Time Perception	24	0.000	0.875	0.875
IMC × Obvious (T/F)	Time Perception	Anchor-and-Adjustment	22	0.000	0.818	0.818
IMC × Obvious (T/F)	Time Perception	Processing Fluency	37	0.027	0.811	0.838
IMC × Obvious (T/F)	Time Perception	Snack Preferences	28	0.000	0.786	0.786
IMC × Obvious (T/F)	Time Perception	Word Search	36	0.000	0.806	0.806
IMC × Straightline	Anchor-and-Adjustment	Processing Fluency	23	0.043	0.826	0.870
IMC × Straightline	Anchor-and-Adjustment	Snack Preferences	29	0.000	0.862	0.862
IMC × Straightline	Anchor-and-Adjustment	Word Search	32	0.000	0.844	0.844
IMC × Straightline	Anchor-and-Adjustment	Time Perception	23	0.000	0.739	0.739
IMC × Straightline	Processing Fluency	Anchor-and-Adjustment	27	0.000	0.852	0.852
IMC × Straightline	Processing Fluency	Snack Preferences	23	0.000	0.826	0.826
IMC × Straightline	Processing Fluency	Word Search	32	0.000	0.781	0.781
IMC × Straightline	Processing Fluency	Time Perception	34	0.000	0.882	0.882
IMC × Straightline	Snack Preferences	Anchor-and-Adjustment	23	0.000	0.870	0.870
IMC × Straightline	Snack Preferences	Processing Fluency	22	0.045	0.909	0.955
IMC × Straightline	Snack Preferences	Word Search	21	0.000	0.810	0.810
IMC × Straightline	Snack Preferences	Time Perception	25	0.000	0.760	0.760
IMC × Straightline	Word Search	Anchor-and-Adjustment	21	0.000	0.762	0.762
IMC × Straightline	Word Search	Processing Fluency	30	0.033	0.700	0.733
IMC × Straightline	Word Search	Snack Preferences	28	0.000	0.929	0.929
IMC × Straightline	Word Search	Time Perception	27	0.000	0.815	0.815
IMC × Straightline	Time Perception	Anchor-and-Adjustment	45	0.000	0.800	0.800
IMC × Straightline	Time Perception	Processing Fluency	26	0.038	0.769	0.808
IMC × Straightline	Time Perception	Snack Preferences	32	0.000	0.875	0.875
IMC × Straightline	Time Perception	Word Search	20	0.000	0.850	0.850
Eng. Comprehension × Obvious (5-pt)	Anchor-and-Adjustment	Processing Fluency	23	0.000	0.870	0.870
Eng. Comprehension × Obvious (5-pt)	Anchor-and-Adjustment	Snack Preferences	29	0.000	0.862	0.862
Eng. Comprehension × Obvious (5-pt)	Anchor-and-Adjustment	Word Search	30	0.000	0.767	0.767
Eng. Comprehension × Obvious (5-pt)	Anchor-and-Adjustment	Time Perception	28	0.000	0.786	0.786
Eng. Comprehension × Obvious (5-pt)	Processing Fluency	Anchor-and-Adjustment	33	0.000	0.909	0.909
Eng. Comprehension × Obvious (5-pt)	Processing Fluency	Snack Preferences	32	0.000	0.594	0.594
Eng. Comprehension × Obvious (5-pt)	Processing Fluency	Word Search	27	0.000	0.704	0.704
Eng. Comprehension × Obvious (5-pt)	Processing Fluency	Time Perception	27	0.000	0.852	0.852
Eng. Comprehension × Obvious (5-pt)	Snack Preferences	Anchor-and-Adjustment	23	0.000	0.696	0.696
Eng. Comprehension × Obvious (5-pt)	Snack Preferences	Processing Fluency	24	0.000	0.667	0.667
Eng. Comprehension × Obvious (5-pt)	Snack Preferences	Word Search	30	0.000	0.867	0.867
Eng. Comprehension × Obvious (5-pt)	Snack Preferences	Time Perception	28	0.000	0.821	0.821
Eng. Comprehension × Obvious (5-pt)	Word Search	Anchor-and-Adjustment	30	0.000	0.700	0.700
Eng. Comprehension × Obvious (5-pt)	Word Search	Processing Fluency	25	0.000	0.840	0.840
Eng. Comprehension × Obvious (5-pt)	Word Search	Snack Preferences	27	0.000	0.704	0.704
Eng. Comprehension × Obvious (5-pt)	Word Search	Time Perception	26	0.000	0.962	0.962
Eng. Comprehension × Obvious (5-pt)	Time Perception	Anchor-and-Adjustment	24	0.000	0.958	0.958
Eng. Comprehension × Obvious (5-pt)	Time Perception	Processing Fluency	30	0.000	0.833	0.833
Eng. Comprehension × Obvious (5-pt)	Time Perception	Snack Preferences	28	0.000	0.893	0.893
Eng. Comprehension × Obvious (5-pt)	Time Perception	Word Search	19	0.000	0.579	0.579
Eng. Comprehension × Obvious (T/F)	Anchor-and-Adjustment	Processing Fluency	28	0.000	0.857	0.857
Eng. Comprehension × Obvious (T/F)	Anchor-and-Adjustment	Snack Preferences	30	0.033	0.767	0.800
Eng. Comprehension × Obvious (T/F)	Anchor-and-Adjustment	Word Search	30	0.000	0.833	0.833
Eng. Comprehension × Obvious (T/F)	Anchor-and-Adjustment	Time Perception	22	0.000	0.864	0.864
Eng. Comprehension × Obvious (T/F)	Processing Fluency	Anchor-and-Adjustment	28	0.000	0.679	0.679
Eng. Comprehension × Obvious (T/F)	Processing Fluency	Snack Preferences	23	0.000	0.913	0.913
Eng. Comprehension × Obvious (T/F)	Processing Fluency	Word Search	38	0.026	0.763	0.789
Eng. Comprehension × Obvious (T/F)	Processing Fluency	Time Perception	30	0.000	0.733	0.733
Eng. Comprehension × Obvious (T/F)	Snack Preferences	Anchor-and-Adjustment	28	0.000	0.750	0.750
Eng. Comprehension × Obvious (T/F)	Snack Preferences	Processing Fluency	27	0.000	0.704	0.704
Eng. Comprehension × Obvious (T/F)	Snack Preferences	Word Search	25	0.000	0.840	0.840
Eng. Comprehension × Obvious (T/F)	Snack Preferences	Time Perception	25	0.000	0.720	0.720
Eng. Comprehension × Obvious (T/F)	Word Search	Anchor-and-Adjustment	21	0.000	0.905	0.905
Eng. Comprehension × Obvious (T/F)	Word Search	Processing Fluency	28	0.036	0.714	0.750
Eng. Comprehension × Obvious (T/F)	Word Search	Snack Preferences	35	0.000	0.829	0.829
Eng. Comprehension × Obvious (T/F)	Word Search	Time Perception	24	0.000	0.750	0.750
Eng. Comprehension × Obvious (T/F)	Time Perception	Anchor-and-Adjustment	23	0.000	0.913	0.913
Eng. Comprehension × Obvious (T/F)	Time Perception	Processing Fluency	22	0.000	0.864	0.864
Eng. Comprehension × Obvious (T/F)	Time Perception	Snack Preferences	31	0.000	0.710	0.710
Eng. Comprehension × Obvious (T/F)	Time Perception	Word Search	25	0.000	0.840	0.840
Eng. Comprehension × Straightline	Anchor-and-Adjustment	Processing Fluency	26	0.038	0.846	0.885
Eng. Comprehension × Straightline	Anchor-and-Adjustment	Snack Preferences	29	0.000	0.793	0.793
Eng. Comprehension × Straightline	Anchor-and-Adjustment	Word Search	18	0.056	0.778	0.833
Eng. Comprehension × Straightline	Anchor-and-Adjustment	Time Perception	37	0.000	0.865	0.865
Eng. Comprehension × Straightline	Processing Fluency	Anchor-and-Adjustment	31	0.000	0.677	0.677
Eng. Comprehension × Straightline	Processing Fluency	Snack Preferences	34	0.000	0.794	0.794
Eng. Comprehension × Straightline	Processing Fluency	Word Search	27	0.000	0.815	0.815
Eng. Comprehension × Straightline	Processing Fluency	Time Perception	27	0.000	0.815	0.815
Eng. Comprehension × Straightline	Snack Preferences	Anchor-and-Adjustment	30	0.000	0.933	0.933
Eng. Comprehension × Straightline	Snack Preferences	Processing Fluency	19	0.000	0.737	0.737
Eng. Comprehension × Straightline	Snack Preferences	Word Search	34	0.000	0.706	0.706
Eng. Comprehension × Straightline	Snack Preferences	Time Perception	22	0.000	0.682	0.682
Eng. Comprehension × Straightline	Word Search	Anchor-and-Adjustment	30	0.000	0.800	0.800
Eng. Comprehension × Straightline	Word Search	Processing Fluency	29	0.000	0.759	0.759
Eng. Comprehension × Straightline	Word Search	Snack Preferences	26	0.000	0.808	0.808
Eng. Comprehension × Straightline	Word Search	Time Perception	23	0.000	0.783	0.783
Eng. Comprehension × Straightline	Time Perception	Anchor-and-Adjustment	25	0.000	0.880	0.880
Eng. Comprehension × Straightline	Time Perception	Processing Fluency	27	0.000	0.741	0.741
Eng. Comprehension × Straightline	Time Perception	Snack Preferences	23	0.000	0.826	0.826
Eng. Comprehension × Straightline	Time Perception	Word Search	26	0.000	0.885	0.885
Obvious (5-pt) × Obvious (T/F)	Anchor-and-Adjustment	Processing Fluency	24	0.000	0.958	0.958
Obvious (5-pt) × Obvious (T/F)	Anchor-and-Adjustment	Snack Preferences	32	0.000	0.938	0.938
Obvious (5-pt) × Obvious (T/F)	Anchor-and-Adjustment	Word Search	27	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Anchor-and-Adjustment	Time Perception	27	0.000	0.963	0.963
Obvious (5-pt) × Obvious (T/F)	Processing Fluency	Anchor-and-Adjustment	17	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Processing Fluency	Snack Preferences	32	0.000	0.969	0.969
Obvious (5-pt) × Obvious (T/F)	Processing Fluency	Word Search	29	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Processing Fluency	Time Perception	24	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Snack Preferences	Anchor-and-Adjustment	27	0.000	0.963	0.963
Obvious (5-pt) × Obvious (T/F)	Snack Preferences	Processing Fluency	30	0.000	0.967	0.967
Obvious (5-pt) × Obvious (T/F)	Snack Preferences	Word Search	36	0.000	0.972	0.972
Obvious (5-pt) × Obvious (T/F)	Snack Preferences	Time Perception	23	0.000	0.957	0.957
Obvious (5-pt) × Obvious (T/F)	Word Search	Anchor-and-Adjustment	24	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Word Search	Processing Fluency	30	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Word Search	Snack Preferences	25	0.000	0.880	0.880
Obvious (5-pt) × Obvious (T/F)	Word Search	Time Perception	27	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Time Perception	Anchor-and-Adjustment	32	0.000	0.938	0.938
Obvious (5-pt) × Obvious (T/F)	Time Perception	Processing Fluency	21	0.000	1.000	1.000
Obvious (5-pt) × Obvious (T/F)	Time Perception	Snack Preferences	30	0.000	0.967	0.967
Obvious (5-pt) × Obvious (T/F)	Time Perception	Word Search	26	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Anchor-and-Adjustment	Processing Fluency	29	0.000	0.966	0.966
Obvious (5-pt) × Straightline	Anchor-and-Adjustment	Snack Preferences	28	0.000	0.929	0.929
Obvious (5-pt) × Straightline	Anchor-and-Adjustment	Word Search	27	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Anchor-and-Adjustment	Time Perception	26	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Processing Fluency	Anchor-and-Adjustment	27	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Processing Fluency	Snack Preferences	27	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Processing Fluency	Word Search	25	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Processing Fluency	Time Perception	23	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Snack Preferences	Anchor-and-Adjustment	36	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Snack Preferences	Processing Fluency	28	0.000	0.964	0.964
Obvious (5-pt) × Straightline	Snack Preferences	Word Search	20	0.000	0.900	0.900
Obvious (5-pt) × Straightline	Snack Preferences	Time Perception	32	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Word Search	Anchor-and-Adjustment	27	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Word Search	Processing Fluency	24	0.042	0.875	0.917
Obvious (5-pt) × Straightline	Word Search	Snack Preferences	27	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Word Search	Time Perception	28	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Time Perception	Anchor-and-Adjustment	26	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Time Perception	Processing Fluency	20	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Time Perception	Snack Preferences	30	0.000	1.000	1.000
Obvious (5-pt) × Straightline	Time Perception	Word Search	33	0.000	0.970	0.970
Obvious (T/F) × Straightline	Anchor-and-Adjustment	Processing Fluency	23	0.000	1.000	1.000
Obvious (T/F) × Straightline	Anchor-and-Adjustment	Snack Preferences	26	0.000	1.000	1.000
Obvious (T/F) × Straightline	Anchor-and-Adjustment	Word Search	28	0.000	0.929	0.929
Obvious (T/F) × Straightline	Anchor-and-Adjustment	Time Perception	23	0.000	1.000	1.000
Obvious (T/F) × Straightline	Processing Fluency	Anchor-and-Adjustment	31	0.000	0.968	0.968
Obvious (T/F) × Straightline	Processing Fluency	Snack Preferences	28	0.000	0.964	0.964
Obvious (T/F) × Straightline	Processing Fluency	Word Search	21	0.000	0.952	0.952
Obvious (T/F) × Straightline	Processing Fluency	Time Perception	25	0.000	1.000	1.000
Obvious (T/F) × Straightline	Snack Preferences	Anchor-and-Adjustment	27	0.000	0.963	0.963
Obvious (T/F) × Straightline	Snack Preferences	Processing Fluency	32	0.000	0.938	0.938
Obvious (T/F) × Straightline	Snack Preferences	Word Search	30	0.033	0.900	0.933
Obvious (T/F) × Straightline	Snack Preferences	Time Perception	30	0.000	0.967	0.967
Obvious (T/F) × Straightline	Word Search	Anchor-and-Adjustment	38	0.000	0.974	0.974
Obvious (T/F) × Straightline	Word Search	Processing Fluency	18	0.000	1.000	1.000
Obvious (T/F) × Straightline	Word Search	Snack Preferences	31	0.000	1.000	1.000
Obvious (T/F) × Straightline	Word Search	Time Perception	31	0.000	1.000	1.000
Obvious (T/F) × Straightline	Time Perception	Anchor-and-Adjustment	20	0.000	1.000	1.000
Obvious (T/F) × Straightline	Time Perception	Processing Fluency	28	0.000	0.929	0.929
Obvious (T/F) × Straightline	Time Perception	Snack Preferences	27	0.000	1.000	1.000
Obvious (T/F) × Straightline	Time Perception	Word Search	26	0.000	1.000	1.000

Prior exposure to each AC type

We also tested whether participants who reported prior exposure to a given AC type were less likely to fail it. In principle, prior exposure should improve a participant’s chance of passing a particular AC. This expectation was largely not supported — with the possible exception of the Straightline check.

Two plausible explanations: (1) open-ended responses suggest some participants misunderstood what counts as an AC, introducing measurement error in the prior exposure variable; and (2) students received course credit regardless of AC performance, so prior exposure may not have translated into greater effort to pass.

Code

exposure_pairs <- list(
  ac_obvious_failed    = "exp_obvious",
  ac_obvious_tf_failed = "exp_obvious",
  ac_stline_failed     = "exp_straightline",
  ac_imc_failed        = "exp_imc",
  ac_eng_failed        = "exp_english"
)

exposure_data <- students_data |>
  select(id, starts_with("exp_")) |>
  left_join(ac_data, by = "id")

exposure_summary <- imap_dfr(exposure_pairs, function(exp_var, ac_var) {
  if (!exp_var %in% names(exposure_data)) return(NULL)
  exposure_data |>
    filter(!is.na(.data[[ac_var]]), !is.na(.data[[exp_var]])) |>
    group_by(Status = ifelse(.data[[ac_var]] == 1, "Failed", "Passed")) |>
    summarise(
      AC           = ac_labels[ac_var],
      `Exp. var`   = exp_var,
      N            = n(),
      `M exposure` = round(mean(.data[[exp_var]], na.rm = TRUE), 2),
      SD           = round(sd(.data[[exp_var]],   na.rm = TRUE), 2),
      .groups      = "drop"
    )
})

exposure_summary |>
  select(AC, `Exp. var`, Status, N, `M exposure`, SD) |>
  arrange(AC, Status) |>
  kbl(caption = "Mean prior exposure rating (1–5) by AC pass/fail status") |>
  kable_styling(full_width = FALSE) |>
  collapse_rows(columns = 1:2, valign = "middle")

Mean prior exposure rating (1–5) by AC pass/fail status
AC	Exp. var	Status	N	M exposure	SD
Eng. Comprehension	exp_english	Failed	106	2.16	1.65
Eng. Comprehension	exp_english	Passed	437	1.93	1.44
IMC	exp_imc	Failed	94	2.29	1.67
IMC	exp_imc	Passed	449	2.11	1.50
Obvious (5-pt)	exp_obvious	Failed	5	4.80	0.45
Obvious (5-pt)		Passed	538	3.30	1.58
Obvious (T/F)		Failed	10	3.00	1.94
Obvious (T/F)		Passed	533	3.32	1.57
Straightline	exp_straightline	Failed	6	2.33	1.21
Straightline	exp_straightline	Passed	537	4.05	1.39

Code

imap_dfr(exposure_pairs, function(exp_var, ac_var) {
  if (!exp_var %in% names(exposure_data)) return(NULL)
  exposure_data |>
    filter(!is.na(.data[[ac_var]]), !is.na(.data[[exp_var]])) |>
    mutate(Status = ifelse(.data[[ac_var]] == 1, "Failed", "Passed"),
           AC     = ac_labels[ac_var],
           exp    = .data[[exp_var]])
}) |>
  ggplot(aes(x = factor(exp), fill = Status)) +
  geom_bar(position = "fill") +
  facet_wrap(~ AC, nrow = 2) +
  scale_y_continuous(labels = percent_format()) +
  scale_fill_manual(values = c("Passed" = "#3B8BD4", "Failed" = "#D85A30")) +
  labs(x = "Prior exposure (1 = definitely not seen → 5 = definitely seen)",
       y = "Proportion", fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "top")

Proportion failing each AC at each prior exposure level. If prior exposure helps, the red (Failed) proportion should decrease at higher exposure ratings.

Behavioral correlates of AC failure

As an additional validity check, we compare logged response times and open-ended response length between AC passers and failers within each condition. Failers might show shorter times (rushing) and/or longer times (distracted), and shorter or lower-effort text responses.

Note

The violin plots may not be very meaningful for the checks in which there are very few failures (e.g., Obvious 5-pt, Obvious T/F, and Straightline checks).

Code

behav_data <- students_data |>
  select(id, treatment, t_wordsearch_page_submit, t_load_page_submit,
         total_time_sec, snack_preferences) |>
  mutate(
    condition      = ifelse(treatment == 1, "High", "Low"),
    log_wordsearch = log1p(t_wordsearch_page_submit),
    log_load       = log1p(t_load_page_submit),
    log_total_time = log1p(total_time_sec),
    snack_chars    = nchar(snack_preferences)
  ) |>
  left_join(ac_data, by = "id")

behav_vars <- c(
  "log_wordsearch" = "Log word search time",
  "log_load"       = "Log cognitive load time",
  "log_total_time" = "Log total time",
  "snack_chars"    = "Snack response length (chars)"
)

behav_long <- behav_data |>
  pivot_longer(all_of(ac_vars), names_to = "ac", values_to = "failed") |>
  mutate(AC = ac_labels[ac], Status = ifelse(failed == 1, "Failed", "Passed"))

Code

behav_long |>
  pivot_longer(names(behav_vars), names_to = "measure", values_to = "value") |>
  mutate(measure = behav_vars[measure]) |>
  ggplot(aes(x = Status, y = value, fill = Status, color = Status)) +
  geom_violin(alpha = 0.25, trim = TRUE) +
  geom_jitter(alpha = 0.08, width = 0.15, size = 0.6) +
  stat_summary(fun.data = mean_cl_normal, geom = "pointrange",
               size = 0.6, color = "black") +
  facet_grid(measure ~ AC + condition, scales = "free_y") +
  scale_fill_manual(values  = c("Passed" = "#3B8BD4", "Failed" = "#D85A30")) +
  scale_color_manual(values = c("Passed" = "#3B8BD4", "Failed" = "#D85A30")) +
  labs(x = NULL, y = NULL) +
  theme_minimal(base_size = 10) +
  theme(legend.position = "none", strip.text = element_text(size = 8),
        axis.text.x = element_text(size = 8))

Behavioral measures by AC pass/fail status and condition. Points show means ± 95% CIs.

Goal 3: Do effect sizes change depending on which AC is used for exclusion?

The core question is whether excluding participants based on different ACs yields meaningfully different effect size estimates for each JDM paradigm. For each paradigm × AC combination, we will compare: (a) the effect size in the full sample, (b) the effect size after excluding participants who failed that AC, and (c) how much the estimate changes relative to the full-sample estimate.

Paradigms and their primary outcomes:

Anchor-and-Adjustment: mean distance from anchor value (among anchored participants)
Processing Fluency: mean fluency rating for disfluent vs. fluent brand names
Word Search: words found (hard vs. easy condition)
Snack Preferences: snack choice composition (hope vs. pride condition)
Time Perception: shipping preference / perceived time duration (days vs. hours condition)

Anchoring effects

Among participants who correctly identified the anchor and reported thinking of it (anchor_x_yn == 1), we examine how closely estimates clustered near the anchor value. Higher cognitive load is expected to increase anchoring (i.e., estimates closer to the anchor).

Here is a breakdown of the number of people who are retained (and excluded) based on this criteria:

Code

anchors <- c(estimate_1 = 1776, estimate_2 = 365, estimate_3 = 9,
             estimate_4 = 212,  estimate_5 = 32,   estimate_6 = 50)

students_data |>
  select(id,
         estimate_1, anchor_1_yn, anchor_1,
         estimate_2, anchor_2_yn, anchor_2,
         estimate_3, anchor_3_yn, anchor_3,
         estimate_4, anchor_4_yn, anchor_4,
         estimate_5, anchor_5_yn, anchor_5,
         estimate_6, anchor_6_yn, anchor_6) |>
  pivot_longer(cols = -id, names_to = "variable", values_to = "value") |>
  mutate(
    number = str_extract(variable, "\\d+"),
    type   = case_when(
      str_starts(variable, "estimate") ~ "rating",
      str_ends(variable,   "_yn")      ~ "exposed",
      str_starts(variable, "anchor")   ~ "anchor_reported"
    )
  ) |>
  select(-variable) |>
  pivot_wider(id_cols = c(id, number),
              names_from = type, values_from = value) |>
  mutate(
    anchor_value  = anchors[paste0("estimate_", number)],
    correct_anchor = anchor_reported == anchor_value,
    retained       = exposed == 1 & correct_anchor,
    status = case_when(
      retained                          ~ "Retained (used correct anchor)",
      exposed == 1 & !correct_anchor    ~ "Used anchor, wrong value",
      exposed == 2                      ~ "Did not use anchor",
      exposed == 3                      ~ "Maybe used anchor",
      TRUE                              ~ "Other / missing"
    )
  ) |>
  group_by(Item = paste0("Estimate ", number,
                         " (anchor = ", anchor_value, ")"), status) |>
  summarise(n = n(), .groups = "drop") |>
  group_by(Item) |>
  mutate(
    total = sum(n),
    pct   = round(n / total * 100, 1)
  ) |>
  ungroup() |>
  mutate(cell = paste0(n, " (", pct, "%)")) |>
  select(Item, status, cell) |>
  pivot_wider(names_from = status, values_from = cell, values_fill = "0 (0%)") |>
  kbl(caption = paste0(
        "Retention by estimate. 'Retained' = participant reported using the ",
        "correct anchor. ",
        "N total per item = ", n_total, "."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11)

Retention by estimate. 'Retained' = participant reported using the correct anchor. N total per item = 543.
Item	Did not use anchor	Maybe used anchor	Retained (used correct anchor)	Used anchor, wrong value
Estimate 1 (anchor = 1776)	153 (28.2%)	44 (8.1%)	255 (47%)	91 (16.8%)
Estimate 2 (anchor = 365)	140 (25.8%)	29 (5.3%)	320 (58.9%)	54 (9.9%)
Estimate 3 (anchor = 9)	129 (23.8%)	27 (5%)	319 (58.7%)	68 (12.5%)
Estimate 4 (anchor = 212)	181 (33.3%)	48 (8.8%)	65 (12%)	249 (45.9%)
Estimate 5 (anchor = 32)	140 (25.8%)	48 (8.8%)	226 (41.6%)	129 (23.8%)
Estimate 6 (anchor = 50)	133 (24.5%)	39 (7.2%)	350 (64.5%)	21 (3.9%)

Note

Interestingly, without any AC exclusions (but with the correct anchor use exclusions), I’m not confident that we replicated the original anchoring-and-adjustment effect. But as you can see in the table above, many participants were excluded from analysis, especially for estimate 4. Additionally, the cognitive load manipulation was operationalized differently from the original paper. In Epley and Gilovich (2006), the control group was not required to remember any length of letters; whereas in this study, the control group received a low cognitive load manipulation.

Code

anchors <- c(estimate_1 = 1776, estimate_2 = 365, estimate_3 = 9,
             estimate_4 = 212,  estimate_5 = 32,   estimate_6 = 50)

anchor_summary <- function(data, exclude_ids = NULL, label = "No exclusion",
                           usage_filter = c(1)) {
  d <- if (!is.null(exclude_ids)) filter(data, !id %in% exclude_ids) else data
  d |>
    select(id, treatment,
           estimate_1, anchor_1_yn, anchor_1,
           estimate_2, anchor_2_yn, anchor_2,
           estimate_3, anchor_3_yn, anchor_3,
           estimate_4, anchor_4_yn, anchor_4,
           estimate_5, anchor_5_yn, anchor_5,
           estimate_6, anchor_6_yn, anchor_6) |>
    pivot_longer(cols = -c(id, treatment), names_to = "variable", values_to = "value") |>
    mutate(
      number = str_extract(variable, "\\d+"),
      type   = case_when(
        str_starts(variable, "estimate") ~ "rating",
        str_ends(variable,   "_yn")      ~ "exposed",
        str_starts(variable, "anchor")   ~ "anchor_reported"
      )
    ) |>
    select(-variable) |>
    pivot_wider(id_cols = c(id, treatment, number),
                names_from = type, values_from = value) |>
    mutate(anchor_value = anchors[paste0("estimate_", number)]) |>
    filter(anchor_reported == anchor_value, exposed %in% usage_filter) |>
    mutate(distance = abs(rating - anchor_value)) |>
    group_by(number, treatment) |>
    summarise(
      n               = n(),
      mean_rating     = round(mean(rating,    na.rm = TRUE), 2),
      median_rating   = round(median(rating,  na.rm = TRUE), 2),
      mean_distance   = round(mean(distance,  na.rm = TRUE), 2),
      median_distance = round(median(distance, na.rm = TRUE), 2),
      .groups         = "drop"
    ) |>
    mutate(
      anchor_value = anchors[paste0("estimate_", number)],
      Condition    = factor(treatment, levels = c(0, 1),
                            labels = c("Low-interference", "High-interference")),
      exclusion    = label
    )
}

exclusion_sets <- list(
  "No exclusion"         = NULL,
  "Excl: IMC"            = students_data |> filter(ac_imc_failed        == 1) |> pull(id),
  "Excl: Straightline"   = students_data |> filter(ac_stline_failed     == 1) |> pull(id),
  "Excl: Obvious (T/F)"  = students_data |> filter(ac_obvious_tf_failed == 1) |> pull(id),
  "Excl: Eng. Comp."     = students_data |> filter(ac_eng_failed        == 1) |> pull(id),
  "Excl: Obvious (5-pt)" = students_data |> filter(ac_obvious_failed    == 1) |> pull(id),
  "Excl: Any AC failed"  = students_data |> filter(ac_count_failed      >= 1) |> pull(id)
)

all_results <- imap_dfr(exclusion_sets, function(ids, label) {
  anchor_summary(students_data, exclude_ids = ids, label = label,
                 usage_filter = c(1))
})

all_results_maybe <- imap_dfr(exclusion_sets, function(ids, label) {
  anchor_summary(students_data, exclude_ids = ids, label = label,
                 usage_filter = c(1, 3))
})

all_results_any <- imap_dfr(exclusion_sets, function(ids, label) {
  anchor_summary(students_data, exclude_ids = ids, label = label,
                 usage_filter = c(1, 2, 3))
})

results <- all_results |>
  filter(exclusion == "No exclusion") |>
  mutate(number = paste("Estimate", number))

anchor_long <- students_data |>
  select(id, treatment,
         estimate_1, anchor_1_yn, anchor_1,
         estimate_2, anchor_2_yn, anchor_2,
         estimate_3, anchor_3_yn, anchor_3,
         estimate_4, anchor_4_yn, anchor_4,
         estimate_5, anchor_5_yn, anchor_5,
         estimate_6, anchor_6_yn, anchor_6) |>
  pivot_longer(cols = -c(id, treatment), names_to = "variable", values_to = "value") |>
  mutate(
    number = str_extract(variable, "\\d+"),
    type   = case_when(
      str_starts(variable, "estimate") ~ "rating",
      str_ends(variable,   "_yn")      ~ "exposed",
      str_starts(variable, "anchor")   ~ "anchor_reported"
    )
  ) |>
  select(-variable) |>
  pivot_wider(id_cols = c(id, treatment, number),
              names_from = type, values_from = value) |>
  mutate(anchor_value = anchors[paste0("estimate_", number)]) |>
  filter(exposed == 1, anchor_reported == anchor_value)

Code

all_results |>
  filter(exclusion == "No exclusion") |>
  select(Item = number, `Anchor value` = anchor_value, Condition, N = n,
         `Mean estimate` = mean_rating, `Median estimate` = median_rating,
         `Mean distance` = mean_distance, `Median distance` = median_distance) |>
  mutate(Item = paste("Estimate", Item)) |>
  kbl(caption = "Estimates and distance from anchor value by condition — correctly anchored participants only)",
      align = c("l", "r", "l", "r", "r", "r", "r", "r"), booktabs = TRUE) |>
  kable_styling(full_width = FALSE) |>
  collapse_rows(columns = 1:2, valign = "top")

Estimates and distance from anchor value by condition — correctly anchored participants only)
Item	Anchor value	Condition	N	Mean estimate	Median estimate	Mean distance	Median distance
Estimate 1	1776	Low-interference	135	1775.26	1776	9.36	0
Estimate 1	1776	High-interference	120	1779.35	1776	4.35	0
Estimate 2	365	Low-interference	163	390.61	365	176.75	115
Estimate 2	365	High-interference	157	370.57	365	138.98	115
Estimate 3	9	Low-interference	175	11.31	10	3.86	3
Estimate 3	9	High-interference	144	11.29	9	3.97	3
Estimate 4	212	Low-interference	33	202.79	212	25.70	0
Estimate 4	212	High-interference	32	202.78	212	28.97	0
Estimate 5	32	Low-interference	105	-8.96	0	46.07	32
Estimate 5	32	High-interference	121	-0.35	0	33.60	32
Estimate 6	50	Low-interference	181	35.07	38	14.93	12
Estimate 6	50	High-interference	169	32.20	35	17.80	15

We observe the largest changes in estimated effect size, based on different AC exclusions, for estimate 2.

Code

all_results |>
  select(exclusion, number, Condition, n, mean_distance, median_distance, anchor_value) |>
  pivot_wider(
    names_from  = Condition,
    values_from = c(n, mean_distance, median_distance)
  ) |>
  mutate(
    `Δ mean (High − Low)`   = round(`mean_distance_High-interference`   - `mean_distance_Low-interference`,   2),
    `Δ median (High − Low)` = round(`median_distance_High-interference` - `median_distance_Low-interference`, 2),
    Item = paste0("Est. ", number, " (anchor = ", anchor_value, ")")
  ) |>
  select(
    `Exclusion rule`     = exclusion,
    Item,
    `N (Low)`            = `n_Low-interference`,
    `N (High)`           = `n_High-interference`,
    `M dist (Low)`       = `mean_distance_Low-interference`,
    `Mdn dist (Low)`     = `median_distance_Low-interference`,
    `M dist (High)`      = `mean_distance_High-interference`,
    `Mdn dist (High)`    = `median_distance_High-interference`,
    `Δ mean (High − Low)`,
    `Δ median (High − Low)`
  ) |>
  arrange(Item, `Exclusion rule`) |>
  kbl(caption = paste0(
        "Mean and median distance from anchor by condition under each exclusion rule. ",
        "Negative Δ = High-interference closer to anchor (stronger anchoring). ",
        "Restricted to confirmed-anchor participants."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11) |>
  collapse_rows(columns = 1, valign = "top")

Mean and median distance from anchor by condition under each exclusion rule. Negative Δ = High-interference closer to anchor (stronger anchoring). Restricted to confirmed-anchor participants.
Exclusion rule	Item	N (Low)	N (High)	M dist (Low)	Mdn dist (Low)	M dist (High)	Mdn dist (High)	Δ mean (High − Low)	Δ median (High − Low)
Excl: Any AC failed	Est. 1 (anchor = 1776)	96	81	9.55	0.0	3.84	0.0	-5.71	0.0
Excl: Eng. Comp.	Est. 1 (anchor = 1776)	114	96	9.26	0.0	3.85	0.0	-5.41	0.0
Excl: IMC	Est. 1 (anchor = 1776)	118	106	9.46	0.0	4.25	0.0	-5.21	0.0
Excl: Obvious (5-pt)	Est. 1 (anchor = 1776)	134	119	9.42	0.0	4.39	0.0	-5.03	0.0
Excl: Obvious (T/F)	Est. 1 (anchor = 1776)	132	119	9.56	0.0	4.29	0.0	-5.27	0.0
Excl: Straightline	Est. 1 (anchor = 1776)	135	120	9.36	0.0	4.35	0.0	-5.01	0.0
No exclusion	Est. 1 (anchor = 1776)	135	120	9.36	0.0	4.35	0.0	-5.01	0.0
Excl: Any AC failed	Est. 2 (anchor = 365)	118	101	180.70	101.5	143.89	121.0	-36.81	19.5
Excl: Eng. Comp.	Est. 2 (anchor = 365)	142	124	170.82	100.0	136.23	116.5	-34.59	16.5
Excl: IMC	Est. 2 (anchor = 365)	140	131	177.26	109.0	146.75	117.0	-30.51	8.0
Excl: Obvious (5-pt)	Est. 2 (anchor = 365)	162	155	177.84	116.0	138.95	115.0	-38.89	-1.0
Excl: Obvious (T/F)	Est. 2 (anchor = 365)	160	157	175.22	115.0	138.98	115.0	-36.24	0.0
Excl: Straightline	Est. 2 (anchor = 365)	161	157	176.82	115.0	138.98	115.0	-37.84	0.0
No exclusion	Est. 2 (anchor = 365)	163	157	176.75	115.0	138.98	115.0	-37.77	0.0
Excl: Any AC failed	Est. 3 (anchor = 9)	126	92	4.01	3.0	4.01	3.0	0.00	0.0
Excl: Eng. Comp.	Est. 3 (anchor = 9)	148	112	3.77	3.0	3.99	3.0	0.22	0.0
Excl: IMC	Est. 3 (anchor = 9)	154	118	3.96	3.0	3.92	3.0	-0.04	0.0
Excl: Obvious (5-pt)	Est. 3 (anchor = 9)	174	143	3.89	3.0	3.93	3.0	0.04	0.0
Excl: Obvious (T/F)	Est. 3 (anchor = 9)	170	144	3.81	3.0	3.97	3.0	0.16	0.0
Excl: Straightline	Est. 3 (anchor = 9)	173	144	3.91	3.0	3.97	3.0	0.06	0.0
No exclusion	Est. 3 (anchor = 9)	175	144	3.86	3.0	3.97	3.0	0.11	0.0
Excl: Any AC failed	Est. 4 (anchor = 212)	24	22	19.42	0.0	27.50	0.0	8.08	0.0
Excl: Eng. Comp.	Est. 4 (anchor = 212)	29	26	25.79	0.0	27.96	0.0	2.17	0.0
Excl: IMC	Est. 4 (anchor = 212)	29	27	20.83	0.0	29.81	0.0	8.98	0.0
Excl: Obvious (5-pt)	Est. 4 (anchor = 212)	32	31	25.88	0.0	29.90	0.0	4.02	0.0
Excl: Obvious (T/F)	Est. 4 (anchor = 212)	32	32	25.31	0.0	28.97	0.0	3.66	0.0
Excl: Straightline	Est. 4 (anchor = 212)	32	32	25.88	0.0	28.97	0.0	3.09	0.0
No exclusion	Est. 4 (anchor = 212)	33	32	25.70	0.0	28.97	0.0	3.27	0.0
Excl: Any AC failed	Est. 5 (anchor = 32)	68	79	44.24	32.0	32.96	32.0	-11.28	0.0
Excl: Eng. Comp.	Est. 5 (anchor = 32)	86	92	42.59	32.0	33.53	32.0	-9.06	0.0
Excl: IMC	Est. 5 (anchor = 32)	87	106	47.98	32.0	34.29	32.0	-13.69	0.0
Excl: Obvious (5-pt)	Est. 5 (anchor = 32)	103	119	46.34	32.0	33.06	32.0	-13.28	0.0
Excl: Obvious (T/F)	Est. 5 (anchor = 32)	101	120	45.36	32.0	33.88	32.0	-11.48	0.0
Excl: Straightline	Est. 5 (anchor = 32)	102	121	47.11	32.0	33.60	32.0	-13.51	0.0
No exclusion	Est. 5 (anchor = 32)	105	121	46.07	32.0	33.60	32.0	-12.47	0.0
Excl: Any AC failed	Est. 6 (anchor = 50)	127	112	15.18	10.0	17.82	15.0	2.64	5.0
Excl: Eng. Comp.	Est. 6 (anchor = 50)	151	136	14.86	10.0	17.15	14.0	2.29	4.0
Excl: IMC	Est. 6 (anchor = 50)	158	142	15.24	12.0	17.99	15.0	2.75	3.0
Excl: Obvious (5-pt)	Est. 6 (anchor = 50)	179	168	15.07	12.0	17.80	15.0	2.73	3.0
Excl: Obvious (T/F)	Est. 6 (anchor = 50)	177	168	14.87	12.0	17.91	15.0	3.04	3.0
Excl: Straightline	Est. 6 (anchor = 50)	179	169	14.89	12.0	17.80	15.0	2.91	3.0
No exclusion	Est. 6 (anchor = 50)	181	169	14.93	12.0	17.80	15.0	2.87	3.0

Code

all_results |>
  select(exclusion, number, Condition, median_distance, anchor_value) |>
  pivot_wider(names_from = Condition, values_from = median_distance) |>
  mutate(
    delta     = `High-interference` - `Low-interference`,
    Item      = paste0("Est. ", number, " (anchor = ", anchor_value, ")"),
    exclusion = factor(exclusion, levels = names(exclusion_sets))
  ) |>
  ggplot(aes(x = exclusion, y = delta, group = Item, color = Item)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  geom_line(linewidth = 0.7, alpha = 0.7) +
  geom_point(size = 2.5) +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(
    x       = "Exclusion rule",
    y       = "Δ median distance (High − Low)",
    color   = "Item",
    caption = "Note. Negative values = High-interference closer to anchor (stronger anchoring). Each line traces one estimate item across exclusion rules."
  ) +
  theme_minimal(base_size = 11) +
  theme(
    legend.position  = "right",
    plot.caption     = element_text(hjust = 0, size = 9, color = "grey40"),
    panel.grid.minor = element_blank()
  )

How the anchoring effect (High − Low median distance) shifts under different AC exclusion rules. Flat lines indicate the effect is robust to exclusion choice; variable lines indicate sensitivity.

Scale Effects

This section examines the time perception paradigm (Siddiqui et al. 2017). Participants were shown a hypothetical shipping scenario in which delivery time was presented either in days or hours (e.g., contracted vs. expanded scale). We focus on two outcomes: (1) shipping preference (e_days_hours), indicating whether participants chose expedited or standard shipping, and (2) perceived duration (time_perception), a 0–100 slider capturing how long the wait time felt. Consistent with prior work, we would expect that in the days (contracted scale; low-interference) condition, participants are more likely to choose standard shipping and perceive the duration as shorter.

Note

Interestingly, without any AC-based exclusions, we do not replicate the original scale expansion/contraction effect on shipping choice. When applying different AC exclusion criteria, the pattern appears to reverse slightly, with more participants in the expanded scale (hours) condition opting for standard shipping. At the same time, participants in the expanded scale condition report longer perceived durations than those in the contracted scale condition, which is consistent with prior findings. One possible explanation is that features of the pricing structure (e.g., relative cost differences between expedited and standard shipping) may have attenuated or overridden the expected effect of scale framing on choice. Experimental stimuli should be revised before running future iterations of this study.

Code

scale_data <- students_data |>
  select(id, treatment, e_days_hours, time_perception) |>
  mutate(
    Condition = factor(treatment, levels = c(0, 1),
                       labels = c("Contracted Scale", "Expanded Scale"))
  )

# ── Shipping preference: proportion choosing standard shipping (value == 2) ──
shipping_summary <- scale_data |>
  filter(!is.na(e_days_hours)) |>
  group_by(Condition) |>
  summarise(
    N                   = n(),
    N_standard          = sum(e_days_hours == 2),
    N_expedited         = sum(e_days_hours == 1),
    `% standard (2)`    = round(N_standard  / N * 100, 1),
    `% expedited (1)`   = round(N_expedited / N * 100, 1),
    .groups = "drop"
  )

shipping_summary |>
  kbl(caption = "Shipping preference by condition (1 = Expedited, 2 = Standard)",
      booktabs = TRUE) |>
  kable_styling(full_width = FALSE)

Shipping preference by condition (1 = Expedited, 2 = Standard)
Condition	N	N_standard	N_expedited	% standard (2)	% expedited (1)
Contracted Scale	281	228	53	81.1	18.9
Expanded Scale	262	220	42	84.0	16.0

Code

# ── Perceived duration: mean and SD of time_perception slider ────────────────
perception_summary <- scale_data |>
  filter(!is.na(time_perception)) |>
  group_by(Condition) |>
  summarise(
    N    = n(),
    M    = round(mean(time_perception, na.rm = TRUE), 2),
    SD   = round(sd(time_perception,   na.rm = TRUE), 2),
    Mdn  = round(median(time_perception, na.rm = TRUE), 2),
    .groups = "drop"
  )

perception_summary |>
  kbl(caption = "Perceived duration (0 = very short, 100 = very long) by condition",
      booktabs = TRUE) |>
  kable_styling(full_width = FALSE)

Perceived duration (0 = very short, 100 = very long) by condition
Condition	N	M	SD	Mdn
Contracted Scale	281	43.62	25.38	42.0
Expanded Scale	262	46.15	25.79	44.5

Code

scale_summary <- function(data, exclude_ids = NULL, label = "No exclusion") {
  d <- if (!is.null(exclude_ids)) filter(data, !id %in% exclude_ids) else data
  d |>
    select(id, treatment, e_days_hours, time_perception) |>
    filter(!is.na(e_days_hours) | !is.na(time_perception)) |>
    mutate(Condition = factor(treatment, levels = c(0, 1),
                              labels = c("Contracted Scale", "Expanded Scale"))) |>
    group_by(Condition) |>
    summarise(
      n               = n(),
      pct_standard    = round(sum(e_days_hours == 2, na.rm = TRUE) / sum(!is.na(e_days_hours)) * 100, 1),
      pct_expedited   = round(sum(e_days_hours == 1, na.rm = TRUE) / sum(!is.na(e_days_hours)) * 100, 1),
      mean_perception = round(mean(time_perception,   na.rm = TRUE), 2),
      sd_perception   = round(sd(time_perception,     na.rm = TRUE), 2),
      mdn_perception  = round(median(time_perception, na.rm = TRUE), 2),
      .groups         = "drop"
    ) |>
    mutate(exclusion = label)
}

all_scale_results <- imap_dfr(exclusion_sets, function(ids, label) {
  scale_summary(students_data, exclude_ids = ids, label = label)
}) |>
  mutate(exclusion = factor(exclusion, levels = names(exclusion_sets)))

# ── Table: shipping preference ────────────────────────────────────────────────
all_scale_results |>
  select(exclusion, Condition, n, pct_standard, pct_expedited) |>
  pivot_wider(
    names_from  = Condition,
    values_from = c(n, pct_standard, pct_expedited)
  ) |>
  mutate(
    `Δ % standard (Expanded − Contracted)` = round(
      `pct_standard_Expanded Scale` - `pct_standard_Contracted Scale`, 1)
  ) |>
  select(
    `Exclusion rule`                       = exclusion,
    `N (Contracted)`                       = `n_Contracted Scale`,
    `N (Expanded)`                         = `n_Expanded Scale`,
    `% standard (Contracted)`              = `pct_standard_Contracted Scale`,
    `% standard (Expanded)`                = `pct_standard_Expanded Scale`,
    `Δ % standard (Expanded − Contracted)`
  ) |>
  kbl(caption = paste0(
        "Shipping preference (% choosing standard) by scale condition under each exclusion rule. ",
        "Positive Δ = higher rate of standard shipping in expanded scale condition."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11)

Shipping preference (% choosing standard) by scale condition under each exclusion rule. Positive Δ = higher rate of standard shipping in expanded scale condition.
Exclusion rule	N (Contracted)	N (Expanded)	% standard (Contracted)	% standard (Expanded)	Δ % standard (Expanded − Contracted)
No exclusion	281	262	81.1	84.0	2.9
Excl: IMC	237	212	83.1	85.4	2.3
Excl: Straightline	275	262	82.2	84.0	1.8
Excl: Obvious (T/F)	274	259	81.4	83.8	2.4
Excl: Eng. Comp.	231	206	79.7	83.5	3.8
Excl: Obvious (5-pt)	279	259	81.7	83.8	2.1
Excl: Any AC failed	188	161	81.9	83.9	2.0

Code

# ── Table: perceived duration ─────────────────────────────────────────────────
all_scale_results |>
  select(exclusion, Condition, n, mean_perception, sd_perception, mdn_perception) |>
  pivot_wider(
    names_from  = Condition,
    values_from = c(n, mean_perception, sd_perception, mdn_perception)
  ) |>
  mutate(
    `Δ mean (Expanded − Contracted)` = round(
      `mean_perception_Expanded Scale` - `mean_perception_Contracted Scale`, 2),
    `Δ mdn (Expanded − Contracted)`  = round(
      `mdn_perception_Expanded Scale`  - `mdn_perception_Contracted Scale`,  2)
  ) |>
  select(
    `Exclusion rule`                 = exclusion,
    `N (Contracted)`                 = `n_Contracted Scale`,
    `N (Expanded)`                   = `n_Expanded Scale`,
    `M (Contracted)`                 = `mean_perception_Contracted Scale`,
    `SD (Contracted)`                = `sd_perception_Contracted Scale`,
    `Mdn (Contracted)`               = `mdn_perception_Contracted Scale`,
    `M (Expanded)`                   = `mean_perception_Expanded Scale`,
    `SD (Expanded)`                  = `sd_perception_Expanded Scale`,
    `Mdn (Expanded)`                 = `mdn_perception_Expanded Scale`,
    `Δ mean (Expanded − Contracted)`,
    `Δ mdn (Expanded − Contracted)`
  ) |>
  kbl(caption = paste0(
        "Perceived duration (0–100 slider) by scale condition under each exclusion rule. ",
        "Positive Δ = expanded scale condition perceived wait as longer."
      ), booktabs = TRUE) |>
  kable_styling(full_width = FALSE, font_size = 11) |>
  scroll_box(width = "100%")

Perceived duration (0–100 slider) by scale condition under each exclusion rule. Positive Δ = expanded scale condition perceived wait as longer.
Exclusion rule	N (Contracted)	N (Expanded)	M (Contracted)	SD (Contracted)	Mdn (Contracted)	M (Expanded)	SD (Expanded)	Mdn (Expanded)	Δ mean (Expanded − Contracted)	Δ mdn (Expanded − Contracted)
No exclusion	281	262	43.62	25.38	42.0	46.15	25.79	44.5	2.53	2.5
Excl: IMC	237	212	43.62	24.34	41.0	46.84	25.54	44.0	3.22	3.0
Excl: Straightline	275	262	43.58	25.38	41.0	46.15	25.79	44.5	2.57	3.5
Excl: Obvious (T/F)	274	259	43.54	25.23	42.5	46.16	25.70	44.0	2.62	1.5
Excl: Eng. Comp.	231	206	43.59	25.59	43.0	46.22	25.52	44.5	2.63	1.5
Excl: Obvious (5-pt)	279	259	43.70	25.30	42.0	46.40	25.75	45.0	2.70	3.0
Excl: Any AC failed	188	161	43.39	24.70	42.0	46.97	24.95	45.0	3.58	3.0

Code

# ── Plot: Δ % standard across exclusion rules ─────────────────────────────────
all_scale_results |>
  select(exclusion, Condition, pct_standard) |>
  pivot_wider(names_from = Condition, values_from = pct_standard) |>
  mutate(delta = `Expanded Scale` - `Contracted Scale`) |>
  ggplot(aes(x = exclusion, y = delta, group = 1)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  geom_line(linewidth = 0.7, color = "#2E5F8A") +
  geom_point(size = 2.5, color = "#2E5F8A") +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(
    x       = "Exclusion rule",
    y       = "Δ % standard shipping (Expanded − Contracted)",
    caption = "Note. Positive values = higher rate of standard shipping in expanded scale condition."
  ) +
  theme_minimal(base_size = 11) +
  theme(
    plot.caption     = element_text(hjust = 0, size = 9, color = "grey40"),
    panel.grid.minor = element_blank()
  )

Code

# ── Plot: Δ median perception across exclusion rules ─────────────────────────
all_scale_results |>
  select(exclusion, Condition, mdn_perception) |>
  pivot_wider(names_from = Condition, values_from = mdn_perception) |>
  mutate(delta = `Expanded Scale` - `Contracted Scale`) |>
  ggplot(aes(x = exclusion, y = delta, group = 1)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  geom_line(linewidth = 0.7, color = "#E07B3F") +
  geom_point(size = 2.5, color = "#E07B3F") +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(
    x       = "Exclusion rule",
    y       = "Δ median perceived duration (Expanded − Contracted)",
    caption = "Note. Positive values = expanded scale condition perceived wait as longer."
  ) +
  theme_minimal(base_size = 11) +
  theme(
    plot.caption     = element_text(hjust = 0, size = 9, color = "grey40"),
    panel.grid.minor = element_blank()
  )

Summary

[Key findings to be filled in once all analyses are complete.]

Goal	Key Finding
1. AC sensitivity to manipulations
2. AC convergence
3. Effect size sensitivity to AC choice