Replication of ‘Two faces of holistic face processing: Facilitation and interference underlying part-whole and composite effects’ by Jin, Hayward, & Cheung (2024, Journal of Vision)

Author

Seojin Lee (seojinl@stanford.edu)

Published

December 14, 2025

Introduction

Justification

My research interests lie in human visual perception, particularly in understanding how the brain integrates low-level information from complex visual input to achieve higher-level understanding of the object. Face is a great example of this, as it involves integrating smaller visual fragments such as eyes, nose, and mouth to achieve an understanding of the identity. The complete composite task from Jin, Hayward, & Cheung (2024) offers a well-defined behavioral paradigm to study this integration process. Specifically, it separates two components of holistic processing – facilitation and interference – rather than treating holistic processing as a single phenomenon. Replicating these effects will help me better understand the mechanisms of holistic face percepton and will inform my own research interests in visual integration.

Stimuli and Procedure

This project replicates the complete composite task online using jsPsych on Prolific. Each trial presents the following: fixaton (500 ms), a study composite (500 ms), a mask (500 ms), then a test stimulus. Participants judged whether the cued half (top or bottom) of the test face matched the same half of the study face, while ignoring the other half. The design manipulates: - Congruency (congruent vs. incongruent) - Alignment (aligned vs. misaligned) - Correct response (same vs. different) - Cue location (top vs. bottom)

Isolated top/bottom halves provide a baselien to measure facilitation (congruent - isolated) and interference (incongruent - isolated) effects. The target trial structure replicates the original: - 400 composite trials (2 x 2 x 2 x 2) For my replication, after piloting, I removed isolated blocks as the composite congruency x alignment effects I aim to replicate do not depend on isolated trials.

Stimuli were grayscale composite faces from the Chicago Face Database. I generated aligned and misaligned versions and added cue brackets to indicate the relevant half.

Methods

Note on Pilot Studies

Two pilot studies were conducted prior to preregistered data collection.
Pilot A tested an early version of the task with a longer trial structure, while Pilot B implemented the finalized version used for the preregistered design.
All confirmatory analyses in the present report refer to Pilot B, whereas Pilot A is included for completeness and design validation.

Power Analysis

Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.

The original paper reported large congruency and alignment effects in the complete composite task (Δd’ = +0.45 for facilitation; Δd’ = -0.66 for interference). These effects were estimated using 455 participants because the authors examined reliability and correations across three holistic tasks, not because the composite task itself requried such a large sample. Im my replication, I focus on replicating the complete congruency x alignment effect, which has been shown to be large and reliable. Using a conservative estimate of Cohen’s dz = 0.40-0.50 for within-subject congruency effects, 60-80 participants provide ~85-95% power.

Planned N = 72, which balances high power, online data quanlity, and feasibility.

Planned Sample

  • Target N: 72 Prolific participants (English-speaking adults, normal/corrected vision)
  • Stopping rule: Stop once 72 valid submissions are collected, accept 72-80 usable datasets after exclisions.
  • Exclusions:
    • RT < 200 ms or > 5000 ms
    • Incomplete responses
    • Participants who fail browser/attention checks

Materials

To match stimulus properties as closely as possible, I contacted the original author (Dr. Haiyang Jin, Oct 27, 2025). Because of copyright restrictions, the authors could not share their exact composite images, but they confirmed that stimuli were constructed from the Chicago Face Database. Following their guidnace, I downloaded CFD images, generated aligned and misaligned composites, created isolated halves initially, and added cue brackets.

Procedure

Can quote directly from original article - just put the text in quotations and note that this was followed precisely. Or, quote directly and just point out exceptions to what was described in the original article. The experiment followed the procedure described in Jin, Hayward, & Cheung (2024): “Each trial began with a fixation cross (500 ms), followed by a composite study face (500 ms), a mask (500 ms), and then a composite test face that remained onscreen until response.” On each trial, participants judged whether the cued half of a test face matched the study face. The factorical structure included: - Cue (top/bottom) - Congruency (congruent/incongruent) - Alignment (aligned/misaligned) - Same/different Participants completed only composite trials in this replication (see Differences section below).

Analysis Plan

Can also quote directly, though it is less often spelled out effectively for an analysis strategy section. The key is to report an analysis strategy that is as close to the original - data cleaning rules, data exclusion rules, covariates, etc. - as possible.
- Outcomes: sensitivity d’ (primary) and RT on correct trials. Primary tests: - Congruency effect (congruent > incongruent) on aligned composites; and Congruency x Alignment interaction. - Facilitation: aligned-congruent vs isolated; Interference: aligned-incongruent vs isolated.

Models: GLMMs following the original strategy (logistic for accuracy/d’ indexing; gamma or log-normal for RT)

Clarify key analysis of interest here You can also pre-specify additional analyses you plan to do.

Differences from Original Study

Explicitly describe known differences in sample, setting, procedure, and analysis plan from original study. The goal, of course, is to minimize those differences, but differences will inevitably occur. Also, note whether such differences are anticipated to make a difference based on claims in the original article or subsequent published research on the conditions for obtaining the effect. Several differences between my replication and Jin, Hayward, & Cheung (2024) exist due to practical constraints of online testing. 1. Task Scope The original study administered three holistic processing tasks (part-whole, standard composite, complete composite) within the same session, using a large sample to estimate correlations and reliability across tasks. In contrast, my replication tests only the complete composite task, because my goal is to replicate the within-task effects (facilitation and interference), not between-task relationships. Anticipated impact: The complete composite task is fully self-contained and does not rely on other tasks, so removing the part-whiole and standard compostie tasks should not significantly affect the replication of the key effects.

  1. Sample size The original study recruited N = 455 online participants, which was driven by their goal of estimating cross-task reliability and between-task correaltions. My replication aims to recruit N = 72 Prolific participants, which is sufficient to detect the large within-subject d’ effects reported in the original paper (facilitation: +0.45, interference: -0.66). Anticipated impact: Power analysis using conservative medium effect sizes (d = 0.30-0.40) indicates >80% power with N = 72. Since I am not estimating cross-task correlations, the reduced sample size is appropriate and should be enough the replicate the core effects of the complete composite task.

  2. Trial count and session duration The original task contained 400 composite trials + 80 isolated trials (~480 trials), which took ~40 minutes. My pilot implementation (Pilot A) unintentionally produced a much longer structure (~704 trials). Based on pilot feedback and Prolific feasibility considerations, I reduced the number of trials per cell by approximately half and removed the isolated-half baseline conditions, since my project does not analyze part-based performance. The final design preserves the full 2 x 2 x 2 composite structure (cue x alignment x congruency) but with fewer repetitions per cell. Anticipated impact: The original trial duration is quite long, increasing the risk of fatigue, dropout, and noisier responses in an online setting. Composite congruency and alignment effects are large and highly reliable, and prior studies indicate that these effects do not require very high trial counts to emerge. Therefore, even without isolated-half baselines and with fewer repetitions per condition, the reduced design should remain a valid and sensitive test of holistic face processing.

  3. Analysis plan I follow the same analysis approach described in the paper: logistic GLMMs for accuracy-derived sensitivity (d’ indexed by fixed effects), with congruency, alignment, cue, and their interaction terms entered as predcitors. I apply the same trial-level exclusion rules (extreme RTs, invalid responses), and compute facilitation and interference as differences from isolated baselines. Anticipated impct: My analysis plan matches the original closely, and differences in sample or trial count do not alter the modeling structure.

Methods Addendum (Post Data Collection)

Actual Sample

Fifty participants were recruited via Prolific. All participants were adults and reported normal or corrected-to-normal vision. Data were screened using the preregistered exclusion criteria: trials with reaction times shorter than 200 ms or longer than 3000 ms were excluded, as were incomplete or invalid responses. After applying these criteria, all 50 participants were retained for analysis.

Differences from pre-data collection methods plan

The preregistered target sample size was N = 72. Data collection was stopped at N = 50 due to time constraints. No other deviations from the preregistered methods or analysis plan were made.

Results

Pilot A (Preliminary Implementation)

Pilot A was conducted on October 26, 2025, using an early version of the jsPsych composite task.
This version unintentionally included a larger number of trials (~704 total), which made the session longer and helped identify design adjustments for the final study.
Two participants completed Pilot A.

Data Loading and Preprocessing

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.2
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data1 <- read_csv("~/Downloads/compositeface_20251026_213322.csv") %>%
  mutate(Participant = "P1")
Rows: 705 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
data2 <- read_csv("~/Downloads/compositeface_20251026_230456.csv") %>%
  mutate(Participant = "P2")
Rows: 705 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Combine into one dataset
data_all <- bind_rows(data1, data2)

# Preprocess
df <- data_all %>%
  select(-rt) %>%  # remove lowercase duplicate column
  filter(trial_frame == "test_face") %>%
  rename(
    is_aligned   = Alignment,
    is_congruent = Congruency,
    rt           = RT
  ) %>%
  select(Participant, is_aligned, is_congruent, Correct, rt)

# Summary statistics
summary_df <- df %>%
  group_by(Participant, is_aligned, is_congruent) %>%
  summarise(
    mean_acc = mean(Correct, na.rm = TRUE),
    mean_rt  = mean(rt, na.rm = TRUE),
    n = n(),
    .groups = "drop"
  )

summary_df
# A tibble: 8 × 6
  Participant is_aligned is_congruent mean_acc mean_rt     n
  <chr>       <chr>      <chr>           <dbl>   <dbl> <int>
1 P1          aligned    congruent       0.807    963.   176
2 P1          aligned    incongruent     0.614    998.   176
3 P1          misaligned congruent       0.716   1049.   176
4 P1          misaligned incongruent     0.665    956.   176
5 P2          aligned    congruent       0.864    705.   176
6 P2          aligned    incongruent     0.597    778.   176
7 P2          misaligned congruent       0.778    770.   176
8 P2          misaligned incongruent     0.744    709.   176

Accuracy by Condition

ggplot(summary_df, aes(x = is_congruent, y = mean_acc, fill = is_aligned)) +
geom_bar(stat = "identity", position = position_dodge()) +
facet_wrap(~ Participant) +
labs(
title = "Pilot A: Accuracy by Congruency and Alignment",
x = "Congruency",
y = "Mean Accuracy",
fill = "Alignment"
) +
theme_minimal()

Reaction Time by Condition

ggplot(summary_df, aes(x = is_congruent, y = mean_rt, fill = is_aligned)) +
geom_bar(stat = "identity", position = position_dodge()) +
facet_wrap(~ Participant) +
labs(
title = "Pilot A: Reaction Time by Congruency and Alignment",
x = "Congruency",
y = "Mean RT (ms)",
fill = "Alignment"
) +
theme_minimal()

Notes on Pilot A

Pilot A revealed several issues that informed updates for Pilot B: The session duration was too long (~704 trials) and risked fatigue. This motivated reducing the number of repetitions per cell. The overall alignment × congruency pattern was visible but noisy due to only two participants. Minor issues with file naming and duplicate RT columns were corrected before Pilot B.

Pilot B (Finalized Implementation)

Pilot B was collected on November 29, 2025, using the finalized version of the jsPsych composite task.
This version implemented the full 2 × 2 × 2 × 2 composite design while reducing repetitions per condition to keep the task feasible for online data collection.

Two participants completed Pilot B. The purpose of Pilot B was to (a) verify the corrected trial structure, (b) ensure timing and branching logic worked as intended, and (c) confirm that the expected Congruency × Alignment pattern emerged in a clean dataset. #### Data Import and Initial Inspection

library(tidyverse)
library(lme4)
Loading required package: Matrix

Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':

    expand, pack, unpack
library(ggplot2)

d1 <- read_csv("~/replication_jin2024/data/compositeface_20251129_161517.csv")
Rows: 417 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
d2 <- read_csv("~/replication_jin2024/data/compositeface_20251129_162056.csv")
Rows: 417 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
raw <- bind_rows(d1, d2)

dim(raw)
[1] 834  36
head(raw)
# A tibble: 6 × 36
  trial_frame      item_width_mm item_height_mm item_width_px px2mm view_dist_mm
  <chr>                    <dbl>          <dbl>         <dbl> <dbl>        <dbl>
1 virtual_chinrest          85.6           54.0           422  4.93         720.
2 test_face                 NA             NA              NA NA             NA 
3 test_face                 NA             NA              NA NA             NA 
4 test_face                 NA             NA              NA NA             NA 
5 test_face                 NA             NA              NA NA             NA 
6 test_face                 NA             NA              NA NA             NA 
# ℹ 30 more variables: rt <dbl>, item_width_deg <dbl>, px2deg <dbl>,
#   win_width_deg <dbl>, win_height_deg <dbl>, trial_type <chr>,
#   trial_index <dbl>, plugin_version <chr>, time_elapsed <dbl>, Subject <chr>,
#   Exp_code <chr>, Exp_name <chr>, CFVersion <chr>, isPavlovia <lgl>,
#   Browser <chr>, Prolific_id <chr>, Trial_num <dbl>, Cue <chr>,
#   Congruency <chr>, Alignment <chr>, SameDifferent <chr>, StimGroup <chr>,
#   StudyFace <chr>, TestFace <chr>, Correct_response <dbl>, MaskFace <chr>, …

Filtering to Experimental Trials

df <- raw %>%
filter(
trial_type == "image-keyboard-response",
!is.na(SameDifferent),
!is.na(Congruency),
!is.na(Alignment),
!is.na(Cue)
)

dim(df)
[1] 832  36
head(df)
# A tibble: 6 × 36
  trial_frame item_width_mm item_height_mm item_width_px px2mm view_dist_mm
  <chr>               <dbl>          <dbl>         <dbl> <dbl>        <dbl>
1 test_face              NA             NA            NA    NA           NA
2 test_face              NA             NA            NA    NA           NA
3 test_face              NA             NA            NA    NA           NA
4 test_face              NA             NA            NA    NA           NA
5 test_face              NA             NA            NA    NA           NA
6 test_face              NA             NA            NA    NA           NA
# ℹ 30 more variables: rt <dbl>, item_width_deg <dbl>, px2deg <dbl>,
#   win_width_deg <dbl>, win_height_deg <dbl>, trial_type <chr>,
#   trial_index <dbl>, plugin_version <chr>, time_elapsed <dbl>, Subject <chr>,
#   Exp_code <chr>, Exp_name <chr>, CFVersion <chr>, isPavlovia <lgl>,
#   Browser <chr>, Prolific_id <chr>, Trial_num <dbl>, Cue <chr>,
#   Congruency <chr>, Alignment <chr>, SameDifferent <chr>, StimGroup <chr>,
#   StudyFace <chr>, TestFace <chr>, Correct_response <dbl>, MaskFace <chr>, …

Cleaning and Variable Setup

df <- df %>%
mutate(
Subject = factor(Subject),
Congruency = factor(Congruency, levels = c("congruent", "incongruent")),
Alignment = factor(Alignment, levels = c("aligned", "misaligned")),
Cue = factor(Cue, levels = c("top", "bot")),
SameDifferent = factor(SameDifferent, levels = c("same", "different")),
Correct = as.numeric(Correct),
RT = as.numeric(RT)
)

# RT Exclusions
df <- df %>% filter(RT > 200, RT < 3000)
nrow(df)
[1] 810
Confirmatory Model: Accuracy (GLMM)

Logistic mixed-effects model predicting accuracy from Congruency, Alignment, and their interaction, with a random intercept for Subject. This model tests whether the expected Congruency × Alignment interaction (i.e., strongest interference in aligned–incongruent condition) is present.

acc_model <- glmer(
Correct ~ Congruency * Alignment + (1 | Subject),
data = df,
family = binomial,
control = glmerControl(optimizer = "bobyqa")
)
boundary (singular) fit: see help('isSingular')
summary(acc_model)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Correct ~ Congruency * Alignment + (1 | Subject)
   Data: df
Control: glmerControl(optimizer = "bobyqa")

      AIC       BIC    logLik -2*log(L)  df.resid 
   1029.2    1052.7    -509.6    1019.2       805 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.7377 -1.0557  0.5755  0.8098  0.9472 

Random effects:
 Groups  Name        Variance  Std.Dev. 
 Subject (Intercept) 2.904e-16 1.704e-08
Number of obs: 810, groups:  Subject, 2

Fixed effects:
                                          Estimate Std. Error z value Pr(>|z|)
(Intercept)                                 1.1051     0.1616   6.841 7.89e-12
Congruencyincongruent                      -0.9966     0.2142  -4.654 3.26e-06
Alignmentmisaligned                        -0.1607     0.2256  -0.712    0.476
Congruencyincongruent:Alignmentmisaligned   0.4742     0.3023   1.569    0.117
                                             
(Intercept)                               ***
Congruencyincongruent                     ***
Alignmentmisaligned                          
Congruencyincongruent:Alignmentmisaligned    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn
Cngrncyncng -0.754              
Algnmntmslg -0.716  0.540       
Cngrncync:A  0.534 -0.709 -0.746
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

RT Analysis (LMER)

# Z-scoring RTs within Subject
df <- df %>%
group_by(Subject) %>%
mutate(RT_z = scale(RT)[,1]) %>%
ungroup()

# Linear Mixed-Effects Model on RT
rt_model <- lmer(
RT_z ~ Congruency * Alignment + (1 | Subject),
data = df %>% filter(Correct == 1)
)
boundary (singular) fit: see help('isSingular')
summary(rt_model)
Linear mixed model fit by REML ['lmerMod']
Formula: RT_z ~ Congruency * Alignment + (1 | Subject)
   Data: df %>% filter(Correct == 1)

REML criterion at convergence: 1421

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.4109 -0.6830 -0.2596  0.3532  4.7813 

Random effects:
 Groups   Name        Variance Std.Dev.
 Subject  (Intercept) 0.0000   0.000   
 Residual             0.8538   0.924   
Number of obs: 527, groups:  Subject, 2

Fixed effects:
                                          Estimate Std. Error t value
(Intercept)                               -0.28312    0.07446  -3.802
Congruencyincongruent                      0.08328    0.11629   0.716
Alignmentmisaligned                        0.37282    0.10711   3.481
Congruencyincongruent:Alignmentmisaligned -0.13382    0.16264  -0.823

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn
Cngrncyncng -0.640              
Algnmntmslg -0.695  0.445       
Cngrncync:A  0.458 -0.715 -0.659
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Descriptive Plots

Accuracy Summary

# Accuracy Summary
acc_summary <- df %>%
group_by(Congruency, Alignment) %>%
summarise(acc = mean(Correct))
`summarise()` has grouped output by 'Congruency'. You can override using the
`.groups` argument.
ggplot(acc_summary, aes(Congruency, acc, fill = Alignment)) +
geom_col(position = "dodge") +
labs(title = "Pilot B: Accuracy by Condition",
y = "Accuracy") +
theme_minimal()

Reaction Time Summary

# Reaction Time Summary
rt_summary <- df %>%
filter(Correct == 1) %>%
group_by(Congruency, Alignment) %>%
summarise(rt = mean(RT))
`summarise()` has grouped output by 'Congruency'. You can override using the
`.groups` argument.
ggplot(rt_summary, aes(Congruency, rt, fill = Alignment)) +
geom_col(position = "dodge") +
labs(title = "Pilot B: Reaction Time (Correct Trials)",
y = "RT (ms)") +
theme_minimal()

RT Plot (Paper-Style)

rt_summary2 <- df %>%
filter(Correct == 1) %>%
group_by(Alignment, Congruency) %>%
summarise(
mean_rt = mean(RT),
se_rt = sd(RT) / sqrt(n()),
.groups = "drop"
)

ggplot(rt_summary2, aes(x = Alignment, y = mean_rt, color = Congruency)) +
geom_point(size = 3, position = position_dodge(width = 0.4)) +
geom_errorbar(
aes(ymin = mean_rt - se_rt, ymax = mean_rt + se_rt),
width = 0.1,
position = position_dodge(width = 0.4)
) +
labs(title = "Pilot B Reaction Time (Paper-Style Format)",
y = "RT (ms)", x = "Alignment") +
coord_cartesian(ylim = c(600, 1400)) +
theme_minimal(base_size = 14)

d′ Analysis (Sensitivity)

compute_dprime <- function(hits, fas, n_hit, n_fa) {
hit_rate <- (hits + 0.5) / (n_hit + 1)
fa_rate  <- (fas + 0.5) / (n_fa + 1)
qnorm(hit_rate) - qnorm(fa_rate)
}

dp <- df %>%
group_by(Subject, Congruency, Alignment) %>%
summarise(
hits = sum(Correct == 1 & SameDifferent == "same"),
fas  = sum(Correct == 0 & SameDifferent == "different"),
n_hit = sum(SameDifferent == "same"),
n_fa  = sum(SameDifferent == "different"),
dprime = compute_dprime(hits, fas, n_hit, n_fa),
.groups = "drop"
)

d′ Plot (Paper-Style)

dp_summary <- dp %>%
group_by(Congruency, Alignment) %>%
summarise(
mean_dp = mean(dprime),
se_dp = sd(dprime) / sqrt(n()),
.groups = "drop"
)

ggplot(dp_summary, aes(x = Alignment, y = mean_dp, color = Congruency)) +
geom_point(size = 3, position = position_dodge(width = 0.4)) +
geom_errorbar(
aes(ymin = mean_dp - se_dp, ymax = mean_dp + se_dp),
width = 0.1,
position = position_dodge(width = 0.4)
) +
labs(title = "Pilot B d′ by Congruency × Alignment",
y = "d′ (Sensitivity)", x = "Alignment") +
coord_cartesian(ylim = c(0, 2.5)) +
theme_minimal(base_size = 14)

Pilot B confirmed that: - The reduced trial count produced cleaner, faster sessions. - All condition labels (Congruency, Alignment, Cue, SameDifferent) were correctly logged. - The expected Aligned–Incongruent cost appeared in both accuracy and RT. - No structural or timing bugs remained after adjustments from Pilot A.

Final Results

Data preparation

Data preparation following the analysis plan. - Analyses were conducted on data from N = 50 participants after applying preregistered exclusion criteria (RT < 200 ms or > 3000 ms, incomplete trials). Only experimental composite trials were included. Accuracy, reaction time (RT), and sensitivity (d′) were analyzed following the same pipeline used in Pilot B.

Data exclusion / filtering

df <- raw %>%
  filter(
    trial_type == "image-keyboard-response",
    !is.na(SameDifferent),
    !is.na(Congruency),
    !is.na(Alignment),
    !is.na(Cue)
  )

dim(df)
[1] 22368    36

Prepare data for analysis - create columns etc.

df <- df %>%
  mutate(
    Subject = factor(Subject),
    Congruency = factor(Congruency, levels = c("congruent", "incongruent")),
    Alignment = factor(Alignment, levels = c("aligned", "misaligned")),
    Cue = factor(Cue, levels = c("top", "bot")),
    SameDifferent = factor(SameDifferent, levels = c("same", "different")),
    Correct = as.numeric(Correct),
    RT = as.numeric(RT)
  )

# RT exclusions
df <- df %>% filter(RT > 200, RT < 3000)

length(unique(df$Subject))  # should still be 50
[1] 50
nrow(df)
[1] 21641

Confirmatory analysis

Confirmatory analysis - Accuracy (GLMM)

acc_model <- glmer(
  Correct ~ Congruency * Alignment + (1 | Subject),
  data = df,
  family = binomial,
  control = glmerControl(optimizer = "bobyqa")
)
summary(acc_model)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Correct ~ Congruency * Alignment + (1 | Subject)
   Data: df
Control: glmerControl(optimizer = "bobyqa")

      AIC       BIC    logLik -2*log(L)  df.resid 
  23473.5   23513.4  -11731.8   23463.5     21636 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.8618 -1.0435  0.4707  0.6156  1.0203 

Random effects:
 Groups  Name        Variance Std.Dev.
 Subject (Intercept) 0.1654   0.4067  
Number of obs: 21641, groups:  Subject, 50

Fixed effects:
                                          Estimate Std. Error z value Pr(>|z|)
(Intercept)                                1.80698    0.06937  26.048  < 2e-16
Congruencyincongruent                     -1.22826    0.04806 -25.559  < 2e-16
Alignmentmisaligned                       -0.41003    0.05126  -7.999 1.26e-15
Congruencyincongruent:Alignmentmisaligned  0.64429    0.06586   9.783  < 2e-16
                                             
(Intercept)                               ***
Congruencyincongruent                     ***
Alignmentmisaligned                       ***
Congruencyincongruent:Alignmentmisaligned ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn
Cngrncyncng -0.447              
Algnmntmslg -0.417  0.601       
Cngrncync:A  0.325 -0.729 -0.778

Confirmatory analysis - Reaction Time (LMER)

df <- df %>%
  group_by(Subject) %>%
  mutate(RT_z = scale(RT)[, 1]) %>%
  ungroup()
rt_model <- lmer(
  RT_z ~ Congruency * Alignment + (1 | Subject),
  data = df %>% filter(Correct == 1)
)
boundary (singular) fit: see help('isSingular')
summary(rt_model)
Linear mixed model fit by REML ['lmerMod']
Formula: RT_z ~ Congruency * Alignment + (1 | Subject)
   Data: df %>% filter(Correct == 1)

REML criterion at convergence: 44023.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.5437 -0.6928 -0.2363  0.4541  7.8123 

Random effects:
 Groups   Name        Variance  Std.Dev. 
 Subject  (Intercept) 1.508e-17 3.883e-09
 Residual             9.109e-01 9.544e-01
Number of obs: 16032, groups:  Subject, 50

Fixed effects:
                                          Estimate Std. Error t value
(Intercept)                               -0.19157    0.01404 -13.644
Congruencyincongruent                      0.15583    0.02153   7.238
Alignmentmisaligned                        0.15849    0.02023   7.835
Congruencyincongruent:Alignmentmisaligned -0.10200    0.03037  -3.359

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn
Cngrncyncng -0.652              
Algnmntmslg -0.694  0.453       
Cngrncync:A  0.462 -0.709 -0.666
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

d’ analysis (sensitivity)

compute_dprime <- function(hits, fas, n_hit, n_fa) {
  hit_rate <- (hits + 0.5) / (n_hit + 1)
  fa_rate  <- (fas + 0.5) / (n_fa + 1)
  qnorm(hit_rate) - qnorm(fa_rate)
}
dp <- df %>%
  group_by(Subject, Congruency, Alignment) %>%
  summarise(
    hits = sum(Correct == 1 & SameDifferent == "same"),
    fas  = sum(Correct == 0 & SameDifferent == "different"),
    n_hit = sum(SameDifferent == "same"),
    n_fa  = sum(SameDifferent == "different"),
    dprime = compute_dprime(hits, fas, n_hit, n_fa),
    .groups = "drop"
  )
dp_summary <- dp %>%
  group_by(Congruency, Alignment) %>%
  summarise(
    mean_dp = mean(dprime),
    se_dp = sd(dprime) / sqrt(n()),
    .groups = "drop"
  )

dp_summary
# A tibble: 4 × 4
  Congruency  Alignment  mean_dp  se_dp
  <fct>       <fct>        <dbl>  <dbl>
1 congruent   aligned      2.22  0.0832
2 congruent   misaligned   1.75  0.0898
3 incongruent aligned      0.727 0.0794
4 incongruent misaligned   1.05  0.104 

Reaction Time plot

rt_summary2 <- df %>%
  filter(Correct == 1) %>%
  group_by(Alignment, Congruency) %>%
  summarise(
    mean_rt = mean(RT),
    se_rt = sd(RT) / sqrt(n()),
    .groups = "drop"
  )

rt_summary2
# A tibble: 4 × 4
  Alignment  Congruency  mean_rt se_rt
  <fct>      <fct>         <dbl> <dbl>
1 aligned    congruent     1037.  6.79
2 aligned    incongruent   1099.  8.35
3 misaligned congruent     1096.  7.31
4 misaligned incongruent   1124.  8.20
ggplot(rt_summary2,
       aes(x = Alignment,
           y = mean_rt,
           color = Congruency,
           group = Congruency)) +
  geom_point(size = 3, position = position_dodge(0.4)) +
  geom_errorbar(
    aes(ymin = mean_rt - se_rt,
        ymax = mean_rt + se_rt),
    width = 0.1,
    position = position_dodge(0.4)
  ) +
  labs(
    title = "Final Results: Reaction Time by Congruency × Alignment",
    y = "RT (ms)",
    x = "Alignment"
  ) +
  coord_cartesian(ylim = c(600, 1400)) +
  theme_minimal(base_size = 14)

d’ plot

ggplot(dp_summary, aes(x = Alignment, y = mean_dp, color = Congruency)) +
  geom_point(size = 3, position = position_dodge(0.4)) +
  geom_errorbar(
    aes(ymin = mean_dp - se_dp, ymax = mean_dp + se_dp),
    width = 0.1,
    position = position_dodge(0.4)
  ) +
  labs(
    title = "Final Results: d′ by Congruency × Alignment",
    x = "Alignment",
    y = "d′ (Sensitivity)"
  ) +
  coord_cartesian(ylim = c(0, 2.5)) +
  theme_minimal(base_size = 14)

Exploratory analyses

Cue location effect analysis

cue_model <- glmer(
  Correct ~ Congruency * Alignment * Cue + (1 | Subject),
  data = df,
  family = binomial,
  control = glmerControl(optimizer = "bobyqa")
)

summary(cue_model)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Correct ~ Congruency * Alignment * Cue + (1 | Subject)
   Data: df
Control: glmerControl(optimizer = "bobyqa")

      AIC       BIC    logLik -2*log(L)  df.resid 
  23292.1   23363.9  -11637.0   23274.1     21632 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.1212 -0.9480  0.4532  0.6045  1.1655 

Random effects:
 Groups  Name        Variance Std.Dev.
 Subject (Intercept) 0.1692   0.4113  
Number of obs: 21641, groups:  Subject, 50

Fixed effects:
                                                 Estimate Std. Error z value
(Intercept)                                       1.92650    0.08130  23.697
Congruencyincongruent                            -1.07708    0.07059 -15.258
Alignmentmisaligned                              -0.46204    0.07470  -6.186
Cuebot                                           -0.22780    0.07707  -2.956
Congruencyincongruent:Alignmentmisaligned         0.73117    0.09665   7.565
Congruencyincongruent:Cuebot                     -0.29715    0.09634  -3.084
Alignmentmisaligned:Cuebot                        0.09698    0.10262   0.945
Congruencyincongruent:Alignmentmisaligned:Cuebot -0.15406    0.13221  -1.165
                                                 Pr(>|z|)    
(Intercept)                                       < 2e-16 ***
Congruencyincongruent                             < 2e-16 ***
Alignmentmisaligned                              6.19e-10 ***
Cuebot                                            0.00312 ** 
Congruencyincongruent:Alignmentmisaligned        3.87e-14 ***
Congruencyincongruent:Cuebot                      0.00204 ** 
Alignmentmisaligned:Cuebot                        0.34466    
Congruencyincongruent:Alignmentmisaligned:Cuebot  0.24393    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn Cuebot Cngr:A Cngr:C Algn:C
Cngrncyncng -0.559                                          
Algnmntmslg -0.527  0.606                                   
Cuebot      -0.510  0.587  0.555                            
Cngrncync:A  0.407 -0.729 -0.772 -0.428                     
Cngrncync:C  0.407 -0.730 -0.443 -0.799  0.533              
Algnmntms:C  0.383 -0.440 -0.727 -0.750  0.561  0.599       
Cngrncy:A:C -0.296  0.532  0.563  0.581 -0.730 -0.727 -0.775
# Accuracy per subject
cue_subj <- df %>%
  group_by(Subject, Cue, Congruency, Alignment) %>%
  summarise(acc = mean(Correct), .groups = "drop")

# Mean + SE across subjects
cue_summary <- cue_subj %>%
  group_by(Cue, Congruency, Alignment) %>%
  summarise(
    acc = mean(acc),
    se  = sd(acc) / sqrt(n()),
    .groups = "drop"
  )

cue_summary
# A tibble: 8 × 5
  Cue   Congruency  Alignment    acc    se
  <fct> <fct>       <fct>      <dbl> <dbl>
1 top   congruent   aligned    0.866    NA
2 top   congruent   misaligned 0.805    NA
3 top   incongruent aligned    0.693    NA
4 top   incongruent misaligned 0.747    NA
5 bot   congruent   aligned    0.836    NA
6 bot   congruent   misaligned 0.784    NA
7 bot   incongruent aligned    0.577    NA
8 bot   incongruent misaligned 0.625    NA
ggplot(
  cue_summary,
  aes(
    x = Alignment,
    y = acc,
    color = Congruency,
    group = Congruency
  )
) +
  geom_point(size = 3, position = position_dodge(0.4)) +
  geom_errorbar(
    aes(ymin = acc - se, ymax = acc + se),
    width = 0.15,
    size = 0.9,
    position = position_dodge(0.4)
  ) +
  facet_wrap(~ Cue) +
  labs(
    title = "Accuracy by Congruency × Alignment × Cue",
    y = "Accuracy",
    x = "Alignment"
  ) +
  ylim(0.5, 1) +
  theme_minimal(base_size = 14)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Discussion

Summary of Replication Attempt

The present study successfully replicated the core behavioral findings of Jin, Hayward, and Cheung (2024) using an online implementation of the complete composite face task. Sensitivity (d’) was highest for aligned-congruent composites (M = 2.22, SE = 0.08) and lowest for aligned-incongruent composites (M = 0.73, SE = 0.08). Critically, the congruency effect – defined as the difference between congruent and incongruent trials – was substantially larger in the aligned condition (Δd’ = 1.49) than in the misaligned condition (Δd’ = 0.70), yielding a clear Congruency x Alignment interaction. This pattern closely mirrors the primary effect reported in the original study.

Commentary

Reaction time differences across conditions were modest, consistent with Jin et al. (2024). Mean RTs ranged from approximately 1037-1124 ms, with aligned-incongruent trials slower than aligned-congruent trials by about 62 ms, and a smaller congruency-related difference under misalignment. This pattern suggests that the composite face effect was expressed primarily as a sensitivity cost rather than a pronounced slowing of responses.

Despite several methodological differences from the original study (smaller smaple size, reduced trial counts, exclusion of isolated-part baselines), the qualitative and quantitative patterns of results closely matched those previously reported. Exploratory analyses indicated that holistic interference effects were evident for both cue locations, with a stronger congruency effect observed for bottom-cued trials than for top-cued trials. However, this asymmetry was not a focus of the preregistered analyses and should be interpreted cautiously. Together, these findings indicate that the complete composite task provides a robust measure of holistic face processing that generalizes across various implementation.