Replication of ‘Two faces of holistic face processing: Facilitation and interference underlying part-whole and composite effects’ by Jin, Hayward, & Cheung (2024, Journal of Vision)

Author

Seojin Lee (seojinl@stanford.edu)

Published

November 30, 2025

Introduction

Justification

My research interests lie in human visual perception, particularly in understanding how the brain integrates low-level information from complex visual input to achieve higher-level understanding of the object. Face is a great example of this, as it involves integrating smaller visual fragments such as eyes, nose, and mouth to achieve an understanding of the identity. The complete composite task from Jin, Hayward, & Cheung (2024) offers a well-defined behavioral paradigm to study this integration process. Specifically, it separates two components of holistic processing – facilitation and interference – rather than treating holistic processing as a single phenomenon. Replicating these effects will help me better understand the mechanisms of holistic face percepton and will inform my own research interests in visual integration.

Stimuli and Procedure

This project replicates the complete composite task online using jsPsych on Prolific. Each trial presents the following: fixaton (500 ms), a study composite (500 ms), a mask (500 ms), then a test stimulus. Participants judged whether the cued half (top or bottom) of the test face matched the same half of the study face, while ignoring the other half. The design manipulates: - Congruency (congruent vs. incongruent) - Alignment (aligned vs. misaligned) - Correct response (same vs. different) - Cue location (top vs. bottom)

Isolated top/bottom halves provide a baselien to measure facilitation (congruent - isolated) and interference (incongruent - isolated) effects. The target trial structure replicates the original: - 400 composite trials (2 x 2 x 2 x 2) For my replication, after piloting, I removed isolated blocks as the composite congruency x alignment effects I aim to replicate do not depend on isolated trials.

Stimuli were grayscale composite faces from the Chicago Face Database. I generated aligned and misaligned versions and added cue brackets to indicate the relevant half.

Links to repos

Replication repository: https://github.com/psych251/jin2024
Original paper: https://jov.arvojournals.org/article.aspx?articleid=2802147

Methods

Power Analysis

Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.

The original paper reported large congruency and alignment effects in the complete composite task (Δd’ = +0.45 for facilitation; Δd’ = -0.66 for interference). These effects were estimated using 455 participants because the authors examined reliability and correations across three holistic tasks, not because the composite task itself requried such a large sample. Im my replication, I focus on replicating the complete congruency x alignment effect, which has been shown to be large and reliable. Using a conservative estimate of Cohen’s dz = 0.40-0.50 for within-subject congruency effects, 60-80 participants provide ~85-95% power.

Planned N = 72, which balances high power, online data quanlity, and feasibility.

Planned Sample

Target N: 72 Prolific participants (English-speaking adults, normal/corrected vision)
Stopping rule: Stop once 72 valid submissions are collected, accept 72-80 usable datasets after exclisions.
Exclusions:
- RT < 200 ms or > 5000 ms
- Incomplete responses
- Participants who fail browser/attention checks

Materials

To match stimulus properties as closely as possible, I contacted the original author (Dr. Haiyang Jin, Oct 27, 2025). Because of copyright restrictions, the authors could not share their exact composite images, but they confirmed that stimuli were constructed from the Chicago Face Database. Following their guidnace, I downloaded CFD images, generated aligned and misaligned composites, created isolated halves initially, and added cue brackets.

Procedure

Can quote directly from original article - just put the text in quotations and note that this was followed precisely. Or, quote directly and just point out exceptions to what was described in the original article. The experiment followed the procedure described in Jin, Hayward, & Cheung (2024): “Each trial began with a fixation cross (500 ms), followed by a composite study face (500 ms), a mask (500 ms), and then a composite test face that remained onscreen until response.” On each trial, participants judged whether the cued half of a test face matched the study face. The factorical structure included: - Cue (top/bottom) - Congruency (congruent/incongruent) - Alignment (aligned/misaligned) - Same/different Participants completed only composite trials in this replication (see Differences section below).

Analysis Plan

Can also quote directly, though it is less often spelled out effectively for an analysis strategy section. The key is to report an analysis strategy that is as close to the original - data cleaning rules, data exclusion rules, covariates, etc. - as possible.
- Outcomes: sensitivity d’ (primary) and RT on correct trials. Primary tests: - Congruency effect (congruent > incongruent) on aligned composites; and Congruency x Alignment interaction. - Facilitation: aligned-congruent vs isolated; Interference: aligned-incongruent vs isolated.

Models: GLMMs following the original strategy (logistic for accuracy/d’ indexing; gamma or log-normal for RT)

Clarify key analysis of interest here You can also pre-specify additional analyses you plan to do.

Differences from Original Study

Explicitly describe known differences in sample, setting, procedure, and analysis plan from original study. The goal, of course, is to minimize those differences, but differences will inevitably occur. Also, note whether such differences are anticipated to make a difference based on claims in the original article or subsequent published research on the conditions for obtaining the effect. Several differences between my replication and Jin, Hayward, & Cheung (2024) exist due to practical constraints of online testing. 1. Task Scope The original study administered three holistic processing tasks (part-whole, standard composite, complete composite) within the same session, using a large sample to estimate correlations and reliability across tasks. In contrast, my replication tests only the complete composite task, because my goal is to replicate the within-task effects (facilitation and interference), not between-task relationships. Anticipated impact: The complete composite task is fully self-contained and does not rely on other tasks, so removing the part-whiole and standard compostie tasks should not significantly affect the replication of the key effects.

Sample size The original study recruited N = 455 online participants, which was driven by their goal of estimating cross-task reliability and between-task correaltions. My replication aims to recruit N = 72 Prolific participants, which is sufficient to detect the large within-subject d’ effects reported in the original paper (facilitation: +0.45, interference: -0.66). Anticipated impact: Power analysis using conservative medium effect sizes (d = 0.30-0.40) indicates >80% power with N = 72. Since I am not estimating cross-task correlations, the reduced sample size is appropriate and should be enough the replicate the core effects of the complete composite task.
Trial count and session duration The original task contained 400 composite trials + 80 isolated trials (~480 trials), which took ~40 minutes. My pilot implementation (Pilot A) unintentionally produced a much longer structure (~704 trials). Based on pilot feedback and Prolific feasibility considerations, I reduced the number of trials per cell by approximately half and removed the isolated-half baseline conditions, since my project does not analyze part-based performance. The final design preserves the full 2 x 2 x 2 composite structure (cue x alignment x congruency) but with fewer repetitions per cell. Anticipated impact: The original trial duration is quite long, increasing the risk of fatigue, dropout, and noisier responses in an online setting. Composite congruency and alignment effects are large and highly reliable, and prior studies indicate that these effects do not require very high trial counts to emerge. Therefore, even without isolated-half baselines and with fewer repetitions per condition, the reduced design should remain a valid and sensitive test of holistic face processing.
Analysis plan I follow the same analysis approach described in the paper: logistic GLMMs for accuracy-derived sensitivity (d’ indexed by fixed effects), with congruency, alignment, cue, and their interaction terms entered as predcitors. I apply the same trial-level exclusion rules (extreme RTs, invalid responses), and compute facilitation and interference as differences from isolated baselines. Anticipated impct: My analysis plan matches the original closely, and differences in sample or trial count do not alter the modeling structure.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Pilot A (Preliminary Implementation)

Pilot A was conducted on October 26, 2025, using an early version of the jsPsych composite task.
This version unintentionally included a larger number of trials (~704 total), which made the session longer and helped identify design adjustments for the final study.
Two participants completed Pilot A.

Data Loading and Preprocessing

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.2
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

data1 <- read_csv("~/Downloads/compositeface_20251026_213322.csv") %>%
  mutate(Participant = "P1")

Rows: 705 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

data2 <- read_csv("~/Downloads/compositeface_20251026_230456.csv") %>%
  mutate(Participant = "P2")

Rows: 705 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Combine into one dataset
data_all <- bind_rows(data1, data2)

# Preprocess
df <- data_all %>%
  select(-rt) %>%  # remove lowercase duplicate column
  filter(trial_frame == "test_face") %>%
  rename(
    is_aligned   = Alignment,
    is_congruent = Congruency,
    rt           = RT
  ) %>%
  select(Participant, is_aligned, is_congruent, Correct, rt)

# Summary statistics
summary_df <- df %>%
  group_by(Participant, is_aligned, is_congruent) %>%
  summarise(
    mean_acc = mean(Correct, na.rm = TRUE),
    mean_rt  = mean(rt, na.rm = TRUE),
    n = n(),
    .groups = "drop"
  )

summary_df

# A tibble: 8 × 6
  Participant is_aligned is_congruent mean_acc mean_rt     n
  <chr>       <chr>      <chr>           <dbl>   <dbl> <int>
1 P1          aligned    congruent       0.807    963.   176
2 P1          aligned    incongruent     0.614    998.   176
3 P1          misaligned congruent       0.716   1049.   176
4 P1          misaligned incongruent     0.665    956.   176
5 P2          aligned    congruent       0.864    705.   176
6 P2          aligned    incongruent     0.597    778.   176
7 P2          misaligned congruent       0.778    770.   176
8 P2          misaligned incongruent     0.744    709.   176

Accuracy by Condition

ggplot(summary_df, aes(x = is_congruent, y = mean_acc, fill = is_aligned)) +
geom_bar(stat = "identity", position = position_dodge()) +
facet_wrap(~ Participant) +
labs(
title = "Pilot A: Accuracy by Congruency and Alignment",
x = "Congruency",
y = "Mean Accuracy",
fill = "Alignment"
) +
theme_minimal()

Reaction Time by Condition

ggplot(summary_df, aes(x = is_congruent, y = mean_rt, fill = is_aligned)) +
geom_bar(stat = "identity", position = position_dodge()) +
facet_wrap(~ Participant) +
labs(
title = "Pilot A: Reaction Time by Congruency and Alignment",
x = "Congruency",
y = "Mean RT (ms)",
fill = "Alignment"
) +
theme_minimal()

Notes on Pilot A

Pilot A revealed several issues that informed updates for Pilot B: The session duration was too long (~704 trials) and risked fatigue. This motivated reducing the number of repetitions per cell. The overall alignment × congruency pattern was visible but noisy due to only two participants. Minor issues with file naming and duplicate RT columns were corrected before Pilot B.

Pilot B (Finalized Implementation)

Pilot B was collected on November 29, 2025, using the finalized version of the jsPsych composite task.
This version implemented the full 2 × 2 × 2 × 2 composite design while reducing repetitions per condition to keep the task feasible for online data collection.

Two participants completed Pilot B. The purpose of Pilot B was to (a) verify the corrected trial structure, (b) ensure timing and branching logic worked as intended, and (c) confirm that the expected Congruency × Alignment pattern emerged in a clean dataset. #### Data Import and Initial Inspection

library(tidyverse)
library(lme4)

Loading required package: Matrix


Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

library(ggplot2)

d1 <- read_csv("~/replication_jin2024/data/compositeface_20251129_161517.csv")

Rows: 417 Columns: 36

── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

d2 <- read_csv("~/replication_jin2024/data/compositeface_20251129_162056.csv")

Rows: 417 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (18): trial_frame, trial_type, plugin_version, Subject, Exp_code, Exp_na...
dbl (16): item_width_mm, item_height_mm, item_width_px, px2mm, view_dist_mm,...
lgl  (2): isPavlovia, Correct

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

raw <- bind_rows(d1, d2)

dim(raw)

[1] 834  36

head(raw)

# A tibble: 6 × 36
  trial_frame      item_width_mm item_height_mm item_width_px px2mm view_dist_mm
  <chr>                    <dbl>          <dbl>         <dbl> <dbl>        <dbl>
1 virtual_chinrest          85.6           54.0           422  4.93         720.
2 test_face                 NA             NA              NA NA             NA 
3 test_face                 NA             NA              NA NA             NA 
4 test_face                 NA             NA              NA NA             NA 
5 test_face                 NA             NA              NA NA             NA 
6 test_face                 NA             NA              NA NA             NA 
# ℹ 30 more variables: rt <dbl>, item_width_deg <dbl>, px2deg <dbl>,
#   win_width_deg <dbl>, win_height_deg <dbl>, trial_type <chr>,
#   trial_index <dbl>, plugin_version <chr>, time_elapsed <dbl>, Subject <chr>,
#   Exp_code <chr>, Exp_name <chr>, CFVersion <chr>, isPavlovia <lgl>,
#   Browser <chr>, Prolific_id <chr>, Trial_num <dbl>, Cue <chr>,
#   Congruency <chr>, Alignment <chr>, SameDifferent <chr>, StimGroup <chr>,
#   StudyFace <chr>, TestFace <chr>, Correct_response <dbl>, MaskFace <chr>, …

Filtering to Experimental Trials

df <- raw %>%
filter(
trial_type == "image-keyboard-response",
!is.na(SameDifferent),
!is.na(Congruency),
!is.na(Alignment),
!is.na(Cue)
)

dim(df)

[1] 832  36

head(df)

# A tibble: 6 × 36
  trial_frame item_width_mm item_height_mm item_width_px px2mm view_dist_mm
  <chr>               <dbl>          <dbl>         <dbl> <dbl>        <dbl>
1 test_face              NA             NA            NA    NA           NA
2 test_face              NA             NA            NA    NA           NA
3 test_face              NA             NA            NA    NA           NA
4 test_face              NA             NA            NA    NA           NA
5 test_face              NA             NA            NA    NA           NA
6 test_face              NA             NA            NA    NA           NA
# ℹ 30 more variables: rt <dbl>, item_width_deg <dbl>, px2deg <dbl>,
#   win_width_deg <dbl>, win_height_deg <dbl>, trial_type <chr>,
#   trial_index <dbl>, plugin_version <chr>, time_elapsed <dbl>, Subject <chr>,
#   Exp_code <chr>, Exp_name <chr>, CFVersion <chr>, isPavlovia <lgl>,
#   Browser <chr>, Prolific_id <chr>, Trial_num <dbl>, Cue <chr>,
#   Congruency <chr>, Alignment <chr>, SameDifferent <chr>, StimGroup <chr>,
#   StudyFace <chr>, TestFace <chr>, Correct_response <dbl>, MaskFace <chr>, …

Cleaning and Variable Setup

df <- df %>%
mutate(
Subject = factor(Subject),
Congruency = factor(Congruency, levels = c("congruent", "incongruent")),
Alignment = factor(Alignment, levels = c("aligned", "misaligned")),
Cue = factor(Cue, levels = c("top", "bot")),
SameDifferent = factor(SameDifferent, levels = c("same", "different")),
Correct = as.numeric(Correct),
RT = as.numeric(RT)
)

# RT Exclusions
df <- df %>% filter(RT > 200, RT < 3000)
nrow(df)

[1] 810

Confirmatory Model: Accuracy (GLMM)

Logistic mixed-effects model predicting accuracy from Congruency, Alignment, and their interaction, with a random intercept for Subject. This model tests whether the expected Congruency × Alignment interaction (i.e., strongest interference in aligned–incongruent condition) is present.

acc_model <- glmer(
Correct ~ Congruency * Alignment + (1 | Subject),
data = df,
family = binomial,
control = glmerControl(optimizer = "bobyqa")
)

boundary (singular) fit: see help('isSingular')

summary(acc_model)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Correct ~ Congruency * Alignment + (1 | Subject)
   Data: df
Control: glmerControl(optimizer = "bobyqa")

      AIC       BIC    logLik -2*log(L)  df.resid 
   1029.2    1052.7    -509.6    1019.2       805 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.7377 -1.0557  0.5755  0.8098  0.9472 

Random effects:
 Groups  Name        Variance  Std.Dev. 
 Subject (Intercept) 2.904e-16 1.704e-08
Number of obs: 810, groups:  Subject, 2

Fixed effects:
                                          Estimate Std. Error z value Pr(>|z|)
(Intercept)                                 1.1051     0.1616   6.841 7.89e-12
Congruencyincongruent                      -0.9966     0.2142  -4.654 3.26e-06
Alignmentmisaligned                        -0.1607     0.2256  -0.712    0.476
Congruencyincongruent:Alignmentmisaligned   0.4742     0.3023   1.569    0.117
                                             
(Intercept)                               ***
Congruencyincongruent                     ***
Alignmentmisaligned                          
Congruencyincongruent:Alignmentmisaligned    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn
Cngrncyncng -0.754              
Algnmntmslg -0.716  0.540       
Cngrncync:A  0.534 -0.709 -0.746
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

RT Analysis (LMER)

# Z-scoring RTs within Subject
df <- df %>%
group_by(Subject) %>%
mutate(RT_z = scale(RT)[,1]) %>%
ungroup()

# Linear Mixed-Effects Model on RT
rt_model <- lmer(
RT_z ~ Congruency * Alignment + (1 | Subject),
data = df %>% filter(Correct == 1)
)

boundary (singular) fit: see help('isSingular')

summary(rt_model)

Linear mixed model fit by REML ['lmerMod']
Formula: RT_z ~ Congruency * Alignment + (1 | Subject)
   Data: df %>% filter(Correct == 1)

REML criterion at convergence: 1421

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.4109 -0.6830 -0.2596  0.3532  4.7813 

Random effects:
 Groups   Name        Variance Std.Dev.
 Subject  (Intercept) 0.0000   0.000   
 Residual             0.8538   0.924   
Number of obs: 527, groups:  Subject, 2

Fixed effects:
                                          Estimate Std. Error t value
(Intercept)                               -0.28312    0.07446  -3.802
Congruencyincongruent                      0.08328    0.11629   0.716
Alignmentmisaligned                        0.37282    0.10711   3.481
Congruencyincongruent:Alignmentmisaligned -0.13382    0.16264  -0.823

Correlation of Fixed Effects:
            (Intr) Cngrnc Algnmn
Cngrncyncng -0.640              
Algnmntmslg -0.695  0.445       
Cngrncync:A  0.458 -0.715 -0.659
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Descriptive Plots

Accuracy Summary

# Accuracy Summary
acc_summary <- df %>%
group_by(Congruency, Alignment) %>%
summarise(acc = mean(Correct))

`summarise()` has grouped output by 'Congruency'. You can override using the
`.groups` argument.

ggplot(acc_summary, aes(Congruency, acc, fill = Alignment)) +
geom_col(position = "dodge") +
labs(title = "Pilot B: Accuracy by Condition",
y = "Accuracy") +
theme_minimal()

Reaction Time Summary

# Reaction Time Summary
rt_summary <- df %>%
filter(Correct == 1) %>%
group_by(Congruency, Alignment) %>%
summarise(rt = mean(RT))

`summarise()` has grouped output by 'Congruency'. You can override using the
`.groups` argument.

ggplot(rt_summary, aes(Congruency, rt, fill = Alignment)) +
geom_col(position = "dodge") +
labs(title = "Pilot B: Reaction Time (Correct Trials)",
y = "RT (ms)") +
theme_minimal()

RT Plot (Paper-Style)

rt_summary2 <- df %>%
filter(Correct == 1) %>%
group_by(Alignment, Congruency) %>%
summarise(
mean_rt = mean(RT),
se_rt = sd(RT) / sqrt(n()),
.groups = "drop"
)

ggplot(rt_summary2, aes(x = Alignment, y = mean_rt, color = Congruency)) +
geom_point(size = 3, position = position_dodge(width = 0.4)) +
geom_errorbar(
aes(ymin = mean_rt - se_rt, ymax = mean_rt + se_rt),
width = 0.1,
position = position_dodge(width = 0.4)
) +
labs(title = "Pilot B Reaction Time (Paper-Style Format)",
y = "RT (ms)", x = "Alignment") +
coord_cartesian(ylim = c(600, 1400)) +
theme_minimal(base_size = 14)

d′ Analysis (Sensitivity)

compute_dprime <- function(hits, fas, n_hit, n_fa) {
hit_rate <- (hits + 0.5) / (n_hit + 1)
fa_rate  <- (fas + 0.5) / (n_fa + 1)
qnorm(hit_rate) - qnorm(fa_rate)
}

dp <- df %>%
group_by(Subject, Congruency, Alignment) %>%
summarise(
hits = sum(Correct == 1 & SameDifferent == "same"),
fas  = sum(Correct == 0 & SameDifferent == "different"),
n_hit = sum(SameDifferent == "same"),
n_fa  = sum(SameDifferent == "different"),
dprime = compute_dprime(hits, fas, n_hit, n_fa),
.groups = "drop"
)

d′ Plot (Paper-Style)

dp_summary <- dp %>%
group_by(Congruency, Alignment) %>%
summarise(
mean_dp = mean(dprime),
se_dp = sd(dprime) / sqrt(n()),
.groups = "drop"
)

ggplot(dp_summary, aes(x = Alignment, y = mean_dp, color = Congruency)) +
geom_point(size = 3, position = position_dodge(width = 0.4)) +
geom_errorbar(
aes(ymin = mean_dp - se_dp, ymax = mean_dp + se_dp),
width = 0.1,
position = position_dodge(width = 0.4)
) +
labs(title = "Pilot B d′ by Congruency × Alignment",
y = "d′ (Sensitivity)", x = "Alignment") +
coord_cartesian(ylim = c(0, 2.5)) +
theme_minimal(base_size = 14)

Pilot B confirmed that: - The reduced trial count produced cleaner, faster sessions. - All condition labels (Congruency, Alignment, Cue, SameDifferent) were correctly logged. - The expected Aligned–Incongruent cost appeared in both accuracy and RT. - No structural or timing bugs remained after adjustments from Pilot A.

Data preparation

Data preparation following the analysis plan.

Confirmatory analysis

The analyses as specified in the analysis plan.

Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.