Replication of Study 1 by Corps & Rabagliati (2020, Journal of Memory and Language)
Author
Xirong Hu (xirohu@stanford.edu)
Published
October 26, 2025
Introduction
Justification
In what form predictions occur remains an unsolved problem in the field of speech comprehension. Corps and Rabagliati (2020) revealed that predictions about semantic content, instead of word forms, facilitated the perception of distorted speech in Experiment 1. This closely aligns with my research interest in how top-down information facilitates speech comprehension in challenging scenarios. Another reason for choosing this study is that the context manipulation used in Experiment 1 is relatively new and was proposed to better separate word-form prediction from semantic content prediction. Thus, it is worthwhile to replicate these findings. In addition to these theoretical motivations, there are two practical considerations. First, the original study was conducted on Prolific Academic, which makes the project more feasible. Second, the procedure was relatively well-documented in the paper, with stimulus list and analysis scripts shared on OSF, which makes it easier to adhere to the original procedure and analysis as closely as possible.
Stimuli and Procedures
The stimuli consist of 30 auditorially presented questions and 30 distorted answers. For the procedure, I will strictly follow that described in the original paper (Experiment1/Method/Procedure).
I anticipate two main challenges. The first concerns the stimuli. The authors have not made the stimuli publicly available. Moreover, the original study was conducted in the United Kingdom, where the low-level acoustic features are different. That might lead to perceptual differences after applying the noise-vocoding. This means I will need to record new auditory stimuli and apply the noise-vocoding. However, the script for that procedure was not shared. One solution is to reach out to the original authors. The second challenge concerns the programming of the experiment, since I have no experience with jsPsych.
Repo Link
(If you open it from RPub, please right click the hyperlink. That should work!)
Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.
The paper’s central research question is whether high-level knowledge enhances comprehension through:
Semantic predictions measured by Answer Consistency effect
Form predictions measured by Question Constraint effect and its interaction with Answer Consistency.
Since the paper found Answer Consistency effect and null effect of the interaction between Question Constraint effect and Answer Constraint effect using generalized linear regression models, I will focus on the Answer Consistency effect for power analysis and sample size.
Original effect size:
Answer Consistency: b = 4.29, SE = 0.42, p < .001
Original sample size: 80
I used powersim to simulate the distribution. It is still running as it is quite computationally heavy. I will update that once I have the result. The script can be found in github repository ./data/scripts/power_analysis.r
Power Analysis Results:
80% power:
90% power:
95% power:
Planned Sample
Planned sample size and/or termination rule, sampling frame, known demographics if any, preselection rules if any.
My planned sample size is
Materials
All materials - can quote directly from original article - just put the text in quotations and note that this was followed precisely. Or, quote directly and just point out exceptions to what was described in the original article.
To ensure my replication closely align with the original study, I reached out to the first author for material and scripts. The author kindly shared all the stimuli. Since some questions asked about political leaders at that time and the answer no longer holds for now, I deleted those items, leaving 14 items per list whereas in the original stimuli there are 15 items per list. The items that I deleted are:
List_1a: (Question)Which female candidate recently ran for president of the United States? (Answer)Hillary Clinton
List_1b: (Question)What is the name of the British prime minister? (Answer)Theresa May
List_2a: (Question)Which female candidate recently ran for president of the United States? (Answer)The Northern Lights
List_2b: (Question)What is the name of the British prime minister? (Answer)New York
List_3a: (Question)Who did you see when you visited America? (Answer)Hillary Clinton
List_3b: (Question)Who did you see when you visited London? (Answer)Theresa May
List_4a: (Question)What did you buy from the shop? (Answer)Theresa May
List_4b: (Question)What is your least favorite method of transport? (Answer)Hillary Clinton
Procedure
Can quote directly from original article - just put the text in quotations and note that this was followed precisely. Or, quote directly and just point out exceptions to what was described in the original article.
“The experiment was administered online on Prolific Academic. Stimulus presentation was controlled using jsPsych (De Leeuw, 2015) and data was recorded using MySQL (version 5.7). Participants were warned that they would be listening to audio stimuli, and so were encouraged to complete the experiment in a quiet environment or to use headphones. Before the task, participants were instructed: “First you will hear a female speaker ask a question in a clear voice. You will then hear a male answer this question in a distorted voice. Your task is to listen carefully and type exactly what you think the male speaker said. If you do not know, then please guess”. To make stimulus onset salient, a fixation cross appeared 500 ms before question playback (see Fig. 1a). The fixation cross then turned red and answer playback began 500 ms later. After listening to the answer, participants were prompted to type their response and press a “submit answer” button.”
Due to implementation challenge, the data was recorded using AJAX call instead of mySQL. Everything else remains the same.
The eight experiments corresponding to eight lists are listed below:
Following Corps and Rabagliati (2020), I will conduct a signal detection analysis to assess participants’ sensitivity to the words they actually heard, while controlling for response bias. Participants responses will be manually coded for the number of words matching the semantically consistent answer. I will follow these rules as specified by the original author: - correct obvious spelling mistakes - do not correct morphological mismatches - words reported in the wrong order will not be scored as matching - exclude trials where participants typed question rather than the answer
After that, I will perform the exact mixed-effects logistic regression model as the original authors did.
Clarify key analysis of interest here You can also pre-specify additional analyses you plan to do. The model structure is as follows:
Explicitly describe known differences in sample, setting, procedure, and analysis plan from original study. The goal, of course, is to minimize those differences, but differences will inevitably occur. Also, note whether such differences are anticipated to make a difference based on claims in the original article or subsequent published research on the conditions for obtaining the effect.
Methods Addendum (Post Data Collection)
You can comment this section out prior to final report with data collection.
Actual Sample
Sample size, demographics, data exclusions based on rules spelled out in analysis plan
Differences from pre-data collection methods plan
Any differences from what was described as the original plan, or “none”.
Results
Data preparation
Data preparation following the analysis plan.
Confirmatory analysis
I have converted the stored data into the structure that aligns with the data structure in the original paper (see https://osf.io/kwa32) using python script. The converted file can be found here.
Once finishing manually coding, I will use the same script provided by the author for mixed-effects regression model. The analyses as specified in the analysis plan.
Side-by-side graph with original graph is ideal here
Exploratory analyses
Any follow-up analyses desired (not required).
Discussion
Summary of Replication Attempt
Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.
Commentary
Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.