Replication of Experiment 1A by Edmiston & Lupyan (2015, Cognition)

Author

Reeka Estacio

Published

October 30, 2024

Introduction

This is a replication study of Experiment 1A of Edmiston & Lupyan (2015). The original study examines the difference in which verbal labels and environmental sounds activate categories in the mind. I believe this study has interesting insights into how human language is unique compared to other forms of informative inputs and theorizes how broad, verbal labels contributes to the efficiency in which humans process language. These findings may have potential implications in how natural language processing technologies, like large language models and speech recognition software, can be built to model more naturalistic language learning.

The stimuli presented in experiment 1A of this study involve six categories, such as bird, dog, and phone. For each category, two environmental sound cues were obtained, each corresponding to a more specific exemplar of that category. For instance, the guitar category includes the sound of an acoustic guitar strum and the sound of an electric guitar strum. Two images were obtained for each exemplar respectively for the task (e.g. two images of an acoustic guitar and two images of an electric guitar). Finally, verbal labels, spoken from both female and male speakers, were obtained for each category. The procedure is an image matching verification task. Participants are randomly presented with an auditory cue (either the verbal label or an environmental sound) and are tasked to respond “Yes” on a gaming controller if the image matches the auditory cue, or “No” if it does not. In total, participants completed 6 practice trials and 384 test trials. Verification speed was recorded.

Methods

Power Analysis

Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.

Planned Sample

In the original study, the participants for Experiment 1A were described as the following:

“43 University of Wisconsin–Madison undergraduates participated in Experiment 1A in exchange for course credit.”

For the current study, we aim to replicate the sample as much as possible. Specifically, we will aim for a similar sample size (n=43) and similar demographics (undergraduate students at UCSD, most likely through SONA).

Materials

From the original study:

“The auditory cues comprised basic-level category labels and environmental sounds for six categories: bird, dog, drum, guitar, motorcycle, and phone. For each category, we obtained two distinct environmental sound cues, e.g., , , and two separate images for each subordinate category, e.g., two electric guitars for , two acoustic guitars for . To control for cue variability, we also used two versions of each spoken category label: one pronounced by a female speaker, one by a male speaker. All auditory cues were equated in duration (600 ms.) and normalized in volume. The images were color photographs (four images per category). The materials, obtained from online repositories, are available for download at http://sapir.psych.wisc.edu/stimuli/ MotivatedCuesExp1A-1B.zip.”

We will use the same materials for the replication study, as all materials were provided by the original authors.

Procedure

From the original study:

“On each trial participants heard a cue and saw a picture. We instructed participants to decide as quickly and accurately as possible if the picture they saw came from the same basic-level category as the word or sound they heard Participants were tested in individual rooms sitting approximately 2400 from a monitor such that images subtended 10 10°. Trials began with a 250 ms. fixation cross followed immediately by the auditory cue, delivered via headphones. The target image appeared centrally 1 s after the off- set of the auditory cue and remained visible until a response was made. Each participant completed 6 practice and 384 test trials. If the picture matched the auditory cue (50% of trials) participants were instructed to respond ‘Yes’ on a gaming controller (e.g., or ‘‘phone’’ followed by a picture of any phone). Otherwise, they were to press ‘No’ (e.g., or ‘‘phone’’ followed by a dog). All factors (cue type, congruence) var- ied randomly within subjects. Auditory feedback (buzz or bleep) was given after each trial.”

We aim to follow the original procedure of Experiment 1A as precisely as possible. However, instead of running the trials in-person, the experiment will be conducted online using jsPsych. For this reason, the task will be slightly different such that participants will respond to trials using their cursor instead of a gaming controller. We will also encourage participants to wear headphones and be in a quiet area for the auditory cues during the experiment.

Analysis Plan

From the original study:

“All participants performed very accurately on all items (M = 97%). Response times (RTs) shorter than 250 ms. or longer than 1500 ms. were removed (292 trials removed, 1.77% of total).

We fit RTs for correct responses on matching trials (‘Yes’ responses) with linear mixed regression using maximum likelihood estimation (Bates, Maechler, Bolker, & Walker, 2013), including random intercepts and random slopes for within-subject factors and random intercepts for repeated items (unique trial types) following the recommendations of Barr, Levy, Scheepers, and Tily (2013). Reported below are the parameter estimates (b) and confidence intervals for each contrast of interest. Significance tests were calculated using chi-square tests that compared nested models—models with and without the factor of interest—on improvement in log-likelihood.”

For our replication, we will also remove trials where response times are shorter than 250 ms or longer than 1500ms from the analysis. We will then run a similar linear mixed regression model, as was performed the original study. We will also use chi-square tests to assess significance of the results. We anticipate that a successful replication of the original study will yield similar effects. More specifically, we should find that verbal labels elicit the lowest overall reaction times, and that congruent sounds elicit lower reaction times than incongruent sounds.

Clarify key analysis of interest here: Linear mixed regression model, chi-square tests for significance

Differences from Original Study

This replication will differ from the original study in terms of setting and procedure. As previously mentioned, the experiments will be conducted online using jsPsych as opposed to in-person in a lab setting. Loading/internet speed of the participants’ personal devices may affect overall reaction times. Participants will also be using their cursor to respond to trials as opposed to the gaming controller used in the original study. However, we do not anticipate that these differences will greatly affect results or ability to obtain the same effect as described in the original study.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Data preparation

Data preparation following the analysis plan.

As previously mentioned, we will exclude trials where reaction times are less than 250 ms or exceed 1500 ms after stimulus presentation.

Confirmatory analysis

The analyses as specified in the analysis plan. This will include a linear mixed regression model and chi-square test for significance.

Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.