Replication of Fluency and the detection of misleading questions: Low processing fluency attenuates the Moses Illusion (Experiment 1) by Song & Schwarz (2008, Social Cognition)

Authors

Katie Allen (k7allen@ucsd.edu)

Jae Kwon (jwkwon@ucsd.edu)

Ashley Monteiro (asmonteiro@ucsd.edu)

Kosta Boskovic (koboskovic@ucsd.edu)

Published

October 31, 2024

Introduction

The Moses Illusion is a classic example of the metacognitive effects of fluency. Fluency has been a strong interest of mine since undergrad, and its applications relate to my research interests in understanding how we remember and believe misinformation. Until now, I have only learned methods of inducing fluency through repetition. I chose this paper (specifically experiment 1) because it instead induces fluency (i.e., ease of processing) through visual representation (e.g., easy vs. hard to read text fonts), and it can be applied to online misinformation studies in my future research endeavors.

https://github.com/ucsd-psych201a/song2008.git

Methods

Power Analysis

Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.

Planned Sample

Pilot: In line with the original study, the pilot sample will be 10 participants.

Main Experiment: In line with the original study, 32 Prolific users will participate for monetary compensation.

Materials

Stimuli: The stimuli set will consist of 2 sets each containing two questions written in two contrasting fonts. One question will contain a distortion while the other will not. “Which country is famous for cuckoo clocks, chocolate, banks, and pocket knives?” (A: Switzerland) is the undistorted version, while “How many animals of each kind did Moses take on the Ark?” (A: can’t say) is the distorted version.

Procedure

Pilot: We will test the readability of the different fonts as a part of the pilot for the main experiment. Within the main experiment pilot, we will have Five Prolific users will read the sentence “Switzerland is famous for cuckoo clocks, banks and pocket knives“ printed in grey Brush Script MT font with font size 12; another five will read the same sentence in black Arial font with font size 12. We will ask our pilot participants to report the ease they can read the text on a scale from 1 (very difficult) to 7 (very easy) as well as any errors they find in the experiment.

Main experiment: This experiment will be conducted online through a JSPsych survey experiment on Prolific. Participants will be randomly assigned to the easy vs hard-to-read conditions. They will either see the easy to read text font versions of the two questions or the hard to read versions of the trivia questions. They will also be informed that some of the questions they encounter might not have right answers and encouraged to answer by selecting ‘can’t say’ when this happens.

Specific instructions given to each participant: “You will a read couple of trivia questions and answer them. You can write the answer in the blank. In case you do not know the answer, please write ‘don’t know.’ You may or may not encounter ill-formed questions which do not have correct answers if taken literally. For instance, you might see the question ‘Why was President Gerald Ford forced to resign his office?’ In fact, Gerald Ford was not forced to resign. Please, write ‘can’t say’ for this type of question.“

Analysis Plan

Pilot: We will run a t-test to confirm the intended variation in ease of reading between the two fonts by comparing the ratings of the two fonts. This is important because fluency is the key manipulation.

Main Experiment: In line with the original paper, we will analyze the data using Z scores comparing the proportion of “2” and “can’t say” responses in the easy vs. hard-to-read distorted condition. We will also report the proportion of “Switzerland” and “can’t say” for the undistorted question in the easy vs hard-to-read conditions. This will allow us to see if low processing fluency helps us answer distorted questions correctly but perhaps cause us to respond more incorrectly to nondistorted questions.

The original researchers did not include information on cleaning or exclusion rules. We will exclude participants who do not complete the study.

Clarify key analysis of interest here The key analysis of interest here is the comparison of the proportions of “2” answers for the distorted question between the easy vs hard-to-read conditions (Z-scores). This will allow us to determine if low processing fluency helps us scrutinize information more and respond more correctly.

Differences from Original Study

Pilot differences: Our pilot sample will not consist of just undergraduates.

Main experiment differences: Instead of running the experiment with undergraduate participants in person, 32 Prolific users will complete this experiment online.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Data preparation

Data will be downloaded and imported into R, deidentified, and cleaned using Tidyverse prior to analysis.

Confirmatory analysis

We are interested in testing if font readability (fluency) alters frequency of the Moses illusion. So, using a z-score approach, we want to know if this rate of proportion differs based on processing fluency. The key analysis of interest here is the z-score comparison of the proportions of “2” answers for the distorted question between the two conditions (easy vs. hard to read fonts).

Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.

Design Overview

One factor was manipulated (processing fluency via font readability).
One measure was taken (answer to trivia questions).
This study used a between-participants design.
Yes, their responses to the trivia questions were measured twice per participant. (They answered 2 questions, one distorted and one non distorted).
Being exposed to both font types could have revealed the purpose of the study and contaminate responses. I think it would be hard to accomplish this study within-participants without risking demand characteristics. However, I think adding more trials could help conceal the purpose if they chose to
It is unclear whether they made an effort to reduce demand characteristics, but likely not. Reading the trivia question in Brush Script could be enough to reveal the purpose because we don’t see that font used much.
One critique of this design is the limited trials. I think this study would be more interesting with the distorted Moses question embedded in a set of 20-30 trials. I also think this might be difficult to generalize to today’s world because cultural knowledge has changed so much. More distorted questions should be used mixed in with nondistorted questions in a larger trial set. A potential confound could be prior (biblical) knowledge. This could impact processing fluency and was not accounted for in the study.
This study has limited generalizability for a few reasons. One, the sample was 32 Psychology undergraduates, so we cannot generalize much beyond this population. Two, without testing a range of distorted questions, it might be hard to generalize beyond the Moses illusion itself.