Replication of Study 1 by Song & Schwartz (2008, Social Cognition)

Author

Kosta Boskovic (kboskovic@ucsd.edu)

Published

November 21, 2024

Project Paradigm Links (one for each condition):

https://ucsd-psych201a.github.io/song2008/Trivia_Q1.html (easy condition)

https://ucsd-psych201a.github.io/song2008/Trivia_Q2.html (difficult condition)

Introduction

My research interests are in cognitive development, particularly in how children acquire language and concepts. However, for this class and considering the scope of the project, I am aiming to conduct an online study with adult participants, which meant I was going to do something a little different from my primary research interests. More generally, I am interested in the relationship between language and cognition and how this interplay is seen in online language processing. I teamed up with several other first-year Psych PhDs to find an article that fit all of our interests and found this article on the Moses effect, which I find fascinating since a simple manipulation affecting the readibility of the sentence influences the result and leads people to a fallacy in processing and judgment. The stimuli and procedure for this study will be relatively simple. We will likely use Qualtrics to design the experiment. There will be an instruction slide telling participants that they will answer trivia questions and how to respond accordingly. Then, there will be one control question followed by the question of interest in which there is a distortion (asking how many of each animal Moses, instead of Noah, brought on the ark). We will make two conditions, one in which the font will be easy to read and one in which the font is harder to read. My experience with Qualtrics is limited, so getting the hang of it and implementing things like multiple conditions and randomization of participants to the two conditions could be challenging. Here is the link to the repo: https://github.com/kostaboskovic/song2008/ Here is the link to the original paper: https://github.com/kostaboskovic/song2008/blob/main/original_paper/SongSchwartz.pdf

Methods

Power Analysis

The original effect size was not reported. We computed the previous study’s achieved power to be 59%. We conducted a power analysis and determined that to achieve 80% power, we should have a sample size of 52.

Planned Sample

The sample size is of n = 52.

Materials

This study will be conducted online and there will be no physical materials necessary.

Procedure

At the start of the experiment, participants will be given instructions on how to respond to trivia questions that they will see. The instructions were as follows: “You will a read couple of trivia questions and answer them. You can write the answer in the blank. In case you do not know the answer, please write ‘don’t know.’ You may or may not encounter ill-formed questions which do not have correct answers if taken literally. For instance, you might see the question ‘Why was President Gerald Ford forced to resign his office?’ In fact, Gerald Ford was not forced to resign. Please, write ‘can’t say’ for this type of question.” The participants then answer two trivia questions, either in an easy-to-read font (Arial) or a hard-to-read font (Bush Script MT). The first question, a control question, asks: “Which country is famous for cuckoo clocks, chocolate, banks, and pocket knives?”. The second question, the distorted question of interest, asks: “How many animals of each kind did Moses take on the Ark?”.

Analysis Plan

Any participant that does not complete the study (providing answers to both trivia questions) will be excluded. For the distorted question, we will report the proportion of participants who responded “2” and the proportion of participants who responded “can’t say” in both conditions (easy-to-read and hard-to-read fonts). We will also conduct z-tests for proportions to compare these proportions between the two conditions. For the control question, we will report the proportions of participants who responded “Switzerland” correctly, the proportion who responded a different country’s name, the proportion who responded “don’t know”, and the proportion who responded “can’t say” in both conditions. For all of which are applicable (i.e., participants provided that response in either condition), we will conduct z-tests for proportions to compare these proportions between the two conditions.

Clarify key analysis of interest here You can also pre-specify additional analyses you plan to do. The z-test of proportions comparing the proportions of “2” answers for the distorted question between the two conditions is the key analysis of interest.

Differences from Original Study

The subject pool will be different. The original paper sampled undergraduate students who received course credit for their participation. Our replication will sample online users of the platform Prolific. We do not anticipate this difference to impact the results. Everything else in the experiment will be kept the same.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Data preparation

Data will be downloaded and imported into R, then cleaned using Tidyverse prior to analysis.

Confirmatory analysis

The z-test of proportions comparing the proportions of “2” answers for the distorted question between the two conditions is the key analysis of interest. This is because we would like to see if the rate of producing this incorrect response which indicates the Moses Illusion will differ based on the difference in processing fluency between the two fonts in the two conditions.

Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.

Design Overview

One factor (difficulty of reading font) was manipulated. There was one measure (the answer to a trivia question) which was repeated once. Between-participants. Half get an easy to read font and half get a hard to read font. A within-subjects design could have impacted the results. Participants may have not been affected as much by low processing or high processing fluency when the two conditions were one after the other. I think that they could have reduced demand characteristics more. The hard to read font they chose is so unusual that I think participants could have figured out that that must be the manipulation, leading them to spend more time on the question and decreasing their rate of falling for the Moses illusion. I would have chosen a font that’s harder to read than Arial but not quite as drastic as Bush Script. I wonder if they were right to give an example of a distorted question that should be answered “can’t say”, rather than just telling participants to answer “can’t say” for a question where that would be applicable. This alerted them even more to that eventuality. Also, it would have been interesting to see more trials and see if the rate of falling for the illusion more or less with more trials. One significant confound is they assume that people know that the answer to the distorted question if it wasn’t distorted is 2. I don’t think Noah bringing two of each animal on the ark is THAT common knowledge to people though. The fact that they only posed two questions (one control, one distorted) limits generalizability.

Pilot A

library(readxl)
Warning: package 'readxl' was built under R version 4.4.2
pilota <- read_excel("C:\\Users\\Owner\\OneDrive\\Documents\\PilotA.xlsx")
brush_count <- sum(pilota$Distorted == 2 & pilota$Condition == "Brush")
arial_count <- sum(pilota$Distorted == 2 & pilota$Condition == "Arial")

brush_total <- sum(pilota$Condition == "Brush")
arial_total <- sum(pilota$Condition == "Arial")

p1 <- brush_count / brush_total
p2 <- arial_count / arial_total

p <- (brush_count + arial_count) / (brush_total + arial_total)

z <- (p1 - p2) / sqrt(p * (1 - p) * (1 / brush_total + 1 / arial_total))
z
[1] -0.58554
p_value <- 2 * (1 - pnorm(abs(z)))
p_value
[1] 0.5581846

Pilot B

setwd("C:\\Users\\Owner\\OneDrive\\Documents\\R\\Stat201A\\PilotB")

csv_files <- list.files(pattern = "\\.csv$", full.names = TRUE)

csv_list <- lapply(csv_files, function(file) read.csv(file, header = TRUE))

combined_data <- do.call(rbind, csv_list)
combined_data <- unique(combined_data)

write.csv(combined_data, "combined_data.csv", row.names = FALSE)

brush_total <- sum(grepl("Brush", combined_data$response))
arial_total <- sum(grepl("Arial", combined_data$response))

brush_count <- 2
arial_count <- 2

# please note that for brush_count and arial_count I could not figure out how to write code to calculate so I counted the 2 instances manually

p1 <- brush_count / brush_total
p2 <- arial_count / arial_total

p <- (brush_count + arial_count) / (brush_total + arial_total)

z <- (p1 - p2) / sqrt(p * (1 - p) * (1 / brush_total + 1 / arial_total))
z
[1] -0.9128709
p_value <- 2 * (1 - pnorm(abs(z)))
p_value
[1] 0.3613104
# the average completion time for the study on Prolific was 2:24 minutes