Replication of Study: How Quick Decisions Illuminate Moral Character

Author

Luna Bellitto (lbellitto@ucsd.edu)

Published

November 22, 2024

Introduction

This report presents our replication of the study “How Quick Decisions Illuminate Moral Character” conducted by Critcher, Inbar, and Pizarro (2013), which examined the effect of decision speed on judgments of moral character. The original study found that individuals who made quick moral decisions were evaluated more positively, whereas those who made quick immoral decisions were judged more harshly. In contrast, slower decisions, whether moral or immoral, led to more moderate evaluations of character. The authors proposed that quick decisions are seen as more certain and, therefore, more revealing of the decision maker’s underlying motives.

The original study employed two experiments. The first involved participants evaluating two individuals’ moral character, one who made a quick decision and another who took longer to decide, in both ethical and unethical contexts. The second experiment explored a similar structure but focused on more complex moral dilemmas. Both experiments consistently showed that quick decisions amplified moral evaluations, with quick moral decisions being judged more favorably and quick immoral decisions being judged more harshly.

Our replication aims to reproduce the findings of the first experiment by closely following the original methodology. Specifically, we focused on testing whether decision speed continues to amplify moral character judgments, with quick decisions leading to more polarized evaluations compared to slower ones. This replication project is part of a broader effort to assess the robustness of psychological research through reproducibility studies.

Methods

Power Analysis

In the absence of prior data or specific effect size estimates from our study to conduct a traditional power analysis, we followed standard practice and we increased our original sample size of 119 by a factor of 2.5, setting a target of 298 participants. This approach is intended to ensure adequate power to detect effects across conditions, compensating for the lack of data.

Planned Sample

For our replication experiment, we would require at least a sample size of 298 participants. In the original study, the participants were students from the University of California, Berkeley or members from the nearby community who were randomly assigned to different conditions to assess the influence of decision speed on moral evaluations. However, participants for this replication study will be recruited through the data collection platform, Prolific, and will be screened for residing in the US and being fluent in English.

Materials

All the materials for this replication study come from the original study, How Quick Decisions Illuminate Moral Character, conducted by Critcher, Inbar, and Pizarro (2013). Our paradigm can be found here.

Scenario:

There will be two groups assigned to one of the two conditions: moral or immoral. Each of the conditions has two agents as subjects (Justin and Nate), who differ based on the quickness of their decision.

For the moral condition, on different occasions, Justin and Nate are at the grocery store’s parking lot. Each of them finds a cash-filled wallet. Justin quickly decides to return the wallet to customer service instead of stealing the money. Nate makes the same decision, but he reaches his conclusion after spending a long time thinking about it.
For the immoral condition: Justin and Nate are at the grocery store’s parking lot. Each of them finds a cash-filled wallet. Justin quickly decides to take the money and drive away. Nate reaches the same conclusion, but he takes a long time before deciding to act.

Questionnaire:

Following the scale below, answer the following questions based on how much you agree with them on a scale from 1 to 7.

1: Not at all
2: Not very
3: Slightly
4: Moderately
5: Quite
6: Very
7: Extremely

Quickness:

Justin made his decision quickly.
Nate made his decision quickly.

Moral character evaluation:

Justin has entirely good principles.
Nate has entirely good principles.

Justin has entirely good moral standards.
Nate has entirely good moral standards.

Deep down, Justin has the moral principles and knowledge to do the right thing.
Deep down, Nate has the moral principles and knowledge to do the right thing.

Certainty:

Justin was conflicted in his decision.
Nate was conflicted in his decision.

Justin had reservations about his decision.
Nate had reservations about his decision.

Justin was quite certain in his decision.
Nate was quite certain in his decision.

Justin was far from choosing the alternate course of action.
Nate was far from choosing the alternate course of action.

Emotional impulsivity:

Justin remained calm and emotionally contained.
Nate remained calm and emotionally contained.

Justin became upset and acted without thinking.
Nate became upset and acted without thinking.

Procedure

As in the original experiment (Critcher et al., 2013), the reproduction will follow the following procedure; “Participants read about both Justin and Nate, two men who each independently came upon two separate cash-filled wallets in the parking lot of a local grocery store. Justin”was able to decide quickly” what to do. Nate “was only able to decide after long and careful deliberation.” Participants assigned to the moral condition learned both men “did not steal the money but instead left the wallet with customer service.” Those in the immoral condition learned instead that both men ’’pocketed the money and drove off””

Following the procedures outlined by Critcher et al. (2013), after reading the scenario of Justin and Nate’s actions, participants need to fill four Likert scale assessments scaled from 1 to 7. The first one concerns the quickness with which the actors took the decision (manipulation check). In the second one, participants will need to assess Justin and Nate’s morality: “has entirely good (vs. entirely bad) moral principles,” “has good (vs. bad) moral standards,” and “deep down has the moral principles and knowledge to do the right thing.” The third one will have four items on the questionnaire assessing level of certainty the actors had in making their decision(“how conflicted [each] felt when making his decision (…), how many reservations [each] had (…), was quite certain in his decision (…), how far [each] was from choosing the alternate course of action”). The fourth assessment evaluates the perceived emotional impulsivity of the actors whether “calm and emotionally contained” or “upset and acted without thinking”.

Design Overview

Factors

In this replication study, two factors were manipulated: the participants were randomly assigned to a type of decision (moral vs. immoral), and following that, they were exposed to both scenarios where one of the actors made the moral or immoral decision quickly and another where the other made the decision slowly.

Primary Measures

Four primary measures were taken from participants:

Quickness of Decision: A manipulation check to assess participants’ perceptions of how quickly Justin and Nate made their decisions.
Moral Character Evaluation: Participants rated each agent’s moral principles and standards, with items like “has entirely good (vs. entirely bad) moral principles.”
Certainty: Four items assessed participants’ perceptions of each agent’s certainty in their decision (e.g., “how conflicted [each] felt when making his decision”).
Emotional Impulsivity: Two items assessed perceptions of each agent’s emotional impulsivity, such as “calm and emotionally contained” vs. “became upset and acted without thinking.”

Mixed Design

The replication study will reflect the original study’s mixed design.

Between-Subjects Factor:
- Moral Condition (moral vs. immoral): Participants were randomly assigned to either the moral condition (where both characters return the wallet) or the immoral condition (where both characters keep the wallet). Each participant experiences only one of these conditions, making this factor between-subjects.
Within-Subjects Factor:
Decision Speed (quick vs. deliberative): Each participant evaluated both Justin (quick decision) and Nate (deliberative decision). Since every participant observes and evaluates both types of decision speeds, this factor is within-subjects.

Impact of Switching Between- and Within-Subjects Designs

Although this replication followed the original study’s mixed design, switching both factors (decision and speed) to either a fully within-subjects or fully between-subjects design is a possible choice.

A fully within-subjects design could increase statistical power and require fewer participants, as each person would experience all combinations of conditions (quick/slow and moral/immoral). This design reduces variability due to individual differences, making it easier to detect effects. However, it could also risk carryover effects, where experiencing one condition might influence responses in the next, potentially producing biased results. Additionally, participants may guess the study’s purpose and alter their responses to appear consistent or meet the experiment’s expectations.

In contrast, in a fully between-subjects design, each participant would experience only one combination of decision speed and moral condition, requiring a larger sample size to achieve reliable effects. With only one condition per participant, variability due to individual differences is higher, which reduces statistical power and sensitivity to detect effects. However, this design minimizes potential biases, such as carryover effects and demand characteristics, as participants encounter only one scenario.

Potential Confounding Variables

Potential confounding variables in this study include but are not limited to participants’ own biases and preconceptions about the decision-making process, which can manifest as personal perceptions of quick versus slow decisions and overall moral judgments. Another important confound relates to preconceptions about men’s moral character, as both characters in the scenario are male, and participants may carry implicit gender biases that influence their evaluations. Additionally, cultural perspectives, participants’ emotional states and contexts, as well as the moral complexity of the scenarios, may further impact these judgments.

Analysis Plan

Firstly, we will randomly assign each participant to one of the conditions for moral condition and they will read both decision speed scenarios. The random assignments will be recorded for group mean comparisons. Moral character evaluation, certainty, and emotional impulsivity scores will be rated using the Likert scales. Any reverse-scored items in the questionnaires will be re-scaled accordingly, and we compute the mean score across items for each variable. A 2(moral condition) x2 (decision speed) ANOVA test will be conducted to compare group means for the moral character evaluation, examining main and interaction effects of both factors. Similarly, we will analyze perceived certainty and emotional impulsivity as covariates, assessing how participants’ perceptions of these traits vary with decision speed.

Differences from Original Study

The difference from the original study is the participant sample. The participants for this replication will be recruited through the data collection platform, Prolific, instead of recruiting local population. As such, this replication will be hosted online and will have a target sample of 298 participants. In addition, this replication will only replicate experiment 1 of the original study.

Methods Addendum (Post Data Collection)

Actual Sample

For our replication experiment, we would require at least a sample size of 298 participants. In the original study, the 119 participants were students from the University of California, Berkeley or members from the nearby community who were randomly assigned to different conditions to assess the influence of decision speed on moral evaluations. However, participants for this replication study will be recruited through the data collection platform, Prolific, and only screened for residing in the US and being fluent in English.

Differences from pre-data collection methods plan

There is no difference in pre-data collection method between this replication study and the original study.

Results

Data preparation

Our replication of Experiment 1 will closely mirror the original study’s 2x2 factorial design, manipulating moral condition (stealing vs. not stealing) and decision speed (quick vs. slow). The dependent variable will be the moral character evaluation, assessed using a Likert scale. As in the original study, this replication will also examine perceived certainty and emotional impulsivity of the actor in the scenario as covariates. The questionnaire items for the covariates will also be assessed using a Likert scale.

#### Load necessary libraries
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

#### Define the directory containing your CSV files
directory_path <- "../data/pilotB"

#### Initialize an empty tibble to store all cleaned data
cleaned_data <- tibble()

#### Get a list of all CSV files in the directory
file_list <- list.files(path = directory_path, pattern = "\\.csv$", full.names = TRUE)

#### Loop through each file in the directory
for (file in file_list) {
  
  # Read the CSV file
  data <- read_csv(file)
  
  # Select and clean relevant columns
  temp_cleaned <- data %>%
    select(condition, scenario, starts_with("response_Q")) %>%
    drop_na() # Remove rows with NA values in selected columns
  
  # Determine who came first based on the first non-NA scenario entry
  temp_cleaned <- temp_cleaned %>%
    group_by(condition) %>%
    mutate(
      first_scenario = first(na.omit(scenario))  # Identify the first scenario for each participant
    ) %>%
    ungroup()
  
  # Convert condition to a factor for easier interpretation
  temp_cleaned <- temp_cleaned %>%
    mutate(
      condition = factor(condition, levels = c(0, 1), labels = c("immoral", "moral")),
      scenario = factor(scenario)
    ) %>%
    # Rearrange columns to place first_scenario after condition
    select(condition, first_scenario, scenario, starts_with("response_Q"))
  
  # Rename response columns to include scenario information
  temp_cleaned <- temp_cleaned %>%
    pivot_longer(cols = starts_with("response_Q"),
                 names_to = "question",
                 values_to = "response") %>%
    unite("question_scenario", scenario, question, sep = "_") %>%
    pivot_wider(names_from = question_scenario, values_from = response)
  
  # Append the cleaned data from this file to the main cleaned_data tibble
  cleaned_data <- bind_rows(cleaned_data, temp_cleaned)
}

Rows: 7 Columns: 21
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (5): stimulus, response, trial_type, plugin_version, scenario
dbl (15): rt, trial_index, time_elapsed, condition, question_order, response...
lgl  (1): participant_feedback

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 7 Columns: 21
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (5): stimulus, response, trial_type, plugin_version, scenario
dbl (15): rt, trial_index, time_elapsed, condition, question_order, response...
lgl  (1): participant_feedback

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 7 Columns: 21
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (5): stimulus, response, trial_type, plugin_version, scenario
dbl (15): rt, trial_index, time_elapsed, condition, question_order, response...
lgl  (1): participant_feedback

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 7 Columns: 21
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (5): stimulus, response, trial_type, plugin_version, scenario
dbl (15): rt, trial_index, time_elapsed, condition, question_order, response...
lgl  (1): participant_feedback

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 7 Columns: 21
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): stimulus, response, trial_type, plugin_version, scenario, particip...
dbl (15): rt, trial_index, time_elapsed, condition, question_order, response...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

#### View the final combined cleaned_data tibble
print(cleaned_data)

# A tibble: 5 × 22
  condition first_scenario Nate_response_Q1 Nate_response_Q2 Nate_response_Q3
  <fct>     <chr>                     <dbl>            <dbl>            <dbl>
1 moral     Nate                          2                6                6
2 moral     Justin                        1                3                3
3 immoral   Nate                          2                4                4
4 immoral   Justin                        6                4                2
5 immoral   Nate                          2                4                4
# ℹ 17 more variables: Nate_response_Q4 <dbl>, Nate_response_Q5 <dbl>,
#   Nate_response_Q6 <dbl>, Nate_response_Q7 <dbl>, Nate_response_Q8 <dbl>,
#   Nate_response_Q9 <dbl>, Nate_response_Q10 <dbl>, Justin_response_Q1 <dbl>,
#   Justin_response_Q2 <dbl>, Justin_response_Q3 <dbl>,
#   Justin_response_Q4 <dbl>, Justin_response_Q5 <dbl>,
#   Justin_response_Q6 <dbl>, Justin_response_Q7 <dbl>,
#   Justin_response_Q8 <dbl>, Justin_response_Q9 <dbl>, …

#### Define the output path for the final cleaned CSV file
output_path <- "../data/cleaned_data_pilotB.csv"

#### Save the final combined cleaned_data tibble as a CSV file
write_csv(cleaned_data, output_path)

Confirmatory analysis

Since the objective of this replication study is to closely mirror the original methodology, the statistical analysis will replicate the 2x2 ANOVA test used in the original study. This approach allows for direct assessment of the main effects of decision speed and moral condition on moral character evaluations, as well as their interaction, within a consistent analytical framework.

The 2x2 ANOVA is particularly appropriate here because it provides a straightforward method for comparing group means across these two factors, making it ideal for testing the original hypotheses about how decision speed and moral outcome influence moral judgments. Additionally, using the same statistical test enhances the comparability of our findings with those of the original study, ensuring any differences in results can be attributed more confidently to sample or context rather than methodological inconsistencies.

While other methods, such as a linear mixed-effects model, could offer flexibility, the 2x2 ANOVA preserves the simplicity and interpretability central to the original study’s analysis. By using this approach, we stay true to the original design and ensure consistency in our replication study.

library(tidyverse)
library(dplyr)
library(car)

Loading required package: carData


Attaching package: 'car'

The following object is masked from 'package:dplyr':

    recode

The following object is masked from 'package:purrr':

    some

library(emmeans)

Warning: package 'emmeans' was built under R version 4.4.2

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

# Load the dataset
dat <- read.csv("../data/cleaned_data_pilotB.csv")  # Read the dataset into a data frame

# Count how many participants had missing data
participants_with_missing <- sum(rowSums(is.na(dat)) > 0)  # Count rows with at least one NA value

# Drop rows with missing values
cleaned_data <- dat %>%
  filter(complete.cases(.))  # Retain only rows without missing values

# Calculate average scores for Justin and Nate's responses
avg_score <- dat %>% 
  mutate(
    # Calculate the average score for Justin across selected questions
    j_avg_score = rowMeans(select(., Justin_response_Q2, Justin_response_Q3, Justin_response_Q4)),
    # Calculate the average score for Nate across selected questions
    n_avg_score = rowMeans(select(., Nate_response_Q2, Nate_response_Q3, Nate_response_Q4))
  )

# Group data by condition and calculate the mean scores for Justin and Nate
grouped_data <- avg_score %>%
  group_by(condition) %>%  # Group data by the 'condition' column
  summarize(
    mean_justin = mean(j_avg_score, na.rm = TRUE),  # Mean of Justin's average scores
    mean_nate = mean(n_avg_score, na.rm = TRUE)     # Mean of Nate's average scores
  )

# Prepare data for ANOVA analysis
anova_data <- avg_score %>%
  pivot_longer(
    cols = c(j_avg_score, n_avg_score),  # Pivot Justin's and Nate's average scores into long format
    names_to = "justin_fast_nate_slow", # New column to indicate score type (Justin or Nate)
    values_to = "score"                 # New column for the score values
  )

# Perform a two-way ANOVA
anova_result <- aov(score ~ condition * justin_fast_nate_slow, data = anova_data)
summary(anova_result)  # Display ANOVA results

                                Df Sum Sq Mean Sq F value Pr(>F)  
condition                        1  8.313   8.313   7.587 0.0331 *
justin_fast_nate_slow            1  0.100   0.100   0.091 0.7728  
condition:justin_fast_nate_slow  1  3.113   3.113   2.841 0.1429  
Residuals                        6  6.574   1.096                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Conduct post hoc pairwise comparisons using estimated marginal means
emmeans(anova_result, pairwise ~ condition * justin_fast_nate_slow)

$emmeans
 condition justin_fast_nate_slow emmean    SE df lower.CL upper.CL
 immoral   j_avg_score             3.00 0.604  6     1.52     4.48
 moral     j_avg_score             6.00 0.740  6     4.19     7.81
 immoral   n_avg_score             4.11 0.604  6     2.63     5.59
 moral     n_avg_score             4.83 0.740  6     3.02     6.64

Confidence level used: 0.95 

$contrasts
 contrast                                  estimate    SE df t.ratio p.value
 immoral j_avg_score - moral j_avg_score     -3.000 0.956  6  -3.140  0.0729
 immoral j_avg_score - immoral n_avg_score   -1.111 0.855  6  -1.300  0.5949
 immoral j_avg_score - moral n_avg_score     -1.833 0.956  6  -1.919  0.3130
 moral j_avg_score - immoral n_avg_score      1.889 0.956  6   1.977  0.2927
 moral j_avg_score - moral n_avg_score        1.167 1.050  6   1.115  0.6948
 immoral n_avg_score - moral n_avg_score     -0.722 0.956  6  -0.756  0.8712

P value adjustment: tukey method for comparing a family of 4 estimates

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.