## Introduction

Justification for Experiment Choice: I chose to replicate Young and Saxe’s study, “When Ignorance Is No Excuse: Different Roles for Intent Across Moral Domains,” as it closely aligns with my research interests in criminology and the psychological factors underpinning moral judgment. The study’s investigation into the varied impacts of intent on moral assessments across domains—such as harm versus purity—relates directly to understanding societal responses to legal and moral transgressions. In particular, this work provides a framework for examining how intent influences perceptions of responsibility and moral wrongness, which is relevant to my research on the social and psychological dynamics of criminal behavior.

Young and Saxe’s experiment involved participants evaluating moral scenarios where intent and outcome varied, providing a robust model for analyzing how judgments are shaped by the interaction of these factors. This experimental setup offers a unique lens to examine how individuals assess responsibility in morally ambiguous situations, where a perceived lack of intent might mitigate or aggravate judgments. Replicating this study will deepen insights into the cognitive processes involved in moral judgment, helping to address complex issues related to public perceptions of justice and accountability, especially in the context of accidental or ambiguous actions that often arise in real-world criminal cases.

Original Paper Link: https://github.com/ucsd-psych201a/young2011/tree/main/original_paper

## Methods

### Power Analysis

This study employs Partial eta-squared (η²p) to quantify the effect size in our Analysis of Variance (ANOVA). Partial η²p represents the proportion of variance explained by a given factor when excluding other factors’ effects. We will compute this measure to assess the practical significance of our findings.

Planned Sample

To ensure methodological rigor, we will aim to recruit approximately 262 participants, aligning with the sample size utilized in the original study. Consistent with the established protocol, we will implement stringent exclusion criteria to maintain data quality. Specifically, we will exclude participants who report prior exposure to similar experimental paradigms and remove duplicate responses as identified through redundant Prolific IDs.

Materials

Each participant made a moral judgment for a single scenario. Participants were assigned randomly to one of six conditions in a 2 (intentional versus accidental) × 3 (harm versus incest versus ingestion) between-subjects experimental design. Here are the materials to present to participants.

Harm

Incest

Ingestion

The experimental stimuli comprised moral scenarios spanning three domains: harm, incest, and ingestion. Each domain was represented by two distinct scenarios: harm (allergy and poison), incest (parent and sibling), and ingestion (dog meat consumption and urine ingestion). Participants evaluated the moral wrongness of actions depicted in these scenarios using a 7-point Likert scale, where 1 represented “not at all morally wrong” and 7 indicated “very morally wrong.” To examine the role of intent, we manipulated scenario type between-subjects, with each participant randomly assigned to evaluate either intentional or accidental versions of the actions.

Analysis Plan

We will replicate Experiment 1A, excluding any repeat participants before analysis. Participants will rate moral wrongness on a 7-point scale. Our analysis will have two steps:

First, we’ll check if participants judged the two scenarios within each domain (harm, incest, and ingestion) similarly using separate 2x2 ANOVAs. Then, if the scenarios within each domain don’t differ significantly, we’ll run our main analysis: three 2x2 ANOVAs comparing how intent affects judgments across different domain pairs:

Harm vs. incest Harm vs. ingestion Incest vs. ingestion

For any significant interactions, we’ll use t-tests to examine specific patterns, following the original paper’s approach.

Differences from Original Study

The only difference may arise from the different participant samples. Additionally, we collect data from two platforms (Prolific vs. MTurk), and participants on these platforms may have different characteristics.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Error: subject Df Sum Sq Mean Sq F value Pr(>F)
intent 1 45.00 45.00 31.15 0.0114 * Residuals 3 4.33 1.44
— Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subject:domain Df Sum Sq Mean Sq F value Pr(>F) domain 1 0.833 0.8333 0.459 0.547 intent:domain 1 0.556 0.5556 0.306 0.619 Residuals 3 5.444 1.8148

Error: Within Df Sum Sq Mean Sq F value Pr(>F) Residuals 20 19.33 0.9667

Data preparation

Data preparation following the analysis plan.

### Data Preparation

#### Load Relevant Libraries and Functions
#library(tidyr)
#library(dplyr)
#library(ggplot2)


#### Import data
#raw_data = read.csv("/data/data.csv")

#### Data exclusion / filtering

# exclude subs that took similar survey before
# filetered_data = raw_data %>% 
#  filter(is.na(TakeSurveyBefore))

# exclude duplicated prolfic ids
# cleaned_data = filetered_data %>% 
#  distinct(prolific_id, .keep_all = TRUE)

#### Prepare data for analysis - create columns etc.

# create columns for all variables in a dataframe from cleaned_data
# convert wide data into long data for analysis and visualization

Confirmatory analysis

The present study aims to replicate Experiment 1A from the original investigation. Prior to statistical analyses, we will implement participant screening procedures to eliminate duplicate responses. The dependent measure consists of participants’ moral judgments assessed on a 7-point Likert scale. Our analytical approach encompasses multiple stages. Initially, we will conduct three separate 2 (story) × 2 (intent) analyses of variance (ANOVAs) - one for each violation type (harm, incest, and ingestion) - to establish equivalence between the paired scenarios within each domain. Contingent upon replicating the original findings of no significant main effect of story and no story-by-intent interaction, we will proceed with the primary analyses: three 2 (intent) × 2 (domain) ANOVAs. These analyses will systematically examine the role of intent across domain pairs: (1) harm versus incest, (2) harm versus ingestion, and (3) incest versus ingestion. For cases yielding significant interaction effects, we will employ independent-samples t-tests to delineate specific patterns of differences, adhering to the analytical framework established in the original study. Based on the original findings, we hypothesize significant main effects for both intent and domain across all three ANOVAs. Moreover, we anticipate significant intent-by-domain interactions specifically in the comparisons involving harm (i.e., harm versus incest and harm versus ingestion), but not in the incest versus ingestion comparison. This pattern would support our central hypothesis that intent plays a more substantial role in moral judgments of harm compared to purity violations. Specifically, we predict that accidental harms will be judged less severely than accidental purity violations, despite comparable severity ratings for intentional violations across domains. Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.