Experimental paradigam:
https://ucsd-psych201a.github.io/young2011/
Introduction
Moral judgments are often shaped by the perceived intentions behind actions. Prior research, including a study by Young and Saxe (2011) (https://doi.org/10.1016/j.cognition.2011.04.005), suggests that the importance of intent varies across different moral domains. For instance, harmful actions tend to be judged more heavily based on the actor’s intentions, while purity violations—such as incest or consuming taboo substances—are often viewed as wrong regardless of intent. This difference aligns with theories of moral foundations, which associate purity violations with strong disgust reactions and harm violations with greater attention to the actor’s mental state.
The key question in the original study was whether intent plays a greater role in moral judgments of harm compared to purity. The results revealed a significant interaction between moral domain and intent: harm judgments were highly sensitive to intent, while purity violations (involving incest and ingestion) were less so. For example, accidental harm was judged as less wrong than accidental purity violations, whereas intentional harm was seen as more wrong than intentional purity violations.
In this replication study, we revisited the role of intent in moral judgments across three domains: harm, incest, and ingestion. We closely followed the original study’s design, using second-person scenarios and a 7-point moral wrongness scale. Our goal was to test the robustness of the original findings in a new sample and recruitment platform while exploring whether cultural or methodological differences might influence the results.
Design Oveview
We used a 2 (intent: intentional vs. accidental) × 3 (domain: harm, incest, ingestion) between-subjects design. Participants were randomly assigned to read one scenario that varied by both intent and domain. After reading the scenario, they rated the moral wrongness of the action on a 7-point scale, ranging from “not at all morally wrong” to “very morally wrong.” Our study closely followed the original design, with the primary difference being the use of Prolific for participant recruitment instead of Amazon MTurk.
To preserve the integrity of the study, we maintained the original between-subjects design. A within-subjects design could have led participants to guess the study’s purpose, potentially influencing their responses. The experiment was conducted double-blind. Although factors such as cultural background, religious beliefs, education level, and gender could influence moral judgments, these variables were not explicitly manipulated in our study.
Methods
Power Analysis
Based on a conventional medium-large effect size (Cohen’s d ~ 0.8) and aiming for 80% power at α = .05, a sample size of approximately 351 participants was deemed sufficient. Feasibility considerations led us to target a similar sample size. Our final analyzed sample included 343 participants, closely matching the power requirements.
Planned Sample
We recruited English-speaking adults living in the United States through Prolific. We included participants aged 18-100, aiming for a diverse demographic. Given our power analysis, we planned to collect around 351 participants; the final sample included 343 usable responses. Participants who indicated that they had completed a similar task before were excluded to prevent familiarity biases.
Materials
We used the same moral judgment scenarios as in the original study, adapted in the second-person perspective. Scenarios fell into three domains: - Harm (e.g., poisoning or unknowingly causing an allergic reaction) - Incest (e.g., a sexual relationship between siblings) - Ingestion (e.g., consuming taboo substances, like dog meat or urine) Each scenario had intentional and accidental versions. For instance, in an accidental harm scenario, a participant (you) inadvertently causes harm due to ignorance of a critical detail (e.g., peanut allergy).
Procedure
Participants were randomly assigned to one of the six conditions (2 intent levels × 3 domains) and presented with a single scenario. After reading it, they provided a moral wrongness rating on a scale from 1 (“not at all morally wrong”) to 7 (“very morally wrong”). The survey was administered online, and participants were compensated for their time.
Analysis Plan
We planned to conduct a series of ANOVAs to examine the intent × domain interaction. First, we would test whether different story exemplars within the same domain differed significantly. If no differences emerged, we would collapse across exemplars. Then, we would conduct 2 (intent) × 2 (domain) comparisons to replicate the original analyses, focusing on the role of intent in harm vs. purity (incest, ingestion) judgments.
We expected to replicate the original findings: intent would have a stronger effect on harm judgments compared to purity judgments. Specifically, accidental harm should be judged less harshly than accidental purity violations, while intentional harm should be judged more harshly than intentional purity violations.
Differences from Original Study
The original study recruited participants via Amazon Mechanical Turk, while we used Prolific. Although both platforms host English-speaking U.S. participants, minor demographic differences might exist. We adhered to similar exclusion criteria and closely matched the original methods otherwise. We believe these minor methodological shifts are unlikely to significantly alter the core patterns of results.
Methods Addendum (Post Data Collection)
Actual Sample
We collected 343 participants, all English-speaking adults residing in the United States. The sample size was slightly below the targeted 351 but still provided sufficient power.
Differences from pre-data collection methods plan
No major deviations from our preregistered plan occurred.
Results
Data preparation
Data preparation following the analysis plan.
AOV comparing harm vs. incest
Df Sum Sq Mean Sq F value Pr(>F)
domain 1 10.2 10.2 2.371 0.125055
intention 1 394.1 394.1 91.238 < 2e-16 ***
domain:intention 1 54.6 54.6 12.648 0.000462 ***
Residuals 216 933.1 4.3
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
AOV comparing harm vs. ingestion
Df Sum Sq Mean Sq F value Pr(>F)
domain 1 4.5 4.50 1.105 0.294
intention 1 230.5 230.48 56.638 8.92e-13 ***
domain:intention 1 169.0 169.04 41.539 5.70e-10 ***
Residuals 256 1041.7 4.07
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
AOV comparing incest vs. ingestion
Df Sum Sq Mean Sq F value Pr(>F)
domain 1 26.8 26.81 5.983 0.015225 *
intention 1 59.5 59.46 13.269 0.000336 ***
domain:intention 1 17.7 17.71 3.953 0.048022 *
Residuals 222 994.8 4.48
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mean Ratings and Standard Errors:
# A tibble: 6 × 4
domain intention mean_rating se_rating
<fct> <fct> <dbl> <dbl>
1 harm accidental 0.481 0.124
2 harm intentional 4.74 0.220
3 incest accidental 2 0.413
4 incest intentional 3.88 0.273
5 ingestion accidental 3.78 0.265
6 ingestion intentional 4.42 0.237
Confirmatory analysis
The original study by Young and Saxe (2011) found that:
There were no significant differences between stories within each domain. Each of the three 2 (story) × 2 (intent) ANOVAs (for harm, incest, and ingestion) revealed main effects of intent but not story, and no story × intent interaction. This allowed the original authors to collapse across stories within each domain in subsequent analyses.
When comparing harm to incest and harm to ingestion using 2 (intent) × 2 (domain) ANOVAs, the original results showed a significant intent × domain interaction. This indicated that intent mattered more for harm judgments than for purity (incest, ingestion). Specifically, accidental purity violations were judged more morally wrong than accidental harm, and intentional harm was judged more morally wrong than intentional purity violations, aligning with the predicted difference in how intent influences moral judgments of harm versus purity.
However, when comparing the two purity domains (incest vs. ingestion), the original study found no intent × domain interaction. Both forms of purity violations were judged similarly in terms of intent-sensitivity, suggesting that purity as a category was less intent-dependent than harm.
In contrast, our replication found:
Similar to the original, we found main effects of intent in all comparisons, indicating that intentional violations are judged more harshly than accidental ones in general.
In comparisons of harm versus incest and harm versus ingestion, we also replicated the pattern of significant domain × intent interactions. This suggests that, as originally reported, harm judgments remain more intent-sensitive than purity judgments.
Critically, our replication diverged from the original findings when comparing incest versus ingestion. While the original study reported no domain × intent interaction for the two purity domains, our results indicated a significant interaction (F(1, 222) = 3.95, p = 0.048). In other words, unlike the original study, we found that the two purity domains (incest and ingestion) were not equally insensitive to intent; instead, we observed subtle differences in how intent influenced judgments within these purity scenarios.
In terms of mean ratings, our results also showed a somewhat different pattern than the original. The original study highlighted that accidental purity violations (incest and ingestion) were judged more harshly than accidental harm, while intentional harm was judged more harshly than purity violations. Our mean ratings suggest a similar pattern for harm vs. incest and harm vs. ingestion (with accidental purity > accidental harm and intentional harm > intentional purity). However, the difference between the two purity domains themselves was more pronounced in our data: accidental incest (M = 2.00) and accidental ingestion (M = 3.78) differed notably, indicating that not all purity violations are treated identically with respect to accidental intent.
Discussion
Summary of Replication Attempt
Our replication partially diverged from the original findings. While we replicated the general notion that intent matters for moral judgment, we did not confirm the original pattern of intent playing a significantly larger role only for harm. Instead, our results suggest that differences in how intent affects judgments extended to within the purity domain comparisons as well.
Contribution
Conceptualization: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang. Data curation: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang. Formal analysis: Coxi Jiang and Seyi Lawal. Funding acquisition: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang. Investigation: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang. Methodology: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang. Software: Coxi Jiang and Seyi Lawal. Visualization: Coxi Jiang and Seyi Lawal. Writing - original draft: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang. Writing - review & editing: Coxi Jiang, Belynda Herrera, Seyi Lawal, and Cassie Wang.