Replication of Study affective congruence & donation by Genevsky et al. (2025, Social Cognitive and Affective Neuroscience)

Author

Cynthia Wu

Published

December 8, 2025

Introduction

I chose to replicate Experiment 1a from Genevsky et al. (2025) because it directly connects with my research interests in affective decision-making and prosocial behavior, particularly how emotional cues and affective congruence shape value-based choices. Genevsky and colleagues proposed that the congruence between emotional features in charitable appeals—such as facial expressions and message framing—can enhance donation behavior by eliciting positive aroused affect, which in turn engages reward-related neural systems like the nucleus accumbens (NAcc). This framework integrates affective science and decision neuroscience, offering a mechanistic account of how emotion and valuation processes jointly drive prosocial giving. Replicating Experiment 1a, the foundational behavioral study in their series, provides an opportunity to test the robustness and generalizability of the affective congruence effect across participant samples. It also offers a strong behavioral foundation for my broader research program, which examines how affective and neural signals forecast real-world decisions, such as consumer or charitable choices. By reproducing this experiment, I aim to evaluate whether affective congruence consistently amplifies giving through emotional engagement, and to better understand the affective mechanisms that support prosocial decision-making.

Methods

Power Analysis

Original effect size and power considerations were estimated based on the reported results from Experiment 1a. Because the original data are not publicly available, we used the reported paired mean difference ($M_{congruent} - M_{incongruent} = 0.425$) sample size ($n=56$), and t-values to estimate Cohen’s d, which was found to range approximately from 0.47 to 0.52. Using this estimated effect size, we conducted a power analysis to determine the required sample sizes to achieve $80\%$, $90\%$, and $95\%$ power for detecting the effect. The resulting sample size requirements are illustrated in Figure 1. For conservative planning, at least 110 participants would be needed to achieve $80\%$ power at the lower bound of the effect size ($d = 0.27$). The original experiment, with $56$ participants, would only achieve $80\%$ power if the effect size were at the upper bound ($d > 0.37$).

library(pwr)
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

dz_values <- seq(0.47, 0.52, by = 0.01)

n_80 <- sapply(dz_values, function(d) ceiling(pwr.t.test(d=d, power=0.8, type="paired", alternative="two.sided")$n))
n_90 <- sapply(dz_values, function(d) ceiling(pwr.t.test(d=d, power=0.9, type="paired", alternative="two.sided")$n))
n_95 <- sapply(dz_values, function(d) ceiling(pwr.t.test(d=d, power=0.95, type="paired", alternative="two.sided")$n))

df <- data.frame(
  dz = dz_values,
  n_80 = n_80,
  n_90 = n_90,
  n_95 = n_95
)

df_long <- pivot_longer(df, cols = c(`n_80`, n_90, n_95),
                        names_to = "Power", values_to = "SampleSize")

p1 <- ggplot(df_long, aes(x=dz, y=SampleSize, color=Power, linetype=Power)) +
  geom_line(size=1.2) +
  geom_hline(yintercept=56, color="black", linetype="dashed", size=1) +
  annotate("text", x=0.47, y=57, label="orignal n=56", color="black", hjust=0, vjust = -1.2) +
  scale_color_manual(values=c("n_80"="#F6A6B2", 
                              "n_90"="#FFE066",  
                              "n_95"="#6CA0DC"), 
                     labels=c("80% power","90% power","95% power")) +
  scale_linetype_manual(values=c("solid","dashed","dotted"),
                        labels=c("80% power","90% power","95% power")) +
  labs(title="Required Sample Size vs Cohen's d",
       x="Cohen's d",
       y="Required Sample Size",
       color="Power Level",
       linetype="Power Level") +
  theme_minimal(base_size = 14)

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

ggsave(filename = "fig1.png", plot = p1, width = 9, height = 5)

Fig 1. Power analysis with estimated effect size given statistics from original study

Planned Sample

The planned sample size was determined based on a power analysis using the estimated effect size from Experiment 1a ($d ≈ 0.47–0.52$). To achieve $90\%$ power at the lower bound of the effect size ($d = 0.47$), a conservative sample of at least 50 participants is required (can be more liberal with funding constraint). Participants will be recruited from Prolific and will be adults aged 18–60 years, fluent in English. No additional preselection criteria will be applied based on prior experience or other characteristics, and no demographic information will be collected or used for inclusion or exclusion.

Materials

The experiment will use the original image and message stimuli from Genevsky et al. (2025), which will be conducted through Pavlovia (https://run.pavlovia.org/cwu0701/genevsky2025_psych251). The image set consists of photographs depicting individuals in need who display either happy or sad facial expressions, representing the manipulation of image valence. Each image is paired with a brief aid message that is either positively or negatively framed, creating a 2 × 2 within-subjects factorial design that varies affective congruence between the two affective features. Stimuli will be presented on a computer screen using a program that allows participants to indicate their donation decision with a slider bar ranging from $0.00 to $10.00. Additional materials include task instructions, comprehension checks, demographic questions, and post-task self-reports of emotional responses.

Procedure

Participants will receive a $6 endowment at the beginning of the session and complete 48 trials organized in a counterbalanced block design. On each trial, participants will view one image–message pair and decide how much of their endowment to donate using the slider. Each decision will be recorded as the dollar amount donated, along with response time and stimulus identifiers. Trial order within blocks will be randomized to minimize order and carryover effects. At the end of the session, one trial will be randomly selected to determine the participant’s real payout: if a donation was made on that trial, the chosen amount will be deducted from their endowment and donated to the corresponding cause; otherwise, the participant will retain the full $6. Finally, participants will complete brief demographic and manipulation-check questionnaires assessing perceived emotional tone of the stimuli.

Analysis Plan

The analysis strategy closely follows the original study. Data cleaning, exclusion rules, and modeling choices are reported as faithfully as possible. Trials with missing responses will be removed according to the original criteria. The main analysis focuses on the interactive effect of affective congruence on donation amount, using hierarchical mixed-effects regression models that include random effects of subject-level intercepts. Fixed effects include image valence, message valence, and their interaction. While the primary statistical inference is conducted using hierarchical mixed-effects regression, group-level averages for congruent versus incongruent trials were visualized in the original study. For descriptive purposes, an independent-samples t-test will be conducted comparing congruent and incongruent affective trials.

Differences from Original Study

While the current study closely follows the original experiment, several differences are worth noting. First, the sample was recruited from Prolific, whereas the original study recruited university students in a laboratory setting. Second, the study was conducted online rather than in-person, which may influence participants’ engagement or environmental control. Third, the choice-making duration was increased from 1 second to 2 seconds to account for potentially lower attention and slower responses among online participants. Fourth, the endowment provided to participants was less comparing to the original study ($6 in the current study and $10 in the original study) due to budget constraints. These differences are not expected to substantially affect the primary effect of affective congruence on donation amount, as prior research indicates that the effect is robust across different adult populations and experimental settings. The increased response window is anticipated to facilitate task performance without altering the underlying effect.

Methods Addendum (Post Data Collection)

Due to the way the online platform handled participant assignment, we unexpectedly collected data from one additional participant, resulting in a final sample of 51 subjects included in the analysis.

Actual Sample

A total of 51 participants took part in the current replication study. After applying the filtering criteria, all included subjects were between 18 and 55 years old. Participants who did not complete all 48 trials were excluded from the analysis.

Differences from pre-data collection methods plan

None.

Results

Data preparation

Data preparation following the analysis plan.

Confirmatory analysis

library(ggplot2)
library(ggsignif)
summary(lmer(donation ~ img_aff * mes_aff + (1|participant), df_filtered))

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: donation ~ img_aff * mes_aff + (1 | participant)
   Data: df_filtered

REML criterion at convergence: 5529.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-7.2005 -0.1228 -0.0237  0.1179  6.9966 

Random effects:
 Groups      Name        Variance Std.Dev.
 participant (Intercept) 2.3678   1.5388  
 Residual                0.6649   0.8154  
Number of obs: 2166, groups:  participant, 51

Fixed effects:
                    Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)          4.25708    0.21834   52.02594  19.498   <2e-16 ***
img_affp             0.09587    0.04974 2112.11152   1.927   0.0541 .  
mes_affp            -0.01017    0.04994 2112.08863  -0.204   0.8387    
img_affp:mes_affp    0.04159    0.07018 2112.09118   0.593   0.5535    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) img_ff ms_ffp
img_affp    -0.114              
mes_affp    -0.114  0.499       
img_ffp:ms_  0.081 -0.709 -0.712

t_res <- t.test(df_filtered %>% filter(congruency == "Congruent") %>% select(donation),
                df_filtered %>% filter(congruency != "Congruent") %>% select(donation))

df_summary <- df_filtered %>%
  group_by(congruency) %>%
  summarise(
    mean_donation = mean(donation, na.rm = TRUE),
    se_donation = sd(donation, na.rm = TRUE)/sqrt(n())
  )

p2 <- ggplot(df_summary, aes(x=congruency, y=mean_donation, fill=congruency)) +
  geom_bar(stat="identity", width=0.7, color="black") +
  geom_errorbar(aes(ymin=mean_donation - se_donation,
                    ymax=mean_donation + se_donation), 
                width=0.2, size=0.8) +
  scale_fill_manual(values=c("Congruent"="#F6A6B2", "Incongruent"="gray80")) +
  geom_signif(
    comparisons = list(c("Congruent", "Incongruent")),
    annotations = ifelse(t_res$p.value < 0.001, "***",
                         ifelse(t_res$p.value < 0.01, "**",
                                ifelse(t_res$p.value < 0.05, "*", "ns"))),
    y_position = max(df_summary$mean_donation + df_summary$se_donation) + 0.5,
    size = 1.5) +
  labs(title="Average Donation by Affective Congruency",
       x="Congruency",
       y="Mean Donation") +
  theme_classic(base_size = 22) +
  theme(
    legend.position = "none",
    plot.title = element_text(size = 28, face = "bold"),
    axis.title = element_text(size = 24),
    axis.text = element_text(size = 20)
  )

ggsave(filename = "fig2.png", plot = p2, width = 6, height = 5)

df_summary <- df_filtered %>%
  group_by(img_aff, mes_aff) %>%
  summarise(
    mean_donation = mean(donation, na.rm = TRUE),
    se_donation = sd(donation, na.rm = TRUE)/sqrt(n())
  ) %>%
  ungroup()

`summarise()` has grouped output by 'img_aff'. You can override using the
`.groups` argument.

p3 <- ggplot(df_summary, aes(x = mes_aff, y = mean_donation, fill = img_aff)) +
  geom_bar(stat = "identity", color = "black", width = 0.6, position = position_dodge(width = 0.7)) +
  geom_errorbar(aes(ymin = mean_donation - se_donation,
                    ymax = mean_donation + se_donation),
                width = 0.2, size = 0.8, position = position_dodge(width = 0.7)) +
  scale_fill_manual(values = c("p" = "#F6A6B2", "n" = "#6CA0DC")) +
  labs(title = "Average Donation by Image and Message Valence",
       x = "Message Valence",
       y = "Mean Donation",
       fill = "Image Valence") +
  theme_minimal(base_size = 14)

ggsave(filename = "fig3.png", plot = p3, width = 8, height = 5)

Fig 2. Average donation amount of each condition in the replication study

The analyses as specified in the analysis plan.
Side-by-side graph with original graph is ideal here Fig 3. Average donation amount of affective congruency group in the original study (left) and replication study (right) ### Exploratory analyses

Any follow-up analyses desired (not required).

model <- summary(lmer(donation ~ img_aff * mes_aff + (1|participant), df_filtered))

t_val <- model$coefficients[4, "t value"]
df    <- model$coefficients[4, "df"]

d_val <- 2 * t_val / sqrt(df)
d_val

[1] 0.0257906

Discussion

Summary of Replication Attempt

The analysis revealed a non-significant trend suggesting that image valence may be related to donation amount; however, this association did not reach statistical significance. Message valence showed no relationship with donation amount, and the interaction between image and message valence, which reflects affective congruency, was also not significantly associated with donation behavior.

Overall, the replication did not reproduce the key findings of the original study. Whereas the original research reported a positive effect of image valence on donations and a significant interaction indicating affective congruency, neither effect emerged in the current replication. ### Commentary

Several considerations emerged during follow-up analyses that offer possible explanations for the observed unreplicated pattern of results. First, differences in subject pools may have meaningfully influenced the outcome. Original work has shown that lab-based samples produce stronger effect sizes ($d = 0.47$ in study 1a) than online participant platforms ($d = 0.27$ in study 1b). Whether these differences definitely or only plausibly moderated the original effect remains uncertain, but the current findings ($d = 0.26$ shown in exploratory analysis) suggest that participant recruitment method could have played a non-trivial role.

Fig 4. Power analysis with estimated effect size given statistics from original study

Second, the incentive of the present study may also have contributed to the weaker or altered effects. Lower or less salient incentives tend to reduce participant engagement, which can shrink effect of decision. Although it is unclear whether this factor fully accounts for the replication divergence, it remains a plausible moderator worth further investigation.

Third, debriefing responses from pilot participants suggested that individuals value different attributes of the information presented in the donation appeal, particularly the content of the message itself. This heterogeneity in what participants notice or find persuasive may have diluted the aggregate effect observed in the original study.