Replication of Study 2 in ‘Effects of donation collection methods on donation amount: Nudging donation for the cause and overhead’ by Suk & Mudita (2023, Psychology & Marketing)

Author

Emma Gu (Emma.Gu@rady.ucsd.edu)

Published

December 10, 2024

Introduction

I have chosen to replicate Study 2 in the paper “Effects of donation collection methods on donation amount: Nudging donation for the cause and overhead” because it aligns closely with my research interests in charitable giving and prosocial behavior, particularly in addressing the challenge of overhead aversion. Overhead aversion refers to the tendency of donors to be reluctant to contribute to a charity’s operational costs, often preferring their donations to go directly to the cause. This aversion can negatively impact charities by limiting their ability to cover essential costs such as staffing and infrastructure, which are critical to achieving long-term goals. The paper I’ve chosen examines how different donation collection methods, such as varying the sequence in which donors are asked to contribute to the cause and overhead, can mitigate overhead aversion and increase the total donation amount without reducing donor satisfaction. The paper’s findings are intriguing and I’m curious to see whether the results can be replicated. Its findings have strong implications for helping nonprofit organizations overcome overhead aversion and increase contributions sustainably, which aligns with my research focus on effective charitable giving strategies.

To replicate the experiment, the main stimuli will include a charitable donation scenario, where participants are asked to make donation decisions using different collection methods. Participants will be asked to imagine they have $6 to donate, and then go through different donation processes. The key manipulation involves varying how the donation options are presented: participants will first decide how much to give to the cause, followed by how much to contribute to overhead (cause first), or vice versa (overhead first).

One challenge will be designing the donation collection screens in a way that mimics real-world charity websites, making the stimuli both realistic and easy for participants to understand. Another potential difficulty is ensuring that participants take the scenario seriously, especially in an online setting where hypothetical donations might not fully capture real-world behaviors. Additionally, recruiting a sufficient and diverse sample of participants with varying levels of donation experience could be challenging, as previous donation experience has been shown to influence outcomes.

Methods

Power Analysis

To determine the required sample size for adequate statistical power in our study, we conducted an a priori power analysis using G*Power software. Our goal was to detect an effect size (Cohen’s d) of 0.4126397, which was calculated based on the results of total donation amount in the original study. This effect size reflects a moderate difference between two independent groups.

We set the power at 80% (0.8), meaning there is an 80% probability of correctly rejecting the null hypothesis if a true effect exists. Additionally, we used a standard alpha level of 0.05 to control the Type I error rate. The analysis indicated that, to achieve 80% power, we would need a total sample size of 188 participants, with 94 participants per group.

Planned Sample

The target sample size for this study is approximately 188 participants, with around 94 individuals assigned to each condition. Participants have to be at least 18 years old, and there will be no other restrictions or controls placed on age, gender, or other demographic characteristics. Recruitment will continue until the target sample size is reached.

Materials

  • Donation Input Form: Participants will be asked to indicate how much they would like to donate if they win a prize. Participants will have the option to write down “$0” if they wish to keep all prize money without making a donation.

  • Donation Satisfaction Scale: After making their donation decision, participants will rate their satisfaction on a Likert scale (1 = not at all, 7 = very much).

  • Donation Information: All participants will receive an overview of how charitable organizations use donations. They will be informed that donations may be allocated to (1) helping the cause and (2) covering the organization’s operational costs, such as overhead. Additionally, participants will learn that some charities accept separate donations for the cause and the overhead, as in the original article: “This information will be presented in both conditions to ensure that the participants will expect that they could donate separately for the cause and the overhead.

Procedure

  • Participant Recruitment: Participants will be recruited from the Prolific Academic online panel. They will be informed that 20% of the randomly selected participants will receive an extra $5.

  • Donation Decision: Participants will be informed that they can choose to donate some or all of the prize money, if won, to charity. The donation will be completely voluntary, and the participants will be asked to write down $0 if they want to keep all the bonus prize money without donation.

  • Information on Charitable Giving: All participants will receive an overview of how charitable organizations collect and spend donation money. They will be informed that donations can be used for (1) helping the cause and (2) covering the organization’s operations, or overhead costs. Participants will also be told that some organizations accept separate donations for cause and overhead expenses.

  • Donation Campaign Overview: Participants will be presented with information on a donation campaign to support disabled children born with physical disabilities in an Asian country. The campaign will be described as being run by a trustworthy charity, but no specific charity name will be mentioned. This will be done to avoid potential influence of participants’ attitude toward existing charities.

  • Condition Assignment: Participants will be randomly assigned to one of two conditions:

    • Cause-First Condition: Participants will first indicate the amount they want to donate to the cause. On the next page, they will indicate the amount for overhead. “On this page, the amount donated for the cause will be shown in the box, and the participants will be allowed to change the amount if they want to.” The total will then be displayed on the screen.

    • Overhead-First Condition: In this condition, participants will specify their donation amount for overhead first, followed by the cause. The procedure will be identical to the cause-first condition except for the order.

  • Total Display and Reminder: The total donation amount will be calculated automatically and displayed on-screen. Participants will be reminded that they can donate any amount up to $5 and that they will keep any remaining funds as prize money.

  • Prize Allocation and Feedback: After completing the study, “20% of the participants will be randomly selected as prize winners.” The total donation from winners will be donated to a charity supporting children overseas, and each winner will receive their non-donated prize amount.

Analysis Plan

  • Total Donation Amount Analysis:

    • A regression analysis will be conducted to examine the effect of donation collection method (cause-first vs. overhead-first) on the total donation amount. The primary hypothesis is that donation amounts will differ based on whether the overhead or the cause donation is collected first. As in the original study, we will expect to find that participants in the overhead-first condition donate more overall than those in the cause-first condition.

    • Gender and age will be included as covariates in the model to test for their potential influence on donation amounts, with expectations that these demographic variables will not significantly impact the outcome, based on the original results.

  • Donation for Overhead and Cause Analysis:

    • Separate regression analyses will be conducted to examine the amounts donated to overhead and to the cause. The main predictor in each model will be the donation collection method (cause-first vs. overhead-first), with the hypothesis that participants in the overhead-first condition will donate more to the overhead than those in the cause-first condition. Likewise, an unexpected but noteworthy finding in the original study was a higher cause donation in the overhead-first condition, which will be tested for replication.

    • Gender and age will again be included as covariates to control for any potential demographic effects on donation distribution.

  • Donation Satisfaction Analysis:

    • Satisfaction with donation decisions will be tested using a series of regression analyses in three steps:

      • Step 1: A simple regression with donation collection method as the independent variable, to test if satisfaction levels differ between the cause-first and overhead-first conditions.

      • Step 2: Adding demographic covariates (age and gender) to the model, allowing us to assess any effects of these variables on satisfaction levels, with an expected significant influence of age as observed in the original study.

      • Step 3: Adding donation amounts for the cause and overhead as covariates to assess whether the donation amounts influence satisfaction levels. Based on the original study, we expect higher donation amounts to be associated with greater satisfaction. At this step, we will also re-evaluate the effect of donation collection method to see if it becomes significant after controlling for donation amounts.

Differences from Original Study

  • Sample and Setting:

    The sample will be recruited through an online panel, following the same method as the original study. However, the unrelated survey administered prior to the donation study will differ from the original due to the unavailability of the original materials from the authors. Additionally, the sample will be recruited from USA rather than UK. We will exclude those who do not pass both comprehension checks from the analysis, while there is no mention of comprehension checks in the original study.

  • Instructions:

    The original study’s instructions will be used verbatim where possible. Otherwise, a closely aligned set of instructions will be developed that preserves the intent and detail of the original materials. Due to our limited funding, we will reduce the individual bonus amount to $5 from £6.

Methods Addendum (Post Data Collection)

  • Actual Sample

    We recruited 188 participants on Prolific. However, only 136 participants (49% female, Mean = 38 ) passed both comprehension checks. As a result, our final sample size was smaller than the original study, which had 143 participants. Of the 136 participants, 65 were in the overhead-first condition, and 71 were in the cause-first condition.

  • Differences from pre-data collection methods plan

    Due to time and budget constraints, we proceeded with the analysis using only the 136 responses, rather than recruiting more participants to reach the full 188.

Design Overview

The original study utilized a between-participants design, with one manipulated factor and three distinct measures. Measures were not repeated, and each participant was exposed to only one experimental condition. A between-participants design was likely chosen to avoid any potential awkwardness or perceived redundancy that could arise from donating to the same charity multiple times, which might also reveal the experimental purpose to participants. This decision aligns well with the study’s objective, as a within-participants design may have increased the risk of participants discerning the treatment and thus responding differently.

To minimize demand characteristics, an unrelated survey was administered prior to the donation task, which helped to distract participants and potentially reduced any preconceived expectations about the study’s purpose. This approach likely maintained the integrity of responses by focusing participants’ attention elsewhere before the key donation ask.

One critique of the study’s design is that including an unrelated survey could introduce unintended influences on participant behavior, potentially acting as a confound. For instance, the content or tone of the survey might subtly impact their mindset when making donation decisions. Additionally, the order in which each type of donation (i.e., cause vs. overhead costs) was explained could also act as a confounding factor, possibly affecting participants’ understanding or perception of each donation type (e.g., through recency effect).

A few factors limit the generalizability of the study’s findings. Notably, only 20% of participants received actual payment, which may have influenced how participants perceived and engaged with the donation task, knowing they were not guaranteed real funds to donate. Moreover, donations were made using money provided by the experimenters rather than participants’ own money. This setup may not accurately capture genuine donation behavior, as participants may behave differently when donating from personal funds. These factors suggest that the results might be specific to the experimental context and may not fully generalize to real-world donation scenarios.

Results

Data preparation

Data will be collected through jsPsych. Various packages will be needed for reading and analyzing .csv data, as well as conducting regression analyses and t-tests. Based on participants’ comprehension check responses, we will filter results to include only those who passed. The dataset will include the following columns: Participant ID, Condition, Demographics (Age, Gender, Nationality), Amount Donated to Overhead, Amount Donated to Cause, Total Donation Amount, Donation Satisfaction, and Donation Status (indicating if the participant was among the 20% selected to win and/or donate additional compensation).

## Necessary Packages
library(jsonlite)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter()  masks stats::filter()
✖ purrr::flatten() masks jsonlite::flatten()
✖ dplyr::lag()     masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)

# ## Data Processing Function
# # Larger function to concatenate individual processed files
# process_csv_files <- function(file_paths, output_file) {
# 
#   # Nested function to process a single file
#   process_file <- function(file_path) {
# 
#     # Reads file
#     testdata <- read_csv(file_path)
# 
#     # Shortens file to specified columns
#     tdcols <- subset(testdata, select = -c(rt, stimulus, trial_type, plugin_version, question_order))
# 
#     # Shortens file to specified rows
#     tdrows <- tdcols[-c(1:13, 16:17), , drop = FALSE]
# 
#     # Rearranges elements of cleaned data into one cohesive row
#     tdarrange <- data.frame(
#       check1 = tdrows$response[1],
#       check2 = tdrows$response[2],
#       cause = tdrows$causeDonation[3],
#       overhead = tdrows$expenseDonation[3],
#       total = tdrows$totalDonation[3],
#       condition = tdrows$questionOrder[3],
#       satis_usual = tdrows$response[4],
#       age = tdrows$age[5],
#       nationality = tdrows$nationality[5],
#       gender = tdrows$response[6],
#       comment = tdrows$response[7]
#     )
# 
#     # Parses the satisfaction responses, unnests JSON formatting
#     tdparsed <- tdarrange %>%
#       mutate(satis_usual = lapply(satis_usual, function(x) {
#         tryCatch(fromJSON(x), error = function(e) NA)
#         }))
# 
#     # Unnests the satisfaction/frequency column into two separate columns
#     tdsep <- tdparsed %>%
#       unnest_wider(col = satis_usual, names_sep = "_")
# 
#     # Renames columns
#     tdclean <- tdsep %>%
#       rename(satisfaction = satis_usual_Q0,
#              frequency = satis_usual_Q1)
# 
#     # Returns cleaned data
#     return(tdclean)
#   }
# 
#   # Apply nested function to each file
#   processed_files <- lapply(file_paths, process_file)
# 
#   # Combines files, adds identifying number
#   all_data <- bind_rows(processed_files, .id = "file_id")
# 
#   # Saves the combined data
#   write_csv(all_data, output_file)
# 
#   # Returns combined data
#   return(all_data)
# }
# 
# # Vector of filepaths for data pipe outputs to be strung together
# file_paths <- c(
#                 "~/Downloads/wke8v3rwk2.csv",
#                 "~/Downloads/vk7098a6qc.csv",
#                 "~/Downloads/c5gpk5juov.csv",
#                 "~/Downloads/6hf9baq07x.csv",
#                 "~/Downloads/33lelrenhh.csv"
#                 )
# 
# # Defines filepath for .csv file
# output_file <- "/Users/willdemelo/Desktop/donation_testdata/testpilotdata.csv"
# 
# # Runs function and produces .csv file
# result <- process_csv_files(file_paths, output_file)
# 
# ## Cleans JSON formatting
# # Function removes all of the special charcters from the data
# remove_special_characters <- function(entry) {
#   gsub("[^a-zA-Z0-9\\s]", "", entry)
# }

# # Applies function to all relevant columns
# cleanresult <- result %>%
#   mutate(check1 = sapply(check1, remove_special_characters)) %>%
#   mutate(check2 = sapply(check2, remove_special_characters)) %>%
#   mutate(condition = sapply(condition, remove_special_characters)) %>%
#   mutate(gender = sapply(gender, remove_special_characters)) %>%
#   mutate(comment = sapply(comment, remove_special_characters))
# 
# # Edits data for legibility
# cleanresult <- cleanresult %>%
#   mutate(
#     condition = recode(
#       condition,
#       ForthecauseForcoveringcharitableorganizationsoperatingexpense = 'causefirst',
#       ForcoveringcharitableorganizationsoperatingexpenseForthecause = 'overfirst'
#     ),
#     check1 = gsub("chanceresponse", "", check1),
#     check2 = gsub("donationuseresponse", "", check2),
#     gender = gsub("gender", "", gender),
#     comment = gsub("Q0", "", comment)
#   ) %>% 
#   # Leaves only participants who answered correctly
#   filter(check1 == 20,
#          check2 == "Alloftheabove")
# 
# # Produces .csv file
# write_csv(cleanresult, "./cleanresult.csv")

## Read .csv file (start here if data cleaning is already finished)
cleanresult <- read.csv("filteredsampledata.csv")

Descriptive Statistics

cleanresult %>%
  group_by(condition) %>% 
  summarise(
    n = n(),
    mean_total = mean(total, na.rm = TRUE),
    sd_total = sd(total, na.rm = TRUE),
    mean_cause = mean(cause, na.rm = TRUE),
    sd_cause = sd(cause, na.rm = TRUE),
    mean_overhead = mean(overhead, na.rm = TRUE),
    sd_overhead = sd(overhead, na.rm = TRUE)
    ) %>% 
  kable()
condition n mean_total sd_total mean_cause sd_cause mean_overhead sd_overhead
causefirst 71 1.472535 1.654858 0.9288732 1.044327 0.5436620 0.7148891
overfirst 65 1.753846 1.926629 1.1115385 1.329526 0.6423077 0.9343376
cleanresult %>%
  group_by(condition) %>% 
  summarise(
    mean_satisfaction = mean(satisfaction, na.rm = TRUE),
    sd_satisfaction = sd(satisfaction, na.rm = TRUE),
    mean_frequency = mean(frequency, na.rm = TRUE),
    sd_frequency = sd(frequency, na.rm = TRUE),
    mean_age = mean(age, na.rm = TRUE),
    sd_age = sd(age, na.rm = TRUE)
  ) %>% 
  kable()
condition mean_satisfaction sd_satisfaction mean_frequency sd_frequency mean_age sd_age
causefirst 4.619718 1.417908 2.873239 1.723199 37.81690 13.03546
overfirst 4.523077 1.370535 2.876923 1.866712 37.35385 10.85189

Confirmatory analysis

  • Total Donation Amount

    A regression analysis was conducted to examine the effects of condition, frequency, age, and gender on total donations. The significant effect of condition observed in the original study did not replicate (β = 0.344, SE = 0.305, p = 0.261). However, the results remained in the same direction as the original study, with participants in the overheads-first condition (μo = 1.75, σo = 1.93) donating more than those in the cause-first condition (μc = 1.47, σc = 1.65). This difference had an effect size (d = 0.157, 95% CI: -0.183, 0.497) smaller than the original study’s effect (d = 0.41).

    Significant effects were observed for age (βa = -0.027, SEa = 0.013, p = 0.035) and gender (βg = -0.685, SEg = 0.310, p = 0.029), indicating that older and male participants donated less than younger and female participants. However, the overall model was only marginally significant (p = .097) and and the adjusted R-squared (R2adj = 0.032) suggested low explanatory power.

  • Donation Allocation

    Separate regression analyses were conducted to evaluate differences in amounts allocated to the charity’s cause and overhead.

    • Overhead Donations

      The original study’s significant effect of condition on overhead donations also failed to replicate (β = 0.135, SE = 0.138, p = 0.332). Consistent with the original study, participants in the overheads-first condition (μo = 0.64, σo = 0.93) donated more than those in the cause-first condition (μc = 0.54, σc = 0.71). This difference had an effect size (d = 0.119, 95% CI: -0.221, 0.459) smaller than the original study’s (d = 0.35).

      Significant effects were observed for age (βa = -0.015, SEa = 0.006, p = 0.012) and gender (βg = -0.406, SEg = 0.141, p = 0.005) in the same directions as the total donation regression. The model was significant (p = .021) with a slightly higher adjusted R-squared (R2adj = 0.062).

    • Cause Donations

      The regression analysis for donations to the cause also failed to replicate the original study’s significant effect of condition (β = 0.209, SE = 0.206, p = 0.311). However, the original direction was maintained, with participants in the overheads-first condition (μo = 1.11, σo = 1.33) donated more than those in the cause-first condition (μc = 0.93, σc = 1.04). This difference had an effect size (d = 0.154, 95% CI: -0.186, 0.494) smaller than the original study’s (d = 0.33).

      No significant explanatory variables were identified in this model, and the analysis produced an insignificant p-value (p = 0.457), and a negative adjusted R-squared (R2adj = -0.002).

## Analysis

totaloutput <- lm('total ~ condition + age + gender', data = cleanresult)
summary(totaloutput)

Call:
lm(formula = "total ~ condition + age + gender", data = cleanresult)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4897 -1.3151 -0.5092  0.8497  3.8677 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           2.80183    0.56851   4.928 2.48e-06 ***
conditionoverfirst    0.34397    0.30452   1.130   0.2608    
age                  -0.02734    0.01284  -2.129   0.0351 *  
genderMale           -0.68530    0.30992  -2.211   0.0288 *  
genderNonbinary      -0.11264    1.26692  -0.089   0.9293    
genderPrefernottosay  0.38164    1.78738   0.214   0.8313    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.759 on 130 degrees of freedom
Multiple R-squared:  0.06833,   Adjusted R-squared:  0.03249 
F-statistic: 1.907 on 5 and 130 DF,  p-value: 0.09747
overoutput <- lm('overhead ~ condition + age + gender', data = cleanresult)
summary(overoutput)

Call:
lm(formula = "overhead ~ condition + age + gender", data = cleanresult)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0663 -0.5410 -0.2396  0.3893  4.0831 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)           1.290311   0.258375   4.994 1.87e-06 ***
conditionoverfirst    0.134722   0.138400   0.973  0.33215    
age                  -0.014946   0.005836  -2.561  0.01157 *  
genderMale           -0.405731   0.140854  -2.881  0.00465 ** 
genderNonbinary      -0.386863   0.575788  -0.672  0.50285    
genderPrefernottosay  0.083347   0.812332   0.103  0.91844    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7996 on 130 degrees of freedom
Multiple R-squared:  0.09625,   Adjusted R-squared:  0.06149 
F-statistic: 2.769 on 5 and 130 DF,  p-value: 0.02067
causeoutput <- lm('cause ~ condition + age + gender', data = cleanresult)
summary(causeoutput)

Call:
lm(formula = "cause ~ condition + age + gender", data = cleanresult)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.4233 -0.8757 -0.2950  0.4362  3.9802 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)           1.511520   0.384364   3.933 0.000136 ***
conditionoverfirst    0.209244   0.205886   1.016 0.311369    
age                  -0.012392   0.008681  -1.427 0.155843    
genderMale           -0.279574   0.209536  -1.334 0.184455    
genderNonbinary       0.274219   0.856553   0.320 0.749374    
genderPrefernottosay  0.298290   1.208439   0.247 0.805422    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.19 on 130 degrees of freedom
Multiple R-squared:  0.0349,    Adjusted R-squared:  -0.00222 
F-statistic: 0.9402 on 5 and 130 DF,  p-value: 0.4574
## Cohen's d Calculation
# Function to calculate Cohen's d
calculate_cohens_d <- function(data, group_var, outcome_var) {
  # Separate the data by groups
  group1 <- data %>% filter(!!sym(group_var) == "causefirst") %>% pull(!!sym(outcome_var))
  group2 <- data %>% filter(!!sym(group_var) == "overfirst") %>% pull(!!sym(outcome_var))
  
  # Calculate means and standard deviations for each group
  mean1 <- mean(group1, na.rm = TRUE)
  mean2 <- mean(group2, na.rm = TRUE)
  sd1 <- sd(group1, na.rm = TRUE)
  sd2 <- sd(group2, na.rm = TRUE)
  
  # Calculate pooled standard deviation
  n1 <- length(group1)
  n2 <- length(group2)
  pooled_sd <- sqrt(((n1 - 1) * sd1^2 + (n2 - 1) * sd2^2) / (n1 + n2 - 2))
  
  # Calculate Cohen's d
  cohens_d <- (mean1 - mean2) / pooled_sd
  return(cohens_d)
}

# Calculate Cohen's d for the effect of condition on total donations
cohens_d <- calculate_cohens_d(cleanresult, "condition", "total")
print(paste("Cohen's d for the effect of condition on total donations:", round(cohens_d, 3)))
[1] "Cohen's d for the effect of condition on total donations: -0.157"
cohens_d <- calculate_cohens_d(cleanresult, "condition", "overhead")
print(paste("Cohen's d for the effect of condition on overhead", round(cohens_d, 3)))
[1] "Cohen's d for the effect of condition on overhead -0.119"
cohens_d <- calculate_cohens_d(cleanresult, "condition", "cause")
print(paste("Cohen's d for the effect of condition on cause", round(cohens_d, 3)))
[1] "Cohen's d for the effect of condition on cause -0.154"
# Calculates means and standard errors for donation amounts among both conditions
summary_data <- cleanresult %>%
  group_by(condition) %>%
  summarise(
    total_mean = mean(total),
    total_se = sd(total) / sqrt(n()),
    cause_mean = mean(cause),
    cause_se = sd(cause) / sqrt(n()),
    overhead_mean = mean(overhead),
    overhead_se = sd(overhead) / sqrt(n())
  ) %>%
  
  # Shapes data for ease of use in ggplot
  pivot_longer(
    cols = -condition,
    names_to = c("category", ".value"),
    names_sep = "_"
  )
summary_data$category <- factor(summary_data$category, 
                                levels = c("total", "cause", "overhead"))

# Creates base for plot
ggplot(summary_data, aes(x = category, y = mean, fill = condition)) +
  geom_bar(stat = "identity", position = position_dodge(width = 0.4), 
           width = 0.4, color = "gray20") +
  
  # Adds error bars
  geom_errorbar(aes(ymin = mean - se, ymax = mean + se),
                position = position_dodge(width = 0.4), 
                width = 0.2, color = "gray20") +
  
  # Specifies aesthetics for color and label
  scale_fill_manual(
    values = c("causefirst" = "white", "overfirst" = "gray"),
    labels = c("Cause First", "Overhead First")) +
  labs(
    title = "Results of Replication",
    x = "Donation collection method",
    y = "Amount ($)",
    fill = "Condition"
  ) +
  
  # Specifies axes
  scale_y_continuous(
    expand = c(0, 0),
    limits = c(0, 2.25),
    breaks = seq(0, 2.25, .5)
  ) +
  
  # Specifies style
  theme_classic()

Results of Original Study

Exploratory analyses

Three regression analyses will test if satisfaction varies by donation order, adding age, gender, and donation amounts in steps to control for their effects.

As in the original study, the model with condition as the single independent variable, condition had no significant effect on donation satisfaction (β = -0.097, SE = 0.240, p = 0.687). The model was not significant (p = 0.687) and had a negative adjusted R-squared (R2adj = -0.006).

When age and gender were added to the model, only gender became significant (β = 0.680, SE = 0.239, p = 0.005). This differs from the original study where only age had a significant effect. The model was significant (p = 0.035) and had a low adjusted R-squared (R2adj = 0.052).

In the model where age, gender, donations for the cause and for the overhead costs were included, the original study found gender to be the only significant variable. In the replication, only gender was significant (β = 0.697, SE = 0.244, p = 0.005). The model is significant (p = 0.025) and had a low adjusted R-squared (R2adj = 0.067).

donation1 <- lm('satisfaction ~ condition', data = cleanresult)
summary(donation1)

Call:
lm(formula = "satisfaction ~ condition", data = cleanresult)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.6197 -1.5231  0.4286  1.3803  1.4769 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)         4.61972    0.16561  27.895   <2e-16 ***
conditionoverfirst -0.09664    0.23956  -0.403    0.687    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.395 on 134 degrees of freedom
Multiple R-squared:  0.001213,  Adjusted R-squared:  -0.006241 
F-statistic: 0.1627 on 1 and 134 DF,  p-value: 0.6873
donation2 <- lm('satisfaction ~ condition + age + gender', data = cleanresult)
summary(donation2)

Call:
lm(formula = "satisfaction ~ condition + age + gender", data = cleanresult)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.6192 -1.0820  0.2557  1.0503  1.8313 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.975362   0.437604   9.084 1.48e-15 ***
conditionoverfirst   -0.192983   0.234404  -0.823   0.4118    
age                   0.009905   0.009884   1.002   0.3182    
genderMale            0.679652   0.238561   2.849   0.0051 ** 
genderNonbinary       0.309134   0.975199   0.317   0.7518    
genderPrefernottosay -2.222978   1.375827  -1.616   0.1086    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.354 on 130 degrees of freedom
Multiple R-squared:  0.08735,   Adjusted R-squared:  0.05225 
F-statistic: 2.489 on 5 and 130 DF,  p-value: 0.03454
donation3 <- lm('satisfaction ~ condition + age + gender + cause + overhead', data = cleanresult)
summary(donation3)

Call:
lm(formula = "satisfaction ~ condition + age + gender + cause + overhead", 
    data = cleanresult)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.4931 -1.0278  0.1928  1.1649  2.0190 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.77690    0.47713   7.916 1.02e-12 ***
conditionoverfirst   -0.22562    0.23371  -0.965  0.33619    
age                   0.01102    0.01005   1.096  0.27502    
genderMale            0.69673    0.24420   2.853  0.00505 ** 
genderNonbinary       0.20016    0.97175   0.206  0.83713    
genderPrefernottosay -2.28227    1.36538  -1.672  0.09706 .  
cause                 0.23161    0.11838   1.957  0.05258 .  
overhead             -0.11751    0.17610  -0.667  0.50579    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.344 on 128 degrees of freedom
Multiple R-squared:  0.1154,    Adjusted R-squared:  0.06704 
F-statistic: 2.386 on 7 and 128 DF,  p-value: 0.02512

Discussion

Summary of Replication Attempt

Our replication of Study 2 from Suk & Mudita (2023) aimed to evaluate the effects of donation collection methods on total donation amount, donation allocation to cause and overhead, and donation satisfaction. While the original study found significant effects of condition (cause-first vs. overhead-first) on total donation amount and allocation, our replication failed to replicate these effects. However, the observed trends in our results were in the same direction as the original findings. Notably, participants in the overhead-first condition donated slightly more overall and allocated more to overhead than those in the cause-first condition, but these differences were not statistically significant. Additionally, we observed significant effects of demographic variables (age and gender) in both the original study and our replication, although with some variations.

Commentary

The exploratory analysis revealed consistent trends in the direction of effects across conditions but failed to achieve statistical significance. One possible explanation for this discrepancy could be the smaller sample size in our replication (136 participants compared to the original 143) due to the exclusion of participants who failed comprehension checks. This difference may have reduced statistical power, making it harder to detect significant effects. Furthermore, while the inclusion of demographic variables such as age and gender added explanatory power to the models, the adjusted R-squared values remained low, indicating limited overall variance explained by the predictors.

The failure to replicate the original study’s significant effects could also stem from contextual or methodological differences. Our replication recruited participants from the U.S., while the original study used a U.K. sample, which could have introduced cultural differences in donation behavior. Additionally, the reduced bonus amount ($5 vs. £6 in the original study) might have influenced participants’ engagement or perceived stakes in the task. These factors may plausibly have moderated the effects observed in the original study.

A potential challenge in replicating the original study was ensuring that participants understood the donation task and conditions as intended. While comprehension checks helped ensure data quality, they also reduced the final sample size, which may have impacted the study’s power. Additionally, the absence of certain materials from the original authors (e.g., unrelated survey) required adaptations, which could have inadvertently influenced participants’ responses.

Conclusion

While our replication did not achieve statistical significance for the main effects observed in the original study, the trends were consistent, suggesting that the direction of the effects may hold under different contexts. This replication highlights the importance of further research to understand the robustness and generalizability of findings related to donation collection methods. Future studies could address limitations such as sample differences and the size of bonuses to improve comparability and ensure more robust conclusions.

Statement of Contributions

Emma Gu: Conceptualization, Methodology, Software, Writing - original draft, and Writing - review & editing.
Fan Yang: Methodology, Software, Writing - original draft, and Writing - review & editing.
Nina Rice: Formal analysis, Software, Validation, Visualization, Writing - original draft, and Writing - review & editing.
William S. de Melo: Data curation, Formal analysis, Investigation, Software, Validation, Writing - original draft, and Writing - review & editing.