Set up

Loading R library

packages = c(
  "tidyverse", "data.table", "dtplyr", "rlang", "kableExtra", "haven", "ggcorrplot", "visdat", "VIM", "corrplot", "kableExtra", "fastDummies", "tm", "wordcloud", "tidytext"
)

# Load the packages, install if necessary
new_packages = packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages, dependencies = TRUE)
lapply(packages, require, character.only = TRUE) |> invisible()

Goal

  • This script starts off by loading the clean data set generated by the main survey cleaning script.

  • The current working directory is ~FB_Charitable_Giving/Code/SurveyDataAnalysis/

  • We then filtered out participants who did not consent and generated a series of summary statistics for all categorical variables and quantitative variables.

  • For categorical variables, we reported the frequency and proportion for each choices of each variable. We then showed these statistics by treatment arms.

  • For continuous variables, we showed the total number of missing values, proportion of missing values, mean, standard deviation, standard error, min, max, and quintile values. We then reported these statistics for each variables by treatment arm.

  • Next, we will look at the number of people entered the chatbot, charity matching results, and donation intention over time.

  • The data we are using is located at ~FB_Charitable_Giving/Data/Processed.

  • The output tables, figures, and spreadsheet generated by this script can be found at ~FB_Charitable_Giving/Data/Processed.

Summary

  • About 90% of people spend 5 minutes or less in the chatbot to get their match.

  • Our entries into the chatbot started increasing dramatically in the last 2 weeks of December. That means we will have a difficult time separating out selection over time and selection due to expanded recruiting efforts (i.e., spending more ad money).

  • 78% of people chose the U.S. over Global, although users were less likely to choose the U.S. over time (see prior point for difficulty interpreting).

  • The amount of time spent on different charity matches varies - this indicates to me that people were reading because the length of time in between messages (chatbot “pauses”) did not vary across charities, but the length of text for each charity did vary some.

  • 15% of people who answered the donation variable reported wanting to make a donation immediately, and this proportion was fairly constant over time with just a couple of small/large days.

  • People who did not want to donate were about 40/40/20 split on Explore Other Charities/Learn More about Matched Charity/Donate Later, and these proportions were fairly constant over time.

  • About 20% of users who were asked the question wanted to share the chatbot quiz with their friends.

  • About 35% of users who were asked the question wanted to be recontacted, and this proportion may have dipped some in the last few days.

Data

  • The data used in this script was processed by the main survey cleaning script.

  • The data we are using is located at ~FB_Charitable_Giving/Data/Processed.

  • The output tables, figures, and spreadsheet generated by this script can be found at ~FB_Charitable_Giving/Data/Processed.

Loading the Data

data <- readRDS("Data/Processed/charitable_clean_wide.rds")

a <- as.character(nrow(data))

Sample definition

df_wide <- data |> filter(consent_coded_num == 1)
n_consent <- as.character(nrow(df_wide))
  • The dataset that contains all participants who passed high quality data checks in the cleaning script. The data currently has 46419 observations.

  • In this script, we will only restrict to participants who consented to start the chatbot. The number of observation in this sample is 18529. This is the main sample we will use for the analysis.

  • During the analysis, we’ll limit participants to specific subgroups determined by certain conditions. For instance, we’ll include participants who were assigned a treatment, those who chose a charity origin, and those who selected a main charity cause. We’ll provide the count of observations from our main sample for each subsample, with the exact number specified within each section of this script.

Key Decisions Made During the Cleaning

  • Key decisions regarding filtering and keeping the were made during cleaning include:
    1. We identified participants who chose to repeat the charity matching quiz and manually recorded the Chatfuel responses for the both the first and second attempts. For analysis, we will use the responses of the first attempt.
    2. We pinpointed participants displaying atypical response behavior, such as skipping entire text blocks yet proceeding to subsequent sections (additional specifics forthcoming).
    3. We documented discrepancies between the expected and revealed charity names disclosed to participants. We noticed that cases with missing entries in the actual charity name field, but not in the expected charity name field, predominantly occur on December 28, 2023. This follows an update on December 27, 2023, to the spreadsheet with charity details, during which a new column was added without a name. Consequently, when App Engine updated its instances on December 28, 2023, it led to a disruption in data availability. This problem was identified and resolved by 10 AM on the same day (December 28). More detail on this issue can be found here.
    4. We also checked whether participants who were able to proceed to the treatment assignment phase had any missing values in the country_charity_coded, main_cause_coded, and sub_cause_coded variables. We found that some participants had missing values in these variables. We generated a series of intermediate csv files to record abnormal entries and will check these responses manually.
    5. WORK IN PROGRESS

Structure of Charity Matching Quiz

  • For each main cause, there are sub causes participants were asked to select, which are described as follows:

  • If country_choice_num is US:

    • The sub-causes for Transform education are:
      1. help all children learn to read
      2. send a low income student to college
      3. support girls in STEM
      4. help teachers fund classroom projects
      5. provide tutoring and support to underserved children
    • The sub-causes for Eradicate hunger and homelessness are:
      1. a roof over their head
      2. hot food in their belly
      3. employment training and opportunities
    • The sub-causes for Defend the oppressed and marginalized are:
      1. immigrants and refugees fleeing violence
      2. LGBTQ+ communities
      3. racial or ethnic minorities
      4. women
      5. people with disabilities
    • The sub-causes for Rescue the environment are:
      1. innovations that further clean energy technology
      2. massive ocean clean up
      3. conserving natural spaces and habitats like old rainforests and national parks
      4. new policies to regulate carbon emissions
    • The sub-causes for Heal the sick are:
      1. people suffering from preventable diseases because of inadequate healthcare
      2. rare diseases that need more research
      3. cancer: The Big C
      4. people with heart disease
    • The sub-causes for Protect the animals are:
      1. dogs & cats humans’ best friends
      2. all the WILD furry animals like lions, tigers and bears (oh and koalas)
      3. any species on the brink of extinction
      4. animals in factory farms
      5. whales and ocean animals (Baby Beluga! Nemo!!)
  • If country_choice_num is Global:

    • The sub-causes for Eradicate poverty worldwide are:
      1. the opportunity to go to school
      2. food in their belly
      3. the skills to grow food sustainably
      4. access to clean water
    • The sub-causes for Defend the oppressed and marginalized are:
      1. refugees fleeing violence or other disasters
      2. people that are victims of human trafficking
      3. women
      4. groups targeted by their government
    • The sub-causes for Rescue the environment are:
      1. innovations that further clean energy technology
      2. massive ocean clean up
      3. conserving natural spaces and habitats like old rainforests and national parks
      4. new policies to regulate carbon emissions
    • The sub-causes for Heal the sick are:
      1. protecting against preventable childhood diseases
      2. treating the big three: HIV, AIDS, Tuburculosis and Malaria
      3. eradicating polio
      4. preventing and curing blindness
    • The sub-causes for Protect the animals are:
      1. all the WILD furry animals like lions, tigers and bears (oh and koalas)
      2. any species on the brink of extinction
      3. whales and ocean animals (Baby Beluga! Nemo!!)

Data Dictionary

  • The variable names and their descriptions are shown in the table below. Variable names are sorted in alphabetical order.
library(Hmisc)
dictionary <- label(df_wide) %>% data.frame()
dictionary <- dictionary %>% 
  mutate(variable_names = rownames(dictionary)) %>%
  select(variable_names, '.')

rownames(dictionary) <- NULL
dictionary %>% arrange(dictionary[,1]) %>% kable(digits = 3, col.names = c("Variable Name", "Description")) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "500px")
Variable Name Description
analytic_id Unique identifier for each participant
arm_coded Treatment assignment of the participant
charitable_affirm_end Time stamp when participants finished answering the affirmation questions
charitable_affirm_start Time stamp when participants started answering the affirmation questions
charitable_intro_end Time stamp after participant saw the statement ‘Awesome! Let’s find your charity soulmate’
charitable_intro_start_time Time stamp when participant entered the chatbot
charitable_match_end Time stamp when participants finished providing their responses for the charity matching quiz
charitable_match_start Time stamp when participants began the charity matching quiz
charitable_reveal_end Time stamp at the end of charity reveal
charitable_reveal_start Time stamp when participants started seeing the charity reveal
charitable_treatment_end Time stamp at the end of the intervention
charitable_treatment_start_time Time stamp when participants got assigned a treatment group
charity_mismatch_type_1 1 if the expected charity name is not missing but the revealed charity name is missing
charity_mismatch_type_2 1 if the expected charity name does not match the revealed charity name
charity_name_coded Charity name revealed to the participant
consent_coded_num Binary flag for whether the participant gave consent to the study
country_charity_coded Country of charity (US or Global) selected by the participant
donate_later_coded Variable to indicate the donation intention of the participant to donate later
donate_today_coded Binary variable to indicate the donation intention of the participant to donate today
donor_type_coded Donor type assigned to the participant
duration_affirm Duration of the affirmation questions in minutes
duration_intro Duration of the introduction stage in minutes
duration_match Duration of the charity matching quiz in minutes
duration_reveal Duration of the charity reveal in minutes
duration_treatment Duration of the intervention in minutes
feedback_has_link Binary variable to indicate if the feedback contains a hyperlink
feedback_match_coded Feedback to charity match
greeting_coded Binary flag for whether the participant passed the greeting stage
has_logo_coded Binary flag for whether the charity has a logo
important_forward_looking_coded Score for the importance of forward-looking giving
important_responsive_coded Score for the importance of responsive giving
important_smart_coded Score for the importance of smart giving
main_cause_coded Main cause of charity selected by the participant
manipulation_order_coded Coded version of the manipulation order variable
manipulation_value_coded Coded version of the manipulation question value variable
not_affil_coded Binary flag for whether the participant passed the not affiliated stage
pre_consent_coded Binary flag for whether the participant passed the pre-consent stage
proper_order Flag for participants who have a time sequence that is in the correct order
repeat_quiz_coded Binary flag for whether the participant repeated the charity matching quiz
share_with_friend_coded Variable to indicate if the participant would like to share the link with friends
source_coded 1 if entered via JSON AD, 0 otherwise
stay_connected_coded Variable to indicate if the participant would like to stay connected with the chatbot
sub_cause_coded Sub cause selected by the participant
text_has_link Binary variable to indicate if the response contains a hyperlink
time_since_consent Time conseted since the experiment began in days
time_since_first_start Time started the chatbot since the experiment began in days
time_since_intervention Time started the intervention since the experiment began in days
time_since_match_start Time started the charity matching quiz since the experiment began in days
time_since_reveal Time started the charity reveal since the experiment began in days
treatment_completed Binary variable to indicate if participants have completed the treatment intervention
treatment_text Free text responses from participants in the opportunity and obligation arm

Functions

Funtions for continuous variables

  • We created a function to take each variable and group by a categorical variable to calculate the summary statistics for each group.

  • The summary statistics include the number of missing values, the mean, the standard deviation, the standard error, the number of total observation, min, max, and decile values.

  • The function combines the summary statistics for each selected variable and creates an html table

  • The function also generates a csv file, which is saved in the Github directory ~./Data/Processed/

summary_stats_continous <- function(dt, variable, group = NULL){
  # Get descriptive statistics for a single continuous variable
  variable_name = variable 
  variable = sym(variable)

  if(!is.null(group)){
    group_name = group
    group = sym(group)
    dt <- dt |> group_by({{group}})
  } # grouping if group variable is selected

  tab = dt |> lazy_dt() |>
    mutate(
      `Number_Missing` = sum(is.na({{variable}})),  # calculate missing values
    ) |>
    filter(!is.na({{variable}})) |>  # filtering out the missing values for the variable
    summarise(
        Total_Observations = n(),
        Number_Missing = first(`Number_Missing`),
        Mean = as.numeric(mean({{variable}})),
        SD = as.numeric(sd({{variable}})),
        SE = as.numeric(sd({{variable}}) / sqrt(n())),
        Min = as.numeric(min({{variable}})),
        Max =  as.numeric(max({{variable}})),
        Decile_10 = as.numeric(quantile({{variable}}, probs = 0.1)),
        Decile_20 = as.numeric(quantile({{variable}}, probs = 0.2)),
        Decile_30 = as.numeric(quantile({{variable}}, probs = 0.3)),
        Decile_40 = as.numeric(quantile({{variable}}, probs = 0.4)),
        Decile_50 = as.numeric(quantile({{variable}}, probs = 0.5)),
        Decile_60 = as.numeric(quantile({{variable}}, probs = 0.6)),
        Decile_70 = as.numeric(quantile({{variable}}, probs = 0.7)),
        Decile_80 = as.numeric(quantile({{variable}}, probs = 0.8)),
        Decile_90 = as.numeric(quantile({{variable}}, probs = 0.9)),
        Decile_95 = as.numeric(quantile({{variable}}, probs = 0.95)),
        Decile_98 = as.numeric(quantile({{variable}}, probs = 0.98)),
        Decile_99 = as.numeric(quantile({{variable}}, probs = 0.99)),
        Decile_100 = as.numeric(quantile({{variable}}, probs = 1))
      ) |> # compute summary statistics 
      mutate(
        Proportion_Missing = Number_Missing / Total_Observations,
        Variable = variable_name
      ) |> # adds a column indicating the proportion of missing values relative to the total observations.
      collect()

  if (is.null(group)){
    tab <- tab |> select(-Total_Observations) %>% select(Variable, Number_Missing, Proportion_Missing, everything())
  } else {
    tab <- tab |> mutate(Group = paste(group_name, ": ", {{group}})) |> select(-Total_Observations) |> select(Variable, Group, Number_Missing, Proportion_Missing, everything()) |> select(-{{group}})
  }

  return(tab)
}

summary_stats_table_continous <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  # Get descriptive statistics for a list of continuous variables and return a combined table
  results <- lapply(variables, function(x) summary_stats_continous(dt, x, group)) # apply the summary_stats_continous function to each variable in the list
  results <- bind_rows(results) # bind the list of dataframes into a single dataframe

  if(!is.null(filename)){
    write.csv(results, paste0("./Data/Processed/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}
summary_stats_continous_outcomes <- function(dt, variable, group = NULL){
  # Get descriptive statistics for a single continuous variable
  variable_name = variable 
  variable = sym(variable)

  if(!is.null(group)){
    group_name = group
    group = sym(group)
    dt <- dt |> group_by({{group}})
  } # grouping if group variable is selected

  tab = dt |> lazy_dt() |>
    mutate(
      `Number_Missing` = sum(is.na({{variable}})),  # calculate missing values
    ) |>
    filter(!is.na({{variable}})) |>  # filtering out the missing values for the variable
    summarise(
        Total_Observations = n(),
        Number_Missing = first(`Number_Missing`),
        Mean = as.numeric(mean({{variable}})),
        SD = as.numeric(sd({{variable}})),
        SE = as.numeric(sd({{variable}}) / sqrt(n()))
      ) |> # compute summary statistics 
      mutate(
        Proportion_Missing = Number_Missing / Total_Observations,
        Variable = variable_name
      ) |> # adds a column indicating the proportion of missing values relative to the total observations.
      collect()

  if (is.null(group)){
    tab <- tab |> select(Variable, everything())
  } else {
    tab <- tab |> mutate(Group = paste(group_name, ": ", {{group}})) |> select(Variable, Group, everything()) |> select(-{{group}})
  }

  return(tab)
}

summary_stats_table_continous_outcomes <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  # Get descriptive statistics for a list of continuous variables and return a combined table
  results <- lapply(variables, function(x) summary_stats_continous_outcomes(dt, x, group)) # apply the summary_stats_continous function to each variable in the list
  results <- bind_rows(results) # bind the list of dataframes into a single dataframe

  if(!is.null(filename)){
    write.csv(results, paste0("./Data/Processed/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

Function for unordered categorical variables

  • We created a function to take each variable and group by a categorical variable to calculate the summary statistics for each group.

  • The summary statistics include the number for each count for each value and the proportion for each value

  • The function combines the summary statistics for each selected variable and creates an html table

  • The function also generates a csv file, which is saved in the Github directory ~./Data/Processed/

# Define the function to create summary statistics for unordered categorical variable
summary_stats_categorical <- function(data, variables, group = NULL) {
  combined_results <- data.frame(
    Variable = character(),
    Group = character(),
    Category = character(),
    Frequency = integer(),
    Proportion = numeric(),
    stringsAsFactors = FALSE
  ) # preset a dataframe
  
  for (variable in variables) {
    if (!is.null(group)) {
      # Grouped calculations
      summary_df <- data %>%
        group_by_at(vars(c(group, variable))) %>%
        summarise(Frequency = n(), .groups = 'drop') %>%
        mutate(Proportion = Frequency / sum(Frequency)) %>%
        ungroup() %>%
        mutate(Variable = variable,
               Group = ifelse(is.na(.data[[group]]), "Missing", as.character(.data[[group]])),
               Category = ifelse(is.na(.data[[variable]]), "Missing", as.character(.data[[variable]]))) %>%
        select(Variable, Group, Category, Frequency, Proportion)
    } else {
      # Ungrouped calculations
     summary_df <- data %>%
      dplyr::select(!!sym(variable)) %>%
      mutate(Category = factor(!!sym(variable), exclude = NULL)) %>% # Include missing values as a factor level
      group_by(Category) %>%
      summarise(
        Frequency = n(),
        Proportion = n() / nrow(data),
        .groups = 'drop' # Avoid regrouping warning
      ) %>%
      ungroup() %>%
      mutate(Variable = variable) %>% # Add variable name
      dplyr::select(Variable, everything())
    }
    
    # Combine the result with the overall results
    combined_results <- bind_rows(combined_results, summary_df)
  }
  
  return(combined_results)
}

summary_stats_tab_categorical <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  
  results <- lapply(variables, function(x) summary_stats_categorical(dt, x, group))
  results <- bind_rows(results) 

  if(!is.null(filename)){
    write.csv(results, paste0("./Data/Processed/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

Function for generating probability distribution

Density plot

create_density_plot <- function(data, x_var, group_var, x_min, x_max, title, x_label, y_label,  file_basename)  {
  plt <-
  ggplot(data, aes_string(x = x_var, group = group_var, fill = group_var)) +
    geom_density(alpha = 0.5, color = "black") +
    labs(title = title,
         x = x_label,
         y = y_label) +
    scale_x_continuous(limits = c(x_min, x_max)) +
    theme_minimal()+
    guides(fill = guide_legend(title = NULL))

  print(plt)

  # Export the data used for the graph to CSV
  write.csv(data, paste0("./Data/Processed/", file_basename, ".csv"))

  # Save the plot to JPG
  ggsave(paste0("./Data/Processed/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)

  # Save the plot to PDF
  ggsave(paste0("./Data/Processed/", file_basename, ".pdf"), plt, device = "pdf", width = 10, height = 8)
}  

Histogram

create_histogram_grouped <- function(data, x_var, group_var, bin_width, title, x_label, y_label, line_color = "black", file_basename) {
  # Generate the histogram
  plt <- ggplot(data, aes_string(x = x_var, fill = group_var)) +
    geom_histogram(binwidth = bin_width, color = line_color, position = "stack", alpha = 0.5) +
    labs(title = title, x = x_label, y = y_label) +
    theme_minimal() +
    guides(fill = guide_legend(title = NULL))

  # Print the plot
  print(plt)
  
  # Export the data used for the graph to CSV
  write.csv(data, paste0("./Data/Processed/", file_basename, ".csv"))

  # Save the plot to JPG
  ggsave(paste0("./Data/Processed/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)

  # Save the plot to PDF
  ggsave(paste0("./Data/Processed/", file_basename, ".pdf"), plt, device = "pdf", width = 10, height = 8)
}

Summary Statistics of Continuous Variables

Main Sample

  • For the summary statistics of continuous variables, we generate the statistics using the sample of consent. This sample includes 18529 participants who consented to the study.

  • The table belows show the number of missing values, proportion of missing values (out of the sample of 18529 participants who consented to the study), mean, standard deviation, standard error, min, max, and quintile values for each variables in the sample.

  • We also reports the extreme values of these variables by showing the 90th, 95th, 98th, 99th, and 100th percentile.

  • We noticed very large outliers of duration_intro, duration_match, duration_affirm, and duration_reveal variables (in minutes).

  • Noticed that there are 18 people missing for the source_coded variable. These participants retried the charity matching quiz. In the process of manually recording data for these participants, we did not capture the source of entry. WE ARE WORKING ON THIS, which should be an easy task.

df_wide$manipulation_value_coded <- as.numeric(df_wide$manipulation_value_coded)

continous_var <- c("duration_intro", "duration_match", "duration_affirm", "duration_reveal", "duration_treatment", "time_since_first_start", "important_smart_coded", "important_responsive_coded","important_forward_looking_coded", "source_coded", "manipulation_value_coded")

summary_stats_table_continous(
  df_wide,  
  variables = continous_var,
  group = NULL,
  filename = "continuous",
  caption = "Summary Statistics for Continuous Variables",
  print_html = TRUE
  )
Summary Statistics for Continuous Variables
Variable Number_Missing Proportion_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Decile_95 Decile_98 Decile_99 Decile_100
duration_intro 2572 0.161 127.725 1744.339 13.809 0.683 81298.02 0.933 1.017 1.067 1.150 1.233 1.350 1.500 1.783 2.483 4.133 74.737 1544.770 81298.02
duration_match 2045 0.124 30.518 816.431 6.359 0.400 44308.42 0.733 0.833 0.917 1.000 1.100 1.217 1.350 1.567 2.017 2.583 4.017 8.253 44308.42
duration_affirm 2634 0.166 11.028 407.984 3.236 0.383 38597.92 0.533 0.567 0.617 0.667 0.717 0.783 0.883 1.033 1.333 1.717 2.517 4.635 38597.92
duration_reveal 7814 0.729 77.690 1896.311 18.319 0.233 120738.32 0.650 0.717 0.767 0.817 0.883 0.967 1.083 1.250 1.667 2.367 7.967 82.480 120738.32
duration_treatment 13123 2.427 95.262 935.402 12.722 0.067 32165.07 1.233 1.700 2.017 2.350 2.800 3.417 4.450 7.600 24.692 135.896 1382.812 1599.863 32165.07
time_since_first_start 2027 0.123 22.630 9.105 0.071 0.000 33.00 8.000 14.000 19.000 23.000 26.000 28.000 29.000 31.000 32.000 33.000 33.000 33.000 33.00
important_smart_coded 2303 0.142 2.759 0.534 0.004 1.000 3.00 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
important_responsive_coded 2471 0.154 2.614 0.590 0.005 1.000 3.00 2.000 2.000 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
important_forward_looking_coded 2627 0.165 2.685 0.527 0.004 1.000 3.00 2.000 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
source_coded 18 0.001 0.983 0.128 0.001 0.000 1.00 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00
manipulation_value_coded 12123 1.892 4.701 0.717 0.009 1.000 5.00 4.000 4.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.00

Participants who received a charity match

df_charity <- df_wide %>% filter(charity_name_coded != "")

n_charity <- as.character(nrow(df_charity))
  • In this section, we will generate the summary statistics for continuous variables for the sample of 16487 participants who received a charity match.
summary_stats_table_continous(
  df_charity,  
  variables = continous_var,
  group = NULL,
  filename = "continuous_charity_match",
  caption = "Summary Statistics for Continuous Variables Among Participants Received a Charity Match",
  print_html = TRUE
  )
Summary Statistics for Continuous Variables Among Participants Received a Charity Match
Variable Number_Missing Proportion_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Decile_95 Decile_98 Decile_99 Decile_100
duration_intro 1786 0.121 89.978 1198.293 9.883 0.683 31112.45 0.933 1.000 1.067 1.133 1.217 1.333 1.483 1.733 2.367 3.700 31.017 774.050 31112.45
duration_match 5 0.000 30.521 816.481 6.360 0.400 44308.42 0.733 0.833 0.917 1.000 1.100 1.217 1.350 1.567 2.017 2.583 4.017 8.253 44308.42
duration_affirm 594 0.037 11.029 408.010 3.236 0.383 38597.92 0.533 0.567 0.617 0.667 0.717 0.783 0.883 1.033 1.333 1.717 2.517 4.636 38597.92
duration_reveal 5772 0.539 77.690 1896.311 18.319 0.233 120738.32 0.650 0.717 0.767 0.817 0.883 0.967 1.083 1.250 1.667 2.367 7.967 82.480 120738.32
duration_treatment 11081 2.050 95.262 935.402 12.722 0.067 32165.07 1.233 1.700 2.017 2.350 2.800 3.417 4.450 7.600 24.692 135.896 1382.812 1599.863 32165.07
time_since_first_start 1781 0.121 22.795 9.070 0.075 0.000 33.00 8.000 14.000 19.000 23.000 26.000 28.000 29.000 31.000 32.000 33.000 33.000 33.000 33.00
important_smart_coded 265 0.016 2.759 0.534 0.004 1.000 3.00 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
important_responsive_coded 433 0.027 2.614 0.590 0.005 1.000 3.00 2.000 2.000 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
important_forward_looking_coded 590 0.037 2.685 0.527 0.004 1.000 3.00 2.000 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
source_coded 0 0.000 0.983 0.131 0.001 0.000 1.00 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00
manipulation_value_coded 10081 1.574 4.701 0.717 0.009 1.000 5.00 4.000 4.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.00

Participants who completed the treatment intervention

df_treatment_completed <- df_wide %>% filter(treatment_completed ==1)

n_treatment_completed <- as.character(nrow(df_treatment_completed))
  • In this section, we will generate the summary statistics for continuous variables for the sample of 5406 participants who completed the treatment intervention.
summary_stats_table_continous(
  df_treatment_completed,  
  variables = continous_var,
  group = NULL,
  filename = "continuous_treatment_completed",
  caption = "Summary Statistics for Continuous Variables Among Participants Completed the Treatment Intervention",
  print_html = TRUE
  )
Summary Statistics for Continuous Variables Among Participants Completed the Treatment Intervention
Variable Number_Missing Proportion_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Decile_95 Decile_98 Decile_99 Decile_100
duration_intro 628 0.131 63.381 870.664 12.596 0.717 31112.45 0.933 1.017 1.083 1.150 1.242 1.350 1.517 1.817 2.533 3.883 24.762 770.198 31112.45
duration_match 1 0.000 23.758 671.671 9.136 0.400 35154.63 0.750 0.850 0.933 1.017 1.133 1.250 1.400 1.633 2.077 2.633 3.865 5.548 35154.63
duration_affirm 0 0.000 6.983 257.466 3.502 0.383 15252.20 0.533 0.583 0.617 0.667 0.717 0.800 0.900 1.050 1.367 1.733 2.515 3.993 15252.20
duration_reveal 0 0.000 16.641 394.072 5.360 0.267 20079.72 0.650 0.700 0.767 0.817 0.883 0.967 1.067 1.217 1.583 2.163 4.398 13.817 20079.72
duration_treatment 0 0.000 95.262 935.402 12.722 0.067 32165.07 1.233 1.700 2.017 2.350 2.800 3.417 4.450 7.600 24.692 135.896 1382.812 1599.863 32165.07
time_since_first_start 627 0.131 22.402 9.137 0.132 0.000 33.00 8.000 14.000 19.000 23.000 25.000 27.000 29.000 31.000 32.000 33.000 33.000 33.000 33.00
important_smart_coded 0 0.000 2.764 0.527 0.007 1.000 3.00 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
important_responsive_coded 0 0.000 2.629 0.571 0.008 1.000 3.00 2.000 2.000 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
important_forward_looking_coded 0 0.000 2.691 0.524 0.007 1.000 3.00 2.000 2.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.00
source_coded 0 0.000 0.965 0.184 0.002 0.000 1.00 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00
manipulation_value_coded 3 0.001 4.707 0.693 0.009 1.000 5.00 4.000 4.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.00

Summary Statistics of Categorical Variables

  • For the summary statistics of categorical variables, we generate the statistics using the sample of consent. This sample includes 18529 participants who consented to the study.

  • The table belows show the frequency and proportion of each category of each variables (out of the sample of 18529 participants who consented to the study).

  • Empty cells represent missing values.

cat_variable <- c("country_charity_coded", "main_cause_coded", "sub_cause_coded", "charity_name_coded", "donor_type_coded", "arm_coded", "manipulation_order_coded", "donate_today_coded", "donate_later_coded", "stay_connected_coded", "share_with_friend_coded")
summary_stats_tab_categorical(
  df_wide,  
  variables = cat_variable,
  group = NULL,
  filename = "donortype",
  caption = "Summary Statistics for Unordered Categorical Variables",
  print_html = TRUE
  )
Summary Statistics for Unordered Categorical Variables
Variable Group Category Frequency Proportion
country_charity_coded NA Global 3062 0.165
country_charity_coded NA US 14409 0.778
country_charity_coded NA NA 1058 0.057
main_cause_coded NA Defend the oppressed and marginalized 2364 0.128
main_cause_coded NA Eradicate hunger and homelessness 5303 0.286
main_cause_coded NA Eradicate poverty worldwide 1000 0.054
main_cause_coded NA Heal the sick 2468 0.133
main_cause_coded NA Protect the animals 2697 0.146
main_cause_coded NA Rescue the environment 1637 0.088
main_cause_coded NA Transform education 1498 0.081
main_cause_coded NA NA 1562 0.084
sub_cause_coded NA a roof over their head 1835 0.099
sub_cause_coded NA access to clean water 220 0.012
sub_cause_coded NA all the WILD furry animals like lions, tigers and bears (oh and koalas) 336 0.018
sub_cause_coded NA animals in factory farms 215 0.012
sub_cause_coded NA any species on the brink of extinction 365 0.020
sub_cause_coded NA cancer: The Big C 947 0.051
sub_cause_coded NA conserving natural spaces and habitats like old rainforests and national parks 618 0.033
sub_cause_coded NA dogs & cats humans’ best friends 1635 0.088
sub_cause_coded NA employment training and opportunities 1658 0.089
sub_cause_coded NA eradicating polio 12 0.001
sub_cause_coded NA food in their belly 238 0.013
sub_cause_coded NA groups targeted by their government 155 0.008
sub_cause_coded NA help all children learn to read 606 0.033
sub_cause_coded NA help teachers fund classroom projects 151 0.008
sub_cause_coded NA hot food in their belly 1740 0.094
sub_cause_coded NA immigrants and refugees fleeing violence 330 0.018
sub_cause_coded NA innovations that further clean energy technology 358 0.019
sub_cause_coded NA LGBTQ+ communities 288 0.016
sub_cause_coded NA massive ocean clean up 288 0.016
sub_cause_coded NA new policies to regulate carbon emissions 318 0.017
sub_cause_coded NA people suffering from preventable diseases because of inadequate healthcare 565 0.030
sub_cause_coded NA people that are victims of human trafficking 110 0.006
sub_cause_coded NA people with disabilities 338 0.018
sub_cause_coded NA people with heart disease 216 0.012
sub_cause_coded NA preventing and curing blindness 46 0.002
sub_cause_coded NA protecting against preventable childhood diseases 184 0.010
sub_cause_coded NA provide tutoring and support to underserved children 391 0.021
sub_cause_coded NA racial or ethnic minorities 320 0.017
sub_cause_coded NA rare diseases that need more research 324 0.017
sub_cause_coded NA refugees fleeing violence or other disasters 367 0.020
sub_cause_coded NA send a low income student to college 132 0.007
sub_cause_coded NA support girls in STEM 134 0.007
sub_cause_coded NA the opportunity to go to school 237 0.013
sub_cause_coded NA the skills to grow food sustainably 292 0.016
sub_cause_coded NA treating the big three: HIV, AIDS, Tuburculosis and Malaria 95 0.005
sub_cause_coded NA whales and ocean animals (Baby Beluga! Nemo!!) 78 0.004
sub_cause_coded NA women 362 0.020
sub_cause_coded NA NA 2025 0.109
charity_name_coded NA American Heart Association 216 0.012
charity_name_coded NA American Society for the Prevention of Cruelty to Animals 1635 0.088
charity_name_coded NA Amnesty International 154 0.008
charity_name_coded NA BRAC 291 0.016
charity_name_coded NA Clean Air Task Force 318 0.017
charity_name_coded NA Concern Worldwide 95 0.005
charity_name_coded NA Dana-Farber Cancer Institute 947 0.051
charity_name_coded NA DonorsChoose 151 0.008
charity_name_coded NA Evidence Action Inc.  183 0.010
charity_name_coded NA Feeding America 1740 0.094
charity_name_coded NA Freedom Network 109 0.006
charity_name_coded NA Girls Inc.  134 0.007
charity_name_coded NA Global Fund for Women 99 0.005
charity_name_coded NA HealthWell Foundation 565 0.030
charity_name_coded NA HIAS 697 0.038
charity_name_coded NA Housing Matters 1834 0.099
charity_name_coded NA Mercy For Animals 214 0.012
charity_name_coded NA National Women’s Law Center 262 0.014
charity_name_coded NA Natural Resources Defense Council 358 0.019
charity_name_coded NA Rainforest Trust 618 0.033
charity_name_coded NA Reading is Fundamental 606 0.033
charity_name_coded NA Rocketship Education 389 0.021
charity_name_coded NA Room to Read 237 0.013
charity_name_coded NA Sightsavers Inc.  46 0.002
charity_name_coded NA Southern Poverty Law Center 320 0.017
charity_name_coded NA Special Olympics 337 0.018
charity_name_coded NA Surfrider Foundation 288 0.016
charity_name_coded NA The Leukemia & Lymphoma Society 323 0.017
charity_name_coded NA The Ocean Foundation 78 0.004
charity_name_coded NA The Rotary Foundation 12 0.001
charity_name_coded NA The Trevor Project 288 0.016
charity_name_coded NA Thurgood Marshall College Fund 131 0.007
charity_name_coded NA WaterAid 219 0.012
charity_name_coded NA Wild Earth Allies 336 0.018
charity_name_coded NA Wildlife SOS 363 0.020
charity_name_coded NA World Food Program 238 0.013
charity_name_coded NA Year Up 1656 0.089
charity_name_coded NA NA 2042 0.110
donor_type_coded NA forward_looking 5125 0.277
donor_type_coded NA responsive 4599 0.248
donor_type_coded NA smart 6105 0.329
donor_type_coded NA unsure 85 0.005
donor_type_coded NA NA 2615 0.141
arm_coded NA control 2161 0.117
arm_coded NA obligation 4344 0.234
arm_coded NA opportunity 4210 0.227
arm_coded NA NA 7814 0.422
manipulation_order_coded NA 1 4063 0.219
manipulation_order_coded NA 2 4167 0.225
manipulation_order_coded NA NA 10299 0.556
donate_today_coded NA 0 6520 0.352
donate_today_coded NA 1 991 0.053
donate_today_coded NA NA 11018 0.595
donate_later_coded NA Do it later 1155 0.062
donate_later_coded NA Explore others 2101 0.113
donate_later_coded NA Learn more 1859 0.100
donate_later_coded NA NA 13414 0.724
stay_connected_coded NA No 3639 0.196
stay_connected_coded NA Yes 2190 0.118
stay_connected_coded NA NA 12700 0.685
share_with_friend_coded NA No 2092 0.113
share_with_friend_coded NA Yes 464 0.025
share_with_friend_coded NA NA 15973 0.862

Charity Match by Treatment Arms

  • This table shows the frequency and proportion of each category of each variables in the consent sample by treatment arms.

  • For instances where the group variable is Missing and the charity category is not missing, this indicates that the participant did not receive a treatment assignment. In other words, these are participants dropped off after the reveal of the charity match.

cat_variable_arm = c("charity_name_coded")
summary_stats_tab_categorical(
  df_wide,  
  variables = cat_variable_arm,
  group = "arm_coded",
  filename = "donortype",
  caption = "Summary Statistics for Unordered Categorical Variables",
  print_html = TRUE
  )
Summary Statistics for Unordered Categorical Variables
Variable Group Category Frequency Proportion
charity_name_coded control American Heart Association 32 0.002
charity_name_coded control American Society for the Prevention of Cruelty to Animals 228 0.012
charity_name_coded control Amnesty International 25 0.001
charity_name_coded control BRAC 42 0.002
charity_name_coded control Clean Air Task Force 45 0.002
charity_name_coded control Concern Worldwide 8 0.000
charity_name_coded control Dana-Farber Cancer Institute 105 0.006
charity_name_coded control DonorsChoose 15 0.001
charity_name_coded control Evidence Action Inc.  25 0.001
charity_name_coded control Feeding America 241 0.013
charity_name_coded control Freedom Network 11 0.001
charity_name_coded control Girls Inc.  16 0.001
charity_name_coded control Global Fund for Women 15 0.001
charity_name_coded control HIAS 82 0.004
charity_name_coded control HealthWell Foundation 68 0.004
charity_name_coded control Housing Matters 255 0.014
charity_name_coded control Mercy For Animals 28 0.002
charity_name_coded control National Women’s Law Center 35 0.002
charity_name_coded control Natural Resources Defense Council 48 0.003
charity_name_coded control Rainforest Trust 93 0.005
charity_name_coded control Reading is Fundamental 66 0.004
charity_name_coded control Rocketship Education 49 0.003
charity_name_coded control Room to Read 41 0.002
charity_name_coded control Sightsavers Inc.  5 0.000
charity_name_coded control Southern Poverty Law Center 37 0.002
charity_name_coded control Special Olympics 47 0.003
charity_name_coded control Surfrider Foundation 38 0.002
charity_name_coded control The Leukemia & Lymphoma Society 35 0.002
charity_name_coded control The Ocean Foundation 10 0.001
charity_name_coded control The Rotary Foundation 3 0.000
charity_name_coded control The Trevor Project 35 0.002
charity_name_coded control Thurgood Marshall College Fund 18 0.001
charity_name_coded control WaterAid 37 0.002
charity_name_coded control Wild Earth Allies 51 0.003
charity_name_coded control Wildlife SOS 49 0.003
charity_name_coded control World Food Program 38 0.002
charity_name_coded control Year Up 185 0.010
charity_name_coded obligation American Heart Association 52 0.003
charity_name_coded obligation American Society for the Prevention of Cruelty to Animals 441 0.024
charity_name_coded obligation Amnesty International 36 0.002
charity_name_coded obligation BRAC 100 0.005
charity_name_coded obligation Clean Air Task Force 76 0.004
charity_name_coded obligation Concern Worldwide 28 0.002
charity_name_coded obligation Dana-Farber Cancer Institute 221 0.012
charity_name_coded obligation DonorsChoose 34 0.002
charity_name_coded obligation Evidence Action Inc.  49 0.003
charity_name_coded obligation Feeding America 433 0.023
charity_name_coded obligation Freedom Network 33 0.002
charity_name_coded obligation Girls Inc.  31 0.002
charity_name_coded obligation Global Fund for Women 22 0.001
charity_name_coded obligation HIAS 185 0.010
charity_name_coded obligation HealthWell Foundation 147 0.008
charity_name_coded obligation Housing Matters 532 0.029
charity_name_coded obligation Mercy For Animals 59 0.003
charity_name_coded obligation National Women’s Law Center 82 0.004
charity_name_coded obligation Natural Resources Defense Council 95 0.005
charity_name_coded obligation Rainforest Trust 172 0.009
charity_name_coded obligation Reading is Fundamental 160 0.009
charity_name_coded obligation Rocketship Education 103 0.006
charity_name_coded obligation Room to Read 75 0.004
charity_name_coded obligation Sightsavers Inc.  14 0.001
charity_name_coded obligation Southern Poverty Law Center 80 0.004
charity_name_coded obligation Special Olympics 85 0.005
charity_name_coded obligation Surfrider Foundation 74 0.004
charity_name_coded obligation The Leukemia & Lymphoma Society 79 0.004
charity_name_coded obligation The Ocean Foundation 23 0.001
charity_name_coded obligation The Rotary Foundation 1 0.000
charity_name_coded obligation The Trevor Project 63 0.003
charity_name_coded obligation Thurgood Marshall College Fund 30 0.002
charity_name_coded obligation WaterAid 62 0.003
charity_name_coded obligation Wild Earth Allies 89 0.005
charity_name_coded obligation Wildlife SOS 101 0.005
charity_name_coded obligation World Food Program 58 0.003
charity_name_coded obligation Year Up 419 0.023
charity_name_coded opportunity American Heart Association 56 0.003
charity_name_coded opportunity American Society for the Prevention of Cruelty to Animals 406 0.022
charity_name_coded opportunity Amnesty International 40 0.002
charity_name_coded opportunity BRAC 69 0.004
charity_name_coded opportunity Clean Air Task Force 88 0.005
charity_name_coded opportunity Concern Worldwide 23 0.001
charity_name_coded opportunity Dana-Farber Cancer Institute 214 0.012
charity_name_coded opportunity DonorsChoose 32 0.002
charity_name_coded opportunity Evidence Action Inc.  53 0.003
charity_name_coded opportunity Feeding America 439 0.024
charity_name_coded opportunity Freedom Network 30 0.002
charity_name_coded opportunity Girls Inc.  25 0.001
charity_name_coded opportunity Global Fund for Women 28 0.002
charity_name_coded opportunity HIAS 183 0.010
charity_name_coded opportunity HealthWell Foundation 157 0.008
charity_name_coded opportunity Housing Matters 545 0.029
charity_name_coded opportunity Mercy For Animals 49 0.003
charity_name_coded opportunity National Women’s Law Center 60 0.003
charity_name_coded opportunity Natural Resources Defense Council 91 0.005
charity_name_coded opportunity Rainforest Trust 149 0.008
charity_name_coded opportunity Reading is Fundamental 151 0.008
charity_name_coded opportunity Rocketship Education 118 0.006
charity_name_coded opportunity Room to Read 48 0.003
charity_name_coded opportunity Sightsavers Inc.  9 0.000
charity_name_coded opportunity Southern Poverty Law Center 94 0.005
charity_name_coded opportunity Special Olympics 69 0.004
charity_name_coded opportunity Surfrider Foundation 80 0.004
charity_name_coded opportunity The Leukemia & Lymphoma Society 87 0.005
charity_name_coded opportunity The Ocean Foundation 18 0.001
charity_name_coded opportunity The Rotary Foundation 3 0.000
charity_name_coded opportunity The Trevor Project 65 0.004
charity_name_coded opportunity Thurgood Marshall College Fund 18 0.001
charity_name_coded opportunity WaterAid 57 0.003
charity_name_coded opportunity Wild Earth Allies 88 0.005
charity_name_coded opportunity Wildlife SOS 98 0.005
charity_name_coded opportunity World Food Program 64 0.003
charity_name_coded opportunity Year Up 406 0.022
charity_name_coded Missing American Heart Association 76 0.004
charity_name_coded Missing American Society for the Prevention of Cruelty to Animals 560 0.030
charity_name_coded Missing Amnesty International 53 0.003
charity_name_coded Missing BRAC 80 0.004
charity_name_coded Missing Clean Air Task Force 109 0.006
charity_name_coded Missing Concern Worldwide 36 0.002
charity_name_coded Missing Dana-Farber Cancer Institute 407 0.022
charity_name_coded Missing DonorsChoose 70 0.004
charity_name_coded Missing Evidence Action Inc.  56 0.003
charity_name_coded Missing Feeding America 627 0.034
charity_name_coded Missing Freedom Network 35 0.002
charity_name_coded Missing Girls Inc.  62 0.003
charity_name_coded Missing Global Fund for Women 34 0.002
charity_name_coded Missing HIAS 247 0.013
charity_name_coded Missing HealthWell Foundation 193 0.010
charity_name_coded Missing Housing Matters 502 0.027
charity_name_coded Missing Mercy For Animals 78 0.004
charity_name_coded Missing National Women’s Law Center 85 0.005
charity_name_coded Missing Natural Resources Defense Council 124 0.007
charity_name_coded Missing Rainforest Trust 204 0.011
charity_name_coded Missing Reading is Fundamental 229 0.012
charity_name_coded Missing Rocketship Education 119 0.006
charity_name_coded Missing Room to Read 73 0.004
charity_name_coded Missing Sightsavers Inc.  18 0.001
charity_name_coded Missing Southern Poverty Law Center 109 0.006
charity_name_coded Missing Special Olympics 136 0.007
charity_name_coded Missing Surfrider Foundation 96 0.005
charity_name_coded Missing The Leukemia & Lymphoma Society 122 0.007
charity_name_coded Missing The Ocean Foundation 27 0.001
charity_name_coded Missing The Rotary Foundation 5 0.000
charity_name_coded Missing The Trevor Project 125 0.007
charity_name_coded Missing Thurgood Marshall College Fund 65 0.004
charity_name_coded Missing WaterAid 63 0.003
charity_name_coded Missing Wild Earth Allies 108 0.006
charity_name_coded Missing Wildlife SOS 115 0.006
charity_name_coded Missing World Food Program 78 0.004
charity_name_coded Missing Year Up 646 0.035
charity_name_coded Missing Missing 2042 0.110

Looking at number of participants consented to the chatbot

  • In this section, we will analyze the number of people who consented to start the chatbot since the experiment started. We are using the sample of consent for this analysis, which has 18529 participants.

  • For each section, we will provide two graphs: one for the total count for each responses overt time, and one for the proportion of each responses over time.

Among those consented

create_histogram_grouped(
  df_wide, 
  x_var = "time_since_first_start", 
  group_var = NULL, 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start"
  )

By source

df_source <- df_wide %>% filter(source_coded !="")
n_source <- as.character(nrow(df_source))
  • The two graphs below show the count and proportion of the 18529 participants consented to start the chatbot since the experiment started by source.

  • It is noticeable that the main source of entering the chatbot is JSON Ad.

Total count over time

  • This graph represents the number of participants who consented to start entered the chatbot via JSON Ad since the experiment started by source.
create_histogram_grouped(
  df_source, 
  x_var = "time_since_first_start", 
  group_var = "source_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Entered Via JSON Ad", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_source"
  )

By treatment arm

df_arm <-df_wide %>% filter(arm_coded != "") 

n_arm <- as.character(nrow(df_arm))
  • For the two graphs below, we excluded participants who did not receive a treatment assignment from the main sample of 18529 participants. The number of observation for this sample of receiving a treatment assignment is 10715.

Total count over time

  • This graph represents the number of participants consented to start the chatbot since the experiment started by treatment arms.
create_histogram_grouped(
  df_arm, 
  x_var = "time_since_first_start", 
  group_var = "arm_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Treatment Arm", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_arm"
  )

Proportion over time

  • This graph represents the proportion of participants consented to start the chatbot since the experiment started by treatment arms.
# group by day and create proportion

df_arm_1 <- df_arm %>% group_by(time_since_first_start, arm_coded) %>% summarise(n = n()) %>% mutate(arm_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_arm_1, aes(x = time_since_first_start, y = arm_coded_proportion, fill = arm_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Treatment Arm Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

By charity orgin

df_country <- df_wide %>% filter(country_charity_coded != "")

n_country <- as.character(nrow(df_country))
  • The two graphs below show the count and proportion of participants consented to start the chatbot since the experiment started by charity origin.

  • For these two graphs, we excluded participants who did not select a country origin from the main sample of 18529 participants. The number of observation for this sample of receiving a treatment assignment is 17471.

Total count over time

  • This graph represents the number of participants consented to start the chatbot since the experiment started by charity origin.
create_histogram_grouped(
  df_country, 
  x_var = "time_since_first_start", 
  group_var = "country_charity_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Charity Origin", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_country"
  )

Proportion over time

  • This graph represents the proportion of participants consented to start the chatbot since the experiment started by charity origin.
# group by day and create proportion
df_country_1 <- df_country %>% group_by(time_since_first_start, country_charity_coded) %>% summarise(n = n()) %>% mutate(country_charity_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_country_1, aes(x = time_since_first_start, y = country_charity_coded_proportion, fill = country_charity_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Charity Origin Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

By main charity cause

df_maincause <- df_wide %>% filter(main_cause_coded != "")

n_maincause <- as.character(nrow(df_maincause))
  • For the fpllowing graphs, we excluded participants who did not select a main charity cause from the main sample of 18529 participants. The number of observation for this sample of receiving a treatment assignment is 16967.

US Choice

df_us <- df_maincause %>% filter(country_charity_coded == "US")

n_us <- as.character(nrow(df_us))
  • In this section, we will analyze the number of people who consented to start the chatbot since the experiment started by main charity cause for the sample of 13986 participants who selected US as the charity origin.

  • The two graphs below show the count and proportion of participants consented to start the chatbot since the experiment started by main charity cause.

Total count over time

  • This graph represents the number of participants consented to start the chatbot since the experiment started by main charity cause in US.
create_histogram_grouped(
  df_us, 
  x_var = "time_since_first_start", 
  group_var = "main_cause_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Main Charity Cause in US", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_main_cause_us"
  )

Proportion over time

  • This graph represents the proportion of participants consented to start the chatbot since the experiment started by main charity cause in US.
# group by day and create proportion
df_us_1 <- df_us %>% group_by(time_since_first_start, main_cause_coded) %>% summarise(n = n()) %>% mutate(main_cause_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_us_1, aes(x = time_since_first_start, y = main_cause_coded_proportion, fill = main_cause_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Main Charity Cause in US Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Global Choice

df_global <- df_maincause %>% filter(country_charity_coded == "Global")

n_global <- as.character(nrow(df_global))
  • In this section, we will analyze the number of people who consented to start the chatbot since the experiment started by main charity cause for the sample of 2981 participants who selected Global as the charity origin.

  • The two graphs below show the count and proportion of participants consented to start the chatbot since the experiment started by main charity cause.

Total count over time

  • This graph represents the number of participants consented to start the chatbot since the experiment started by main charity cause in Global.
create_histogram_grouped(
  df_global, 
  x_var = "time_since_first_start", 
  group_var = "main_cause_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Main Charity Cause in Global", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_main_cause_global"
  )

Proportion over time

  • This graph represents the proportion of participants consented to start the chatbot since the experiment started by main charity cause in Global.
# group by day and create proportion

df_global_1 <- df_global %>% group_by(time_since_first_start, main_cause_coded) %>% summarise(n = n()) %>% mutate(main_cause_coded_proportion = n / sum(n))

# Create bar chart

ggplot(df_global_1, aes(x = time_since_first_start, y = main_cause_coded_proportion, fill = main_cause_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Main Charity Cause in Global Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Looking at time spent on charity match

  • In this section, we will analyze the time spent on the charity matching quiz in minutes by participants. We are using the sample of consent for this analysis, which has 18529 participants.

  • We determined the time spent on the charity matching quiz by computing the difference (in minutes) between the start time of the quiz (charitable_match_start) and the end time of the quiz (charitable_match_end). During the cleaning process, we identified instances where this value is negative and removed these observations from the dataset.

Among those consented

  • For this plot, we are using a sample of 18529 participants who consented to the study.

  • We noticed extreme right outliers causing thick right tail.

create_density_plot(
  df_wide, 
  x_var = "duration_match", 
  group_var = NULL, 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Treatment Arm", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

By treatment arm

  • For this plot, we are using a sample of 10715 participants who received a treatment assignment.

  • The density plot below shows the time spent on charity match by treatment arms. We see that the distribution of time spent on charity match is similar across treatment arm.

create_density_plot(
  df_arm, 
  x_var = "duration_match", 
  group_var = "arm_coded", 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Treatment Arm", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

By charity origin

  • For this plot, we are using a sample of 17471 participants who selected a country origin.
create_density_plot(
  df_country, 
  x_var = "duration_match", 
  group_var = "country_charity_coded", 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Charity Origin", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

By main charity cause

  • For this plot, we are using a sample of 16967 participants who selected a main charity cause.
create_density_plot(
  df_maincause, 
  x_var = "duration_match", 
  group_var = "main_cause_coded", 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Main Charity Cause", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

Looking at Charity Match

  • In this section, we will look the top 5 and bottom 5 charities that consented participants matched with over time.

  • We will use the sample of 16487 participants who received a charity match.

Both Global and US

Top 5

# select top five charities each day 

df_charity_top <- df_charity %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(5, n) %>% ungroup()

## plot 

ggplot(df_charity_top, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time both US and Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Bottom 5

df_charity_bottom <- df_charity %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(-5, n) %>% ungroup()

## plot 

ggplot(df_charity_bottom, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time both US and Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

US Charities

df_charity_us <- df_charity %>% filter(country_charity_coded == "US")

n_charity_us <- as.character(nrow(df_charity_us))
  • In this section, we will look at the top 5 and bottom 5 charities that consented participants matched with over time for the sample of 13596 participants who selected US as the charity origin.

Top 5

df_charity_top <- df_charity_us %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(5, n) %>% ungroup()

## plot 

ggplot(df_charity_top, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time US", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Bottom 5

df_charity_bottom <- df_charity_us %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(-5, n) %>% ungroup()

## plot

ggplot(df_charity_bottom, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Bottom 5 Charities Over Time US", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Global Charities

df_charity_global <- df_charity %>% filter(country_charity_coded == "Global")

n_charity_global <- as.character(nrow(df_charity_global))
  • In this section, we will look at the top 5 and bottom 5 charities that consented participants matched with over time for the sample of 2891 participants who selected Global as the charity origin.

Top 5

df_charity_top <- df_charity_global %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(5, n) %>% ungroup()

## plot

ggplot(df_charity_top, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Bottom 5

df_charity_bottom <- df_charity_global %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(-5, n) %>% ungroup()

## plot

ggplot(df_charity_bottom, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Bottom 5 Charities Over Time Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Looking at donation intention over time

  • In this section, we will look at the donation intention over time.

Donation Intention Today

df_donate_today <- df_wide %>% filter(donate_today_coded != "") %>% mutate(donate_today_coded_factor = as.factor(donate_today_coded)) %>% mutate(donate_today_coded_factor = fct_recode(donate_today_coded_factor, "Yes" = "1", "No" = "0"))

n_donate_today <- as.character(nrow(df_donate_today))
  • In this section, we looked at participants who intended to donate today over time.

  • We are using a sample of 7511 participants who answered if they were intended to donate today from the main sample of 18529 participants who consented to the study.

  • The next two graphs below the count and proportion of participants who intended to donate today over time.

Total count over time

  • This graph represents the number of participants who intended to donate today over time.
create_histogram_grouped(
  df_donate_today, 
  x_var = "time_since_first_start", 
  group_var = "donate_today_coded_factor", 
  bin_width = 1, 
  title = "Donation Today by Time Since the Experiment Started", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_donate_today"
  )

Proportion over time

  • This graph represents the proportion of participants who intended to donate today over time.
# group by day and create proportion
df_donate_today_1 <- df_donate_today %>% group_by(time_since_first_start, donate_today_coded_factor) %>% summarise(n = n()) %>% mutate(donate_today_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_donate_today_1, aes(x = time_since_first_start, y = donate_today_coded_proportion, fill = donate_today_coded_factor)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Donation Today Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

Choose to donate today, by treatment arm over time

df_donate_today_yes <- df_wide %>% filter(donate_today_coded == 1)
n_donate_today_yes <- as.character(nrow(df_donate_today_yes))
  • In this section, we looked at participants who intended to donate today by treatment arm over time.

  • We are using a sample of 991 participants who intended to donate today from the main sample of 18529 participants who consented to the study.

create_histogram_grouped(
  df_donate_today_yes, 
  x_var = "time_since_first_start", 
  group_var = "arm_coded", 
  bin_width = 1, 
  title = "Donation Today by Time Since the Experiment Started and Treatment Arm", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_yesdonate_today_arm"
  )

Choose to donate today, by charity origin over time

  • In this section, we looked at participants who intended to donate today by charity origin over time.

  • We are using a sample of 991 participants who intended to donate today from the main sample of 18529 participants who consented to the study.

create_histogram_grouped(
  df_donate_today_yes, 
  x_var = "time_since_first_start", 
  group_var = "country_charity_coded", 
  bin_width = 1, 
  title = "Donate Today by Time Since the Experiment Started and Charity Origin", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_yesdonate_today_country"
  )

Choose to donate today, by Charity Cause over time

  • In this section, we looked at participants who intended to donate today by main charity cause over time.

  • We are using a sample of 991 participants who intended to donate today from the main sample of 18529 participants who consented to the study.

create_histogram_grouped(
  df_donate_today_yes, 
  x_var = "time_since_first_start", 
  group_var = "main_cause_coded", 
  bin_width = 1, 
  title = "Donation Today by Time Since the Experiment Started and Main Charity Cause", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_yesdonate_today_main_cause"
  )

Not donate today

  • If participants denied to donate today, they were asked to do one of the three things: donate later, explore others, or learn more about the charity.

  • We will show the distribution of these choices over time since the experiment started.

Choose other options over time

df_donate_later <- df_wide %>% filter(donate_later_coded != "")

n_donate_later <- as.character(nrow(df_donate_later))
  • In this section, we looked at participants who responded to one the three options: donate later, explore others, or learn more about the charity over time. In this sample there are 5115 participants from the main sample of 18529 participants who consented to the study.

  • The next two graphs below show the count and proportion of participants who responded to one of the three options over time.

Total count over time

  • This graph represents the number of participants who responded to one of the three options over time.
create_histogram_grouped(
  df_donate_later, 
  x_var = "time_since_first_start", 
  group_var = "donate_later_coded", 
  bin_width = 1, 
  title = "Donate Later Choices by Time Since the Experiment Started", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_donate_later"
  )

Proportion over time

  • This graph represents the proportion of participants who responded to one of the three options over time.
# group by day and create proportion
df_donate_later_1 <- df_donate_later %>% group_by(time_since_first_start, donate_later_coded) %>% summarise(n = n()) %>% mutate(donate_later_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_donate_later_1, aes(x = time_since_first_start, y = donate_later_coded_proportion, fill = donate_later_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Donate Later Choices Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

Looking at staying connected over time

df_share <- df_wide %>% filter(stay_connected_coded != "")

n_share <- as.character(nrow(df_share))
  • In this section, we will look at the proportion of participants who stayed connected over time.

  • We are using a sample of 5829 participants answered whether they are willing to stay connected from the main sample of 18529 participants who consented to the study.

  • The next two graphs below show the count and proportion of participants who stayed connected over time.

Total count over time

  • This graph represents the number of participants who stayed connected over time.
create_histogram_grouped(
  df_share, 
  x_var = "time_since_first_start", 
  group_var = "stay_connected_coded", 
  bin_width = 1, 
  title = "Staying connected over time", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "share_with_friend"
  )

Proportion over time

  • This graph represents the proportion of participants who stayed connected over time.
# group by day and create proportion
df_share_1 <- df_share %>% group_by(time_since_first_start, stay_connected_coded) %>% summarise(n = n()) %>% mutate(stay_connected_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_share_1, aes(x = time_since_first_start, y = stay_connected_coded_proportion, fill = stay_connected_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Staying Connected Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 10)) + theme(legend.key.size = unit(0.2, 'cm'))

Correlation Matrix

  • In this section, we will look at the correlation matrix of the following variables of the main sample of 18529 participants who consented to the study: source entered, donor type, country (U.S.), main cause (all), subcause (all)
df_correlation <- df_wide %>% select(source_coded, donor_type_coded, country_charity_coded, main_cause_coded, sub_cause_coded) %>% filter(source_coded != "") %>% filter(donor_type_coded != "") %>% filter(country_charity_coded != "") %>% filter(main_cause_coded != "") %>% filter(sub_cause_coded != "")

df_correlation <- dummy_cols(df_correlation, select_columns = c("source_coded", "donor_type_coded", "country_charity_coded", "main_cause_coded", "sub_cause_coded")) %>% select(-source_coded, -donor_type_coded, -country_charity_coded, -main_cause_coded, -sub_cause_coded)

# remove name of the dummy columns

names(df_correlation) <- gsub("source_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("donor_type_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("country_charity_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("main_cause_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("sub_cause_coded_", "", names(df_correlation))
names(df_correlation)[names(df_correlation) == "1"] <- "JSON Ad"
names(df_correlation)[names(df_correlation) == "0"] <- "Other sources of entrance"




correlation_matrix <- cor(df_correlation, use = "pairwise.complete.obs") 

col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
colors <- col(200)
annotation_colors <- c("white", "black")

# Set up a larger plotting window
options(repr.plot.width=50, repr.plot.height=50)

# Generate the correlation plot with adjusted parameters
corrplot(correlation_matrix,
         method = "color", 
         col = colors,
         type = "upper",
         addCoef.col = "black",
         tl.col = "black",
         tl.srt = 90,
         diag = FALSE,
         bg = "white",
         addrect = 2,
         rect.col = "gray",
         cl.pos = "n",
         number.cex  = 0.15,  # Adjust the font size for correlation values
         tl.cex = 0.3,      # Adjust the font size for variable names
         addCoef.asPercent = TRUE,
         p.mat = NULL,
         sig.level = 0.01,
         insig = "blank",
         pch.col = "black",
         pch.cex = 0.8,
         col.cor = "black",
         mar = c(0,0,1,0))

Correlation table

  • The table below shows the correlation matrix of the following variables: source entered, donor type, country (U.S.), main cause (all), subcause (all)
correlation_matrix %>% kable(digits = 3) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "1000px")
Other sources of entrance JSON Ad forward_looking responsive smart unsure Global US Defend the oppressed and marginalized Eradicate hunger and homelessness Eradicate poverty worldwide Heal the sick Protect the animals Rescue the environment Transform education a roof over their head access to clean water all the WILD furry animals like lions, tigers and bears (oh and koalas) animals in factory farms any species on the brink of extinction cancer: The Big C conserving natural spaces and habitats like old rainforests and national parks dogs & cats humans’ best friends employment training and opportunities eradicating polio food in their belly groups targeted by their government help all children learn to read help teachers fund classroom projects hot food in their belly immigrants and refugees fleeing violence innovations that further clean energy technology LGBTQ+ communities massive ocean clean up new policies to regulate carbon emissions people suffering from preventable diseases because of inadequate healthcare people that are victims of human trafficking people with disabilities people with heart disease preventing and curing blindness protecting against preventable childhood diseases provide tutoring and support to underserved children racial or ethnic minorities rare diseases that need more research refugees fleeing violence or other disasters send a low income student to college support girls in STEM the opportunity to go to school the skills to grow food sustainably treating the big three: HIV, AIDS, Tuburculosis and Malaria whales and ocean animals (Baby Beluga! Nemo!!) women
Other sources of entrance 1.000 -1.000 0.005 -0.008 -0.001 0.023 0.039 -0.039 0.003 -0.017 0.054 -0.007 -0.013 0.001 0.005 -0.011 -0.003 -0.002 0.002 0.002 -0.008 -0.009 -0.020 0.004 -0.004 0.008 0.007 -0.006 -0.003 -0.018 -0.009 -0.007 0.003 0.011 0.012 -0.017 0.001 0.004 0.006 -0.007 0.013 0.001 0.008 0.012 0.011 0.033 -0.001 0.060 0.039 -0.010 0.012 -0.017
JSON Ad -1.000 1.000 -0.005 0.008 0.001 -0.023 -0.039 0.039 -0.003 0.017 -0.054 0.007 0.013 -0.001 -0.005 0.011 0.003 0.002 -0.002 -0.002 0.008 0.009 0.020 -0.004 0.004 -0.008 -0.007 0.006 0.003 0.018 0.009 0.007 -0.003 -0.011 -0.012 0.017 -0.001 -0.004 -0.006 0.007 -0.013 -0.001 -0.008 -0.012 -0.011 -0.033 0.001 -0.060 -0.039 0.010 -0.012 0.017
forward_looking 0.005 -0.005 1.000 -0.440 -0.544 -0.051 -0.004 0.004 -0.033 -0.011 0.008 0.022 -0.016 0.034 0.009 0.000 0.008 -0.008 -0.005 0.005 0.018 0.025 -0.018 0.013 0.010 -0.009 -0.010 0.008 0.003 -0.030 -0.021 0.003 -0.012 0.019 0.018 0.013 -0.011 0.004 0.004 -0.007 0.004 0.004 -0.008 0.003 -0.033 0.000 0.000 0.000 0.016 0.002 0.012 0.003
responsive -0.008 0.008 -0.440 1.000 -0.503 -0.047 -0.005 0.005 0.014 0.021 -0.009 -0.007 0.012 -0.037 -0.012 0.010 -0.014 0.006 0.003 0.001 0.000 -0.023 0.012 -0.008 -0.002 0.008 0.002 -0.008 -0.003 0.030 0.020 -0.022 -0.014 -0.010 -0.015 -0.009 0.021 -0.001 -0.003 0.020 -0.008 -0.002 0.006 -0.005 0.023 0.001 -0.017 0.000 -0.011 -0.001 -0.007 -0.014
smart -0.001 0.001 -0.544 -0.503 1.000 -0.058 0.011 -0.011 0.021 -0.008 0.003 -0.018 0.002 0.003 0.000 -0.009 0.006 0.001 0.003 -0.005 -0.019 -0.002 0.005 -0.004 -0.008 0.003 0.009 -0.004 0.001 0.001 0.003 0.019 0.027 -0.009 -0.002 -0.004 -0.011 -0.004 -0.003 -0.014 0.004 -0.001 0.003 0.000 0.012 -0.005 0.016 0.001 -0.004 -0.002 -0.006 0.011
unsure 0.023 -0.023 -0.051 -0.047 -0.058 1.000 -0.018 0.018 -0.012 -0.013 -0.015 0.019 0.015 -0.009 0.015 -0.007 -0.009 0.008 -0.001 0.001 0.016 -0.001 0.013 -0.013 -0.002 -0.009 -0.007 0.023 -0.007 0.000 -0.010 -0.011 -0.010 -0.003 -0.004 -0.004 0.015 0.008 0.015 0.013 -0.008 -0.006 -0.004 0.014 -0.011 0.024 -0.007 -0.009 -0.003 0.006 0.008 -0.005
Global 0.039 -0.039 -0.004 -0.005 0.011 -0.018 1.000 -1.000 0.154 -0.316 0.546 -0.036 -0.061 0.129 -0.141 -0.164 0.251 0.088 -0.052 0.117 -0.113 0.070 -0.155 -0.156 0.059 0.261 0.211 -0.090 -0.044 -0.159 -0.066 0.060 -0.062 0.059 0.059 -0.087 0.175 -0.067 -0.053 0.112 0.230 -0.073 -0.066 -0.066 0.328 -0.040 -0.042 0.261 0.293 0.165 0.005 0.040
US -0.039 0.039 0.004 0.005 -0.011 0.018 -1.000 1.000 -0.154 0.316 -0.546 0.036 0.061 -0.129 0.141 0.164 -0.251 -0.088 0.052 -0.117 0.113 -0.070 0.155 0.156 -0.059 -0.261 -0.211 0.090 0.044 0.159 0.066 -0.060 0.062 -0.059 -0.059 0.087 -0.175 0.067 0.053 -0.112 -0.230 0.073 0.066 0.066 -0.328 0.040 0.042 -0.261 -0.293 -0.165 -0.005 -0.040
Defend the oppressed and marginalized 0.003 -0.003 -0.033 0.014 0.021 -0.012 0.154 -0.154 1.000 -0.273 -0.101 -0.164 -0.175 -0.131 -0.122 -0.142 -0.047 -0.058 -0.045 -0.061 -0.097 -0.079 -0.134 -0.134 -0.011 -0.049 0.244 -0.078 -0.038 -0.137 0.356 -0.060 0.336 -0.054 -0.056 -0.075 0.203 0.362 -0.045 -0.021 -0.043 -0.063 0.353 -0.057 0.380 -0.035 -0.036 -0.048 -0.054 -0.031 -0.027 0.371
Eradicate hunger and homelessness -0.017 0.017 -0.011 0.021 -0.008 -0.013 -0.316 0.316 -0.273 1.000 -0.173 -0.279 -0.298 -0.223 -0.207 0.518 -0.079 -0.098 -0.077 -0.103 -0.166 -0.135 -0.227 0.492 -0.019 -0.083 -0.067 -0.132 -0.065 0.503 -0.097 -0.102 -0.092 -0.092 -0.095 -0.128 -0.055 -0.099 -0.077 -0.036 -0.073 -0.107 -0.096 -0.097 -0.104 -0.059 -0.061 -0.082 -0.092 -0.052 -0.047 -0.101
Eradicate poverty worldwide 0.054 -0.054 0.008 -0.009 0.003 -0.015 0.546 -0.546 -0.101 -0.173 1.000 -0.103 -0.111 -0.083 -0.077 -0.090 0.459 -0.036 -0.029 -0.038 -0.062 -0.050 -0.084 -0.085 -0.007 0.478 -0.025 -0.049 -0.024 -0.087 -0.036 -0.038 -0.034 -0.034 -0.035 -0.048 -0.021 -0.037 -0.029 -0.013 -0.027 -0.040 -0.036 -0.036 -0.039 -0.022 -0.023 0.477 0.535 -0.019 -0.017 -0.038
Heal the sick -0.007 0.007 0.022 -0.007 -0.018 0.019 -0.036 0.036 -0.164 -0.279 -0.103 1.000 -0.178 -0.133 -0.124 -0.144 -0.047 -0.059 -0.046 -0.062 0.595 -0.081 -0.136 -0.137 0.067 -0.050 -0.040 -0.079 -0.039 -0.140 -0.058 -0.061 -0.055 -0.055 -0.057 0.461 -0.033 -0.059 0.278 0.127 0.261 -0.064 -0.058 0.347 -0.062 -0.035 -0.037 -0.049 -0.055 0.187 -0.028 -0.061
Protect the animals -0.013 0.013 -0.016 0.012 0.002 0.015 -0.061 0.061 -0.175 -0.298 -0.111 -0.178 1.000 -0.143 -0.133 -0.154 -0.051 0.329 0.259 0.347 -0.106 -0.086 0.764 -0.146 -0.012 -0.053 -0.043 -0.085 -0.042 -0.150 -0.062 -0.065 -0.059 -0.059 -0.061 -0.082 -0.035 -0.063 -0.050 -0.023 -0.047 -0.068 -0.062 -0.062 -0.066 -0.038 -0.039 -0.053 -0.059 -0.033 0.157 -0.065
Rescue the environment 0.001 -0.001 0.034 -0.037 0.003 -0.009 0.129 -0.129 -0.131 -0.223 -0.083 -0.133 -0.143 1.000 -0.099 -0.115 -0.038 -0.047 -0.037 -0.050 -0.079 0.606 -0.109 -0.110 -0.009 -0.040 -0.032 -0.063 -0.031 -0.112 -0.047 0.457 -0.044 0.413 0.427 -0.062 -0.027 -0.047 -0.037 -0.017 -0.035 -0.051 -0.046 -0.046 -0.050 -0.028 -0.029 -0.040 -0.044 -0.025 -0.022 -0.049
Transform education 0.005 -0.005 0.009 -0.012 0.000 0.015 -0.141 0.141 -0.122 -0.207 -0.077 -0.124 -0.133 -0.099 1.000 -0.108 -0.035 -0.044 -0.034 -0.046 -0.074 -0.060 -0.101 -0.102 -0.008 -0.037 -0.030 0.638 0.315 -0.104 -0.043 -0.045 -0.041 -0.041 -0.042 -0.057 -0.025 -0.044 -0.035 -0.016 -0.032 0.515 -0.043 -0.043 -0.046 0.284 0.295 -0.037 -0.041 -0.023 -0.021 -0.045
a roof over their head -0.011 0.011 0.000 0.010 -0.009 -0.007 -0.164 0.164 -0.142 0.518 -0.090 -0.144 -0.154 -0.115 -0.108 1.000 -0.041 -0.051 -0.040 -0.054 -0.086 -0.070 -0.118 -0.119 -0.010 -0.043 -0.035 -0.069 -0.034 -0.121 -0.050 -0.053 -0.048 -0.048 -0.049 -0.067 -0.029 -0.051 -0.040 -0.018 -0.038 -0.055 -0.050 -0.050 -0.054 -0.031 -0.032 -0.043 -0.048 -0.027 -0.024 -0.053
access to clean water -0.003 0.003 0.008 -0.014 0.006 -0.009 0.251 -0.251 -0.047 -0.079 0.459 -0.047 -0.051 -0.038 -0.035 -0.041 1.000 -0.017 -0.013 -0.018 -0.028 -0.023 -0.039 -0.039 -0.003 -0.014 -0.011 -0.023 -0.011 -0.040 -0.017 -0.017 -0.016 -0.016 -0.016 -0.022 -0.009 -0.017 -0.013 -0.006 -0.012 -0.018 -0.016 -0.016 -0.018 -0.010 -0.010 -0.014 -0.016 -0.009 -0.008 -0.017
all the WILD furry animals like lions, tigers and bears (oh and koalas) -0.002 0.002 -0.008 0.006 0.001 0.008 0.088 -0.088 -0.058 -0.098 -0.036 -0.059 0.329 -0.047 -0.044 -0.051 -0.017 1.000 -0.016 -0.022 -0.035 -0.028 -0.048 -0.048 -0.004 -0.017 -0.014 -0.028 -0.014 -0.049 -0.021 -0.021 -0.019 -0.019 -0.020 -0.027 -0.012 -0.021 -0.016 -0.007 -0.015 -0.023 -0.020 -0.020 -0.022 -0.012 -0.013 -0.017 -0.020 -0.011 -0.010 -0.021
animals in factory farms 0.002 -0.002 -0.005 0.003 0.003 -0.001 -0.052 0.052 -0.045 -0.077 -0.029 -0.046 0.259 -0.037 -0.034 -0.040 -0.013 -0.016 1.000 -0.017 -0.027 -0.022 -0.038 -0.038 -0.003 -0.014 -0.011 -0.022 -0.011 -0.039 -0.016 -0.017 -0.015 -0.015 -0.016 -0.021 -0.009 -0.016 -0.013 -0.006 -0.012 -0.018 -0.016 -0.016 -0.017 -0.010 -0.010 -0.014 -0.015 -0.009 -0.008 -0.017
any species on the brink of extinction 0.002 -0.002 0.005 0.001 -0.005 0.001 0.117 -0.117 -0.061 -0.103 -0.038 -0.062 0.347 -0.050 -0.046 -0.054 -0.018 -0.022 -0.017 1.000 -0.037 -0.030 -0.051 -0.051 -0.004 -0.018 -0.015 -0.029 -0.015 -0.052 -0.022 -0.023 -0.020 -0.020 -0.021 -0.029 -0.012 -0.022 -0.017 -0.008 -0.016 -0.024 -0.021 -0.021 -0.023 -0.013 -0.014 -0.018 -0.021 -0.012 -0.010 -0.023
cancer: The Big C -0.008 0.008 0.018 0.000 -0.019 0.016 -0.113 0.113 -0.097 -0.166 -0.062 0.595 -0.106 -0.079 -0.074 -0.086 -0.028 -0.035 -0.027 -0.037 1.000 -0.048 -0.081 -0.082 -0.007 -0.029 -0.024 -0.047 -0.023 -0.083 -0.035 -0.036 -0.033 -0.033 -0.034 -0.046 -0.020 -0.035 -0.028 -0.013 -0.026 -0.038 -0.034 -0.034 -0.037 -0.021 -0.022 -0.029 -0.033 -0.019 -0.017 -0.036
conserving natural spaces and habitats like old rainforests and national parks -0.009 0.009 0.025 -0.023 -0.002 -0.001 0.070 -0.070 -0.079 -0.135 -0.050 -0.081 -0.086 0.606 -0.060 -0.070 -0.023 -0.028 -0.022 -0.030 -0.048 1.000 -0.066 -0.066 -0.005 -0.024 -0.019 -0.038 -0.019 -0.068 -0.028 -0.030 -0.027 -0.027 -0.028 -0.037 -0.016 -0.029 -0.022 -0.010 -0.021 -0.031 -0.028 -0.028 -0.030 -0.017 -0.018 -0.024 -0.027 -0.015 -0.014 -0.029
dogs & cats humans’ best friends -0.020 0.020 -0.018 0.012 0.005 0.013 -0.155 0.155 -0.134 -0.227 -0.084 -0.136 0.764 -0.109 -0.101 -0.118 -0.039 -0.048 -0.038 -0.051 -0.081 -0.066 1.000 -0.112 -0.009 -0.040 -0.033 -0.065 -0.032 -0.114 -0.048 -0.050 -0.045 -0.045 -0.047 -0.063 -0.027 -0.048 -0.038 -0.017 -0.036 -0.052 -0.047 -0.047 -0.051 -0.029 -0.030 -0.040 -0.045 -0.025 -0.023 -0.050
employment training and opportunities 0.004 -0.004 0.013 -0.008 -0.004 -0.013 -0.156 0.156 -0.134 0.492 -0.085 -0.137 -0.146 -0.110 -0.102 -0.119 -0.039 -0.048 -0.038 -0.051 -0.082 -0.066 -0.112 1.000 -0.009 -0.041 -0.033 -0.065 -0.032 -0.115 -0.048 -0.050 -0.045 -0.045 -0.047 -0.063 -0.027 -0.049 -0.038 -0.017 -0.036 -0.053 -0.047 -0.048 -0.051 -0.029 -0.030 -0.041 -0.045 -0.026 -0.023 -0.050
eradicating polio -0.004 0.004 0.010 -0.002 -0.008 -0.002 0.059 -0.059 -0.011 -0.019 -0.007 0.067 -0.012 -0.009 -0.008 -0.010 -0.003 -0.004 -0.003 -0.004 -0.007 -0.005 -0.009 -0.009 1.000 -0.003 -0.003 -0.005 -0.003 -0.009 -0.004 -0.004 -0.004 -0.004 -0.004 -0.005 -0.002 -0.004 -0.003 -0.001 -0.003 -0.004 -0.004 -0.004 -0.004 -0.002 -0.002 -0.003 -0.004 -0.002 -0.002 -0.004
food in their belly 0.008 -0.008 -0.009 0.008 0.003 -0.009 0.261 -0.261 -0.049 -0.083 0.478 -0.050 -0.053 -0.040 -0.037 -0.043 -0.014 -0.017 -0.014 -0.018 -0.029 -0.024 -0.040 -0.041 -0.003 1.000 -0.012 -0.023 -0.012 -0.042 -0.017 -0.018 -0.016 -0.016 -0.017 -0.023 -0.010 -0.018 -0.014 -0.006 -0.013 -0.019 -0.017 -0.017 -0.018 -0.010 -0.011 -0.015 -0.016 -0.009 -0.008 -0.018
groups targeted by their government 0.007 -0.007 -0.010 0.002 0.009 -0.007 0.211 -0.211 0.244 -0.067 -0.025 -0.040 -0.043 -0.032 -0.030 -0.035 -0.011 -0.014 -0.011 -0.015 -0.024 -0.019 -0.033 -0.033 -0.003 -0.012 1.000 -0.019 -0.009 -0.033 -0.014 -0.015 -0.013 -0.013 -0.014 -0.018 -0.008 -0.014 -0.011 -0.005 -0.010 -0.015 -0.014 -0.014 -0.015 -0.008 -0.009 -0.012 -0.013 -0.007 -0.007 -0.015
help all children learn to read -0.006 0.006 0.008 -0.008 -0.004 0.023 -0.090 0.090 -0.078 -0.132 -0.049 -0.079 -0.085 -0.063 0.638 -0.069 -0.023 -0.028 -0.022 -0.029 -0.047 -0.038 -0.065 -0.065 -0.005 -0.023 -0.019 1.000 -0.019 -0.066 -0.028 -0.029 -0.026 -0.026 -0.027 -0.037 -0.016 -0.028 -0.022 -0.010 -0.021 -0.030 -0.027 -0.027 -0.030 -0.017 -0.017 -0.023 -0.026 -0.015 -0.013 -0.029
help teachers fund classroom projects -0.003 0.003 0.003 -0.003 0.001 -0.007 -0.044 0.044 -0.038 -0.065 -0.024 -0.039 -0.042 -0.031 0.315 -0.034 -0.011 -0.014 -0.011 -0.015 -0.023 -0.019 -0.032 -0.032 -0.003 -0.012 -0.009 -0.019 1.000 -0.033 -0.014 -0.014 -0.013 -0.013 -0.013 -0.018 -0.008 -0.014 -0.011 -0.005 -0.010 -0.015 -0.014 -0.014 -0.015 -0.008 -0.009 -0.012 -0.013 -0.007 -0.007 -0.014
hot food in their belly -0.018 0.018 -0.030 0.030 0.001 0.000 -0.159 0.159 -0.137 0.503 -0.087 -0.140 -0.150 -0.112 -0.104 -0.121 -0.040 -0.049 -0.039 -0.052 -0.083 -0.068 -0.114 -0.115 -0.009 -0.042 -0.033 -0.066 -0.033 1.000 -0.049 -0.051 -0.046 -0.046 -0.048 -0.065 -0.028 -0.050 -0.039 -0.018 -0.037 -0.054 -0.048 -0.049 -0.052 -0.030 -0.031 -0.041 -0.046 -0.026 -0.023 -0.051
immigrants and refugees fleeing violence -0.009 0.009 -0.021 0.020 0.003 -0.010 -0.066 0.066 0.356 -0.097 -0.036 -0.058 -0.062 -0.047 -0.043 -0.050 -0.017 -0.021 -0.016 -0.022 -0.035 -0.028 -0.048 -0.048 -0.004 -0.017 -0.014 -0.028 -0.014 -0.049 1.000 -0.021 -0.019 -0.019 -0.020 -0.027 -0.012 -0.021 -0.016 -0.007 -0.015 -0.022 -0.020 -0.020 -0.022 -0.012 -0.013 -0.017 -0.019 -0.011 -0.010 -0.021
innovations that further clean energy technology -0.007 0.007 0.003 -0.022 0.019 -0.011 0.060 -0.060 -0.060 -0.102 -0.038 -0.061 -0.065 0.457 -0.045 -0.053 -0.017 -0.021 -0.017 -0.023 -0.036 -0.030 -0.050 -0.050 -0.004 -0.018 -0.015 -0.029 -0.014 -0.051 -0.021 1.000 -0.020 -0.020 -0.021 -0.028 -0.012 -0.022 -0.017 -0.008 -0.016 -0.023 -0.021 -0.021 -0.023 -0.013 -0.013 -0.018 -0.020 -0.011 -0.010 -0.022
LGBTQ+ communities 0.003 -0.003 -0.012 -0.014 0.027 -0.010 -0.062 0.062 0.336 -0.092 -0.034 -0.055 -0.059 -0.044 -0.041 -0.048 -0.016 -0.019 -0.015 -0.020 -0.033 -0.027 -0.045 -0.045 -0.004 -0.016 -0.013 -0.026 -0.013 -0.046 -0.019 -0.020 1.000 -0.018 -0.019 -0.025 -0.011 -0.020 -0.015 -0.007 -0.014 -0.021 -0.019 -0.019 -0.020 -0.012 -0.012 -0.016 -0.018 -0.010 -0.009 -0.020
massive ocean clean up 0.011 -0.011 0.019 -0.010 -0.009 -0.003 0.059 -0.059 -0.054 -0.092 -0.034 -0.055 -0.059 0.413 -0.041 -0.048 -0.016 -0.019 -0.015 -0.020 -0.033 -0.027 -0.045 -0.045 -0.004 -0.016 -0.013 -0.026 -0.013 -0.046 -0.019 -0.020 -0.018 1.000 -0.019 -0.025 -0.011 -0.020 -0.015 -0.007 -0.014 -0.021 -0.019 -0.019 -0.021 -0.012 -0.012 -0.016 -0.018 -0.010 -0.009 -0.020
new policies to regulate carbon emissions 0.012 -0.012 0.018 -0.015 -0.002 -0.004 0.059 -0.059 -0.056 -0.095 -0.035 -0.057 -0.061 0.427 -0.042 -0.049 -0.016 -0.020 -0.016 -0.021 -0.034 -0.028 -0.047 -0.047 -0.004 -0.017 -0.014 -0.027 -0.013 -0.048 -0.020 -0.021 -0.019 -0.019 1.000 -0.026 -0.011 -0.020 -0.016 -0.007 -0.015 -0.022 -0.020 -0.020 -0.021 -0.012 -0.013 -0.017 -0.019 -0.011 -0.010 -0.021
people suffering from preventable diseases because of inadequate healthcare -0.017 0.017 0.013 -0.009 -0.004 -0.004 -0.087 0.087 -0.075 -0.128 -0.048 0.461 -0.082 -0.062 -0.057 -0.067 -0.022 -0.027 -0.021 -0.029 -0.046 -0.037 -0.063 -0.063 -0.005 -0.023 -0.018 -0.037 -0.018 -0.065 -0.027 -0.028 -0.025 -0.025 -0.026 1.000 -0.015 -0.027 -0.021 -0.010 -0.020 -0.029 -0.027 -0.027 -0.029 -0.016 -0.017 -0.023 -0.026 -0.014 -0.013 -0.028
people that are victims of human trafficking 0.001 -0.001 -0.011 0.021 -0.011 0.015 0.175 -0.175 0.203 -0.055 -0.021 -0.033 -0.035 -0.027 -0.025 -0.029 -0.009 -0.012 -0.009 -0.012 -0.020 -0.016 -0.027 -0.027 -0.002 -0.010 -0.008 -0.016 -0.008 -0.028 -0.012 -0.012 -0.011 -0.011 -0.011 -0.015 1.000 -0.012 -0.009 -0.004 -0.009 -0.013 -0.011 -0.012 -0.012 -0.007 -0.007 -0.010 -0.011 -0.006 -0.006 -0.012
people with disabilities 0.004 -0.004 0.004 -0.001 -0.004 0.008 -0.067 0.067 0.362 -0.099 -0.037 -0.059 -0.063 -0.047 -0.044 -0.051 -0.017 -0.021 -0.016 -0.022 -0.035 -0.029 -0.048 -0.049 -0.004 -0.018 -0.014 -0.028 -0.014 -0.050 -0.021 -0.022 -0.020 -0.020 -0.020 -0.027 -0.012 1.000 -0.016 -0.008 -0.015 -0.023 -0.021 -0.021 -0.022 -0.013 -0.013 -0.018 -0.020 -0.011 -0.010 -0.022
people with heart disease 0.006 -0.006 0.004 -0.003 -0.003 0.015 -0.053 0.053 -0.045 -0.077 -0.029 0.278 -0.050 -0.037 -0.035 -0.040 -0.013 -0.016 -0.013 -0.017 -0.028 -0.022 -0.038 -0.038 -0.003 -0.014 -0.011 -0.022 -0.011 -0.039 -0.016 -0.017 -0.015 -0.015 -0.016 -0.021 -0.009 -0.016 1.000 -0.006 -0.012 -0.018 -0.016 -0.016 -0.017 -0.010 -0.010 -0.014 -0.015 -0.009 -0.008 -0.017
preventing and curing blindness -0.007 0.007 -0.007 0.020 -0.014 0.013 0.112 -0.112 -0.021 -0.036 -0.013 0.127 -0.023 -0.017 -0.016 -0.018 -0.006 -0.007 -0.006 -0.008 -0.013 -0.010 -0.017 -0.017 -0.001 -0.006 -0.005 -0.010 -0.005 -0.018 -0.007 -0.008 -0.007 -0.007 -0.007 -0.010 -0.004 -0.008 -0.006 1.000 -0.006 -0.008 -0.007 -0.007 -0.008 -0.005 -0.005 -0.006 -0.007 -0.004 -0.004 -0.008
protecting against preventable childhood diseases 0.013 -0.013 0.004 -0.008 0.004 -0.008 0.230 -0.230 -0.043 -0.073 -0.027 0.261 -0.047 -0.035 -0.032 -0.038 -0.012 -0.015 -0.012 -0.016 -0.026 -0.021 -0.036 -0.036 -0.003 -0.013 -0.010 -0.021 -0.010 -0.037 -0.015 -0.016 -0.014 -0.014 -0.015 -0.020 -0.009 -0.015 -0.012 -0.006 1.000 -0.017 -0.015 -0.015 -0.016 -0.009 -0.010 -0.013 -0.014 -0.008 -0.007 -0.016
provide tutoring and support to underserved children 0.001 -0.001 0.004 -0.002 -0.001 -0.006 -0.073 0.073 -0.063 -0.107 -0.040 -0.064 -0.068 -0.051 0.515 -0.055 -0.018 -0.023 -0.018 -0.024 -0.038 -0.031 -0.052 -0.053 -0.004 -0.019 -0.015 -0.030 -0.015 -0.054 -0.022 -0.023 -0.021 -0.021 -0.022 -0.029 -0.013 -0.023 -0.018 -0.008 -0.017 1.000 -0.022 -0.022 -0.024 -0.014 -0.014 -0.019 -0.021 -0.012 -0.011 -0.023
racial or ethnic minorities 0.008 -0.008 -0.008 0.006 0.003 -0.004 -0.066 0.066 0.353 -0.096 -0.036 -0.058 -0.062 -0.046 -0.043 -0.050 -0.016 -0.020 -0.016 -0.021 -0.034 -0.028 -0.047 -0.047 -0.004 -0.017 -0.014 -0.027 -0.014 -0.048 -0.020 -0.021 -0.019 -0.019 -0.020 -0.027 -0.011 -0.021 -0.016 -0.007 -0.015 -0.022 1.000 -0.020 -0.022 -0.012 -0.013 -0.017 -0.019 -0.011 -0.010 -0.021
rare diseases that need more research 0.012 -0.012 0.003 -0.005 0.000 0.014 -0.066 0.066 -0.057 -0.097 -0.036 0.347 -0.062 -0.046 -0.043 -0.050 -0.016 -0.020 -0.016 -0.021 -0.034 -0.028 -0.047 -0.048 -0.004 -0.017 -0.014 -0.027 -0.014 -0.049 -0.020 -0.021 -0.019 -0.019 -0.020 -0.027 -0.012 -0.021 -0.016 -0.007 -0.015 -0.022 -0.020 1.000 -0.022 -0.012 -0.013 -0.017 -0.019 -0.011 -0.010 -0.021
refugees fleeing violence or other disasters 0.011 -0.011 -0.033 0.023 0.012 -0.011 0.328 -0.328 0.380 -0.104 -0.039 -0.062 -0.066 -0.050 -0.046 -0.054 -0.018 -0.022 -0.017 -0.023 -0.037 -0.030 -0.051 -0.051 -0.004 -0.018 -0.015 -0.030 -0.015 -0.052 -0.022 -0.023 -0.020 -0.021 -0.021 -0.029 -0.012 -0.022 -0.017 -0.008 -0.016 -0.024 -0.022 -0.022 1.000 -0.013 -0.014 -0.018 -0.021 -0.012 -0.010 -0.023
send a low income student to college 0.033 -0.033 0.000 0.001 -0.005 0.024 -0.040 0.040 -0.035 -0.059 -0.022 -0.035 -0.038 -0.028 0.284 -0.031 -0.010 -0.012 -0.010 -0.013 -0.021 -0.017 -0.029 -0.029 -0.002 -0.010 -0.008 -0.017 -0.008 -0.030 -0.012 -0.013 -0.012 -0.012 -0.012 -0.016 -0.007 -0.013 -0.010 -0.005 -0.009 -0.014 -0.012 -0.012 -0.013 1.000 -0.008 -0.010 -0.012 -0.007 -0.006 -0.013
support girls in STEM -0.001 0.001 0.000 -0.017 0.016 -0.007 -0.042 0.042 -0.036 -0.061 -0.023 -0.037 -0.039 -0.029 0.295 -0.032 -0.010 -0.013 -0.010 -0.014 -0.022 -0.018 -0.030 -0.030 -0.002 -0.011 -0.009 -0.017 -0.009 -0.031 -0.013 -0.013 -0.012 -0.012 -0.013 -0.017 -0.007 -0.013 -0.010 -0.005 -0.010 -0.014 -0.013 -0.013 -0.014 -0.008 1.000 -0.011 -0.012 -0.007 -0.006 -0.013
the opportunity to go to school 0.060 -0.060 0.000 0.000 0.001 -0.009 0.261 -0.261 -0.048 -0.082 0.477 -0.049 -0.053 -0.040 -0.037 -0.043 -0.014 -0.017 -0.014 -0.018 -0.029 -0.024 -0.040 -0.041 -0.003 -0.015 -0.012 -0.023 -0.012 -0.041 -0.017 -0.018 -0.016 -0.016 -0.017 -0.023 -0.010 -0.018 -0.014 -0.006 -0.013 -0.019 -0.017 -0.017 -0.018 -0.010 -0.011 1.000 -0.016 -0.009 -0.008 -0.018
the skills to grow food sustainably 0.039 -0.039 0.016 -0.011 -0.004 -0.003 0.293 -0.293 -0.054 -0.092 0.535 -0.055 -0.059 -0.044 -0.041 -0.048 -0.016 -0.020 -0.015 -0.021 -0.033 -0.027 -0.045 -0.045 -0.004 -0.016 -0.013 -0.026 -0.013 -0.046 -0.019 -0.020 -0.018 -0.018 -0.019 -0.026 -0.011 -0.020 -0.015 -0.007 -0.014 -0.021 -0.019 -0.019 -0.021 -0.012 -0.012 -0.016 1.000 -0.010 -0.009 -0.020
treating the big three: HIV, AIDS, Tuburculosis and Malaria -0.010 0.010 0.002 -0.001 -0.002 0.006 0.165 -0.165 -0.031 -0.052 -0.019 0.187 -0.033 -0.025 -0.023 -0.027 -0.009 -0.011 -0.009 -0.012 -0.019 -0.015 -0.025 -0.026 -0.002 -0.009 -0.007 -0.015 -0.007 -0.026 -0.011 -0.011 -0.010 -0.010 -0.011 -0.014 -0.006 -0.011 -0.009 -0.004 -0.008 -0.012 -0.011 -0.011 -0.012 -0.007 -0.007 -0.009 -0.010 1.000 -0.005 -0.011
whales and ocean animals (Baby Beluga! Nemo!!) 0.012 -0.012 0.012 -0.007 -0.006 0.008 0.005 -0.005 -0.027 -0.047 -0.017 -0.028 0.157 -0.022 -0.021 -0.024 -0.008 -0.010 -0.008 -0.010 -0.017 -0.014 -0.023 -0.023 -0.002 -0.008 -0.007 -0.013 -0.007 -0.023 -0.010 -0.010 -0.009 -0.009 -0.010 -0.013 -0.006 -0.010 -0.008 -0.004 -0.007 -0.011 -0.010 -0.010 -0.010 -0.006 -0.006 -0.008 -0.009 -0.005 1.000 -0.010
women -0.017 0.017 0.003 -0.014 0.011 -0.005 0.040 -0.040 0.371 -0.101 -0.038 -0.061 -0.065 -0.049 -0.045 -0.053 -0.017 -0.021 -0.017 -0.023 -0.036 -0.029 -0.050 -0.050 -0.004 -0.018 -0.015 -0.029 -0.014 -0.051 -0.021 -0.022 -0.020 -0.020 -0.021 -0.028 -0.012 -0.022 -0.017 -0.008 -0.016 -0.023 -0.021 -0.021 -0.023 -0.013 -0.013 -0.018 -0.020 -0.011 -0.010 1.000

Analyzing treatment text responses

df_treatment_text <- df_wide %>% select(treatment_text, text_has_link, arm_coded) %>% filter(treatment_text != "") %>% filter(text_has_link == 0)

n_treatment_text <- as.character(nrow(df_treatment_text))
  • In this section, we will do some analysis with the text responses from the participants.

  • First, as mentioned the cleaning script, we detected a some abnormal text string that contains a link. We suspect that participants responded by sending an interactive images or emojis. We created a variable text_has_link to indicate if the text response contains a link. For this analysis, we will drop the missing values in the text respons and reponse with link.

  • We have a total of 5854 participants who provided text responses without a link from the main sample of 18529 participants who consented to the study.

  • These responses are exported to a spreadsheet, named treatment_text.csv, in the Data/Processed folder.

df_text <- df_treatment_text %>% select(treatment_text) 

write.csv(df_text, "./Data/Processed/treatment_text.csv")
  • The table below will show 10 responses of each treatment arm.
df_treatment_text %>% select(arm_coded, treatment_text) %>% group_by(arm_coded) %>% sample_n(10) %>% kable(digits = 3, col.names = c("Treatment Group", "Text Response")) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "500px")
Treatment Group Text Response
obligation I would tell them where to go get help
obligation Great
obligation Bad
obligation Frustrated and sad.
obligation Sad, horrible, and sad
obligation Good.
obligation no
obligation Stop
obligation It would make me feel like, this isn’t what God would want from me
obligation Ineffective
opportunity VERY PROUD
opportunity Wat response
opportunity Exit
opportunity Glad for their safety out of the elements
opportunity well, I’m 70 years old and have no grandchildren
opportunity happy
opportunity pleased
opportunity I would be filled with joy
opportunity I’d be happy and touched
opportunity It would feel great

Word cloud

  • In this section, we will create a word cloud for the text responses from the participants.

Both Obligation and Opportunity Groups

  • The top 30 words in the text responses from both the obligation and opportunity groups are shown in the word cloud below.
# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_treatment_text$treatment_text))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Opportunity

  • The top 30 words in the text responses from opportunity arm are shown in the word cloud below.
df_opportunity_text <- df_treatment_text %>% filter(arm_coded == "opportunity")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_opportunity_text$treatment_text))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Obligation

  • The top 30 words in the text responses from obligation arm are shown in the word cloud below.
df_obligation_text <- df_treatment_text %>% filter(arm_coded == "obligation")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_obligation_text$treatment_text))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Sentiment analysis

  • In this section, we will conduct a sentiment analysis on the text responses from the participants.

  • The scale of the sentiment score ranges from negative to positive integers. Negative scores indicate negative sentiment, positive scores indicate positive sentiment, and scores close to zero suggest neutral sentiment.

  • In our analysis, we observe that the average sentiment score for the obligation arm is negative, while the average sentiment score for the opportunity arm is positive.

# Tokenize the text
tokens <- df_treatment_text %>% select(arm_coded, treatment_text) %>%
  unnest_tokens(word, treatment_text)

# Get sentiment lexicon (afinn)
sentiments <- get_sentiments("afinn")

# Join the tokens with sentiment scores
sentiment_scores <- tokens %>%
  inner_join(sentiments, by = "word")

# Calculate sentiment score for each document (in this case, each treatment text)
sentiment_scores <- sentiment_scores %>%
  group_by(arm_coded) %>%
  summarise(sentiment_score = mean(value, na.rm = TRUE),
            sum_score = sum(value, na.rm = TRUE),
            sd = sd(value, na.rm = TRUE),
            n = n())

# Display sentiment scores for each group
sentiment_scores %>% kable(digits = 3, col.names = c("Treatment Group", "Average Sentiment Score", "Sum Sentiment Score", "SD", "N")) |>
  kable_styling(bootstrap_options = c("striped", "hover"))
Treatment Group Average Sentiment Score Sum Sentiment Score SD N
obligation -0.159 -614 2.212 3870
opportunity 1.762 6691 1.672 3797

Feedback text response

df_feedback_text <- df_wide %>% select(feedback_match_coded, feedback_has_link, arm_coded) %>% filter(feedback_match_coded != "") %>% filter(feedback_has_link == 0)

n_feedback_text <- as.character(nrow(df_feedback_text))
  • In this section, we will do some analysis with the feedback responses from the participants.

  • First, as mentioned the cleaning script, we detected a some abnormal text string that contains a link. We suspect that participants responded by sending an interactive images or emojis. We created a variable feedback_has_link to indicate if the text response contains a link. For this analysis, we will drop the missing values in the text respons and reponse with link.

  • We have a total of 2510 participants who provided feedback responses without a link from the main sample of 18529 participants who consented to the study.

  • These responses are exported to a spreadsheet, named feedback_text.csv, in the Data/Processed folder.

df_feedback <- df_feedback_text  %>% select(feedback_match_coded) 

write.csv(df_feedback, "./Data/Processed/feedback_text.csv")
  • The table below will show 10 responses of each treatment arm.
df_feedback_text %>% select(arm_coded, feedback_match_coded) %>% group_by(arm_coded) %>% sample_n(10) %>% kable(digits = 3, col.names = c("Treatment Group", "Feedback")) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "500px")
Treatment Group Feedback
control I don’t think it’s for me
control I like it and agree with you
control its great
control I’ll think about it
control I’m sold on St Jude Children Research Hospital
control I’m still researching
control You did good! I’m very unsure these days as to what is real and not real
control not what I wanted
control Actually, I’d like to consider domestic charities as well.
control THOUGHT IT WAS GREAT
obligation I think it is interesting and accurate. However, I do not want to be inundated with requests for more donations; I do not want “gifts”.
obligation Good Match
obligation Yes
obligation I need more alternatives, specifically locally. Most of the suggestions were west coast.
obligation meh
obligation Not a match.
obligation It was not my choice
obligation I want a redo
obligation I prefer bunny charities.
obligation It was perfect
opportunity It’s not for me. Rotary International is my chosen charity.
opportunity Very nice.
opportunity Nothing!!
opportunity ok
opportunity I liked if so I donated to it.
opportunity Don’t know yet , I feel pretty much like you do
opportunity It was fun chatting with you this was my first time doing it
opportunity A good organization, but not a no kill facility as far as I know. Those I donate to are.
opportunity I think it is a good match - thank you!
opportunity It was good

Word cloud

  • In this section, we will create a word cloud for the text responses from the participants.

All Groups

  • The top 30 words in the feedback responses from both the obligation and opportunity groups are shown in the word cloud below.
# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_feedback_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Control

  • The top 30 words in the feedback responses from the control arm are shown in the word cloud below.
df_control_text <- df_feedback_text %>% filter(arm_coded == "control")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_control_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Opportunity

  • The top 30 words in the feedback responses from the opportunity arm are shown in the word cloud below.
df_opportunity_text <- df_feedback_text %>% filter(arm_coded == "opportunity")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_opportunity_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Obligation

  • The top 30 words in the feedback responses from the obligation arm are shown in the word cloud below.
df_obligation_text <- df_feedback_text %>% filter(arm_coded == "obligation")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_obligation_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Sentiment analysis

  • In this section, we will conduct a sentiment analysis on the feedback responses from the participants.

  • The scale of the sentiment score ranges from negative to positive integers. Negative scores indicate negative sentiment, positive scores indicate positive sentiment, and scores close to zero suggest neutral sentiment.

  • In our analysis, we observe that the average sentiment score is the highest for the opportunity and the lowest for the obligation arm.

# Tokenize the text
tokens <- df_feedback_text %>% select(arm_coded, feedback_match_coded) %>%
  unnest_tokens(word, feedback_match_coded)

# Get sentiment lexicon (afinn)
sentiments <- get_sentiments("afinn")

# Join the tokens with sentiment scores
sentiment_scores <- tokens %>%
  inner_join(sentiments, by = "word")

# Calculate sentiment score for each document (in this case, each treatment text)
sentiment_scores <- sentiment_scores %>%
  group_by(arm_coded) %>%
  summarise(sentiment_score = mean(value, na.rm = TRUE),
            sum_score = sum(value, na.rm = TRUE),
            sd = sd(value, na.rm = TRUE),
            n = n())

# Display sentiment scores for each group
sentiment_scores %>% kable(digits = 3, col.names = c("Treatment Group", "Average Sentiment Score", "Sum Sentiment Score", "SD", "N")) |>
  kable_styling(bootstrap_options = c("striped", "hover"))
Treatment Group Average Sentiment Score Sum Sentiment Score SD N
control 1.613 1210 1.654 750
obligation 1.599 1861 1.691 1164
opportunity 1.670 2123 1.684 1271