Set up

Loading R library

packages = c(
  "tidyverse", "data.table", "dtplyr", "rlang", "kableExtra", "haven", "ggcorrplot", "visdat", "VIM", "corrplot", "kableExtra", "fastDummies", "tm", "wordcloud", "tidytext"
)

# Load the packages, install if necessary
new_packages = packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages, dependencies = TRUE)
lapply(packages, require, character.only = TRUE) |> invisible()

Goal

This script starts off by loading the clean data set generated by the main survey cleaning script.
The current working directory is ~FB_Charitable_Giving/Code/SurveyDataAnalysis/
We then filtered out participants who did not consent and generated a series of summary statistics for all categorical variables and quantitative variables.
For categorical variables, we reported the frequency and proportion for each choices of each variable. We then showed these statistics by treatment arms.
For continuous variables, we showed the total number of missing values, proportion of missing values, mean, standard deviation, standard error, min, max, and quintile values. We then reported these statistics for each variables by treatment arm.
Next, we will look at the number of people entered the chatbot, charity matching results, and donation intention over time.
The data we are using is located at ~FB_Charitable_Giving/Data/Processed.
The output tables, figures, and spreadsheet generated by this script can be found at ~FB_Charitable_Giving/Data/Processed.

Summary

About 90% of people spend 5 minutes or less in the chatbot to get their match.
Our entries into the chatbot started increasing dramatically in the last 2 weeks of December. That means we will have a difficult time separating out selection over time and selection due to expanded recruiting efforts (i.e., spending more ad money).
78% of people chose the U.S. over Global, although users were less likely to choose the U.S. over time (see prior point for difficulty interpreting).
The amount of time spent on different charity matches varies - this indicates to me that people were reading because the length of time in between messages (chatbot “pauses”) did not vary across charities, but the length of text for each charity did vary some.
15% of people who answered the donation variable reported wanting to make a donation immediately, and this proportion was fairly constant over time with just a couple of small/large days.
People who did not want to donate were about 40/40/20 split on Explore Other Charities/Learn More about Matched Charity/Donate Later, and these proportions were fairly constant over time.
About 20% of users who were asked the question wanted to share the chatbot quiz with their friends.
About 35% of users who were asked the question wanted to be recontacted, and this proportion may have dipped some in the last few days.

Data

The data used in this script was processed by the main survey cleaning script.
The data we are using is located at ~FB_Charitable_Giving/Data/Processed.
The output tables, figures, and spreadsheet generated by this script can be found at ~FB_Charitable_Giving/Data/Processed.

Loading the Data

data <- readRDS("Data/Processed/charitable_clean_wide.rds")

a <- as.character(nrow(data))

Sample definition

df_wide <- data |> filter(consent_coded_num == 1)
n_consent <- as.character(nrow(df_wide))

The dataset that contains all participants who passed high quality data checks in the cleaning script. The data currently has 46419 observations.
In this script, we will only restrict to participants who consented to start the chatbot. The number of observation in this sample is 18529. This is the main sample we will use for the analysis.
During the analysis, we’ll limit participants to specific subgroups determined by certain conditions. For instance, we’ll include participants who were assigned a treatment, those who chose a charity origin, and those who selected a main charity cause. We’ll provide the count of observations from our main sample for each subsample, with the exact number specified within each section of this script.

Key Decisions Made During the Cleaning

Key decisions regarding filtering and keeping the were made during cleaning include:
1. We identified participants who chose to repeat the charity matching quiz and manually recorded the Chatfuel responses for the both the first and second attempts. For analysis, we will use the responses of the first attempt.
2. We pinpointed participants displaying atypical response behavior, such as skipping entire text blocks yet proceeding to subsequent sections (additional specifics forthcoming).
3. We documented discrepancies between the expected and revealed charity names disclosed to participants. We noticed that cases with missing entries in the actual charity name field, but not in the expected charity name field, predominantly occur on December 28, 2023. This follows an update on December 27, 2023, to the spreadsheet with charity details, during which a new column was added without a name. Consequently, when App Engine updated its instances on December 28, 2023, it led to a disruption in data availability. This problem was identified and resolved by 10 AM on the same day (December 28). More detail on this issue can be found here.
4. We also checked whether participants who were able to proceed to the treatment assignment phase had any missing values in the country_charity_coded, main_cause_coded, and sub_cause_coded variables. We found that some participants had missing values in these variables. We generated a series of intermediate csv files to record abnormal entries and will check these responses manually.
5. WORK IN PROGRESS

Structure of Charity Matching Quiz

For each main cause, there are sub causes participants were asked to select, which are described as follows:
If country_choice_num is US:
- The sub-causes for Transform education are:
  1. help all children learn to read
  2. send a low income student to college
  3. support girls in STEM
  4. help teachers fund classroom projects
  5. provide tutoring and support to underserved children
- The sub-causes for Eradicate hunger and homelessness are:
  1. a roof over their head
  2. hot food in their belly
  3. employment training and opportunities
- The sub-causes for Defend the oppressed and marginalized are:
  1. immigrants and refugees fleeing violence
  2. LGBTQ+ communities
  3. racial or ethnic minorities
  4. women
  5. people with disabilities
- The sub-causes for Rescue the environment are:
  1. innovations that further clean energy technology
  2. massive ocean clean up
  3. conserving natural spaces and habitats like old rainforests and national parks
  4. new policies to regulate carbon emissions
- The sub-causes for Heal the sick are:
  1. people suffering from preventable diseases because of inadequate healthcare
  2. rare diseases that need more research
  3. cancer: The Big C
  4. people with heart disease
- The sub-causes for Protect the animals are:
  1. dogs & cats humans’ best friends
  2. all the WILD furry animals like lions, tigers and bears (oh and koalas)
  3. any species on the brink of extinction
  4. animals in factory farms
  5. whales and ocean animals (Baby Beluga! Nemo!!)
If country_choice_num is Global:
- The sub-causes for Eradicate poverty worldwide are:
  1. the opportunity to go to school
  2. food in their belly
  3. the skills to grow food sustainably
  4. access to clean water
- The sub-causes for Defend the oppressed and marginalized are:
  1. refugees fleeing violence or other disasters
  2. people that are victims of human trafficking
  3. women
  4. groups targeted by their government
- The sub-causes for Rescue the environment are:
  1. innovations that further clean energy technology
  2. massive ocean clean up
  3. conserving natural spaces and habitats like old rainforests and national parks
  4. new policies to regulate carbon emissions
- The sub-causes for Heal the sick are:
  1. protecting against preventable childhood diseases
  2. treating the big three: HIV, AIDS, Tuburculosis and Malaria
  3. eradicating polio
  4. preventing and curing blindness
- The sub-causes for Protect the animals are:
  1. all the WILD furry animals like lions, tigers and bears (oh and koalas)
  2. any species on the brink of extinction
  3. whales and ocean animals (Baby Beluga! Nemo!!)

Data Dictionary

The variable names and their descriptions are shown in the table below. Variable names are sorted in alphabetical order.

library(Hmisc)
dictionary <- label(df_wide) %>% data.frame()
dictionary <- dictionary %>% 
  mutate(variable_names = rownames(dictionary)) %>%
  select(variable_names, '.')

rownames(dictionary) <- NULL
dictionary %>% arrange(dictionary[,1]) %>% kable(digits = 3, col.names = c("Variable Name", "Description")) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "500px")

Variable Name	Description
analytic_id	Unique identifier for each participant
arm_coded	Treatment assignment of the participant
charitable_affirm_end	Time stamp when participants finished answering the affirmation questions
charitable_affirm_start	Time stamp when participants started answering the affirmation questions
charitable_intro_end	Time stamp after participant saw the statement ‘Awesome! Let’s find your charity soulmate’
charitable_intro_start_time	Time stamp when participant entered the chatbot
charitable_match_end	Time stamp when participants finished providing their responses for the charity matching quiz
charitable_match_start	Time stamp when participants began the charity matching quiz
charitable_reveal_end	Time stamp at the end of charity reveal
charitable_reveal_start	Time stamp when participants started seeing the charity reveal
charitable_treatment_end	Time stamp at the end of the intervention
charitable_treatment_start_time	Time stamp when participants got assigned a treatment group
charity_mismatch_type_1	1 if the expected charity name is not missing but the revealed charity name is missing
charity_mismatch_type_2	1 if the expected charity name does not match the revealed charity name
charity_name_coded	Charity name revealed to the participant
consent_coded_num	Binary flag for whether the participant gave consent to the study
country_charity_coded	Country of charity (US or Global) selected by the participant
donate_later_coded	Variable to indicate the donation intention of the participant to donate later
donate_today_coded	Binary variable to indicate the donation intention of the participant to donate today
donor_type_coded	Donor type assigned to the participant
duration_affirm	Duration of the affirmation questions in minutes
duration_intro	Duration of the introduction stage in minutes
duration_match	Duration of the charity matching quiz in minutes
duration_reveal	Duration of the charity reveal in minutes
duration_treatment	Duration of the intervention in minutes
feedback_has_link	Binary variable to indicate if the feedback contains a hyperlink
feedback_match_coded	Feedback to charity match
greeting_coded	Binary flag for whether the participant passed the greeting stage
has_logo_coded	Binary flag for whether the charity has a logo
important_forward_looking_coded	Score for the importance of forward-looking giving
important_responsive_coded	Score for the importance of responsive giving
important_smart_coded	Score for the importance of smart giving
main_cause_coded	Main cause of charity selected by the participant
manipulation_order_coded	Coded version of the manipulation order variable
manipulation_value_coded	Coded version of the manipulation question value variable
not_affil_coded	Binary flag for whether the participant passed the not affiliated stage
pre_consent_coded	Binary flag for whether the participant passed the pre-consent stage
proper_order	Flag for participants who have a time sequence that is in the correct order
repeat_quiz_coded	Binary flag for whether the participant repeated the charity matching quiz
share_with_friend_coded	Variable to indicate if the participant would like to share the link with friends
source_coded	1 if entered via JSON AD, 0 otherwise
stay_connected_coded	Variable to indicate if the participant would like to stay connected with the chatbot
sub_cause_coded	Sub cause selected by the participant
text_has_link	Binary variable to indicate if the response contains a hyperlink
time_since_consent	Time conseted since the experiment began in days
time_since_first_start	Time started the chatbot since the experiment began in days
time_since_intervention	Time started the intervention since the experiment began in days
time_since_match_start	Time started the charity matching quiz since the experiment began in days
time_since_reveal	Time started the charity reveal since the experiment began in days
treatment_completed	Binary variable to indicate if participants have completed the treatment intervention
treatment_text	Free text responses from participants in the opportunity and obligation arm

Functions

Funtions for continuous variables

We created a function to take each variable and group by a categorical variable to calculate the summary statistics for each group.
The summary statistics include the number of missing values, the mean, the standard deviation, the standard error, the number of total observation, min, max, and decile values.
The function combines the summary statistics for each selected variable and creates an html table
The function also generates a csv file, which is saved in the Github directory ~./Data/Processed/

summary_stats_continous <- function(dt, variable, group = NULL){
  # Get descriptive statistics for a single continuous variable
  variable_name = variable 
  variable = sym(variable)

  if(!is.null(group)){
    group_name = group
    group = sym(group)
    dt <- dt |> group_by({{group}})
  } # grouping if group variable is selected

  tab = dt |> lazy_dt() |>
    mutate(
      `Number_Missing` = sum(is.na({{variable}})),  # calculate missing values
    ) |>
    filter(!is.na({{variable}})) |>  # filtering out the missing values for the variable
    summarise(
        Total_Observations = n(),
        Number_Missing = first(`Number_Missing`),
        Mean = as.numeric(mean({{variable}})),
        SD = as.numeric(sd({{variable}})),
        SE = as.numeric(sd({{variable}}) / sqrt(n())),
        Min = as.numeric(min({{variable}})),
        Max =  as.numeric(max({{variable}})),
        Decile_10 = as.numeric(quantile({{variable}}, probs = 0.1)),
        Decile_20 = as.numeric(quantile({{variable}}, probs = 0.2)),
        Decile_30 = as.numeric(quantile({{variable}}, probs = 0.3)),
        Decile_40 = as.numeric(quantile({{variable}}, probs = 0.4)),
        Decile_50 = as.numeric(quantile({{variable}}, probs = 0.5)),
        Decile_60 = as.numeric(quantile({{variable}}, probs = 0.6)),
        Decile_70 = as.numeric(quantile({{variable}}, probs = 0.7)),
        Decile_80 = as.numeric(quantile({{variable}}, probs = 0.8)),
        Decile_90 = as.numeric(quantile({{variable}}, probs = 0.9)),
        Decile_95 = as.numeric(quantile({{variable}}, probs = 0.95)),
        Decile_98 = as.numeric(quantile({{variable}}, probs = 0.98)),
        Decile_99 = as.numeric(quantile({{variable}}, probs = 0.99)),
        Decile_100 = as.numeric(quantile({{variable}}, probs = 1))
      ) |> # compute summary statistics 
      mutate(
        Proportion_Missing = Number_Missing / Total_Observations,
        Variable = variable_name
      ) |> # adds a column indicating the proportion of missing values relative to the total observations.
      collect()

  if (is.null(group)){
    tab <- tab |> select(-Total_Observations) %>% select(Variable, Number_Missing, Proportion_Missing, everything())
  } else {
    tab <- tab |> mutate(Group = paste(group_name, ": ", {{group}})) |> select(-Total_Observations) |> select(Variable, Group, Number_Missing, Proportion_Missing, everything()) |> select(-{{group}})
  }

  return(tab)
}

summary_stats_table_continous <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  # Get descriptive statistics for a list of continuous variables and return a combined table
  results <- lapply(variables, function(x) summary_stats_continous(dt, x, group)) # apply the summary_stats_continous function to each variable in the list
  results <- bind_rows(results) # bind the list of dataframes into a single dataframe

  if(!is.null(filename)){
    write.csv(results, paste0("./Data/Processed/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

summary_stats_continous_outcomes <- function(dt, variable, group = NULL){
  # Get descriptive statistics for a single continuous variable
  variable_name = variable 
  variable = sym(variable)

  if(!is.null(group)){
    group_name = group
    group = sym(group)
    dt <- dt |> group_by({{group}})
  } # grouping if group variable is selected

  tab = dt |> lazy_dt() |>
    mutate(
      `Number_Missing` = sum(is.na({{variable}})),  # calculate missing values
    ) |>
    filter(!is.na({{variable}})) |>  # filtering out the missing values for the variable
    summarise(
        Total_Observations = n(),
        Number_Missing = first(`Number_Missing`),
        Mean = as.numeric(mean({{variable}})),
        SD = as.numeric(sd({{variable}})),
        SE = as.numeric(sd({{variable}}) / sqrt(n()))
      ) |> # compute summary statistics 
      mutate(
        Proportion_Missing = Number_Missing / Total_Observations,
        Variable = variable_name
      ) |> # adds a column indicating the proportion of missing values relative to the total observations.
      collect()

  if (is.null(group)){
    tab <- tab |> select(Variable, everything())
  } else {
    tab <- tab |> mutate(Group = paste(group_name, ": ", {{group}})) |> select(Variable, Group, everything()) |> select(-{{group}})
  }

  return(tab)
}

summary_stats_table_continous_outcomes <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  # Get descriptive statistics for a list of continuous variables and return a combined table
  results <- lapply(variables, function(x) summary_stats_continous_outcomes(dt, x, group)) # apply the summary_stats_continous function to each variable in the list
  results <- bind_rows(results) # bind the list of dataframes into a single dataframe

  if(!is.null(filename)){
    write.csv(results, paste0("./Data/Processed/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

Function for unordered categorical variables

We created a function to take each variable and group by a categorical variable to calculate the summary statistics for each group.
The summary statistics include the number for each count for each value and the proportion for each value
The function combines the summary statistics for each selected variable and creates an html table
The function also generates a csv file, which is saved in the Github directory ~./Data/Processed/

# Define the function to create summary statistics for unordered categorical variable
summary_stats_categorical <- function(data, variables, group = NULL) {
  combined_results <- data.frame(
    Variable = character(),
    Group = character(),
    Category = character(),
    Frequency = integer(),
    Proportion = numeric(),
    stringsAsFactors = FALSE
  ) # preset a dataframe
  
  for (variable in variables) {
    if (!is.null(group)) {
      # Grouped calculations
      summary_df <- data %>%
        group_by_at(vars(c(group, variable))) %>%
        summarise(Frequency = n(), .groups = 'drop') %>%
        mutate(Proportion = Frequency / sum(Frequency)) %>%
        ungroup() %>%
        mutate(Variable = variable,
               Group = ifelse(is.na(.data[[group]]), "Missing", as.character(.data[[group]])),
               Category = ifelse(is.na(.data[[variable]]), "Missing", as.character(.data[[variable]]))) %>%
        select(Variable, Group, Category, Frequency, Proportion)
    } else {
      # Ungrouped calculations
     summary_df <- data %>%
      dplyr::select(!!sym(variable)) %>%
      mutate(Category = factor(!!sym(variable), exclude = NULL)) %>% # Include missing values as a factor level
      group_by(Category) %>%
      summarise(
        Frequency = n(),
        Proportion = n() / nrow(data),
        .groups = 'drop' # Avoid regrouping warning
      ) %>%
      ungroup() %>%
      mutate(Variable = variable) %>% # Add variable name
      dplyr::select(Variable, everything())
    }
    
    # Combine the result with the overall results
    combined_results <- bind_rows(combined_results, summary_df)
  }
  
  return(combined_results)
}

summary_stats_tab_categorical <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  
  results <- lapply(variables, function(x) summary_stats_categorical(dt, x, group))
  results <- bind_rows(results) 

  if(!is.null(filename)){
    write.csv(results, paste0("./Data/Processed/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

Function for generating probability distribution

Density plot

create_density_plot <- function(data, x_var, group_var, x_min, x_max, title, x_label, y_label,  file_basename)  {
  plt <-
  ggplot(data, aes_string(x = x_var, group = group_var, fill = group_var)) +
    geom_density(alpha = 0.5, color = "black") +
    labs(title = title,
         x = x_label,
         y = y_label) +
    scale_x_continuous(limits = c(x_min, x_max)) +
    theme_minimal()+
    guides(fill = guide_legend(title = NULL))

  print(plt)

  # Export the data used for the graph to CSV
  write.csv(data, paste0("./Data/Processed/", file_basename, ".csv"))

  # Save the plot to JPG
  ggsave(paste0("./Data/Processed/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)

  # Save the plot to PDF
  ggsave(paste0("./Data/Processed/", file_basename, ".pdf"), plt, device = "pdf", width = 10, height = 8)
}

Histogram

create_histogram_grouped <- function(data, x_var, group_var, bin_width, title, x_label, y_label, line_color = "black", file_basename) {
  # Generate the histogram
  plt <- ggplot(data, aes_string(x = x_var, fill = group_var)) +
    geom_histogram(binwidth = bin_width, color = line_color, position = "stack", alpha = 0.5) +
    labs(title = title, x = x_label, y = y_label) +
    theme_minimal() +
    guides(fill = guide_legend(title = NULL))

  # Print the plot
  print(plt)
  
  # Export the data used for the graph to CSV
  write.csv(data, paste0("./Data/Processed/", file_basename, ".csv"))

  # Save the plot to JPG
  ggsave(paste0("./Data/Processed/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)

  # Save the plot to PDF
  ggsave(paste0("./Data/Processed/", file_basename, ".pdf"), plt, device = "pdf", width = 10, height = 8)
}

Summary Statistics of Continuous Variables

Main Sample

For the summary statistics of continuous variables, we generate the statistics using the sample of consent. This sample includes 18529 participants who consented to the study.
The table belows show the number of missing values, proportion of missing values (out of the sample of 18529 participants who consented to the study), mean, standard deviation, standard error, min, max, and quintile values for each variables in the sample.
We also reports the extreme values of these variables by showing the 90th, 95th, 98th, 99th, and 100th percentile.
We noticed very large outliers of duration_intro, duration_match, duration_affirm, and duration_reveal variables (in minutes).
Noticed that there are 18 people missing for the source_coded variable. These participants retried the charity matching quiz. In the process of manually recording data for these participants, we did not capture the source of entry. WE ARE WORKING ON THIS, which should be an easy task.

df_wide$manipulation_value_coded <- as.numeric(df_wide$manipulation_value_coded)

continous_var <- c("duration_intro", "duration_match", "duration_affirm", "duration_reveal", "duration_treatment", "time_since_first_start", "important_smart_coded", "important_responsive_coded","important_forward_looking_coded", "source_coded", "manipulation_value_coded")

summary_stats_table_continous(
  df_wide,  
  variables = continous_var,
  group = NULL,
  filename = "continuous",
  caption = "Summary Statistics for Continuous Variables",
  print_html = TRUE
  )

Summary Statistics for Continuous Variables
Variable	Number_Missing	Proportion_Missing	Mean	SD	SE	Min	Max	Decile_10	Decile_20	Decile_30	Decile_40	Decile_50	Decile_60	Decile_70	Decile_80	Decile_90	Decile_95	Decile_98	Decile_99	Decile_100
duration_intro	2572	0.161	127.725	1744.339	13.809	0.683	81298.02	0.933	1.017	1.067	1.150	1.233	1.350	1.500	1.783	2.483	4.133	74.737	1544.770	81298.02
duration_match	2045	0.124	30.518	816.431	6.359	0.400	44308.42	0.733	0.833	0.917	1.000	1.100	1.217	1.350	1.567	2.017	2.583	4.017	8.253	44308.42
duration_affirm	2634	0.166	11.028	407.984	3.236	0.383	38597.92	0.533	0.567	0.617	0.667	0.717	0.783	0.883	1.033	1.333	1.717	2.517	4.635	38597.92
duration_reveal	7814	0.729	77.690	1896.311	18.319	0.233	120738.32	0.650	0.717	0.767	0.817	0.883	0.967	1.083	1.250	1.667	2.367	7.967	82.480	120738.32
duration_treatment	13123	2.427	95.262	935.402	12.722	0.067	32165.07	1.233	1.700	2.017	2.350	2.800	3.417	4.450	7.600	24.692	135.896	1382.812	1599.863	32165.07
time_since_first_start	2027	0.123	22.630	9.105	0.071	0.000	33.00	8.000	14.000	19.000	23.000	26.000	28.000	29.000	31.000	32.000	33.000	33.000	33.000	33.00
important_smart_coded	2303	0.142	2.759	0.534	0.004	1.000	3.00	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
important_responsive_coded	2471	0.154	2.614	0.590	0.005	1.000	3.00	2.000	2.000	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
important_forward_looking_coded	2627	0.165	2.685	0.527	0.004	1.000	3.00	2.000	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
source_coded	18	0.001	0.983	0.128	0.001	0.000	1.00	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.00
manipulation_value_coded	12123	1.892	4.701	0.717	0.009	1.000	5.00	4.000	4.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.00

Participants who received a charity match

df_charity <- df_wide %>% filter(charity_name_coded != "")

n_charity <- as.character(nrow(df_charity))

In this section, we will generate the summary statistics for continuous variables for the sample of 16487 participants who received a charity match.

summary_stats_table_continous(
  df_charity,  
  variables = continous_var,
  group = NULL,
  filename = "continuous_charity_match",
  caption = "Summary Statistics for Continuous Variables Among Participants Received a Charity Match",
  print_html = TRUE
  )

Summary Statistics for Continuous Variables Among Participants Received a Charity Match
Variable	Number_Missing	Proportion_Missing	Mean	SD	SE	Min	Max	Decile_10	Decile_20	Decile_30	Decile_40	Decile_50	Decile_60	Decile_70	Decile_80	Decile_90	Decile_95	Decile_98	Decile_99	Decile_100
duration_intro	1786	0.121	89.978	1198.293	9.883	0.683	31112.45	0.933	1.000	1.067	1.133	1.217	1.333	1.483	1.733	2.367	3.700	31.017	774.050	31112.45
duration_match	5	0.000	30.521	816.481	6.360	0.400	44308.42	0.733	0.833	0.917	1.000	1.100	1.217	1.350	1.567	2.017	2.583	4.017	8.253	44308.42
duration_affirm	594	0.037	11.029	408.010	3.236	0.383	38597.92	0.533	0.567	0.617	0.667	0.717	0.783	0.883	1.033	1.333	1.717	2.517	4.636	38597.92
duration_reveal	5772	0.539	77.690	1896.311	18.319	0.233	120738.32	0.650	0.717	0.767	0.817	0.883	0.967	1.083	1.250	1.667	2.367	7.967	82.480	120738.32
duration_treatment	11081	2.050	95.262	935.402	12.722	0.067	32165.07	1.233	1.700	2.017	2.350	2.800	3.417	4.450	7.600	24.692	135.896	1382.812	1599.863	32165.07
time_since_first_start	1781	0.121	22.795	9.070	0.075	0.000	33.00	8.000	14.000	19.000	23.000	26.000	28.000	29.000	31.000	32.000	33.000	33.000	33.000	33.00
important_smart_coded	265	0.016	2.759	0.534	0.004	1.000	3.00	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
important_responsive_coded	433	0.027	2.614	0.590	0.005	1.000	3.00	2.000	2.000	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
important_forward_looking_coded	590	0.037	2.685	0.527	0.004	1.000	3.00	2.000	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
source_coded	0	0.000	0.983	0.131	0.001	0.000	1.00	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.00
manipulation_value_coded	10081	1.574	4.701	0.717	0.009	1.000	5.00	4.000	4.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.00

Participants who completed the treatment intervention

df_treatment_completed <- df_wide %>% filter(treatment_completed ==1)

n_treatment_completed <- as.character(nrow(df_treatment_completed))

In this section, we will generate the summary statistics for continuous variables for the sample of 5406 participants who completed the treatment intervention.

summary_stats_table_continous(
  df_treatment_completed,  
  variables = continous_var,
  group = NULL,
  filename = "continuous_treatment_completed",
  caption = "Summary Statistics for Continuous Variables Among Participants Completed the Treatment Intervention",
  print_html = TRUE
  )

Summary Statistics for Continuous Variables Among Participants Completed the Treatment Intervention
Variable	Number_Missing	Proportion_Missing	Mean	SD	SE	Min	Max	Decile_10	Decile_20	Decile_30	Decile_40	Decile_50	Decile_60	Decile_70	Decile_80	Decile_90	Decile_95	Decile_98	Decile_99	Decile_100
duration_intro	628	0.131	63.381	870.664	12.596	0.717	31112.45	0.933	1.017	1.083	1.150	1.242	1.350	1.517	1.817	2.533	3.883	24.762	770.198	31112.45
duration_match	1	0.000	23.758	671.671	9.136	0.400	35154.63	0.750	0.850	0.933	1.017	1.133	1.250	1.400	1.633	2.077	2.633	3.865	5.548	35154.63
duration_affirm	0	0.000	6.983	257.466	3.502	0.383	15252.20	0.533	0.583	0.617	0.667	0.717	0.800	0.900	1.050	1.367	1.733	2.515	3.993	15252.20
duration_reveal	0	0.000	16.641	394.072	5.360	0.267	20079.72	0.650	0.700	0.767	0.817	0.883	0.967	1.067	1.217	1.583	2.163	4.398	13.817	20079.72
duration_treatment	0	0.000	95.262	935.402	12.722	0.067	32165.07	1.233	1.700	2.017	2.350	2.800	3.417	4.450	7.600	24.692	135.896	1382.812	1599.863	32165.07
time_since_first_start	627	0.131	22.402	9.137	0.132	0.000	33.00	8.000	14.000	19.000	23.000	25.000	27.000	29.000	31.000	32.000	33.000	33.000	33.000	33.00
important_smart_coded	0	0.000	2.764	0.527	0.007	1.000	3.00	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
important_responsive_coded	0	0.000	2.629	0.571	0.008	1.000	3.00	2.000	2.000	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
important_forward_looking_coded	0	0.000	2.691	0.524	0.007	1.000	3.00	2.000	2.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.000	3.00
source_coded	0	0.000	0.965	0.184	0.002	0.000	1.00	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.00
manipulation_value_coded	3	0.001	4.707	0.693	0.009	1.000	5.00	4.000	4.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.000	5.00

Summary Statistics of Categorical Variables

For the summary statistics of categorical variables, we generate the statistics using the sample of consent. This sample includes 18529 participants who consented to the study.
The table belows show the frequency and proportion of each category of each variables (out of the sample of 18529 participants who consented to the study).
Empty cells represent missing values.

cat_variable <- c("country_charity_coded", "main_cause_coded", "sub_cause_coded", "charity_name_coded", "donor_type_coded", "arm_coded", "manipulation_order_coded", "donate_today_coded", "donate_later_coded", "stay_connected_coded", "share_with_friend_coded")
summary_stats_tab_categorical(
  df_wide,  
  variables = cat_variable,
  group = NULL,
  filename = "donortype",
  caption = "Summary Statistics for Unordered Categorical Variables",
  print_html = TRUE
  )

Summary Statistics for Unordered Categorical Variables
Variable	Group	Category	Frequency	Proportion
country_charity_coded	NA	Global	3062	0.165
country_charity_coded	NA	US	14409	0.778
country_charity_coded	NA	NA	1058	0.057
main_cause_coded	NA	Defend the oppressed and marginalized	2364	0.128
main_cause_coded	NA	Eradicate hunger and homelessness	5303	0.286
main_cause_coded	NA	Eradicate poverty worldwide	1000	0.054
main_cause_coded	NA	Heal the sick	2468	0.133
main_cause_coded	NA	Protect the animals	2697	0.146
main_cause_coded	NA	Rescue the environment	1637	0.088
main_cause_coded	NA	Transform education	1498	0.081
main_cause_coded	NA	NA	1562	0.084
sub_cause_coded	NA	a roof over their head	1835	0.099
sub_cause_coded	NA	access to clean water	220	0.012
sub_cause_coded	NA	all the WILD furry animals like lions, tigers and bears (oh and koalas)	336	0.018
sub_cause_coded	NA	animals in factory farms	215	0.012
sub_cause_coded	NA	any species on the brink of extinction	365	0.020
sub_cause_coded	NA	cancer: The Big C	947	0.051
sub_cause_coded	NA	conserving natural spaces and habitats like old rainforests and national parks	618	0.033
sub_cause_coded	NA	dogs & cats humans’ best friends	1635	0.088
sub_cause_coded	NA	employment training and opportunities	1658	0.089
sub_cause_coded	NA	eradicating polio	12	0.001
sub_cause_coded	NA	food in their belly	238	0.013
sub_cause_coded	NA	groups targeted by their government	155	0.008
sub_cause_coded	NA	help all children learn to read	606	0.033
sub_cause_coded	NA	help teachers fund classroom projects	151	0.008
sub_cause_coded	NA	hot food in their belly	1740	0.094
sub_cause_coded	NA	immigrants and refugees fleeing violence	330	0.018
sub_cause_coded	NA	innovations that further clean energy technology	358	0.019
sub_cause_coded	NA	LGBTQ+ communities	288	0.016
sub_cause_coded	NA	massive ocean clean up	288	0.016
sub_cause_coded	NA	new policies to regulate carbon emissions	318	0.017
sub_cause_coded	NA	people suffering from preventable diseases because of inadequate healthcare	565	0.030
sub_cause_coded	NA	people that are victims of human trafficking	110	0.006
sub_cause_coded	NA	people with disabilities	338	0.018
sub_cause_coded	NA	people with heart disease	216	0.012
sub_cause_coded	NA	preventing and curing blindness	46	0.002
sub_cause_coded	NA	protecting against preventable childhood diseases	184	0.010
sub_cause_coded	NA	provide tutoring and support to underserved children	391	0.021
sub_cause_coded	NA	racial or ethnic minorities	320	0.017
sub_cause_coded	NA	rare diseases that need more research	324	0.017
sub_cause_coded	NA	refugees fleeing violence or other disasters	367	0.020
sub_cause_coded	NA	send a low income student to college	132	0.007
sub_cause_coded	NA	support girls in STEM	134	0.007
sub_cause_coded	NA	the opportunity to go to school	237	0.013
sub_cause_coded	NA	the skills to grow food sustainably	292	0.016
sub_cause_coded	NA	treating the big three: HIV, AIDS, Tuburculosis and Malaria	95	0.005
sub_cause_coded	NA	whales and ocean animals (Baby Beluga! Nemo!!)	78	0.004
sub_cause_coded	NA	women	362	0.020
sub_cause_coded	NA	NA	2025	0.109
charity_name_coded	NA	American Heart Association	216	0.012
charity_name_coded	NA	American Society for the Prevention of Cruelty to Animals	1635	0.088
charity_name_coded	NA	Amnesty International	154	0.008
charity_name_coded	NA	BRAC	291	0.016
charity_name_coded	NA	Clean Air Task Force	318	0.017
charity_name_coded	NA	Concern Worldwide	95	0.005
charity_name_coded	NA	Dana-Farber Cancer Institute	947	0.051
charity_name_coded	NA	DonorsChoose	151	0.008
charity_name_coded	NA	Evidence Action Inc.	183	0.010
charity_name_coded	NA	Feeding America	1740	0.094
charity_name_coded	NA	Freedom Network	109	0.006
charity_name_coded	NA	Girls Inc.	134	0.007
charity_name_coded	NA	Global Fund for Women	99	0.005
charity_name_coded	NA	HealthWell Foundation	565	0.030
charity_name_coded	NA	HIAS	697	0.038
charity_name_coded	NA	Housing Matters	1834	0.099
charity_name_coded	NA	Mercy For Animals	214	0.012
charity_name_coded	NA	National Women’s Law Center	262	0.014
charity_name_coded	NA	Natural Resources Defense Council	358	0.019
charity_name_coded	NA	Rainforest Trust	618	0.033
charity_name_coded	NA	Reading is Fundamental	606	0.033
charity_name_coded	NA	Rocketship Education	389	0.021
charity_name_coded	NA	Room to Read	237	0.013
charity_name_coded	NA	Sightsavers Inc.	46	0.002
charity_name_coded	NA	Southern Poverty Law Center	320	0.017
charity_name_coded	NA	Special Olympics	337	0.018
charity_name_coded	NA	Surfrider Foundation	288	0.016
charity_name_coded	NA	The Leukemia & Lymphoma Society	323	0.017
charity_name_coded	NA	The Ocean Foundation	78	0.004
charity_name_coded	NA	The Rotary Foundation	12	0.001
charity_name_coded	NA	The Trevor Project	288	0.016
charity_name_coded	NA	Thurgood Marshall College Fund	131	0.007
charity_name_coded	NA	WaterAid	219	0.012
charity_name_coded	NA	Wild Earth Allies	336	0.018
charity_name_coded	NA	Wildlife SOS	363	0.020
charity_name_coded	NA	World Food Program	238	0.013
charity_name_coded	NA	Year Up	1656	0.089
charity_name_coded	NA	NA	2042	0.110
donor_type_coded	NA	forward_looking	5125	0.277
donor_type_coded	NA	responsive	4599	0.248
donor_type_coded	NA	smart	6105	0.329
donor_type_coded	NA	unsure	85	0.005
donor_type_coded	NA	NA	2615	0.141
arm_coded	NA	control	2161	0.117
arm_coded	NA	obligation	4344	0.234
arm_coded	NA	opportunity	4210	0.227
arm_coded	NA	NA	7814	0.422
manipulation_order_coded	NA	1	4063	0.219
manipulation_order_coded	NA	2	4167	0.225
manipulation_order_coded	NA	NA	10299	0.556
donate_today_coded	NA	0	6520	0.352
donate_today_coded	NA	1	991	0.053
donate_today_coded	NA	NA	11018	0.595
donate_later_coded	NA	Do it later	1155	0.062
donate_later_coded	NA	Explore others	2101	0.113
donate_later_coded	NA	Learn more	1859	0.100
donate_later_coded	NA	NA	13414	0.724
stay_connected_coded	NA	No	3639	0.196
stay_connected_coded	NA	Yes	2190	0.118
stay_connected_coded	NA	NA	12700	0.685
share_with_friend_coded	NA	No	2092	0.113
share_with_friend_coded	NA	Yes	464	0.025
share_with_friend_coded	NA	NA	15973	0.862

Charity Match by Treatment Arms

This table shows the frequency and proportion of each category of each variables in the consent sample by treatment arms.
For instances where the group variable is Missing and the charity category is not missing, this indicates that the participant did not receive a treatment assignment. In other words, these are participants dropped off after the reveal of the charity match.

cat_variable_arm = c("charity_name_coded")
summary_stats_tab_categorical(
  df_wide,  
  variables = cat_variable_arm,
  group = "arm_coded",
  filename = "donortype",
  caption = "Summary Statistics for Unordered Categorical Variables",
  print_html = TRUE
  )

Summary Statistics for Unordered Categorical Variables
Variable	Group	Category	Frequency	Proportion
charity_name_coded	control	American Heart Association	32	0.002
charity_name_coded	control	American Society for the Prevention of Cruelty to Animals	228	0.012
charity_name_coded	control	Amnesty International	25	0.001
charity_name_coded	control	BRAC	42	0.002
charity_name_coded	control	Clean Air Task Force	45	0.002
charity_name_coded	control	Concern Worldwide	8	0.000
charity_name_coded	control	Dana-Farber Cancer Institute	105	0.006
charity_name_coded	control	DonorsChoose	15	0.001
charity_name_coded	control	Evidence Action Inc.	25	0.001
charity_name_coded	control	Feeding America	241	0.013
charity_name_coded	control	Freedom Network	11	0.001
charity_name_coded	control	Girls Inc.	16	0.001
charity_name_coded	control	Global Fund for Women	15	0.001
charity_name_coded	control	HIAS	82	0.004
charity_name_coded	control	HealthWell Foundation	68	0.004
charity_name_coded	control	Housing Matters	255	0.014
charity_name_coded	control	Mercy For Animals	28	0.002
charity_name_coded	control	National Women’s Law Center	35	0.002
charity_name_coded	control	Natural Resources Defense Council	48	0.003
charity_name_coded	control	Rainforest Trust	93	0.005
charity_name_coded	control	Reading is Fundamental	66	0.004
charity_name_coded	control	Rocketship Education	49	0.003
charity_name_coded	control	Room to Read	41	0.002
charity_name_coded	control	Sightsavers Inc.	5	0.000
charity_name_coded	control	Southern Poverty Law Center	37	0.002
charity_name_coded	control	Special Olympics	47	0.003
charity_name_coded	control	Surfrider Foundation	38	0.002
charity_name_coded	control	The Leukemia & Lymphoma Society	35	0.002
charity_name_coded	control	The Ocean Foundation	10	0.001
charity_name_coded	control	The Rotary Foundation	3	0.000
charity_name_coded	control	The Trevor Project	35	0.002
charity_name_coded	control	Thurgood Marshall College Fund	18	0.001
charity_name_coded	control	WaterAid	37	0.002
charity_name_coded	control	Wild Earth Allies	51	0.003
charity_name_coded	control	Wildlife SOS	49	0.003
charity_name_coded	control	World Food Program	38	0.002
charity_name_coded	control	Year Up	185	0.010
charity_name_coded	obligation	American Heart Association	52	0.003
charity_name_coded	obligation	American Society for the Prevention of Cruelty to Animals	441	0.024
charity_name_coded	obligation	Amnesty International	36	0.002
charity_name_coded	obligation	BRAC	100	0.005
charity_name_coded	obligation	Clean Air Task Force	76	0.004
charity_name_coded	obligation	Concern Worldwide	28	0.002
charity_name_coded	obligation	Dana-Farber Cancer Institute	221	0.012
charity_name_coded	obligation	DonorsChoose	34	0.002
charity_name_coded	obligation	Evidence Action Inc.	49	0.003
charity_name_coded	obligation	Feeding America	433	0.023
charity_name_coded	obligation	Freedom Network	33	0.002
charity_name_coded	obligation	Girls Inc.	31	0.002
charity_name_coded	obligation	Global Fund for Women	22	0.001
charity_name_coded	obligation	HIAS	185	0.010
charity_name_coded	obligation	HealthWell Foundation	147	0.008
charity_name_coded	obligation	Housing Matters	532	0.029
charity_name_coded	obligation	Mercy For Animals	59	0.003
charity_name_coded	obligation	National Women’s Law Center	82	0.004
charity_name_coded	obligation	Natural Resources Defense Council	95	0.005
charity_name_coded	obligation	Rainforest Trust	172	0.009
charity_name_coded	obligation	Reading is Fundamental	160	0.009
charity_name_coded	obligation	Rocketship Education	103	0.006
charity_name_coded	obligation	Room to Read	75	0.004
charity_name_coded	obligation	Sightsavers Inc.	14	0.001
charity_name_coded	obligation	Southern Poverty Law Center	80	0.004
charity_name_coded	obligation	Special Olympics	85	0.005
charity_name_coded	obligation	Surfrider Foundation	74	0.004
charity_name_coded	obligation	The Leukemia & Lymphoma Society	79	0.004
charity_name_coded	obligation	The Ocean Foundation	23	0.001
charity_name_coded	obligation	The Rotary Foundation	1	0.000
charity_name_coded	obligation	The Trevor Project	63	0.003
charity_name_coded	obligation	Thurgood Marshall College Fund	30	0.002
charity_name_coded	obligation	WaterAid	62	0.003
charity_name_coded	obligation	Wild Earth Allies	89	0.005
charity_name_coded	obligation	Wildlife SOS	101	0.005
charity_name_coded	obligation	World Food Program	58	0.003
charity_name_coded	obligation	Year Up	419	0.023
charity_name_coded	opportunity	American Heart Association	56	0.003
charity_name_coded	opportunity	American Society for the Prevention of Cruelty to Animals	406	0.022
charity_name_coded	opportunity	Amnesty International	40	0.002
charity_name_coded	opportunity	BRAC	69	0.004
charity_name_coded	opportunity	Clean Air Task Force	88	0.005
charity_name_coded	opportunity	Concern Worldwide	23	0.001
charity_name_coded	opportunity	Dana-Farber Cancer Institute	214	0.012
charity_name_coded	opportunity	DonorsChoose	32	0.002
charity_name_coded	opportunity	Evidence Action Inc.	53	0.003
charity_name_coded	opportunity	Feeding America	439	0.024
charity_name_coded	opportunity	Freedom Network	30	0.002
charity_name_coded	opportunity	Girls Inc.	25	0.001
charity_name_coded	opportunity	Global Fund for Women	28	0.002
charity_name_coded	opportunity	HIAS	183	0.010
charity_name_coded	opportunity	HealthWell Foundation	157	0.008
charity_name_coded	opportunity	Housing Matters	545	0.029
charity_name_coded	opportunity	Mercy For Animals	49	0.003
charity_name_coded	opportunity	National Women’s Law Center	60	0.003
charity_name_coded	opportunity	Natural Resources Defense Council	91	0.005
charity_name_coded	opportunity	Rainforest Trust	149	0.008
charity_name_coded	opportunity	Reading is Fundamental	151	0.008
charity_name_coded	opportunity	Rocketship Education	118	0.006
charity_name_coded	opportunity	Room to Read	48	0.003
charity_name_coded	opportunity	Sightsavers Inc.	9	0.000
charity_name_coded	opportunity	Southern Poverty Law Center	94	0.005
charity_name_coded	opportunity	Special Olympics	69	0.004
charity_name_coded	opportunity	Surfrider Foundation	80	0.004
charity_name_coded	opportunity	The Leukemia & Lymphoma Society	87	0.005
charity_name_coded	opportunity	The Ocean Foundation	18	0.001
charity_name_coded	opportunity	The Rotary Foundation	3	0.000
charity_name_coded	opportunity	The Trevor Project	65	0.004
charity_name_coded	opportunity	Thurgood Marshall College Fund	18	0.001
charity_name_coded	opportunity	WaterAid	57	0.003
charity_name_coded	opportunity	Wild Earth Allies	88	0.005
charity_name_coded	opportunity	Wildlife SOS	98	0.005
charity_name_coded	opportunity	World Food Program	64	0.003
charity_name_coded	opportunity	Year Up	406	0.022
charity_name_coded	Missing	American Heart Association	76	0.004
charity_name_coded	Missing	American Society for the Prevention of Cruelty to Animals	560	0.030
charity_name_coded	Missing	Amnesty International	53	0.003
charity_name_coded	Missing	BRAC	80	0.004
charity_name_coded	Missing	Clean Air Task Force	109	0.006
charity_name_coded	Missing	Concern Worldwide	36	0.002
charity_name_coded	Missing	Dana-Farber Cancer Institute	407	0.022
charity_name_coded	Missing	DonorsChoose	70	0.004
charity_name_coded	Missing	Evidence Action Inc.	56	0.003
charity_name_coded	Missing	Feeding America	627	0.034
charity_name_coded	Missing	Freedom Network	35	0.002
charity_name_coded	Missing	Girls Inc.	62	0.003
charity_name_coded	Missing	Global Fund for Women	34	0.002
charity_name_coded	Missing	HIAS	247	0.013
charity_name_coded	Missing	HealthWell Foundation	193	0.010
charity_name_coded	Missing	Housing Matters	502	0.027
charity_name_coded	Missing	Mercy For Animals	78	0.004
charity_name_coded	Missing	National Women’s Law Center	85	0.005
charity_name_coded	Missing	Natural Resources Defense Council	124	0.007
charity_name_coded	Missing	Rainforest Trust	204	0.011
charity_name_coded	Missing	Reading is Fundamental	229	0.012
charity_name_coded	Missing	Rocketship Education	119	0.006
charity_name_coded	Missing	Room to Read	73	0.004
charity_name_coded	Missing	Sightsavers Inc.	18	0.001
charity_name_coded	Missing	Southern Poverty Law Center	109	0.006
charity_name_coded	Missing	Special Olympics	136	0.007
charity_name_coded	Missing	Surfrider Foundation	96	0.005
charity_name_coded	Missing	The Leukemia & Lymphoma Society	122	0.007
charity_name_coded	Missing	The Ocean Foundation	27	0.001
charity_name_coded	Missing	The Rotary Foundation	5	0.000
charity_name_coded	Missing	The Trevor Project	125	0.007
charity_name_coded	Missing	Thurgood Marshall College Fund	65	0.004
charity_name_coded	Missing	WaterAid	63	0.003
charity_name_coded	Missing	Wild Earth Allies	108	0.006
charity_name_coded	Missing	Wildlife SOS	115	0.006
charity_name_coded	Missing	World Food Program	78	0.004
charity_name_coded	Missing	Year Up	646	0.035
charity_name_coded	Missing	Missing	2042	0.110

Looking at number of participants consented to the chatbot

In this section, we will analyze the number of people who consented to start the chatbot since the experiment started. We are using the sample of consent for this analysis, which has 18529 participants.
For each section, we will provide two graphs: one for the total count for each responses overt time, and one for the proportion of each responses over time.

Among those consented

create_histogram_grouped(
  df_wide, 
  x_var = "time_since_first_start", 
  group_var = NULL, 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start"
  )

By source

df_source <- df_wide %>% filter(source_coded !="")
n_source <- as.character(nrow(df_source))

The two graphs below show the count and proportion of the 18529 participants consented to start the chatbot since the experiment started by source.
It is noticeable that the main source of entering the chatbot is JSON Ad.

Total count over time

This graph represents the number of participants who consented to start entered the chatbot via JSON Ad since the experiment started by source.

create_histogram_grouped(
  df_source, 
  x_var = "time_since_first_start", 
  group_var = "source_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Entered Via JSON Ad", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_source"
  )

By treatment arm

df_arm <-df_wide %>% filter(arm_coded != "") 

n_arm <- as.character(nrow(df_arm))

For the two graphs below, we excluded participants who did not receive a treatment assignment from the main sample of 18529 participants. The number of observation for this sample of receiving a treatment assignment is 10715.

Total count over time

This graph represents the number of participants consented to start the chatbot since the experiment started by treatment arms.

create_histogram_grouped(
  df_arm, 
  x_var = "time_since_first_start", 
  group_var = "arm_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Treatment Arm", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_arm"
  )

Proportion over time

This graph represents the proportion of participants consented to start the chatbot since the experiment started by treatment arms.

# group by day and create proportion

df_arm_1 <- df_arm %>% group_by(time_since_first_start, arm_coded) %>% summarise(n = n()) %>% mutate(arm_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_arm_1, aes(x = time_since_first_start, y = arm_coded_proportion, fill = arm_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Treatment Arm Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

By charity orgin

df_country <- df_wide %>% filter(country_charity_coded != "")

n_country <- as.character(nrow(df_country))

The two graphs below show the count and proportion of participants consented to start the chatbot since the experiment started by charity origin.
For these two graphs, we excluded participants who did not select a country origin from the main sample of 18529 participants. The number of observation for this sample of receiving a treatment assignment is 17471.

Total count over time

This graph represents the number of participants consented to start the chatbot since the experiment started by charity origin.

create_histogram_grouped(
  df_country, 
  x_var = "time_since_first_start", 
  group_var = "country_charity_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Charity Origin", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_country"
  )

Proportion over time

This graph represents the proportion of participants consented to start the chatbot since the experiment started by charity origin.

# group by day and create proportion
df_country_1 <- df_country %>% group_by(time_since_first_start, country_charity_coded) %>% summarise(n = n()) %>% mutate(country_charity_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_country_1, aes(x = time_since_first_start, y = country_charity_coded_proportion, fill = country_charity_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Charity Origin Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

By main charity cause

df_maincause <- df_wide %>% filter(main_cause_coded != "")

n_maincause <- as.character(nrow(df_maincause))

For the fpllowing graphs, we excluded participants who did not select a main charity cause from the main sample of 18529 participants. The number of observation for this sample of receiving a treatment assignment is 16967.

US Choice

df_us <- df_maincause %>% filter(country_charity_coded == "US")

n_us <- as.character(nrow(df_us))

In this section, we will analyze the number of people who consented to start the chatbot since the experiment started by main charity cause for the sample of 13986 participants who selected US as the charity origin.
The two graphs below show the count and proportion of participants consented to start the chatbot since the experiment started by main charity cause.

Total count over time

This graph represents the number of participants consented to start the chatbot since the experiment started by main charity cause in US.

create_histogram_grouped(
  df_us, 
  x_var = "time_since_first_start", 
  group_var = "main_cause_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Main Charity Cause in US", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_main_cause_us"
  )

Proportion over time

This graph represents the proportion of participants consented to start the chatbot since the experiment started by main charity cause in US.

# group by day and create proportion
df_us_1 <- df_us %>% group_by(time_since_first_start, main_cause_coded) %>% summarise(n = n()) %>% mutate(main_cause_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_us_1, aes(x = time_since_first_start, y = main_cause_coded_proportion, fill = main_cause_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Main Charity Cause in US Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Global Choice

df_global <- df_maincause %>% filter(country_charity_coded == "Global")

n_global <- as.character(nrow(df_global))

In this section, we will analyze the number of people who consented to start the chatbot since the experiment started by main charity cause for the sample of 2981 participants who selected Global as the charity origin.
The two graphs below show the count and proportion of participants consented to start the chatbot since the experiment started by main charity cause.

Total count over time

This graph represents the number of participants consented to start the chatbot since the experiment started by main charity cause in Global.

create_histogram_grouped(
  df_global, 
  x_var = "time_since_first_start", 
  group_var = "main_cause_coded", 
  bin_width = 1, 
  title = "Number of People Consented to Start Chatbot Since the Experiment by Main Charity Cause in Global", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_main_cause_global"
  )

Proportion over time

This graph represents the proportion of participants consented to start the chatbot since the experiment started by main charity cause in Global.

# group by day and create proportion

df_global_1 <- df_global %>% group_by(time_since_first_start, main_cause_coded) %>% summarise(n = n()) %>% mutate(main_cause_coded_proportion = n / sum(n))

# Create bar chart

ggplot(df_global_1, aes(x = time_since_first_start, y = main_cause_coded_proportion, fill = main_cause_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Main Charity Cause in Global Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Looking at time spent on charity match

In this section, we will analyze the time spent on the charity matching quiz in minutes by participants. We are using the sample of consent for this analysis, which has 18529 participants.
We determined the time spent on the charity matching quiz by computing the difference (in minutes) between the start time of the quiz (charitable_match_start) and the end time of the quiz (charitable_match_end). During the cleaning process, we identified instances where this value is negative and removed these observations from the dataset.

Among those consented

For this plot, we are using a sample of 18529 participants who consented to the study.
We noticed extreme right outliers causing thick right tail.

create_density_plot(
  df_wide, 
  x_var = "duration_match", 
  group_var = NULL, 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Treatment Arm", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

By treatment arm

For this plot, we are using a sample of 10715 participants who received a treatment assignment.
The density plot below shows the time spent on charity match by treatment arms. We see that the distribution of time spent on charity match is similar across treatment arm.

create_density_plot(
  df_arm, 
  x_var = "duration_match", 
  group_var = "arm_coded", 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Treatment Arm", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

By charity origin

For this plot, we are using a sample of 17471 participants who selected a country origin.

create_density_plot(
  df_country, 
  x_var = "duration_match", 
  group_var = "country_charity_coded", 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Charity Origin", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

By main charity cause

For this plot, we are using a sample of 16967 participants who selected a main charity cause.

create_density_plot(
  df_maincause, 
  x_var = "duration_match", 
  group_var = "main_cause_coded", 
  x_min = 0, 
  x_max = 5, 
  title = "Density Plot of Time Spent on Charity Match by Main Charity Cause", 
  x_label = "Time Spent on Charity Match (mins)", 
  y_label = "Density",
  file_basename = "duration_match"
  )

Looking at Charity Match

In this section, we will look the top 5 and bottom 5 charities that consented participants matched with over time.
We will use the sample of 16487 participants who received a charity match.

Both Global and US

Top 5

# select top five charities each day 

df_charity_top <- df_charity %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(5, n) %>% ungroup()

## plot 

ggplot(df_charity_top, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time both US and Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Bottom 5

df_charity_bottom <- df_charity %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(-5, n) %>% ungroup()

## plot 

ggplot(df_charity_bottom, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time both US and Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

US Charities

df_charity_us <- df_charity %>% filter(country_charity_coded == "US")

n_charity_us <- as.character(nrow(df_charity_us))

In this section, we will look at the top 5 and bottom 5 charities that consented participants matched with over time for the sample of 13596 participants who selected US as the charity origin.

Top 5

df_charity_top <- df_charity_us %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(5, n) %>% ungroup()

## plot 

ggplot(df_charity_top, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time US", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Bottom 5

df_charity_bottom <- df_charity_us %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(-5, n) %>% ungroup()

## plot

ggplot(df_charity_bottom, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Bottom 5 Charities Over Time US", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Global Charities

df_charity_global <- df_charity %>% filter(country_charity_coded == "Global")

n_charity_global <- as.character(nrow(df_charity_global))

In this section, we will look at the top 5 and bottom 5 charities that consented participants matched with over time for the sample of 2891 participants who selected Global as the charity origin.

Top 5

df_charity_top <- df_charity_global %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(5, n) %>% ungroup()

## plot

ggplot(df_charity_top, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Top 5 Charities Over Time Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Bottom 5

df_charity_bottom <- df_charity_global %>% group_by(time_since_first_start, charity_name_coded) %>% summarise(n = n()) %>% arrange(desc(n)) %>% group_by(time_since_first_start) %>% top_n(-5, n) %>% ungroup()

## plot

ggplot(df_charity_bottom, aes(x = time_since_first_start, y = n, fill = charity_name_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Bottom 5 Charities Over Time Global", x = "Time Since the Experiment Started (days)", y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 8)) + theme(legend.key.size = unit(0.2, 'cm'))

Looking at donation intention over time

In this section, we will look at the donation intention over time.

Donation Intention Today

df_donate_today <- df_wide %>% filter(donate_today_coded != "") %>% mutate(donate_today_coded_factor = as.factor(donate_today_coded)) %>% mutate(donate_today_coded_factor = fct_recode(donate_today_coded_factor, "Yes" = "1", "No" = "0"))

n_donate_today <- as.character(nrow(df_donate_today))

In this section, we looked at participants who intended to donate today over time.
We are using a sample of 7511 participants who answered if they were intended to donate today from the main sample of 18529 participants who consented to the study.
The next two graphs below the count and proportion of participants who intended to donate today over time.

Total count over time

This graph represents the number of participants who intended to donate today over time.

create_histogram_grouped(
  df_donate_today, 
  x_var = "time_since_first_start", 
  group_var = "donate_today_coded_factor", 
  bin_width = 1, 
  title = "Donation Today by Time Since the Experiment Started", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_donate_today"
  )

Proportion over time

This graph represents the proportion of participants who intended to donate today over time.

# group by day and create proportion
df_donate_today_1 <- df_donate_today %>% group_by(time_since_first_start, donate_today_coded_factor) %>% summarise(n = n()) %>% mutate(donate_today_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_donate_today_1, aes(x = time_since_first_start, y = donate_today_coded_proportion, fill = donate_today_coded_factor)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Donation Today Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

Choose to donate today, by treatment arm over time

df_donate_today_yes <- df_wide %>% filter(donate_today_coded == 1)
n_donate_today_yes <- as.character(nrow(df_donate_today_yes))

In this section, we looked at participants who intended to donate today by treatment arm over time.
We are using a sample of 991 participants who intended to donate today from the main sample of 18529 participants who consented to the study.

create_histogram_grouped(
  df_donate_today_yes, 
  x_var = "time_since_first_start", 
  group_var = "arm_coded", 
  bin_width = 1, 
  title = "Donation Today by Time Since the Experiment Started and Treatment Arm", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_yesdonate_today_arm"
  )

Choose to donate today, by charity origin over time

In this section, we looked at participants who intended to donate today by charity origin over time.
We are using a sample of 991 participants who intended to donate today from the main sample of 18529 participants who consented to the study.

create_histogram_grouped(
  df_donate_today_yes, 
  x_var = "time_since_first_start", 
  group_var = "country_charity_coded", 
  bin_width = 1, 
  title = "Donate Today by Time Since the Experiment Started and Charity Origin", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_yesdonate_today_country"
  )

Choose to donate today, by Charity Cause over time

In this section, we looked at participants who intended to donate today by main charity cause over time.
We are using a sample of 991 participants who intended to donate today from the main sample of 18529 participants who consented to the study.

create_histogram_grouped(
  df_donate_today_yes, 
  x_var = "time_since_first_start", 
  group_var = "main_cause_coded", 
  bin_width = 1, 
  title = "Donation Today by Time Since the Experiment Started and Main Charity Cause", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_yesdonate_today_main_cause"
  )

Not donate today

If participants denied to donate today, they were asked to do one of the three things: donate later, explore others, or learn more about the charity.
We will show the distribution of these choices over time since the experiment started.

Choose other options over time

df_donate_later <- df_wide %>% filter(donate_later_coded != "")

n_donate_later <- as.character(nrow(df_donate_later))

In this section, we looked at participants who responded to one the three options: donate later, explore others, or learn more about the charity over time. In this sample there are 5115 participants from the main sample of 18529 participants who consented to the study.
The next two graphs below show the count and proportion of participants who responded to one of the three options over time.

Total count over time

This graph represents the number of participants who responded to one of the three options over time.

create_histogram_grouped(
  df_donate_later, 
  x_var = "time_since_first_start", 
  group_var = "donate_later_coded", 
  bin_width = 1, 
  title = "Donate Later Choices by Time Since the Experiment Started", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "time_since_first_start_donate_later"
  )

Proportion over time

This graph represents the proportion of participants who responded to one of the three options over time.

# group by day and create proportion
df_donate_later_1 <- df_donate_later %>% group_by(time_since_first_start, donate_later_coded) %>% summarise(n = n()) %>% mutate(donate_later_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_donate_later_1, aes(x = time_since_first_start, y = donate_later_coded_proportion, fill = donate_later_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Donate Later Choices Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 5)) + theme(legend.key.size = unit(0.2, 'cm'))

Looking at staying connected over time

df_share <- df_wide %>% filter(stay_connected_coded != "")

n_share <- as.character(nrow(df_share))

In this section, we will look at the proportion of participants who stayed connected over time.
We are using a sample of 5829 participants answered whether they are willing to stay connected from the main sample of 18529 participants who consented to the study.
The next two graphs below show the count and proportion of participants who stayed connected over time.

Total count over time

This graph represents the number of participants who stayed connected over time.

create_histogram_grouped(
  df_share, 
  x_var = "time_since_first_start", 
  group_var = "stay_connected_coded", 
  bin_width = 1, 
  title = "Staying connected over time", 
  x_label = "Time Since the Experiment Started (days)", 
  y_label = "Frequency",
  file_basename = "share_with_friend"
  )

Proportion over time

This graph represents the proportion of participants who stayed connected over time.

# group by day and create proportion
df_share_1 <- df_share %>% group_by(time_since_first_start, stay_connected_coded) %>% summarise(n = n()) %>% mutate(stay_connected_coded_proportion = n / sum(n))

# Create bar chart
ggplot(df_share_1, aes(x = time_since_first_start, y = stay_connected_coded_proportion, fill = stay_connected_coded)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Proportion of Staying Connected Over Time", x = "Time Since the Experiment Started (days)", y = "Proportion") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + theme(legend.title = element_text(size = 0),
                   legend.text = element_text(size = 10)) + theme(legend.key.size = unit(0.2, 'cm'))

Correlation Matrix

In this section, we will look at the correlation matrix of the following variables of the main sample of 18529 participants who consented to the study: source entered, donor type, country (U.S.), main cause (all), subcause (all)

df_correlation <- df_wide %>% select(source_coded, donor_type_coded, country_charity_coded, main_cause_coded, sub_cause_coded) %>% filter(source_coded != "") %>% filter(donor_type_coded != "") %>% filter(country_charity_coded != "") %>% filter(main_cause_coded != "") %>% filter(sub_cause_coded != "")

df_correlation <- dummy_cols(df_correlation, select_columns = c("source_coded", "donor_type_coded", "country_charity_coded", "main_cause_coded", "sub_cause_coded")) %>% select(-source_coded, -donor_type_coded, -country_charity_coded, -main_cause_coded, -sub_cause_coded)

# remove name of the dummy columns

names(df_correlation) <- gsub("source_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("donor_type_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("country_charity_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("main_cause_coded_", "", names(df_correlation))
names(df_correlation) <- gsub("sub_cause_coded_", "", names(df_correlation))
names(df_correlation)[names(df_correlation) == "1"] <- "JSON Ad"
names(df_correlation)[names(df_correlation) == "0"] <- "Other sources of entrance"




correlation_matrix <- cor(df_correlation, use = "pairwise.complete.obs") 

col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
colors <- col(200)
annotation_colors <- c("white", "black")

# Set up a larger plotting window
options(repr.plot.width=50, repr.plot.height=50)

# Generate the correlation plot with adjusted parameters
corrplot(correlation_matrix,
         method = "color", 
         col = colors,
         type = "upper",
         addCoef.col = "black",
         tl.col = "black",
         tl.srt = 90,
         diag = FALSE,
         bg = "white",
         addrect = 2,
         rect.col = "gray",
         cl.pos = "n",
         number.cex  = 0.15,  # Adjust the font size for correlation values
         tl.cex = 0.3,      # Adjust the font size for variable names
         addCoef.asPercent = TRUE,
         p.mat = NULL,
         sig.level = 0.01,
         insig = "blank",
         pch.col = "black",
         pch.cex = 0.8,
         col.cor = "black",
         mar = c(0,0,1,0))

Correlation table

The table below shows the correlation matrix of the following variables: source entered, donor type, country (U.S.), main cause (all), subcause (all)

correlation_matrix %>% kable(digits = 3) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "1000px")

	Other sources of entrance	JSON Ad	forward_looking	responsive	smart	unsure	Global	US	Defend the oppressed and marginalized	Eradicate hunger and homelessness	Eradicate poverty worldwide	Heal the sick	Protect the animals	Rescue the environment	Transform education	a roof over their head	access to clean water	all the WILD furry animals like lions, tigers and bears (oh and koalas)	animals in factory farms	any species on the brink of extinction	cancer: The Big C	conserving natural spaces and habitats like old rainforests and national parks	dogs & cats humans’ best friends	employment training and opportunities	eradicating polio	food in their belly	groups targeted by their government	help all children learn to read	help teachers fund classroom projects	hot food in their belly	immigrants and refugees fleeing violence	innovations that further clean energy technology	LGBTQ+ communities	massive ocean clean up	new policies to regulate carbon emissions	people suffering from preventable diseases because of inadequate healthcare	people that are victims of human trafficking	people with disabilities	people with heart disease	preventing and curing blindness	protecting against preventable childhood diseases	provide tutoring and support to underserved children	racial or ethnic minorities	rare diseases that need more research	refugees fleeing violence or other disasters	send a low income student to college	support girls in STEM	the opportunity to go to school	the skills to grow food sustainably	treating the big three: HIV, AIDS, Tuburculosis and Malaria	whales and ocean animals (Baby Beluga! Nemo!!)	women
Other sources of entrance	1.000	-1.000	0.005	-0.008	-0.001	0.023	0.039	-0.039	0.003	-0.017	0.054	-0.007	-0.013	0.001	0.005	-0.011	-0.003	-0.002	0.002	0.002	-0.008	-0.009	-0.020	0.004	-0.004	0.008	0.007	-0.006	-0.003	-0.018	-0.009	-0.007	0.003	0.011	0.012	-0.017	0.001	0.004	0.006	-0.007	0.013	0.001	0.008	0.012	0.011	0.033	-0.001	0.060	0.039	-0.010	0.012	-0.017
JSON Ad	-1.000	1.000	-0.005	0.008	0.001	-0.023	-0.039	0.039	-0.003	0.017	-0.054	0.007	0.013	-0.001	-0.005	0.011	0.003	0.002	-0.002	-0.002	0.008	0.009	0.020	-0.004	0.004	-0.008	-0.007	0.006	0.003	0.018	0.009	0.007	-0.003	-0.011	-0.012	0.017	-0.001	-0.004	-0.006	0.007	-0.013	-0.001	-0.008	-0.012	-0.011	-0.033	0.001	-0.060	-0.039	0.010	-0.012	0.017
forward_looking	0.005	-0.005	1.000	-0.440	-0.544	-0.051	-0.004	0.004	-0.033	-0.011	0.008	0.022	-0.016	0.034	0.009	0.000	0.008	-0.008	-0.005	0.005	0.018	0.025	-0.018	0.013	0.010	-0.009	-0.010	0.008	0.003	-0.030	-0.021	0.003	-0.012	0.019	0.018	0.013	-0.011	0.004	0.004	-0.007	0.004	0.004	-0.008	0.003	-0.033	0.000	0.000	0.000	0.016	0.002	0.012	0.003
responsive	-0.008	0.008	-0.440	1.000	-0.503	-0.047	-0.005	0.005	0.014	0.021	-0.009	-0.007	0.012	-0.037	-0.012	0.010	-0.014	0.006	0.003	0.001	0.000	-0.023	0.012	-0.008	-0.002	0.008	0.002	-0.008	-0.003	0.030	0.020	-0.022	-0.014	-0.010	-0.015	-0.009	0.021	-0.001	-0.003	0.020	-0.008	-0.002	0.006	-0.005	0.023	0.001	-0.017	0.000	-0.011	-0.001	-0.007	-0.014
smart	-0.001	0.001	-0.544	-0.503	1.000	-0.058	0.011	-0.011	0.021	-0.008	0.003	-0.018	0.002	0.003	0.000	-0.009	0.006	0.001	0.003	-0.005	-0.019	-0.002	0.005	-0.004	-0.008	0.003	0.009	-0.004	0.001	0.001	0.003	0.019	0.027	-0.009	-0.002	-0.004	-0.011	-0.004	-0.003	-0.014	0.004	-0.001	0.003	0.000	0.012	-0.005	0.016	0.001	-0.004	-0.002	-0.006	0.011
unsure	0.023	-0.023	-0.051	-0.047	-0.058	1.000	-0.018	0.018	-0.012	-0.013	-0.015	0.019	0.015	-0.009	0.015	-0.007	-0.009	0.008	-0.001	0.001	0.016	-0.001	0.013	-0.013	-0.002	-0.009	-0.007	0.023	-0.007	0.000	-0.010	-0.011	-0.010	-0.003	-0.004	-0.004	0.015	0.008	0.015	0.013	-0.008	-0.006	-0.004	0.014	-0.011	0.024	-0.007	-0.009	-0.003	0.006	0.008	-0.005
Global	0.039	-0.039	-0.004	-0.005	0.011	-0.018	1.000	-1.000	0.154	-0.316	0.546	-0.036	-0.061	0.129	-0.141	-0.164	0.251	0.088	-0.052	0.117	-0.113	0.070	-0.155	-0.156	0.059	0.261	0.211	-0.090	-0.044	-0.159	-0.066	0.060	-0.062	0.059	0.059	-0.087	0.175	-0.067	-0.053	0.112	0.230	-0.073	-0.066	-0.066	0.328	-0.040	-0.042	0.261	0.293	0.165	0.005	0.040
US	-0.039	0.039	0.004	0.005	-0.011	0.018	-1.000	1.000	-0.154	0.316	-0.546	0.036	0.061	-0.129	0.141	0.164	-0.251	-0.088	0.052	-0.117	0.113	-0.070	0.155	0.156	-0.059	-0.261	-0.211	0.090	0.044	0.159	0.066	-0.060	0.062	-0.059	-0.059	0.087	-0.175	0.067	0.053	-0.112	-0.230	0.073	0.066	0.066	-0.328	0.040	0.042	-0.261	-0.293	-0.165	-0.005	-0.040
Defend the oppressed and marginalized	0.003	-0.003	-0.033	0.014	0.021	-0.012	0.154	-0.154	1.000	-0.273	-0.101	-0.164	-0.175	-0.131	-0.122	-0.142	-0.047	-0.058	-0.045	-0.061	-0.097	-0.079	-0.134	-0.134	-0.011	-0.049	0.244	-0.078	-0.038	-0.137	0.356	-0.060	0.336	-0.054	-0.056	-0.075	0.203	0.362	-0.045	-0.021	-0.043	-0.063	0.353	-0.057	0.380	-0.035	-0.036	-0.048	-0.054	-0.031	-0.027	0.371
Eradicate hunger and homelessness	-0.017	0.017	-0.011	0.021	-0.008	-0.013	-0.316	0.316	-0.273	1.000	-0.173	-0.279	-0.298	-0.223	-0.207	0.518	-0.079	-0.098	-0.077	-0.103	-0.166	-0.135	-0.227	0.492	-0.019	-0.083	-0.067	-0.132	-0.065	0.503	-0.097	-0.102	-0.092	-0.092	-0.095	-0.128	-0.055	-0.099	-0.077	-0.036	-0.073	-0.107	-0.096	-0.097	-0.104	-0.059	-0.061	-0.082	-0.092	-0.052	-0.047	-0.101
Eradicate poverty worldwide	0.054	-0.054	0.008	-0.009	0.003	-0.015	0.546	-0.546	-0.101	-0.173	1.000	-0.103	-0.111	-0.083	-0.077	-0.090	0.459	-0.036	-0.029	-0.038	-0.062	-0.050	-0.084	-0.085	-0.007	0.478	-0.025	-0.049	-0.024	-0.087	-0.036	-0.038	-0.034	-0.034	-0.035	-0.048	-0.021	-0.037	-0.029	-0.013	-0.027	-0.040	-0.036	-0.036	-0.039	-0.022	-0.023	0.477	0.535	-0.019	-0.017	-0.038
Heal the sick	-0.007	0.007	0.022	-0.007	-0.018	0.019	-0.036	0.036	-0.164	-0.279	-0.103	1.000	-0.178	-0.133	-0.124	-0.144	-0.047	-0.059	-0.046	-0.062	0.595	-0.081	-0.136	-0.137	0.067	-0.050	-0.040	-0.079	-0.039	-0.140	-0.058	-0.061	-0.055	-0.055	-0.057	0.461	-0.033	-0.059	0.278	0.127	0.261	-0.064	-0.058	0.347	-0.062	-0.035	-0.037	-0.049	-0.055	0.187	-0.028	-0.061
Protect the animals	-0.013	0.013	-0.016	0.012	0.002	0.015	-0.061	0.061	-0.175	-0.298	-0.111	-0.178	1.000	-0.143	-0.133	-0.154	-0.051	0.329	0.259	0.347	-0.106	-0.086	0.764	-0.146	-0.012	-0.053	-0.043	-0.085	-0.042	-0.150	-0.062	-0.065	-0.059	-0.059	-0.061	-0.082	-0.035	-0.063	-0.050	-0.023	-0.047	-0.068	-0.062	-0.062	-0.066	-0.038	-0.039	-0.053	-0.059	-0.033	0.157	-0.065
Rescue the environment	0.001	-0.001	0.034	-0.037	0.003	-0.009	0.129	-0.129	-0.131	-0.223	-0.083	-0.133	-0.143	1.000	-0.099	-0.115	-0.038	-0.047	-0.037	-0.050	-0.079	0.606	-0.109	-0.110	-0.009	-0.040	-0.032	-0.063	-0.031	-0.112	-0.047	0.457	-0.044	0.413	0.427	-0.062	-0.027	-0.047	-0.037	-0.017	-0.035	-0.051	-0.046	-0.046	-0.050	-0.028	-0.029	-0.040	-0.044	-0.025	-0.022	-0.049
Transform education	0.005	-0.005	0.009	-0.012	0.000	0.015	-0.141	0.141	-0.122	-0.207	-0.077	-0.124	-0.133	-0.099	1.000	-0.108	-0.035	-0.044	-0.034	-0.046	-0.074	-0.060	-0.101	-0.102	-0.008	-0.037	-0.030	0.638	0.315	-0.104	-0.043	-0.045	-0.041	-0.041	-0.042	-0.057	-0.025	-0.044	-0.035	-0.016	-0.032	0.515	-0.043	-0.043	-0.046	0.284	0.295	-0.037	-0.041	-0.023	-0.021	-0.045
a roof over their head	-0.011	0.011	0.000	0.010	-0.009	-0.007	-0.164	0.164	-0.142	0.518	-0.090	-0.144	-0.154	-0.115	-0.108	1.000	-0.041	-0.051	-0.040	-0.054	-0.086	-0.070	-0.118	-0.119	-0.010	-0.043	-0.035	-0.069	-0.034	-0.121	-0.050	-0.053	-0.048	-0.048	-0.049	-0.067	-0.029	-0.051	-0.040	-0.018	-0.038	-0.055	-0.050	-0.050	-0.054	-0.031	-0.032	-0.043	-0.048	-0.027	-0.024	-0.053
access to clean water	-0.003	0.003	0.008	-0.014	0.006	-0.009	0.251	-0.251	-0.047	-0.079	0.459	-0.047	-0.051	-0.038	-0.035	-0.041	1.000	-0.017	-0.013	-0.018	-0.028	-0.023	-0.039	-0.039	-0.003	-0.014	-0.011	-0.023	-0.011	-0.040	-0.017	-0.017	-0.016	-0.016	-0.016	-0.022	-0.009	-0.017	-0.013	-0.006	-0.012	-0.018	-0.016	-0.016	-0.018	-0.010	-0.010	-0.014	-0.016	-0.009	-0.008	-0.017
all the WILD furry animals like lions, tigers and bears (oh and koalas)	-0.002	0.002	-0.008	0.006	0.001	0.008	0.088	-0.088	-0.058	-0.098	-0.036	-0.059	0.329	-0.047	-0.044	-0.051	-0.017	1.000	-0.016	-0.022	-0.035	-0.028	-0.048	-0.048	-0.004	-0.017	-0.014	-0.028	-0.014	-0.049	-0.021	-0.021	-0.019	-0.019	-0.020	-0.027	-0.012	-0.021	-0.016	-0.007	-0.015	-0.023	-0.020	-0.020	-0.022	-0.012	-0.013	-0.017	-0.020	-0.011	-0.010	-0.021
animals in factory farms	0.002	-0.002	-0.005	0.003	0.003	-0.001	-0.052	0.052	-0.045	-0.077	-0.029	-0.046	0.259	-0.037	-0.034	-0.040	-0.013	-0.016	1.000	-0.017	-0.027	-0.022	-0.038	-0.038	-0.003	-0.014	-0.011	-0.022	-0.011	-0.039	-0.016	-0.017	-0.015	-0.015	-0.016	-0.021	-0.009	-0.016	-0.013	-0.006	-0.012	-0.018	-0.016	-0.016	-0.017	-0.010	-0.010	-0.014	-0.015	-0.009	-0.008	-0.017
any species on the brink of extinction	0.002	-0.002	0.005	0.001	-0.005	0.001	0.117	-0.117	-0.061	-0.103	-0.038	-0.062	0.347	-0.050	-0.046	-0.054	-0.018	-0.022	-0.017	1.000	-0.037	-0.030	-0.051	-0.051	-0.004	-0.018	-0.015	-0.029	-0.015	-0.052	-0.022	-0.023	-0.020	-0.020	-0.021	-0.029	-0.012	-0.022	-0.017	-0.008	-0.016	-0.024	-0.021	-0.021	-0.023	-0.013	-0.014	-0.018	-0.021	-0.012	-0.010	-0.023
cancer: The Big C	-0.008	0.008	0.018	0.000	-0.019	0.016	-0.113	0.113	-0.097	-0.166	-0.062	0.595	-0.106	-0.079	-0.074	-0.086	-0.028	-0.035	-0.027	-0.037	1.000	-0.048	-0.081	-0.082	-0.007	-0.029	-0.024	-0.047	-0.023	-0.083	-0.035	-0.036	-0.033	-0.033	-0.034	-0.046	-0.020	-0.035	-0.028	-0.013	-0.026	-0.038	-0.034	-0.034	-0.037	-0.021	-0.022	-0.029	-0.033	-0.019	-0.017	-0.036
conserving natural spaces and habitats like old rainforests and national parks	-0.009	0.009	0.025	-0.023	-0.002	-0.001	0.070	-0.070	-0.079	-0.135	-0.050	-0.081	-0.086	0.606	-0.060	-0.070	-0.023	-0.028	-0.022	-0.030	-0.048	1.000	-0.066	-0.066	-0.005	-0.024	-0.019	-0.038	-0.019	-0.068	-0.028	-0.030	-0.027	-0.027	-0.028	-0.037	-0.016	-0.029	-0.022	-0.010	-0.021	-0.031	-0.028	-0.028	-0.030	-0.017	-0.018	-0.024	-0.027	-0.015	-0.014	-0.029
dogs & cats humans’ best friends	-0.020	0.020	-0.018	0.012	0.005	0.013	-0.155	0.155	-0.134	-0.227	-0.084	-0.136	0.764	-0.109	-0.101	-0.118	-0.039	-0.048	-0.038	-0.051	-0.081	-0.066	1.000	-0.112	-0.009	-0.040	-0.033	-0.065	-0.032	-0.114	-0.048	-0.050	-0.045	-0.045	-0.047	-0.063	-0.027	-0.048	-0.038	-0.017	-0.036	-0.052	-0.047	-0.047	-0.051	-0.029	-0.030	-0.040	-0.045	-0.025	-0.023	-0.050
employment training and opportunities	0.004	-0.004	0.013	-0.008	-0.004	-0.013	-0.156	0.156	-0.134	0.492	-0.085	-0.137	-0.146	-0.110	-0.102	-0.119	-0.039	-0.048	-0.038	-0.051	-0.082	-0.066	-0.112	1.000	-0.009	-0.041	-0.033	-0.065	-0.032	-0.115	-0.048	-0.050	-0.045	-0.045	-0.047	-0.063	-0.027	-0.049	-0.038	-0.017	-0.036	-0.053	-0.047	-0.048	-0.051	-0.029	-0.030	-0.041	-0.045	-0.026	-0.023	-0.050
eradicating polio	-0.004	0.004	0.010	-0.002	-0.008	-0.002	0.059	-0.059	-0.011	-0.019	-0.007	0.067	-0.012	-0.009	-0.008	-0.010	-0.003	-0.004	-0.003	-0.004	-0.007	-0.005	-0.009	-0.009	1.000	-0.003	-0.003	-0.005	-0.003	-0.009	-0.004	-0.004	-0.004	-0.004	-0.004	-0.005	-0.002	-0.004	-0.003	-0.001	-0.003	-0.004	-0.004	-0.004	-0.004	-0.002	-0.002	-0.003	-0.004	-0.002	-0.002	-0.004
food in their belly	0.008	-0.008	-0.009	0.008	0.003	-0.009	0.261	-0.261	-0.049	-0.083	0.478	-0.050	-0.053	-0.040	-0.037	-0.043	-0.014	-0.017	-0.014	-0.018	-0.029	-0.024	-0.040	-0.041	-0.003	1.000	-0.012	-0.023	-0.012	-0.042	-0.017	-0.018	-0.016	-0.016	-0.017	-0.023	-0.010	-0.018	-0.014	-0.006	-0.013	-0.019	-0.017	-0.017	-0.018	-0.010	-0.011	-0.015	-0.016	-0.009	-0.008	-0.018
groups targeted by their government	0.007	-0.007	-0.010	0.002	0.009	-0.007	0.211	-0.211	0.244	-0.067	-0.025	-0.040	-0.043	-0.032	-0.030	-0.035	-0.011	-0.014	-0.011	-0.015	-0.024	-0.019	-0.033	-0.033	-0.003	-0.012	1.000	-0.019	-0.009	-0.033	-0.014	-0.015	-0.013	-0.013	-0.014	-0.018	-0.008	-0.014	-0.011	-0.005	-0.010	-0.015	-0.014	-0.014	-0.015	-0.008	-0.009	-0.012	-0.013	-0.007	-0.007	-0.015
help all children learn to read	-0.006	0.006	0.008	-0.008	-0.004	0.023	-0.090	0.090	-0.078	-0.132	-0.049	-0.079	-0.085	-0.063	0.638	-0.069	-0.023	-0.028	-0.022	-0.029	-0.047	-0.038	-0.065	-0.065	-0.005	-0.023	-0.019	1.000	-0.019	-0.066	-0.028	-0.029	-0.026	-0.026	-0.027	-0.037	-0.016	-0.028	-0.022	-0.010	-0.021	-0.030	-0.027	-0.027	-0.030	-0.017	-0.017	-0.023	-0.026	-0.015	-0.013	-0.029
help teachers fund classroom projects	-0.003	0.003	0.003	-0.003	0.001	-0.007	-0.044	0.044	-0.038	-0.065	-0.024	-0.039	-0.042	-0.031	0.315	-0.034	-0.011	-0.014	-0.011	-0.015	-0.023	-0.019	-0.032	-0.032	-0.003	-0.012	-0.009	-0.019	1.000	-0.033	-0.014	-0.014	-0.013	-0.013	-0.013	-0.018	-0.008	-0.014	-0.011	-0.005	-0.010	-0.015	-0.014	-0.014	-0.015	-0.008	-0.009	-0.012	-0.013	-0.007	-0.007	-0.014
hot food in their belly	-0.018	0.018	-0.030	0.030	0.001	0.000	-0.159	0.159	-0.137	0.503	-0.087	-0.140	-0.150	-0.112	-0.104	-0.121	-0.040	-0.049	-0.039	-0.052	-0.083	-0.068	-0.114	-0.115	-0.009	-0.042	-0.033	-0.066	-0.033	1.000	-0.049	-0.051	-0.046	-0.046	-0.048	-0.065	-0.028	-0.050	-0.039	-0.018	-0.037	-0.054	-0.048	-0.049	-0.052	-0.030	-0.031	-0.041	-0.046	-0.026	-0.023	-0.051
immigrants and refugees fleeing violence	-0.009	0.009	-0.021	0.020	0.003	-0.010	-0.066	0.066	0.356	-0.097	-0.036	-0.058	-0.062	-0.047	-0.043	-0.050	-0.017	-0.021	-0.016	-0.022	-0.035	-0.028	-0.048	-0.048	-0.004	-0.017	-0.014	-0.028	-0.014	-0.049	1.000	-0.021	-0.019	-0.019	-0.020	-0.027	-0.012	-0.021	-0.016	-0.007	-0.015	-0.022	-0.020	-0.020	-0.022	-0.012	-0.013	-0.017	-0.019	-0.011	-0.010	-0.021
innovations that further clean energy technology	-0.007	0.007	0.003	-0.022	0.019	-0.011	0.060	-0.060	-0.060	-0.102	-0.038	-0.061	-0.065	0.457	-0.045	-0.053	-0.017	-0.021	-0.017	-0.023	-0.036	-0.030	-0.050	-0.050	-0.004	-0.018	-0.015	-0.029	-0.014	-0.051	-0.021	1.000	-0.020	-0.020	-0.021	-0.028	-0.012	-0.022	-0.017	-0.008	-0.016	-0.023	-0.021	-0.021	-0.023	-0.013	-0.013	-0.018	-0.020	-0.011	-0.010	-0.022
LGBTQ+ communities	0.003	-0.003	-0.012	-0.014	0.027	-0.010	-0.062	0.062	0.336	-0.092	-0.034	-0.055	-0.059	-0.044	-0.041	-0.048	-0.016	-0.019	-0.015	-0.020	-0.033	-0.027	-0.045	-0.045	-0.004	-0.016	-0.013	-0.026	-0.013	-0.046	-0.019	-0.020	1.000	-0.018	-0.019	-0.025	-0.011	-0.020	-0.015	-0.007	-0.014	-0.021	-0.019	-0.019	-0.020	-0.012	-0.012	-0.016	-0.018	-0.010	-0.009	-0.020
massive ocean clean up	0.011	-0.011	0.019	-0.010	-0.009	-0.003	0.059	-0.059	-0.054	-0.092	-0.034	-0.055	-0.059	0.413	-0.041	-0.048	-0.016	-0.019	-0.015	-0.020	-0.033	-0.027	-0.045	-0.045	-0.004	-0.016	-0.013	-0.026	-0.013	-0.046	-0.019	-0.020	-0.018	1.000	-0.019	-0.025	-0.011	-0.020	-0.015	-0.007	-0.014	-0.021	-0.019	-0.019	-0.021	-0.012	-0.012	-0.016	-0.018	-0.010	-0.009	-0.020
new policies to regulate carbon emissions	0.012	-0.012	0.018	-0.015	-0.002	-0.004	0.059	-0.059	-0.056	-0.095	-0.035	-0.057	-0.061	0.427	-0.042	-0.049	-0.016	-0.020	-0.016	-0.021	-0.034	-0.028	-0.047	-0.047	-0.004	-0.017	-0.014	-0.027	-0.013	-0.048	-0.020	-0.021	-0.019	-0.019	1.000	-0.026	-0.011	-0.020	-0.016	-0.007	-0.015	-0.022	-0.020	-0.020	-0.021	-0.012	-0.013	-0.017	-0.019	-0.011	-0.010	-0.021
people suffering from preventable diseases because of inadequate healthcare	-0.017	0.017	0.013	-0.009	-0.004	-0.004	-0.087	0.087	-0.075	-0.128	-0.048	0.461	-0.082	-0.062	-0.057	-0.067	-0.022	-0.027	-0.021	-0.029	-0.046	-0.037	-0.063	-0.063	-0.005	-0.023	-0.018	-0.037	-0.018	-0.065	-0.027	-0.028	-0.025	-0.025	-0.026	1.000	-0.015	-0.027	-0.021	-0.010	-0.020	-0.029	-0.027	-0.027	-0.029	-0.016	-0.017	-0.023	-0.026	-0.014	-0.013	-0.028
people that are victims of human trafficking	0.001	-0.001	-0.011	0.021	-0.011	0.015	0.175	-0.175	0.203	-0.055	-0.021	-0.033	-0.035	-0.027	-0.025	-0.029	-0.009	-0.012	-0.009	-0.012	-0.020	-0.016	-0.027	-0.027	-0.002	-0.010	-0.008	-0.016	-0.008	-0.028	-0.012	-0.012	-0.011	-0.011	-0.011	-0.015	1.000	-0.012	-0.009	-0.004	-0.009	-0.013	-0.011	-0.012	-0.012	-0.007	-0.007	-0.010	-0.011	-0.006	-0.006	-0.012
people with disabilities	0.004	-0.004	0.004	-0.001	-0.004	0.008	-0.067	0.067	0.362	-0.099	-0.037	-0.059	-0.063	-0.047	-0.044	-0.051	-0.017	-0.021	-0.016	-0.022	-0.035	-0.029	-0.048	-0.049	-0.004	-0.018	-0.014	-0.028	-0.014	-0.050	-0.021	-0.022	-0.020	-0.020	-0.020	-0.027	-0.012	1.000	-0.016	-0.008	-0.015	-0.023	-0.021	-0.021	-0.022	-0.013	-0.013	-0.018	-0.020	-0.011	-0.010	-0.022
people with heart disease	0.006	-0.006	0.004	-0.003	-0.003	0.015	-0.053	0.053	-0.045	-0.077	-0.029	0.278	-0.050	-0.037	-0.035	-0.040	-0.013	-0.016	-0.013	-0.017	-0.028	-0.022	-0.038	-0.038	-0.003	-0.014	-0.011	-0.022	-0.011	-0.039	-0.016	-0.017	-0.015	-0.015	-0.016	-0.021	-0.009	-0.016	1.000	-0.006	-0.012	-0.018	-0.016	-0.016	-0.017	-0.010	-0.010	-0.014	-0.015	-0.009	-0.008	-0.017
preventing and curing blindness	-0.007	0.007	-0.007	0.020	-0.014	0.013	0.112	-0.112	-0.021	-0.036	-0.013	0.127	-0.023	-0.017	-0.016	-0.018	-0.006	-0.007	-0.006	-0.008	-0.013	-0.010	-0.017	-0.017	-0.001	-0.006	-0.005	-0.010	-0.005	-0.018	-0.007	-0.008	-0.007	-0.007	-0.007	-0.010	-0.004	-0.008	-0.006	1.000	-0.006	-0.008	-0.007	-0.007	-0.008	-0.005	-0.005	-0.006	-0.007	-0.004	-0.004	-0.008
protecting against preventable childhood diseases	0.013	-0.013	0.004	-0.008	0.004	-0.008	0.230	-0.230	-0.043	-0.073	-0.027	0.261	-0.047	-0.035	-0.032	-0.038	-0.012	-0.015	-0.012	-0.016	-0.026	-0.021	-0.036	-0.036	-0.003	-0.013	-0.010	-0.021	-0.010	-0.037	-0.015	-0.016	-0.014	-0.014	-0.015	-0.020	-0.009	-0.015	-0.012	-0.006	1.000	-0.017	-0.015	-0.015	-0.016	-0.009	-0.010	-0.013	-0.014	-0.008	-0.007	-0.016
provide tutoring and support to underserved children	0.001	-0.001	0.004	-0.002	-0.001	-0.006	-0.073	0.073	-0.063	-0.107	-0.040	-0.064	-0.068	-0.051	0.515	-0.055	-0.018	-0.023	-0.018	-0.024	-0.038	-0.031	-0.052	-0.053	-0.004	-0.019	-0.015	-0.030	-0.015	-0.054	-0.022	-0.023	-0.021	-0.021	-0.022	-0.029	-0.013	-0.023	-0.018	-0.008	-0.017	1.000	-0.022	-0.022	-0.024	-0.014	-0.014	-0.019	-0.021	-0.012	-0.011	-0.023
racial or ethnic minorities	0.008	-0.008	-0.008	0.006	0.003	-0.004	-0.066	0.066	0.353	-0.096	-0.036	-0.058	-0.062	-0.046	-0.043	-0.050	-0.016	-0.020	-0.016	-0.021	-0.034	-0.028	-0.047	-0.047	-0.004	-0.017	-0.014	-0.027	-0.014	-0.048	-0.020	-0.021	-0.019	-0.019	-0.020	-0.027	-0.011	-0.021	-0.016	-0.007	-0.015	-0.022	1.000	-0.020	-0.022	-0.012	-0.013	-0.017	-0.019	-0.011	-0.010	-0.021
rare diseases that need more research	0.012	-0.012	0.003	-0.005	0.000	0.014	-0.066	0.066	-0.057	-0.097	-0.036	0.347	-0.062	-0.046	-0.043	-0.050	-0.016	-0.020	-0.016	-0.021	-0.034	-0.028	-0.047	-0.048	-0.004	-0.017	-0.014	-0.027	-0.014	-0.049	-0.020	-0.021	-0.019	-0.019	-0.020	-0.027	-0.012	-0.021	-0.016	-0.007	-0.015	-0.022	-0.020	1.000	-0.022	-0.012	-0.013	-0.017	-0.019	-0.011	-0.010	-0.021
refugees fleeing violence or other disasters	0.011	-0.011	-0.033	0.023	0.012	-0.011	0.328	-0.328	0.380	-0.104	-0.039	-0.062	-0.066	-0.050	-0.046	-0.054	-0.018	-0.022	-0.017	-0.023	-0.037	-0.030	-0.051	-0.051	-0.004	-0.018	-0.015	-0.030	-0.015	-0.052	-0.022	-0.023	-0.020	-0.021	-0.021	-0.029	-0.012	-0.022	-0.017	-0.008	-0.016	-0.024	-0.022	-0.022	1.000	-0.013	-0.014	-0.018	-0.021	-0.012	-0.010	-0.023
send a low income student to college	0.033	-0.033	0.000	0.001	-0.005	0.024	-0.040	0.040	-0.035	-0.059	-0.022	-0.035	-0.038	-0.028	0.284	-0.031	-0.010	-0.012	-0.010	-0.013	-0.021	-0.017	-0.029	-0.029	-0.002	-0.010	-0.008	-0.017	-0.008	-0.030	-0.012	-0.013	-0.012	-0.012	-0.012	-0.016	-0.007	-0.013	-0.010	-0.005	-0.009	-0.014	-0.012	-0.012	-0.013	1.000	-0.008	-0.010	-0.012	-0.007	-0.006	-0.013
support girls in STEM	-0.001	0.001	0.000	-0.017	0.016	-0.007	-0.042	0.042	-0.036	-0.061	-0.023	-0.037	-0.039	-0.029	0.295	-0.032	-0.010	-0.013	-0.010	-0.014	-0.022	-0.018	-0.030	-0.030	-0.002	-0.011	-0.009	-0.017	-0.009	-0.031	-0.013	-0.013	-0.012	-0.012	-0.013	-0.017	-0.007	-0.013	-0.010	-0.005	-0.010	-0.014	-0.013	-0.013	-0.014	-0.008	1.000	-0.011	-0.012	-0.007	-0.006	-0.013
the opportunity to go to school	0.060	-0.060	0.000	0.000	0.001	-0.009	0.261	-0.261	-0.048	-0.082	0.477	-0.049	-0.053	-0.040	-0.037	-0.043	-0.014	-0.017	-0.014	-0.018	-0.029	-0.024	-0.040	-0.041	-0.003	-0.015	-0.012	-0.023	-0.012	-0.041	-0.017	-0.018	-0.016	-0.016	-0.017	-0.023	-0.010	-0.018	-0.014	-0.006	-0.013	-0.019	-0.017	-0.017	-0.018	-0.010	-0.011	1.000	-0.016	-0.009	-0.008	-0.018
the skills to grow food sustainably	0.039	-0.039	0.016	-0.011	-0.004	-0.003	0.293	-0.293	-0.054	-0.092	0.535	-0.055	-0.059	-0.044	-0.041	-0.048	-0.016	-0.020	-0.015	-0.021	-0.033	-0.027	-0.045	-0.045	-0.004	-0.016	-0.013	-0.026	-0.013	-0.046	-0.019	-0.020	-0.018	-0.018	-0.019	-0.026	-0.011	-0.020	-0.015	-0.007	-0.014	-0.021	-0.019	-0.019	-0.021	-0.012	-0.012	-0.016	1.000	-0.010	-0.009	-0.020
treating the big three: HIV, AIDS, Tuburculosis and Malaria	-0.010	0.010	0.002	-0.001	-0.002	0.006	0.165	-0.165	-0.031	-0.052	-0.019	0.187	-0.033	-0.025	-0.023	-0.027	-0.009	-0.011	-0.009	-0.012	-0.019	-0.015	-0.025	-0.026	-0.002	-0.009	-0.007	-0.015	-0.007	-0.026	-0.011	-0.011	-0.010	-0.010	-0.011	-0.014	-0.006	-0.011	-0.009	-0.004	-0.008	-0.012	-0.011	-0.011	-0.012	-0.007	-0.007	-0.009	-0.010	1.000	-0.005	-0.011
whales and ocean animals (Baby Beluga! Nemo!!)	0.012	-0.012	0.012	-0.007	-0.006	0.008	0.005	-0.005	-0.027	-0.047	-0.017	-0.028	0.157	-0.022	-0.021	-0.024	-0.008	-0.010	-0.008	-0.010	-0.017	-0.014	-0.023	-0.023	-0.002	-0.008	-0.007	-0.013	-0.007	-0.023	-0.010	-0.010	-0.009	-0.009	-0.010	-0.013	-0.006	-0.010	-0.008	-0.004	-0.007	-0.011	-0.010	-0.010	-0.010	-0.006	-0.006	-0.008	-0.009	-0.005	1.000	-0.010
women	-0.017	0.017	0.003	-0.014	0.011	-0.005	0.040	-0.040	0.371	-0.101	-0.038	-0.061	-0.065	-0.049	-0.045	-0.053	-0.017	-0.021	-0.017	-0.023	-0.036	-0.029	-0.050	-0.050	-0.004	-0.018	-0.015	-0.029	-0.014	-0.051	-0.021	-0.022	-0.020	-0.020	-0.021	-0.028	-0.012	-0.022	-0.017	-0.008	-0.016	-0.023	-0.021	-0.021	-0.023	-0.013	-0.013	-0.018	-0.020	-0.011	-0.010	1.000

Analyzing treatment text responses

df_treatment_text <- df_wide %>% select(treatment_text, text_has_link, arm_coded) %>% filter(treatment_text != "") %>% filter(text_has_link == 0)

n_treatment_text <- as.character(nrow(df_treatment_text))

In this section, we will do some analysis with the text responses from the participants.
First, as mentioned the cleaning script, we detected a some abnormal text string that contains a link. We suspect that participants responded by sending an interactive images or emojis. We created a variable text_has_link to indicate if the text response contains a link. For this analysis, we will drop the missing values in the text respons and reponse with link.
We have a total of 5854 participants who provided text responses without a link from the main sample of 18529 participants who consented to the study.
These responses are exported to a spreadsheet, named treatment_text.csv, in the Data/Processed folder.

df_text <- df_treatment_text %>% select(treatment_text) 

write.csv(df_text, "./Data/Processed/treatment_text.csv")

The table below will show 10 responses of each treatment arm.

df_treatment_text %>% select(arm_coded, treatment_text) %>% group_by(arm_coded) %>% sample_n(10) %>% kable(digits = 3, col.names = c("Treatment Group", "Text Response")) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "500px")

Treatment Group	Text Response
obligation	I would tell them where to go get help
obligation	Great
obligation	Bad
obligation	Frustrated and sad.
obligation	Sad, horrible, and sad
obligation	Good.
obligation	no
obligation	Stop
obligation	It would make me feel like, this isn’t what God would want from me
obligation	Ineffective
opportunity	VERY PROUD
opportunity	Wat response
opportunity	Exit
opportunity	Glad for their safety out of the elements
opportunity	well, I’m 70 years old and have no grandchildren
opportunity	happy
opportunity	pleased
opportunity	I would be filled with joy
opportunity	I’d be happy and touched
opportunity	It would feel great

Word cloud

In this section, we will create a word cloud for the text responses from the participants.

Both Obligation and Opportunity Groups

The top 30 words in the text responses from both the obligation and opportunity groups are shown in the word cloud below.

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_treatment_text$treatment_text))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Opportunity

The top 30 words in the text responses from opportunity arm are shown in the word cloud below.

df_opportunity_text <- df_treatment_text %>% filter(arm_coded == "opportunity")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_opportunity_text$treatment_text))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Obligation

The top 30 words in the text responses from obligation arm are shown in the word cloud below.

df_obligation_text <- df_treatment_text %>% filter(arm_coded == "obligation")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_obligation_text$treatment_text))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Sentiment analysis

In this section, we will conduct a sentiment analysis on the text responses from the participants.
The scale of the sentiment score ranges from negative to positive integers. Negative scores indicate negative sentiment, positive scores indicate positive sentiment, and scores close to zero suggest neutral sentiment.
In our analysis, we observe that the average sentiment score for the obligation arm is negative, while the average sentiment score for the opportunity arm is positive.

# Tokenize the text
tokens <- df_treatment_text %>% select(arm_coded, treatment_text) %>%
  unnest_tokens(word, treatment_text)

# Get sentiment lexicon (afinn)
sentiments <- get_sentiments("afinn")

# Join the tokens with sentiment scores
sentiment_scores <- tokens %>%
  inner_join(sentiments, by = "word")

# Calculate sentiment score for each document (in this case, each treatment text)
sentiment_scores <- sentiment_scores %>%
  group_by(arm_coded) %>%
  summarise(sentiment_score = mean(value, na.rm = TRUE),
            sum_score = sum(value, na.rm = TRUE),
            sd = sd(value, na.rm = TRUE),
            n = n())

# Display sentiment scores for each group
sentiment_scores %>% kable(digits = 3, col.names = c("Treatment Group", "Average Sentiment Score", "Sum Sentiment Score", "SD", "N")) |>
  kable_styling(bootstrap_options = c("striped", "hover"))

Treatment Group	Average Sentiment Score	Sum Sentiment Score	SD	N
obligation	-0.159	-614	2.212	3870
opportunity	1.762	6691	1.672	3797

Feedback text response

df_feedback_text <- df_wide %>% select(feedback_match_coded, feedback_has_link, arm_coded) %>% filter(feedback_match_coded != "") %>% filter(feedback_has_link == 0)

n_feedback_text <- as.character(nrow(df_feedback_text))

In this section, we will do some analysis with the feedback responses from the participants.
First, as mentioned the cleaning script, we detected a some abnormal text string that contains a link. We suspect that participants responded by sending an interactive images or emojis. We created a variable feedback_has_link to indicate if the text response contains a link. For this analysis, we will drop the missing values in the text respons and reponse with link.
We have a total of 2510 participants who provided feedback responses without a link from the main sample of 18529 participants who consented to the study.
These responses are exported to a spreadsheet, named feedback_text.csv, in the Data/Processed folder.

df_feedback <- df_feedback_text  %>% select(feedback_match_coded) 

write.csv(df_feedback, "./Data/Processed/feedback_text.csv")

The table below will show 10 responses of each treatment arm.

df_feedback_text %>% select(arm_coded, feedback_match_coded) %>% group_by(arm_coded) %>% sample_n(10) %>% kable(digits = 3, col.names = c("Treatment Group", "Feedback")) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      kableExtra::scroll_box( height = "500px")

Treatment Group	Feedback
control	I don’t think it’s for me
control	I like it and agree with you
control	its great
control	I’ll think about it
control	I’m sold on St Jude Children Research Hospital
control	I’m still researching
control	You did good! I’m very unsure these days as to what is real and not real
control	not what I wanted
control	Actually, I’d like to consider domestic charities as well.
control	THOUGHT IT WAS GREAT
obligation	I think it is interesting and accurate. However, I do not want to be inundated with requests for more donations; I do not want “gifts”.
obligation	Good Match
obligation	Yes
obligation	I need more alternatives, specifically locally. Most of the suggestions were west coast.
obligation	meh
obligation	Not a match.
obligation	It was not my choice
obligation	I want a redo
obligation	I prefer bunny charities.
obligation	It was perfect
opportunity	It’s not for me. Rotary International is my chosen charity.
opportunity	Very nice.
opportunity	Nothing!!
opportunity	ok
opportunity	I liked if so I donated to it.
opportunity	Don’t know yet , I feel pretty much like you do
opportunity	It was fun chatting with you this was my first time doing it
opportunity	A good organization, but not a no kill facility as far as I know. Those I donate to are.
opportunity	I think it is a good match - thank you!
opportunity	It was good

Word cloud

In this section, we will create a word cloud for the text responses from the participants.

All Groups

The top 30 words in the feedback responses from both the obligation and opportunity groups are shown in the word cloud below.

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_feedback_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Control

The top 30 words in the feedback responses from the control arm are shown in the word cloud below.

df_control_text <- df_feedback_text %>% filter(arm_coded == "control")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_control_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Opportunity

The top 30 words in the feedback responses from the opportunity arm are shown in the word cloud below.

df_opportunity_text <- df_feedback_text %>% filter(arm_coded == "opportunity")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_opportunity_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Obligation

The top 30 words in the feedback responses from the obligation arm are shown in the word cloud below.

df_obligation_text <- df_feedback_text %>% filter(arm_coded == "obligation")

# Create a Corpus from the text data
corpus <- Corpus(VectorSource(df_obligation_text$feedback_match_coded))

# Preprocess the text (convert to lowercase, remove punctuation, numbers, stopwords)
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

# Convert Corpus to plain text
text <- sapply(corpus, as.character)

# Concatenate all text into a single string
text <- paste(text, collapse = " ")

# Generate word frequencies
word_freqs <- sort(table(unlist(strsplit(text, "\\s+"))), decreasing = TRUE)

# Select top 30 words
top_words <- head(names(word_freqs), 30)

# Filter text for top 30 words
filtered_text <- tolower(text)
for (word in setdiff(names(word_freqs), top_words)) {
  filtered_text <- gsub(paste0("\\b", word, "\\b"), "", filtered_text)
}

# Generate word cloud for top 30 words
wordcloud(words = names(word_freqs)[1:30], freq = word_freqs[1:30],
          min.freq = 1, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Sentiment analysis

In this section, we will conduct a sentiment analysis on the feedback responses from the participants.
The scale of the sentiment score ranges from negative to positive integers. Negative scores indicate negative sentiment, positive scores indicate positive sentiment, and scores close to zero suggest neutral sentiment.
In our analysis, we observe that the average sentiment score is the highest for the opportunity and the lowest for the obligation arm.

# Tokenize the text
tokens <- df_feedback_text %>% select(arm_coded, feedback_match_coded) %>%
  unnest_tokens(word, feedback_match_coded)

# Get sentiment lexicon (afinn)
sentiments <- get_sentiments("afinn")

# Join the tokens with sentiment scores
sentiment_scores <- tokens %>%
  inner_join(sentiments, by = "word")

# Calculate sentiment score for each document (in this case, each treatment text)
sentiment_scores <- sentiment_scores %>%
  group_by(arm_coded) %>%
  summarise(sentiment_score = mean(value, na.rm = TRUE),
            sum_score = sum(value, na.rm = TRUE),
            sd = sd(value, na.rm = TRUE),
            n = n())

# Display sentiment scores for each group
sentiment_scores %>% kable(digits = 3, col.names = c("Treatment Group", "Average Sentiment Score", "Sum Sentiment Score", "SD", "N")) |>
  kable_styling(bootstrap_options = c("striped", "hover"))

Treatment Group	Average Sentiment Score	Sum Sentiment Score	SD	N
control	1.613	1210	1.654	750
obligation	1.599	1861	1.691	1164
opportunity	1.670	2123	1.684	1271

FB-Charitable Givings Descriptive Statistics

Kiet Le and Szymon Sacher

2024-04-15

Set up

Loading R library

Goal

Summary

Data

Loading the Data

Sample definition

Key Decisions Made During the Cleaning

Structure of Charity Matching Quiz

Data Dictionary

Functions

Funtions for continuous variables

Function for unordered categorical variables

Function for generating probability distribution

Density plot

Histogram

Summary Statistics of Continuous Variables

Main Sample

Participants who received a charity match

Participants who completed the treatment intervention

Summary Statistics of Categorical Variables

Charity Match by Treatment Arms

Looking at number of participants consented to the chatbot

Among those consented

By source

Total count over time

By treatment arm

Total count over time

Proportion over time

By charity orgin

Total count over time

Proportion over time

By main charity cause

US Choice

Total count over time

Proportion over time

Global Choice

Total count over time

Proportion over time

Looking at time spent on charity match

Among those consented

By treatment arm

By charity origin

By main charity cause

Looking at Charity Match

Both Global and US

Top 5

Bottom 5

US Charities

Top 5

Bottom 5

Global Charities

Top 5

Bottom 5

Looking at donation intention over time

Donation Intention Today

Total count over time

Proportion over time

Choose to donate today, by treatment arm over time

Choose to donate today, by charity origin over time

Choose to donate today, by Charity Cause over time

Not donate today

Choose other options over time

Total count over time

Proportion over time

Looking at staying connected over time

Total count over time

Proportion over time

Correlation Matrix

Correlation table

Analyzing treatment text responses

Word cloud

Both Obligation and Opportunity Groups

Opportunity

Obligation

Sentiment analysis

Feedback text response