Goal

The purpose of this script is to calculate and report the funnel statistics for the Facebook Charitable Giving cleaned data.

Data

The dataset is the Facebook Charitable Giving wide cleaned, described and created in Issue #99.

setwd("/Users/ethanbernheim/Documents/GitHub/FB_Charitable_Giving/")
data <- readRDS("/Users/ethanbernheim/Desktop/FB_Charitable_Giving/Data/Processed/charitable_clean_wide.rds")

Setting up the data

To conduct funnel analysis, we must preprocess all the variables listed above into binary representations. For each variable, TRUE indicates that the participant successfully completed that stage of the quiz, while FALSE signifies they did not.

Click to see complete data preprocessing

Summary table helper function

This function creates a table displaying the frequencies of all values of a variable.

summary_table <- function(variable, title, var1, var2){
    bool_df <- as.data.frame(table(variable, useNA = 'ifany'))
    colnames(bool_df) <- c(var1, var2)
    kable(bool_df, caption = title, align = "c") |>
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) 
}

Data preprocessing

We split the variables up into sections defined by type. The sections are:

Variables where the string “Dropped off” represents missing
Varaibles where NA represents missing
Variables where an empty string represents missing

The treatment and manipulation order variables are singled out at the end regardless of type.

We create three new variables which are binary versions of variables already existing in the dataset. Particpants passed each stage if the respective value is “Proceeded” and did not pass if the value is “Dropped off”. The variables created in this section are

greeting_coded_funnel
pre_consent_coded_funnel
not_affil_coded_funnel

data$greeting_coded_funnel <- data$greeting_coded == "Proceeded"
data$pre_consent_coded_funnel <- data$pre_consent_coded == "Proceeded"
data$not_affil_coded_funnel <- data$not_affil_coded == "Proceeded"
data$for_me_coded_funnel <- data$for_me_coded == "Proceeded"
summary_table(data$greeting_coded_funnel, "Greeting Funnel", "Greeting", "Count")

Greeting Funnel
Greeting	Count
FALSE	4725
TRUE	41675
NA	17

summary_table(data$pre_consent_coded_funnel, "Pre-consent Funnel", "Pre-consent", "Count")

Pre-consent Funnel
Pre-consent	Count
FALSE	17514
TRUE	28886
NA	17

summary_table(data$not_affil_coded_funnel, "Not Affil Funnel", "Not Affil", "Count")

Not Affil Funnel
Not Affil	Count
FALSE	28495
TRUE	17905
NA	17

summary_table(data$for_me_coded_funnel, "For Me Funnel", "For Me", "Count")

For Me Funnel
For Me	Count
FALSE	33964
TRUE	12436
NA	17

We create a new variable which is a binary version of consent_coded. Participants only pass this stage if consent_coded_num is 1. This means the participants stated exactly “I consent, start now.” The new variable created in this section is

consent_coded_funnel

data$consent_coded_funnel <- data$consent_coded_num == 1
summary_table(data$consent_coded_funnel, "Consent Funnel", "Consent", "Count")

Consent Funnel
Consent	Count
FALSE	3993
TRUE	18528
NA	23896

The series of funnel variables below are set to TRUE if their respective cleaned variables have any value and FALSE if they are NA. The variables created in this section are

country_charity_coded_funnel
main_cause_coded_funnel
sub_cause_coded_funnel
charity_name_coded_funnel
important_smart_funnel
important_responsive_funnel
important_forwardlooking_funnel
manipulation_order_coded_funnel
share_with_friend_coded_funnel

data$country_charity_coded_funnel <- !is.na(data$country_charity_coded)
data$main_cause_coded_funnel <- !is.na(data$main_cause_coded)
data$sub_cause_coded_funnel <- !is.na(data$sub_cause_coded)
data$charity_name_coded_funnel <- !is.na(data$charity_name_coded)
data$important_smart_funnel <- !is.na(data$important_smart_coded)
data$important_responsive_funnel <- !is.na(data$important_responsive_coded)
data$important_forwardlooking_funnel <- !is.na(data$important_forward_looking_coded)
data$manipulation_order_coded_funnel <- !is.na(data$manipulation_order_coded)
data$share_with_friend_coded_funnel <- !is.na(data$share_with_friend_coded)
data$donate_today_coded_both_funnel <- !is.na(data$donate_today_coded)
summary_table(data$country_charity_coded_funnel, "Country Charity Funnel", "Country Charity", "Count")

Country Charity Funnel
Country Charity	Count
FALSE	28934
TRUE	17483

summary_table(data$main_cause_coded_funnel, "Main Cause Funnel", "Main Cause", "Count")

Main Cause Funnel
Main Cause	Count
FALSE	29438
TRUE	16979

summary_table(data$sub_cause_coded_funnel, "Sub Cause Funnel", "Sub Cause", "Count")

Sub Cause Funnel
Sub Cause	Count
FALSE	29902
TRUE	16515

summary_table(data$charity_name_coded_funnel, "Charity Name Funnel", "Charity Name", "Count")

Charity Name Funnel
Charity Name	Count
FALSE	29919
TRUE	16498

summary_table(data$important_smart_funnel, "Important Smart Funnel", "Important Smart", "Count")

Important Smart Funnel
Important Smart	Count
FALSE	30180
TRUE	16237

summary_table(data$important_responsive_funnel, "Important Responsive Funnel", "Important Responsive", "Count")

Important Responsive Funnel
Important Responsive	Count
FALSE	30348
TRUE	16069

summary_table(data$important_forwardlooking_funnel, "Important Forwardlooking Funnel", "Important Forwardlooking", "Count")

Important Forwardlooking Funnel
Important Forwardlooking	Count
FALSE	30504
TRUE	15913

summary_table(data$manipulation_order_coded_funnel, "Manipulation Order Funnel", "Manipulation Order", "Count")

Manipulation Order Funnel
Manipulation Order	Count
FALSE	38179
TRUE	8238

summary_table(data$share_with_friend_coded_funnel, "Share With Friend Funnel", "Share With Friend", "Count")

Share With Friend Funnel
Share With Friend	Count
FALSE	43858
TRUE	2559

summary_table(data$donate_today_coded_both_funnel, "Donate Today Funnel", "Donate Today", "Count")

Donate Today Funnel
Donate Today	Count
FALSE	38901
TRUE	7516

The funnel variables below are set to TRUE if their respective cleaned variables have any value other than an empty string. The variables created in this section are

donor_type_coded_funnel
arm_coded_funnel
feedback_match_coded_funnel

data$donor_type_coded_funnel <- !(data$donor_type_coded == "")
data$arm_coded_funnel <- !(data$arm_coded == "")
data$feedback_match_coded_funnel <- !(data$feedback_match_coded == "")
summary_table(data$donor_type_coded_funnel, "Donor Type Funnel", "Donor Type", "Count")

Donor Type Funnel
Donor Type	Count
TRUE	15925
NA	30492

summary_table(data$arm_coded_funnel, "Treatment Arm Funnel", "Treatment Arm", "Count")

Treatment Arm Funnel
Treatment Arm	Count
TRUE	10723
NA	35694

summary_table(data$feedback_match_coded_funnel, "Feedback Match Funnel", "Feedback Match", "Count")

Feedback Match Funnel
Feedback Match	Count
FALSE	43600
TRUE	2800
NA	17

We now create a funnel variable for entering the quiz. This variable is set to TRUE for every participant.

data$entry_funnel <- TRUE
summary_table(data$entry_funnel, "Entry Funnel", "Entry", "Count")

Entry Funnel
Entry	Count
TRUE	46417

Treatment variables

We now create funnel variables for the treatment variables. The two treatment questions are only asked to participants assigned to the opportunity and obligation groups. The responses to the two questions across both treatment arms are already combined and stored in temp_a and treatment_text respectively. NOTE: TEMP_A CURRENTLY MISSING FROM THE DATASET.

data$treatmeant_text_funnel <- !is.na(data$treatment_text)
summary_table(data$treatmeant_text_funnel, "Treatment Text Funnel", "Treatment Text", "Count")

Treatment Text Funnel
Treatment Text	Count
FALSE	37857
TRUE	8560

Manipulation variables

We now create funnel variables for the manipulation variables. Since the manipulation questions are asked in one of two orders, we create two funnel variables for each manipulation value. The variables created in this section are

manipulation_value_coded_1_funnel
donate_today_coded_1_funnel
manipulation_value_coded_2_funnel
donate_today_coded_2_funnel

data$manipulation_value_coded_1_funnel <- !is.na(data$manipulation_value_coded) & data$manipulation_order_coded == 1
data$donate_today_coded_1_funnel <- !is.na(data$donate_today_coded) & data$manipulation_order_coded == 1
data$manipulation_value_coded_2_funnel <- !is.na(data$manipulation_value_coded) & data$manipulation_order_coded == 2
data$donate_today_coded_2_funnel <- !is.na(data$donate_today_coded) & data$manipulation_order_coded == 2
summary_table(data$manipulation_value_coded_1_funnel, "Manipulation Value 1 Funnel", "Manipulation Value 1", "Count")

Manipulation Value 1 Funnel
Manipulation Value 1	Count
FALSE	42644
TRUE	3773

summary_table(data$manipulation_value_coded_2_funnel, "Manipulation Value 2 Funnel", "Manipulation Value 2", "Count")

Manipulation Value 2 Funnel
Manipulation Value 2	Count
FALSE	43779
TRUE	2638

summary_table(data$donate_today_coded_1_funnel, "Donate Today 1 Funnel", "Donate Today 1", "Count")

Donate Today 1 Funnel
Donate Today 1	Count
FALSE	42804
TRUE	3611
NA	2

summary_table(data$donate_today_coded_2_funnel, "Donate Today 2 Funnel", "Donate Today 2", "Count")

Donate Today 2 Funnel
Donate Today 2	Count
FALSE	42512
TRUE	3903
NA	2

Data description

We use the following variables in our funnel dropoff statistics (variables with *s next to them not currently in the dataset) They are listed in order of appearence in the quiz:

greeting_coded_funnel: binary flag for whether the participant passed the greeting stage
pre_consent_coded_funnel: binary flag for whether the participant passed the pre-consent stage
consent_coded_funnel: binary flag for whether the participant gave consent to the study
not_affil_coded_funnel: binary flag for whether the participant passed the not affiliated stage
country_charity_coded_funnel: binary flag for whether participant picked a country of charity (US or Global)
main_cause_coded_funnel: binary flag for whether participant selected a main cause
sub_cause_coded_funnel: binary flag for whether participant selected a sub cause
charity_name_coded_funnel: binary flag for whether participant was assigned a charity
important_smart_funnel: binary flag for whether participant answered the important smart question
important_responsive_funnel: binary flag for whether participant answered the important responsive question
important_forwardlooking_funnel: binary flag for whether participant answered the important forwardlooking question
for_me_coded_funnel: binary flag for whether participant answered ‘but is it for me’ question
end_affil*
arm_coded_funnel: binary flag for whether participant was assigned a treatment group
temp_a*
treatment_text_funnel: binary flag for whether participant answered antything in the free text treatment question
manipulation_order_coded_funnel: binary flag for whether participant was assigned a manipulation order
manipulation_value_coded_funnel: binary flag for whether participant answered the manipulation value question
donate_today_coded_funnel: binary flag for whether participant answered the donate today question
feedback_match_coded_funnel: binary flag for wether participant answered the feedback match question
share_with_friend_coded_funnel: binary flag for wether participant answered the share with friend question

Funnel statistics

Funnel functions

This function accepts a list of binary variables in sequential order and associated variable names, representing stages of the quiz. It then computes and returns:

The count of participants who passed each stage
The percentage of participants who passed each stage relative to total participants
The percentage of participants who passed each stage relative to the previous stage

count_non_na_list <- function(var_list, var_names){
  counts <- lapply(var_list, function(x) sum(x == TRUE, na.rm = TRUE))
  percent_total <- lapply(counts, function(x) x / nrow(data))
  percent_previous <- c(NA, lapply(2:length(counts), function(i) counts[[i]] / counts[[i - 1]]))
  funnel_df <- data.frame(stage = var_names)
  funnel_df$counts <- lapply(counts, function(x) round(x, 3))
  funnel_df$percent_total <- lapply(percent_total, function(x) round(x, 3))
  funnel_df$percent_previous <- lapply(percent_previous, function(x) round(x, 3))
  return(funnel_df)
}

The next function accepts a list of binary variables in sequential order and associated variable names, representing stages of the quiz. It also accepts a variable and a value. It then computes the funnel statistics for observations where the variable equals the value. As with the previous function, it returns:

The count of participants who passed each stage
The percentage of participants who passed each stage relative to total participants
The percentage of participants who passed each stage relative to the previous stage

by_var <- function(var, var_val, var_list, var_names){
  counts <- lapply(var_list, function(x) sum(x == TRUE & var == var_val, na.rm = TRUE))
  percent_total <- lapply(counts, function(x) x / counts[[1]])
  percent_previous <- c(NA, lapply(2:length(counts), function(i) counts[[i]] / counts[[i - 1]]))
  funnel_df <- data.frame(stage = var_names)
  funnel_df$counts <- lapply(counts, function(x) round(x, 3))
  funnel_df$percent_total <- lapply(percent_total, function(x) round(x, 3))
  funnel_df$percent_previous <- lapply(percent_previous, function(x) round(x, 3))
  return(funnel_df)
}

This function creates a graph of the funnel statistics. It accepts a dataframe with the funnel statistics and returns a funnel graph showing how participants drop off over time.

funnel_graph <- function(funnel_df, var_names){
  percent_total <- funnel_df$percent_total
  var_names <- rev(var_names)

  test <- data.frame(
    Category = factor(var_names, levels = var_names),
    Value = rev(unlist(percent_total))
  )

  ggplot(test, aes(x = Category, y = Value)) +
    geom_point(color = "darkred", size = 3) +
    geom_line(aes(group = 1), color = "darkred", size = 1) + 
    labs(x = "Stages", y = "Percent Total", title = "Percent Total Dropoff Rates") +
    theme_minimal() +
    coord_flip()
}

Full funnel statistics

The funnel statistics are reported in the table below. Note that the treatment and manipulation order variables are missing, as there is not a clear way to represent them in the same table (not every participant is asked the treatment questions, and the manipulation questions change order). The following variables are missing from the table and graph below:

temp_a
treatment_text
manipulation_value_coded

donate_today is included in both the table and the graph, even though participants were not always asked the question in the same order. Since the other question that vary order are not included in the table or graph, this is not an issue.

full_var_list <- list(data$entry_funnel, data$greeting_coded_funnel, data$pre_consent_coded_funnel, data$consent_coded_funnel, data$not_affil_coded_funnel, data$country_charity_coded_funnel, data$main_cause_coded_funnel, 
data$sub_cause_coded_funnel, data$charity_name_coded_funnel, data$important_smart_funnel, data$important_responsive_funnel, data$important_forwardlooking_funnel,
data$for_me_coded_funnel, data$arm_coded_funnel, data$manipulation_order_coded_funnel, data$donate_today_coded_both_funnel, data$feedback_match_coded_funnel, data$share_with_friend_coded_funnel)
full_var_names <- c("Entry", "Greeting", "Pre-consent", "Consent", "Not Affil", "Country Charity", "Main Cause", "Sub Cause", "Charity Name", "Important Smart", "Important Responsive", "Important Forwardlooking", 
"Is It For Me", "Treatment Arm", "Manipulation Order", "Donate Today", "Feedback Match", "Share With Friend")
funnel_df <- count_non_na_list(full_var_list, full_var_names)
kable(funnel_df, caption = "Basic Funnel Statistics", align = "c", col.names = c("stage", "count", "% total", "% previous")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Basic Funnel Statistics
stage	count	% total	% previous
Entry	46417	1	NA
Greeting	41675	0.898	0.898
Pre-consent	28886	0.622	0.693
Consent	18528	0.399	0.641
Not Affil	17905	0.386	0.966
Country Charity	17483	0.377	0.976
Main Cause	16979	0.366	0.971
Sub Cause	16515	0.356	0.973
Charity Name	16498	0.355	0.999
Important Smart	16237	0.35	0.984
Important Responsive	16069	0.346	0.99
Important Forwardlooking	15913	0.343	0.99
Is It For Me	12436	0.268	0.781
Treatment Arm	10723	0.231	0.862
Manipulation Order	8238	0.177	0.768
Donate Today	7516	0.162	0.912
Feedback Match	2800	0.06	0.373
Share With Friend	2559	0.055	0.914

Now we’ll visualize these statistics.

funnel_graph(funnel_df, full_var_names)

Manipulation order

Here we report the two sub-funnels for the groups 1 and 2 of the manipulation order. Group 1 was asked the manipulation value question first, while group 2 was asked the donate today question first.

var_list <- list(data$manipulation_value_coded_1_funnel, data$donate_today_coded_1_funnel)
var_names <- c("Manipulation Value 1", "Donate Today 1")
funnel_df <- count_non_na_list(var_list, var_names)
kable(funnel_df, caption = "Manipulation Order 1 Funnel Statistics", align = "c", digits = 3) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Manipulation Order 1 Funnel Statistics
stage	counts	percent_total	percent_previous
Manipulation Value 1	3773	0.081	NA
Donate Today 1	3611	0.078	0.957

var_list <- list(data$donate_today_coded_2_funnel, data$manipulation_value_coded_2_funnel)
var_names <- c("Donate Today 2", "Manipulation Value 2")
funnel_df <- count_non_na_list(var_list, var_names)
funnel_df <- funnel_df |>
  mutate_if(is.numeric, ~round(., 3))
kable(funnel_df, caption = "Manipulation Order 2 Funnel Statistics", align = "c", digits = 3) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Manipulation Order 2 Funnel Statistics
stage	counts	percent_total	percent_previous
Donate Today 2	3903	0.084	NA
Manipulation Value 2	2638	0.057	0.676

Treatment group

We are interested in the number of participants assigned to each treatment arm who answered the donate_today question. Those statistics are reported in the table below.

arm_coded_char <- as.character(data$arm_coded)
num_obligation <- sum(data$arm_coded == "obligation", na.rm = TRUE)
num_opportunity <- sum(data$arm_coded == "opportunity", na.rm = TRUE)
num_control <- sum(data$arm_coded == "control", na.rm = TRUE)
num_obligation_donate_today_funnel <- sum((data$donate_today_coded_1_funnel | data$donate_today_coded_2_funnel) & data$arm_coded == "obligation", na.rm = TRUE)
num_opportinuty_donate_today_funnel <- sum((data$donate_today_coded_1_funnel | data$donate_today_coded_2_funnel) & data$arm_coded == "opportunity", na.rm = TRUE)
num_control_donate_today_funnel <- sum((data$donate_today_coded_1_funnel | data$donate_today_coded_2_funnel) & data$arm_coded == "control", na.rm = TRUE)
treatment_table_df <- data.frame(
  Treatment = c("Obligation", "Opportunity", "Control"),
  Total = c(num_obligation, num_opportunity, num_control),
  Answered_Donate_Today = c(num_obligation_donate_today_funnel, num_opportinuty_donate_today_funnel, num_control_donate_today_funnel),
  Percent_Completed = c(num_obligation_donate_today_funnel / num_obligation, num_opportinuty_donate_today_funnel / num_opportunity, num_control_donate_today_funnel / num_control)
)
kable(treatment_table_df, caption = "Donate Today Responses by Treatment Group", align = "c") |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Donate Today Responses by Treatment Group
Treatment	Total	Answered_Donate_Today	Percent_Completed
Obligation	4348	2769	0.6368445
Opportunity	4212	2757	0.6545584
Control	2163	1988	0.9190939

Funnel by treatment group

Below are the funnel statistics for each of the three treatment groups.

full_var_list <- list(data$entry_funnel, data$greeting_coded_funnel, data$pre_consent_coded_funnel, data$consent_coded_funnel, data$not_affil_coded_funnel, data$country_charity_coded_funnel, data$main_cause_coded_funnel, 
data$sub_cause_coded_funnel, data$charity_name_coded_funnel, data$important_smart_funnel, data$important_responsive_funnel, data$important_forwardlooking_funnel,
data$for_me_coded_funnel, data$arm_coded_funnel, data$manipulation_order_coded_funnel, data$donate_today_coded_both_funnel, data$feedback_match_coded_funnel, data$share_with_friend_coded_funnel)
full_var_names <- c("Entry", "Greeting", "Pre-consent", "Consent", "Not Affil", "Country Charity", "Main Cause", "Sub Cause", "Charity Name", "Important Smart", "Important Responsive", "Important Forwardlooking", 
"Is It For Me", "Treatment Arm", "Manipulation Order", "Donate Today", "Feedback Match", "Share With Friend")
obligation <- by_var(data$arm_coded, "obligation", full_var_list, full_var_names)
opportunity <- by_var(data$arm_coded, "opportunity", full_var_list, full_var_names)
control <- by_var(data$arm_coded, "control", full_var_list, full_var_names)

combined_df <- bind_cols(obligation, opportunity[, -1], control[, -1])
kable(combined_df, caption = "Funnel Statistics by Treatment Arm", align = "c", col.names = c("stage", "count", "% total", "% previous", "count", "% total", "% previous", "count", "% total", "% previous")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) |>
  add_header_above(header = c(" " = 1, "Obligation" = 3, "Opportunity" = 3, "Control" = 3))

Funnel Statistics by Treatment Arm
	Obligation			Opportunity			Control
stage	count	% total	% previous	count	% total	% previous	count	% total	% previous
Entry	4348	1	NA	4212	1	NA	2163	1	NA
Greeting	4348	1	1	4212	1	1	2163	1	1
Pre-consent	4347	1	1	4210	1	1	2162	1	1
Consent	4344	0.999	0.999	4210	1	1	2161	0.999	1
Not Affil	4347	1	1.001	4210	1	1	2162	1	1
Country Charity	4348	1	1	4212	1	1	2163	1	1
Main Cause	4348	1	1	4212	1	1	2163	1	1
Sub Cause	4348	1	1	4212	1	1	2163	1	1
Charity Name	4348	1	1	4212	1	1	2163	1	1
Important Smart	4348	1	1	4212	1	1	2163	1	1
Important Responsive	4348	1	1	4212	1	1	2163	1	1
Important Forwardlooking	4348	1	1	4212	1	1	2163	1	1
Is It For Me	4347	1	1	4211	1	1	2163	1	1
Treatment Arm	4348	1	1	4212	1	1	2163	1	1
Manipulation Order	3067	0.705	0.705	3008	0.714	0.714	2163	1	1
Donate Today	2769	0.637	0.903	2758	0.655	0.917	1988	0.919	0.919
Feedback Match	1059	0.244	0.382	1083	0.257	0.393	658	0.304	0.331
Share With Friend	964	0.222	0.91	993	0.236	0.917	602	0.278	0.915

Now we visualize these statistics.

percent_total <- obligation$percent_total
percent_total_no_ad <- opportunity$percent_total
percent_total_control <- control$percent_total
var_names <- rev(full_var_names)

test <- data.frame(
  Category = factor(var_names, levels = var_names),
  Obligation = rev(unlist(percent_total)),
  Opportunity = rev(unlist(percent_total_no_ad)),
  Control = rev(unlist(percent_total_control))
)

ggplot(test, aes(x = Category)) +
  geom_point(aes(y = Obligation, color = "Obligation"), size = 3) +
  geom_line(aes(y = Obligation, group = 1, color = "Obligation"), size = 1) + 
  geom_point(aes(y = Opportunity, color = "Opportunity"), size = 3) +
  geom_line(aes(y = Opportunity, group = 1, color = "Opportunity"), size = 1) +
  geom_point(aes(y = Control, color = "Control"), size = 3) +
  geom_line(aes(y = Control, group = 1, color = "Control"), size = 1) +
  labs(x = "Stages", y = "Percent Total", title = "Percent Total Dropoff Rates") +
  theme_minimal() +
  coord_flip() +
  scale_color_manual(name = "Legend", 
                     values = c("Obligation" = "darkred", "Opportunity" = "blue", "Control" = "green"))

Funnel by US vs Global

Below are the funnel statistics for the US and Global groups.

us <- by_var(data$country_charity_coded, "US", full_var_list, full_var_names)
global <- by_var(data$country_charity_coded, "Global", full_var_list, full_var_names)

combined_df <- bind_cols(us, global[, -1])
kable(combined_df, caption = "Funnel Statistics by Country Charity", align = "c", col.names = c("stage", "count", "% total", "% previous", "count", "% total", "% previous")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) |>
  add_header_above(header = c(" " = 1, "US" = 3, "Global" = 3))

Funnel Statistics by Country Charity
	US			Global
stage	count	% total	% previous	count	% total	% previous
Entry	14418	1	NA	3065	1	NA
Greeting	14408	0.999	0.999	3058	0.998	0.998
Pre-consent	14397	0.999	0.999	3058	0.998	1
Consent	14409	0.999	1.001	3062	0.999	1.001
Not Affil	14404	0.999	1	3057	0.997	0.998
Country Charity	14418	1	1.001	3065	1	1.003
Main Cause	13995	0.971	0.971	2984	0.974	0.974
Sub Cause	13614	0.944	0.973	2901	0.946	0.972
Charity Name	13604	0.944	0.999	2894	0.944	0.998
Important Smart	13377	0.928	0.983	2858	0.932	0.988
Important Responsive	13227	0.917	0.989	2840	0.927	0.994
Important Forwardlooking	13094	0.908	0.99	2817	0.919	0.992
Is It For Me	10156	0.704	0.776	2280	0.744	0.809
Treatment Arm	8743	0.606	0.861	1980	0.646	0.868
Manipulation Order	6735	0.467	0.77	1503	0.49	0.759
Donate Today	6133	0.425	0.911	1383	0.451	0.92
Feedback Match	2190	0.152	0.357	610	0.199	0.441
Share With Friend	2009	0.139	0.917	550	0.179	0.902

Now we visualize these statistics.

percent_total <- us$percent_total
percent_total_global <- global$percent_total
var_names <- rev(full_var_names)

test <- data.frame(
  Category = factor(var_names, levels = var_names),
  Value = rev(unlist(percent_total)),
  Value_Global = rev(unlist(percent_total_global))
)

ggplot(test, aes(x = Category)) +
  geom_point(aes(y = Value, color = "US"), size = 3) +
  geom_line(aes(y = Value, group = 1, color = "US"), size = 1) + 
  geom_point(aes(y = Value_Global, color = "Global"), size = 3) +
  geom_line(aes(y = Value_Global, group = 1, color = "Global"), size = 1) +
  labs(x = "Stages", y = "Percent Total", title = "Percent Total Dropoff Rates") +
  theme_minimal() +
  coord_flip() +
  scale_color_manual(name = "Legend", values = c("US" = "darkred", "Global" = "blue"))

Funnel by source

Below are the funnel statistics for the source (ad or not ad).

ad <- by_var(data$source_coded, 1, full_var_list, full_var_names)
no_ad <- by_var(data$source_coded, 0, full_var_list, full_var_names)

combined_df <- bind_cols(ad, no_ad[, -1])
kable(combined_df, caption = "Funnel Statistics by Source", align = "c", col.names = c("stage", "count", "% total", "% previous", "count", "% total", "% previous")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) |>
  add_header_above(header = c(" " = 1, "Ad" = 3, "No Ad" = 3))

Funnel Statistics by Source
	Ad			No Ad
stage	count	% total	% previous	count	% total	% previous
Entry	45593	1	NA	807	1	NA
Greeting	41268	0.905	0.905	407	0.504	0.504
Pre-consent	28512	0.625	0.691	374	0.463	0.919
Consent	18204	0.399	0.638	307	0.38	0.821
Not Affil	17596	0.386	0.967	309	0.383	1.007
Country Charity	17160	0.376	0.975	306	0.379	0.99
Main Cause	16664	0.365	0.971	298	0.369	0.974
Sub Cause	16208	0.355	0.973	290	0.359	0.973
Charity Name	16208	0.355	1	290	0.359	1
Important Smart	15948	0.35	0.984	289	0.358	0.997
Important Responsive	15782	0.346	0.99	287	0.356	0.993
Important Forwardlooking	15626	0.343	0.99	287	0.356	1
Is It For Me	12173	0.267	0.779	263	0.326	0.916
Treatment Arm	10472	0.23	0.86	251	0.311	0.954
Manipulation Order	8008	0.176	0.765	230	0.285	0.916
Donate Today	7296	0.16	0.911	220	0.273	0.957
Feedback Match	2643	0.058	0.362	157	0.195	0.714
Share With Friend	2414	0.053	0.913	145	0.18	0.924

Now we visualize these statistics.

percent_total <- ad$percent_total
percent_total_no_ad <- no_ad$percent_total
var_names <- rev(full_var_names)

test <- data.frame(
  Category = factor(var_names, levels = var_names),
  Value = rev(unlist(percent_total)),
  Value_No_Ad = rev(unlist(percent_total_no_ad))
)

ggplot(test, aes(x = Category)) +
  geom_point(aes(y = Value, color = "Ad"), size = 3) +
  geom_line(aes(y = Value, group = 1, color = "Ad"), size = 1) + 
  geom_point(aes(y = Value_No_Ad, color = "No Ad"), size = 3) +
  geom_line(aes(y = Value_No_Ad, group = 1, color = "No Ad"), size = 1) +
  labs(x = "Stages", y = "Percent Total", title = "Percent Total Dropoff Rates") +
  theme_minimal() +
  coord_flip() +
  scale_color_manual(name = "Legend", values = c("Ad" = "darkred", "No Ad" = "blue"))

Funnel by week

Below are the funnel statistics for each week of the study. The participants are grouped into one of five weeks. The weeks are defined as follows (inclusive of both ends):

Week 1: 2023-11-28 to 2023-12-04
Week 2: 2023-12-05 to 2023-12-11
Week 3: 2023-12-12 to 2023-12-18
Week 4: 2023-12-19 to 2023-12-25
Week 5: 2023-12-26 to 2023-01-01

data$start_time <- as.POSIXct(data$charitable_intro_start_time, format = "%Y-%m-%d %H:%M:%S")
earliest_date <- as.POSIXct(trunc(min(data$start_time, na.rm = TRUE), "day"))
data$week_number <- as.integer(difftime(data$start_time, earliest_date, units = "weeks")) + 1

week1 <- by_var(data$week_number, 1, full_var_list, full_var_names)
week2 <- by_var(data$week_number, 2, full_var_list, full_var_names)
week3 <- by_var(data$week_number, 3, full_var_list, full_var_names)
week4 <- by_var(data$week_number, 4, full_var_list, full_var_names)
week5 <- by_var(data$week_number, 5, full_var_list, full_var_names)

combined_df <- bind_cols(week1, week2[, -1], week3[, -1], week4[, -1], week5[, -1])
kable(combined_df, caption = "Funnel Statistics by Week", align = "c", col.names = c("stage", "count", "% total", "% previous", "count", "% total", "% previous",  "count", "% total", "% previous", "count", "% total", "% previous",  "count", "% total", "% previous")) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) |>
  add_header_above(header = c(" " = 1, "Week 1" = 3, "Week 2" = 3, "Week 3" = 3, "Week 4" = 3, "Week 5" = 3))

Funnel Statistics by Week
	Week 1			Week 2			Week 3			Week 4			Week 5
stage	count	% total	% previous	count	% total	% previous	count	% total	% previous	count	% total	% previous	count	% total	% previous
Entry	4084	1	NA	5004	1	NA	5554	1	NA	8918	1	NA	13609	1	NA
Greeting	4084	1	1	4988	0.997	0.997	5553	1	1	8917	1	1	13609	1	1
Pre-consent	2487	0.609	0.609	3244	0.648	0.65	3712	0.668	0.668	6113	0.685	0.686	9751	0.717	0.717
Consent	1353	0.331	0.544	1835	0.367	0.566	2303	0.415	0.62	4072	0.457	0.666	6939	0.51	0.712
Not Affil	1289	0.316	0.953	1749	0.35	0.953	2197	0.396	0.954	3931	0.441	0.965	6803	0.5	0.98
Country Charity	1248	0.306	0.968	1693	0.338	0.968	2126	0.383	0.968	3852	0.432	0.98	6672	0.49	0.981
Main Cause	1216	0.298	0.974	1621	0.324	0.957	2053	0.37	0.966	3740	0.419	0.971	6504	0.478	0.975
Sub Cause	1184	0.29	0.974	1584	0.317	0.977	2004	0.361	0.976	3638	0.408	0.973	6307	0.463	0.97
Charity Name	1184	0.29	1	1584	0.317	1	2004	0.361	1	3638	0.408	1	6307	0.463	1
Important Smart	1154	0.283	0.975	1548	0.309	0.977	1960	0.353	0.978	3584	0.402	0.985	6227	0.458	0.987
Important Responsive	1135	0.278	0.984	1529	0.306	0.988	1936	0.349	0.988	3546	0.398	0.989	6179	0.454	0.992
Important Forwardlooking	1120	0.274	0.987	1513	0.302	0.99	1907	0.343	0.985	3514	0.394	0.991	6131	0.451	0.992
Is It For Me	904	0.221	0.807	1204	0.241	0.796	1506	0.271	0.79	2796	0.314	0.796	4614	0.339	0.753
Treatment Arm	810	0.198	0.896	1073	0.214	0.891	1311	0.236	0.871	2456	0.275	0.878	3849	0.283	0.834
Manipulation Order	668	0.164	0.825	873	0.174	0.814	1047	0.189	0.799	1876	0.21	0.764	2785	0.205	0.724
Donate Today	614	0.15	0.919	802	0.16	0.919	966	0.174	0.923	1703	0.191	0.908	2522	0.185	0.906
Feedback Match	178	0.044	0.29	234	0.047	0.292	337	0.061	0.349	639	0.072	0.375	1053	0.077	0.418
Share With Friend	161	0.039	0.904	214	0.043	0.915	307	0.055	0.911	585	0.066	0.915	969	0.071	0.92

Now we visualize these statistics.

percent_total_week1 <- week1$percent_total
percent_total_week2 <- week2$percent_total
percent_total_week3 <- week3$percent_total
percent_total_week4 <- week4$percent_total
percent_total_week5 <- week5$percent_total
var_names <- rev(full_var_names)

test <- data.frame(
  Category = factor(var_names, levels = var_names),
  Week1 = rev(unlist(percent_total_week1)),
  Week2 = rev(unlist(percent_total_week2)),
  Week3 = rev(unlist(percent_total_week3)),
  Week4 = rev(unlist(percent_total_week4)),
  Week5 = rev(unlist(percent_total_week5))
)

ggplot(test, aes(x = Category)) +
  geom_point(aes(y = Week1, color = "Week 1"), size = 3) +
  geom_line(aes(y = Week1, group = 1, color = "Week 1"), size = 1) + 
  geom_point(aes(y = Week2, color = "Week 2"), size = 3) +
  geom_line(aes(y = Week2, group = 1, color = "Week 2"), size = 1) +
  geom_point(aes(y = Week3, color = "Week 3"), size = 3) +
  geom_line(aes(y = Week3, group = 1, color = "Week 3"), size = 1) +
  geom_point(aes(y = Week4, color = "Week 4"), size = 3) +
  geom_line(aes(y = Week4, group = 1, color = "Week 4"), size = 1) +
  geom_point(aes(y = Week5, color = "Week 5"), size = 3) +
  geom_line(aes(y = Week5, group = 1, color = "Week 5"), size = 1) +
  labs(x = "Stages", y = "Percent Total", title = "Percent Total Dropoff Rates") +
  theme_minimal() +
  coord_flip() +
  scale_color_manual(name = "Legend", 
                     values = c("Week 1" = "darkred", "Week 2" = "blue", "Week 3" = "green", "Week 4" = "purple", "Week 5" = "orange"))

FB-Charitable Givings Funnel Statistics

Ethan Bernheim

2024-05-05

Goal

Data

Setting up the data

Summary table helper function

Data preprocessing

Treatment variables

Manipulation variables

Data description

Funnel statistics

Funnel functions

Full funnel statistics

Manipulation order

Treatment group

Funnel by treatment group

Funnel by US vs Global

Funnel by source

Funnel by week