1 Introduction

What’s in this tutorial

In this tutorial, we will walk you through the analysis of pilot data from a Facebook ad campaign to recruit participants to a chatbot survey about vaccine hesitancy. The current goal of the chatbot is to learn about the reasons unvaccinated people have not been vaccinated (i.e., their impediments) and other information that would allow us to segment users for personalized interventions. To do this, we need to be able to cheaply recruit many vaccine hesitant people. It is also important that we recruit different types of vaccine hesitant people, where type is defined by the person’s impediment. From this analysis, you will identify which ads from the pilot experiment are effective for recruiting different types of vaccine hesitant people to complete the chatbot survey.

While this project, like others in the class, will study subgroups (in particular, people with different impediments to being vaccinated), there is an important conceptual difference in what you will be doing in this project. To explain, let’s start with an example using two ads. We want to learn whether one ad, ad \(A\), is better at recruiting than another ad, ad \(B\), for types of vaccine hesitant people. So, we want to learn the treatment effect \(\tau = Y(A) - Y(B)\), where \(Y(\cdot)\) is the number of completed surveys for a given ad. In a typical randomized control trial, we would draw a sample \(N\) from the population of Facebook users, randomly divide the sample in two subsamples of size \(N/2\), and assign one subsample to see ad \(A\) and the other to see ad \(B\). We could then estimate \(\hat{\tau}\) by taking the difference of the sample averages \(\hat{\tau} = \overline{Y}(A) - \overline{Y}(B)\). To understand which ads are best for different types of vaccine hesitant people, we would estimate \(\hat{\tau}\) separately for each type.

Problem statement

Here’s the problem with the above approach in this project: we cannot randomly assign Facebook users to see each ad. Rather, Facebook’s algorithm determines who is shown each ad and one of the factors that determines whether or not a person is shown an ad is how likely Facebook thinks it is that the person will click on the ad. So, the sample of people shown ad \(A\) may have a different composition compared to the sample shown ad \(B\), where composition here is based on impediment.

Think about an example. Let’s say ad \(A\) recruits 50 people who say they haven’t been vaccinated because they think vaccines don’t work and 1 person who says they haven’t been vaccinated because it is unavailable near them. Then, let’s say the opposite for ad \(B\): it recruits 50 people who say the vaccine is unavailable near them and 1 person who says they think vaccines don’t work. If we thought that the sample was randomly assigned, then we would likely conclude that ad \(A\) is more effective at recruiting people who think vaccines don’t work and ad \(B\) is more effective at recruiting people who say the vaccine is unavailable near them. What if you knew, however, that ad \(A\) was shown to 1000 people who think vaccines don’t work and only 5 people who say the vaccine is unavailable near them, and vice versa for ad \(B\). Then, we would say that ad \(A\) recruits \(50/1000 = 5%\) of the “vaccines don’t work” population and \(⅕ = 20%\) of the “vaccines unavailable” population, while ad \(B\) recruits 20% of the “vaccines don’t work” population and 5% of the “vaccines unavailable” population – the exact opposite conclusion.

The core of the problem is that we can’t observe an ad’s rate of recruitment in a subpopulation because we do not observe the composition of the population to which the ad is shown. Instead, we observe an arrival rate for each subpopulation. That means the arrival rate of each subpopulation of interest is an outcome of interest. Other projects in this class are looking for heterogeneous treatment effects, meaning that the difference in one outcome between two treatment groups differs between two subpopulations. In this project, we will look for differences between two treatment groups for multiple outcomes, defined as arrival rates for each subpopulation of interest.

2 Data Structure

Observations: (Facebook) user-level
Funnel outcomes: impressions, clicks, completes, completes for unvaccinated
Vaccine-hesitancy: motivation (1-6), ability (1-3), sub-motivation, sub-ability
Covariates: age, income, education, religion, politics
Ads 0-16, where 0 is did not enter through ad

Load packages:

## Load packages
pacman::p_load(DT, estimatr, kableExtra, readr, reshape2, tidyverse, xtable, dataMaid, ggcorrplot, pollster, tm, RColorBrewer, hrbrthemes, janitor, purrr, gridExtra, cowplot, rcompanion)

Load data:

## V6 data
df_full_v6 <- read_csv("data/Share_Your_Voice_v6_ALP.csv")

# filtering for current wave (main data for this wave)
df_v6 <-
  df_full_v6 %>%
  filter(version == "pilot_6", full_complete == "yes") %>% 
  drop_na(vax_status)

## V5 data
df_full_v5 <- read_csv("data/Share_Your_Voice_v5_ALP.csv")

# filtering for current wave (main data for this wave)
df_v5 <-
  df_full_v5 %>%
  filter(version == "pilot_5", full_complete == "yes") %>% 
  drop_na(vax_status)

## read in facebook ads data
ads_v6 <- 
  read.csv("data/ads_data_v6.csv") %>% 
  rename(original_ref = Ad.ID, ad_name = Ad.name) %>% 
  remove_empty() %>% 
  inner_join(
    read_csv(here::here("data/Pilot_creative_text.csv")) %>% rename(ad_name = Ad),
    by = "ad_name"
  ) %>% 
  relocate(original_ref, ad_name)

ads_v5 <- 
  read.csv("data/ads_data_v5.csv") %>% 
  rename(original_ref = Ad.ID, ad_name = Ad.name) %>% 
  remove_empty() %>% 
  inner_join(
    read_csv(here::here("data/Pilot_creative_text.csv")) %>% rename(ad_name = Ad),
    by = "ad_name"
  ) %>% 
  relocate(original_ref, ad_name)

# combining chatfuel data with ads data. we use `df` in most analyses below. 
df <-
  bind_rows(
    df_v6 %>% left_join(ads_v6, by = "original_ref"),
    df_v5 %>% left_join(ads_v5, by = "original_ref"),
  ) %>% 
  remove_empty()

# combining full datatsets
df_full <- 
  bind_rows(
    df_full_v6 %>% left_join(ads_v6, by = "original_ref"), 
    df_full_v5 %>% left_join(ads_v5, by = "original_ref")
  ) %>% 
  remove_empty()

ads <- bind_rows(ads_v5, ads_v6) %>% remove_empty()

# data formatting functions
form_percent <- function(dec){
  return(paste0(round(dec* 100, 1), "%"))
}

form_cost <- function(num){
  if (num == 0) {
    return(NA_character_)
  } else {
    return(paste0("$", round(cost / num, 3)))
  }
}

rm(ads_v5, ads_v6, df_v5, df_v6, df_full_v5, df_full_v6)

3 Understanding the data

3.1 General cross-tabs

Clean up some variables:

clean_up_demog <- function(df){
  df$gender[!df$gender %in% c("male", "female")] <- "other"
  df$ethnicity[!df$ethnicity %in% c("black or african", "coloured", "white or caucasian", "prefer not to say", "asian or indian")] <- "other"
  df$income[!df$income %in% c("< R5,000", "R5,000 – R9,999", "R10,000 – R29,999", "R30,000 – R49,999", "R50,000 – R99,999", "> R100,000", "prefer not to say")] <- "other"
  df$education[!df$education %in% c("< high school", "high school", "some college", "2-year degree", "4-year degree", "graduate degree", "prefer not to say")] <- "other"
  df$politics[!df$politics %in% c("conservative", "moderate", "liberal", "prefer not to say")] <- "other"
  df$location[!df$location %in% c("urban", "suburban", "rural", "prefer not to say")] <- "other"
  df$religion[grep("christ", tolower(df$religion))] <- "christian"
  df$religion[!df$religion %in% c("christian", "african traditional", "islam", "hinduism", "no religion", "prefer not to say")] <- "other"
  return (df)
}

df <- clean_up_demog(df)

Here’s a summary of relevant demographic variables:

df %>%
  mutate_if(is_character, ~ as_factor(.)) %>%
  select(contains(c("gender", "income", "ethnicity", "education", "religion", "politics", "location"))) %>% 
  clean_names(case = "title") %>%
  papeR::summarize_factor() %>% 
  datatable(options = list(pageLength = 50, columnDefs = list(list(orderable = TRUE, targets = 0))))

Here’s a distribution of respondent age:

df %>%
  mutate(
    age = parse_integer(cv_age),
    age = if_else(age < 0 | age > 120, NA_integer_, age)
  ) %>%
  ggplot(aes(age)) +
  geom_histogram(alpha = 0.9) +
  theme_bw() +
  labs(
    x = "Age", 
    y = "Count",
    subtitle = "Respondent age distribution"
  )

df %>%
  transmute(
    age = parse_integer(cv_age),
    age = if_else(age < 0 | age > 120, NA_integer_, age)
  ) %>% 
  summary()

##       age       
##  Min.   : 0.00  
##  1st Qu.:20.00  
##  Median :25.00  
##  Mean   :27.52  
##  3rd Qu.:32.00  
##  Max.   :79.00  
##  NA's   :43

Let’s move ahead to look at vaccination impediment related variables.

3.2 Vaccination status

We start off with a plot demonstrating the split among vaccinated and unvaccinated participants in this pilot:

df %>% 
  count(vax_status) %>% 
  mutate(percentage = n / sum(n, na.rm = T)) %>% 
  mutate(vax_status_plot = ifelse(vax_status == "vax", "Vaccinated", "Unvaccinated")) %>%
  ggplot(aes(x = vax_status_plot, y = n, fill = vax_status_plot)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  labs(title = "Vaccination Status in pilot", fill = "vaccination status") +
  xlab("Vaccination status") +
  ylab("Count") +
  labs(fill = "Vaccination status")

3.3 Impediments

3.3.1 Overview

We asked participants about whether they have the motivation to get the COVID-19 vaccine and whether they have the ability to get the vaccine for both vaccinated and unvaccinated people. We then fork them into 8 different segments based on the vaccination status, motivation to get the vaccine, and ability to get the vaccine. We obtain the distribution below:

df %>%
  drop_na(vax_status, ability, motive) %>% 
  group_by(vax_status, ability, motive) %>%
  dplyr::summarise(count = n()) %>%
  mutate(`% of total` = paste0(round(count / nrow(df), 2) * 100, "%")) %>%
  arrange(vax_status, desc(count)) %>%
  kable(col.names = 
          c("Vaccination status", "Able to get vaccine", "Have motivation to get vaccine", "Count", "Percentage of total participants"), 
        align = "c",
        caption = "Distribution of forking segments of participants' impediments") %>%
  kable_styling()

Distribution of forking segments of participants’ impediments
Vaccination status	Able to get vaccine	Have motivation to get vaccine	Count	Percentage of total participants
unvax	yes	no	875	24%
unvax	no	no	252	7%
unvax	yes	yes	242	7%
unvax	no	yes	45	1%
vax	yes	yes	1205	34%
vax	yes	no	713	20%
vax	no	yes	139	4%
vax	no	no	121	3%

Takeaways

39% of survey respondents are unvaccinated. Among the unvaccinated, roughly:

62% lack motivation, but not ability to get vaccine
18% lack both motivation and ability to get vaccine
3% lack ability, but don’t lack motivation to get vaccine
17% lack neither

3.3.2 Analysis for each impediment

Let’s investigate each impediment (motivation & ability) in detail, and see the distributions of the reasons why they have such impediments.

Main Level: demonstrates the distribution of the reasons that chosen from the options we provided to them to explain why participants have motivation/ability impediments
Sub-level: analyze the explanations why participants choose the specific reason in the Main level

3.3.2.1 Motivational (Main Level)

We asked: What’s the main reason you don’t want to be vaccinated?

Provided options:

there’s no clear benefit (benefit)
it’s too risky (risk)
against my beliefs (belief)

The distribution of the answers demonstrates below:

motive <- df %>% filter(!is.na(motive_main))

motive_plot <- motive %>% count(motive_main) %>% mutate(percentage = n / nrow(motive)) %>% arrange(-n)

ggplot(motive_plot, aes(x = fct_inorder(motive_main), y = n, fill = motive_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  ggtitle(paste0("Motivational Impediments (n = ", sum(motive_plot$n), ")"),
          subtitle = "What’s the main reason you don’t want to be vaccinated?") +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  labs(x = "Motivational Impediment", y = "count")

3.3.2.2 Motivational (Sub-Level)

motive_sub <- df %>% filter(!is.na(motive_main)) %>%
  select(motive_main, benefit_main, risk_main, belief_main)

# clean up answers to standard responses
motive_sub$benefit_main[!motive_sub$benefit_main %in% c("covid not dangerous", "unlikely to get sick", "i can recover", "had covid already", NA)] <- "other reason(s)"
motive_sub$risk_main[!motive_sub$risk_main %in% c("don’t trust pharma", "bad side effects", "needles/pain", "vaccines don’t work", NA)] <- "other reason(s)"
motive_sub$belief_main[!motive_sub$belief_main %in% c("don’t trust gov", "religious reasons", "freedom to choose", NA)] <- "other reason(s)"


motive_sub_benefit <- motive_sub %>% filter(!is.na(benefit_main)) %>%
  count(benefit_main) %>% mutate(percentage = n / sum(n)) %>% 
  arrange(-n) %>% mutate(benefit_main = fct_inorder(benefit_main))

motive_sub_risk <- motive_sub %>% filter(!is.na(risk_main)) %>%
  count(risk_main) %>% mutate(percentage = n / sum(n)) %>% 
  arrange(-n) %>% mutate(risk_main = fct_inorder(risk_main))

motive_sub_belief <- motive_sub %>% filter(!is.na(belief_main)) %>%
  count(belief_main) %>% mutate(percentage = n / sum(n)) %>% 
  arrange(-n) %>% mutate(belief_main = fct_inorder(belief_main))

benefit_plot <- ggplot(motive_sub_benefit, aes(x = benefit_main, y = n, fill = benefit_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  ggtitle(paste0("Motivational Impediments - Benefit (n = ", sum(motive_sub_benefit$n), ")"),
          subtitle = "Is there a main reason why (there isn't a benefit)?") +
  ylim(0, max(motive_sub_benefit$n) + 5) +
  labs(x = "reasons", y = "count")

risk_plot <- ggplot(motive_sub_risk, aes(x = risk_main, y = n, fill = risk_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  ggtitle(paste0("Motivational Impediments - Risk (n = ", sum(motive_sub_risk$n), ")"),
          subtitle = "Is there a main reason why (there is risk)?") +
  ylim(0, max(motive_sub_risk$n) + 5) +
  labs(x = "reasons", y = "count")

belief_plot <- ggplot(motive_sub_belief, aes(x = belief_main, y = n, fill = belief_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  ggtitle(paste0("Motivational Impediments - Belief (n = ", sum(motive_sub_belief$n), ")"),
          subtitle = "Is there a main reason why (against your belief)?") +
  ylim(0, max(motive_sub_belief$n) + 5) +
  labs(x = "reasons", y = "count")

# plot_grid(benefit_plot, risk_plot, belief_plot, nrow = 3)

3.3.2.2.1 Benefit

We asked: is there a main reason why you think there isn’t a benefit?

Provided options:

covid not dangerous
unlikely to get sick
had covid already
i can recover
other reason(s)

The distribution of the answers demonstrates below:

benefit_plot

3.3.2.2.2 Risk

We asked: is there a main reason why you think there is risk?

Provided options:

don’t trust pharma
vaccines don’t work
bad side effects
needles/pain
other reason(s)

The distribution of the answers demonstrates below:

risk_plot

3.3.2.2.3 Belief

We asked: is there a main reason why against your belief?

Provided options:

don’t trust gov
religious reasons
freedom to choose
other reason(s)

The distribution of the answers demonstrates below:

belief_plot

3.3.2.3 Ability (Main Level)

We asked: What’s the main difficulty of getting vaccinated?

Provided options:

no vax available (availability)
lack of time (time)
lack of money (money)

The distribution of the answers demonstrates below:

ability <- df %>% filter(!is.na(ability_main))

ability_plot <- ability %>% count(ability_main) %>% mutate(percentage = n / sum(n, na.rm = T)) %>% 
  arrange(-n) %>% mutate(ability_main = fct_inorder(ability_main))

ggplot(ability_plot, aes(x = ability_main, y = n, fill = ability_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  ggtitle(paste0("Ability Impediments (n = ", sum(ability_plot$n), ")"),
          subtitle = "What’s the main difficulty of getting vaccinated?") +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  labs(x = "Ability Impediment", y = "count")

3.3.2.4 Ability (Sub-Level)

ability_sub <- df %>% filter(!is.na(ability_main)) %>%
  select(ability_main, availability_main, time_main, money_main)

# clean up answers to standard responses
ability_sub$availability_main[!ability_sub$availability_main %in% c("too far away", "no vaccines left", NA)] <- "other reason(s)"
ability_sub$time_main[!ability_sub$time_main %in% c("no time to research", "getting off work", "childcare", NA)] <- "other reason(s)"
ability_sub$money_main[!ability_sub$money_main %in% c("travel costs", "no insurance", "no cash", NA)] <- "other reason(s)"


ability_sub_availability <- ability_sub %>% filter(!is.na(availability_main)) %>%
  count(availability_main) %>% mutate(percentage = n / sum(n)) %>% 
  arrange(-n) %>% mutate(availability_main = fct_inorder(availability_main))

ability_sub_time <- ability_sub %>% filter(!is.na(time_main)) %>%
  count(time_main) %>% mutate(percentage = n / sum(n)) %>% 
  arrange(-n) %>% mutate(time_main = fct_inorder(time_main))

ability_sub_money <- ability_sub %>% filter(!is.na(money_main)) %>%
  count(money_main) %>% mutate(percentage = n / sum(n)) %>% 
  arrange(-n) %>% mutate(money_main = fct_inorder(money_main))

availability_plot <- ggplot(ability_sub_availability, aes(x = availability_main, y = n, fill = availability_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  ggtitle(paste0("Ability Impediments - Availability (n = ", sum(ability_sub_availability$n), ")"),
          subtitle = "Is there a main reason why (there isn't availability)?") +
  ylim(0, max(ability_sub_availability$n) + 5) +
  labs(x = "reasons", y = "count")

time_plot <- ggplot(ability_sub_time, aes(x = time_main, y = n, fill = time_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  ggtitle(paste0("Ability Impediments - Time (n = ", sum(ability_sub_time$n), ")"),
          subtitle = "Is there a main reason why (there isn't time)?") +
  ylim(0, max(ability_sub_time$n) + 5) +
  labs(x = "reasons", y = "count")


money_plot <- ggplot(ability_sub_money, aes(x = money_main, y = n, fill = money_main)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = paste0(n, " (", round(percentage * 100,2), "%)")), vjust = -0.25) +
  theme_bw() +
  theme(plot.title = element_text(face = "bold"), legend.position = "none") +
  ggtitle(paste0("Ability Impediments - Money (n = ", sum(ability_sub_money$n), ")"),
          subtitle = "Is there a main reason why (there isn't money)?") +
  ylim(0, max(ability_sub_money$n) + 5) +
  labs(x = "reasons", y = "count")


#plot_grid(availability_plot, time_plot, money_plot, nrow = 3)

3.3.2.4.1 Availability

We asked: is there a main reason why there isn’t availability?

Provided options:

too far away
no vaccines left
other reason(s)

availability_plot

3.3.2.4.2 Time

We asked: is there a main reason why there isn’t time?

Provided options:

no time to research
getting off work
childcare
other reason(s)

time_plot

3.3.2.4.3 Money

We asked: is there a main reason why there isn’t money?

Provided options:

travel costs
no insurance
no cost

money_plot

3.4 Demographic Association

The correlation plots here aim to understand what demographic variables and ability impediments are directly correlated with each other. We want to filter out the demographic variables that are highly correlated with other demographic variables and the demographic variables that are not related to the motivation/ability impediment to minimize the number of questions respondents have to answer.

Since the correlation matrices provided by ggcorrplot() shows the correlation coefficients between continuous variables, we mapped binary and ordinal variables to continuous variables. Details on this mapping are provided below:

female: 1 if female, 0 if male
income: 0 if the participant is unemployed, 1 if household income < R5,000, 2 if household income in R5,000 – R9,999, …, 6 if household income > R100,000
education: 1 if the participant’s education < high school, 2 if education is high school, …, 6 if education is a graduate degree
religiosity: 1 if the participant is not very religious, 2 if somewhat religious, 3 if very religious
politics: 1 if the participant is conservative, 2 if moderate, 3 if liberal
location: 1 if the participant lives in rural, 2 if suburban, 3 if urban,
white: 1 if the participant is a white or caucasian, 0 if not
no_motive: 1 if the participant does not have the motive to get vax, 0 otherwise
- no_benefits: 1 if participant believes vaccine has no benefits, 0 otherwise
- against_beliefs:: 1 if participant believes vaccine is against their beliefs, 0 otherwise
- risky:: 1 if participant believes vaccine is risky, 0 otherwise
no_ability: 1 if the participant does not have the ability to get vax, 0 otherwise
- no_time: 1 if participant has no time to get vaxxed, 0 otherwise
- no_money: 1 if participant has no money to get vaxxed, 0 otherwise
- no_availability: 1 if participant faces vax availability issues, 0 otherwise

3.4.1 Correlation Plot: Demographics only

The correlation plot among just demographic variables can be seen below [crossed-out correlations are not statistically significant]:

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    ability = case_when(
      ability == "no" ~ 0,
      ability == "yes" ~ 1,
    ),
    motive = case_when(
      motive == "no" ~ 0,
      motive == "yes" ~ 1,
    ),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("female", "income", "education", "religiosity", "politics", "location", "white", "unvax")

ggcorrplot(cor(df_mc_numeric, use = "pairwise.complete.obs"), type = "lower", lab = TRUE, lab_size = 3, tl.cex = 10, p.mat = cor_pmat(df_mc_numeric)) + 
  ggtitle("Correlation Plot (demographic variables only)")

3.4.2 Correlation Plot: Motivation

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("no_motive", "female", "income", "education", "religiosity", "politics", "location", "white")


mat1 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat1_p <- cor_pmat(df_mc_numeric)[1, 2:8]

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("no_benefits", "female", "income", "education", "religiosity", "politics", "location", "white")


mat2 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat2_p <- cor_pmat(df_mc_numeric)[1, 2:8]

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("against_beliefs", "female", "income", "education", "religiosity", "politics", "location", "white")

mat3 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat3_p <- cor_pmat(df_mc_numeric)[1, 2:8]

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("risky", "female", "income", "education", "religiosity", "politics", "location", "white")

mat4 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat4_p <- cor_pmat(df_mc_numeric)[1, 2:8]

final_mat <-
  tibble(
    mat4, mat3, mat2, mat1
  ) %>% 
  as.matrix()

final_mat_p <-
  tibble(
    mat4_p, mat3_p, mat2_p, mat1_p
  ) %>% 
  as.matrix() 

rownames(final_mat) <- c("female", "income", "education", "religiosity", "politics", "location", "white")
colnames(final_mat) <- c("risky", "against_beliefs", "no_benefits", "no_motive")
colnames(final_mat_p) <- c("risky", "against_beliefs", "no_benefits", "no_motive")

ggcorrplot(final_mat, p.mat = final_mat_p, lab_size = 3, lab = T, tl.cex = 10)

3.4.3 Correlation Plot: Ability

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("no_ability", "female", "income", "education", "religiosity", "politics", "location", "white")


mat5 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat5_p <- cor_pmat(df_mc_numeric)[1, 2:8]

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("no_time", "female", "income", "education", "religiosity", "politics", "location", "white")

mat6 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat6_p <- cor_pmat(df_mc_numeric)[1, 2:8]

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("no_money", "female", "income", "education", "religiosity", "politics", "location", "white")

mat7 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat7_p <- cor_pmat(df_mc_numeric)[1, 2:8]

df_mc_numeric <-
  df %>%
  mutate(
    female = case_when(
      gender == "female" ~ 1,
      gender == "male" ~ 0,
    ),
    income = case_when(
      income == "Unemployed" ~ 0,
      income == "< R5,000" ~ 1,
      income == "R5,000 – R9,999" ~ 2,
      income == "R10,000 – R29,999" ~ 3,
      income == "R30,000 – R49,999" ~ 4,
      income == "R50,000 – R99,999" ~ 5,
      income == "> R100,000" ~ 6,
    ),
    education = case_when(
      education == "< high school" ~ 1,
      education == "high school" ~ 2,
      education == "some college" ~ 3,
      education == "2-year degree" ~ 4,
      education == "4-year degree" ~ 5,
      education == "graduate degree" ~ 6,
    ),
    religiosity = case_when(
      religiosity == "not very religious" ~ 1,
      religiosity == "somewhat religious" ~ 2,
      religiosity == "very religious" ~ 3,
    ),
    politics = case_when(
      politics == "conservative" ~ 1,
      politics == "moderate" ~ 2,
      politics == "liberal" ~ 3,
    ),
    location = case_when(
      location == "rural" ~ 1,
      location == "suburban" ~ 2,
      location == "urban" ~ 3,
    ),
    white = case_when(
      ethnicity == "white or caucasian" ~ 1,
      ethnicity != "white or caucasian" ~ 0
    ),
    no_ability = case_when(
      ability == "no" ~ 1,
      ability == "yes" ~ 0,
    ),
    no_motive = case_when(
      motive == "no" ~ 1,
      motive == "yes" ~ 0,
    ),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
    treat_nchar = nchar(best_treatment_explain),
    unvax = if_else(vax_status == "unvax", 1L, 0L)
  ) %>%
  select("no_availability", "female", "income", "education", "religiosity", "politics", "location", "white")

mat8 <- cor(df_mc_numeric, use = "pairwise.complete.obs")[1, 2:8]
mat8_p <- cor_pmat(df_mc_numeric)[1, 2:8]

final_mat <-
  tibble(
    mat8, mat7, mat6, mat5
  ) %>% 
  as.matrix()

final_mat_p <-
  tibble(
    mat8_p, mat7_p, mat6_p, mat5_p
  ) %>% 
  as.matrix() 

rownames(final_mat) <- c("female", "income", "education", "religiosity", "politics", "location", "white")
colnames(final_mat) <- c("no_availability", "no_money", "no_time", "no_ability")
colnames(final_mat_p) <- c("no_availability", "no_money", "no_time", "no_ability")

ggcorrplot(final_mat, p.mat = final_mat_p, lab_size = 3, lab = T, tl.cex = 10)

4 Funnel outcomes

Details on each of the 15 ads used can be found here.

Setup:

1 campaign
5 ad sets, 3 ads in each ad set
15 ads split into:
- 3 impediment themes (3 inaccessible, 6 risky, 6 unnecessary)
- 2 different prompts (6 control and 9 airtime)
- 9 different images (Images 1-6 used twice, images 7-9 used once)

Metrics explanation:

Impressions (Total Count) = the total number of times our ad has been viewed
Clickthrough (%) = #clicks / #impressions
Messages Sent (%) = #conversations / #clicks
Consent Obtained (%) = #consents / #conversations
Core Survey Complete (%) = #forking section completed / #consents
Treatment Complete (%) = #treatment section completed / #forking section completed
Demo Questions Complete (%) = #demog section completed / #treatment section completed
Full Survey Complete (%) = #full chat completed / #demog section completed
Total characters elicited per completed survey (treatment) = average #character in best treatment explanation per full chat completed
Avg characters elicited per completed survey (impediment explanations) = average #character in impediment explanations per full chat completed
Cost per Impression = amount spent / #impressions (in USD)
Cost per Link Click = amount spent / #clicks (in USD)
Cost per Survey Complete (All participants) = amount spent / #full chat completed (in USD)
Cost per Survey Complete (Unvax) = amount spent / #full chat completed with unvaccinated participants (in USD)
Cost per Survey Complete (Unvax, Open to Treatment) = amount spent / #full chat completed with unvaccinated and open to treatment participants (in USD)

Let’s first look at overall costs.

4.1 Costs - Overall

# overall ad cost for full pilot
df_temp <- df_full
ads_latest <- ads

df_ads_unvax_current <- df_temp %>% filter(vax_status == "unvax")
df_ads_unvax_open_current <- df_ads_unvax_current %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_latest$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_latest$Impressions, na.rm = T)
clicks <- sum(ads_latest$Link.clicks, na.rm = T)
conversations <- sum(ads_latest$Results, na.rm = T)

consents <- sum(df_temp$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_temp$main_complete), na.rm = T)
treatment_complete <- sum(df_temp$treatment_complete == "yes", na.rm = T)
demog_complete <- sum(!is.na(df_temp$demog_complete), na.rm = T)
full_complete <- sum(df_temp$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax_current$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open_current$full_complete == "yes", na.rm = T)


dropoff_current <- data.frame(
  metric = c("Impressions", "Link Clicks", "Messages Sent", 
             "Consent Obtained", "Core Survey Complete", "Treatment complete",
             "Demographic Questions Complete", "Full Survey Complete", 
             "Full Survey Complete (Unvax)", "Full Survey Complete (Unvax, Open to Treatment)"), 
  
  total = c(impressions, clicks, conversations, consents, 
            core_complete, treatment_complete, demog_complete, full_complete,
            full_complete_unvax, full_complete_unvax_open),
  
  perc_of_prev = c("-", form_percent(clicks / impressions),
                   form_percent(conversations / clicks),
                   form_percent(consents / conversations),
                   form_percent(core_complete / consents),
                   form_percent(treatment_complete/core_complete),
                   form_percent(demog_complete / treatment_complete),
                   form_percent(full_complete / demog_complete),
                   form_percent(full_complete_unvax / full_complete),
                   form_percent(full_complete_unvax_open / full_complete_unvax)),
  
  cost_per = c(form_cost(impressions), form_cost(clicks),
               form_cost(conversations), form_cost(consents),
               form_cost(core_complete), form_cost(treatment_complete),
               form_cost(demog_complete), form_cost(full_complete),
               form_cost(full_complete_unvax), form_cost(full_complete_unvax_open)))


dropoff_current %>% 
  rename(
    Metric = metric,
    `Total` = total,
    `% of Previous Funnel` = perc_of_prev,
    `$ Cost per` = cost_per
  ) %>%
  datatable()

Let’s now break things down by ad type. The following analyses show 3 tables using different combinations of 15 distinct ads:

split by 3 ad impediment themes
split by 2 ad body text approaches
split by 9 ad images

4.2 Costs: By Ad-type

4.2.1 Unnecessary vs Risky vs Inaccessible ads

This table compared three Ad impediment sources (vaccine is unnecessary vs vaccine is risky vs vaccine is inaccessible) in terms of the metrics described above.

ads_specific <- ads %>% filter(`Analysis 3 - impediment theme` == "unnecessary")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_1 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_1) <- c("Metric of Interest", "Unnecessary (6)")

ads_specific <- ads %>% filter(`Analysis 3 - impediment theme` == "risky")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_2 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_2) <- c("Metric of Interest", "Risky (6)")

ads_specific <- ads %>% filter(`Analysis 3 - impediment theme` == "inaccessible")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_3 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_3) <- c("Metric of Interest", "Inaccessible (3)")

## MERGE dfs ##
dropoff_1 %>% 
  full_join(dropoff_2, by = "Metric of Interest") %>%
  full_join(dropoff_3, by = "Metric of Interest") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.2.2 Control vs Airtime

This table compared two Ad body text approaches - control (share your opinion) vs (take a short survey and earn airtime) - in terms of the metrics described above.

ads_specific <- ads %>% filter(`Analysis 4 - body text` == "control")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_1 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_1) <- c("Metric of Interest", "Control (6)")

ads_specific <- ads %>% filter(`Analysis 4 - body text` == "airtime")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_2 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_2) <- c("Metric of Interest", "Airtime (9)")

## MERGE dfs ##
dropoff_1 %>% 
  full_join(dropoff_2, by = "Metric of Interest") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.2.3 Image-wise Split

This table compared nine images in terms of the metrics described above.

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image1")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_1 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_1) <- c("Metric of Interest", "Image 1 (2)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image2")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_2 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_2) <- c("Metric of Interest", "Image 2 (2)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image3")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_3 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_3) <- c("Metric of Interest", "Image 3 (2)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image4")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_4 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_4) <- c("Metric of Interest", "Image 4 (2)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image5")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_5 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_5) <- c("Metric of Interest", "Image 5 (2)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image6")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_6 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_6) <- c("Metric of Interest", "Image 6 (2)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image7")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_7 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_7) <- c("Metric of Interest", "Image 7 (1)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image8")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_8 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_8) <- c("Metric of Interest", "Image 8 (1)")

ads_specific <- ads %>% filter(`Analysis 2 - image` == "image9")

df_ads <- df_full %>% filter(original_ref %in% (ads_specific %>% pull(original_ref)))
df_ads_unvax <- df_ads %>% filter(vax_status == "unvax")
df_ads_unvax_open <- df_ads_unvax %>% filter(vax_future %in% c("maybe", "of course!"))

cost <- sum(ads_specific$Amount.spent..USD., na.rm = T)
impressions <- sum(ads_specific$Impressions, na.rm = T)
clicks <- sum(ads_specific$Link.clicks, na.rm = T)
conversations <- sum(ads_specific$Results, na.rm = T)

consents <- sum(df_ads$consent == "yes", na.rm = T)
core_complete <- sum(!is.na(df_ads$main_complete), na.rm = T)
treat_complete <- sum(!is.na(df_ads$treatment_complete), na.rm = T)
demog_complete <- sum(!is.na(df_ads$demog_complete), na.rm = T)
full_complete <- sum(df_ads$full_complete == "yes", na.rm = T)
full_complete_unvax <- sum(df_ads_unvax$full_complete == "yes", na.rm = T)
full_complete_unvax_open <- sum(df_ads_unvax_open$full_complete == "yes", na.rm = T)

treat_nchar <-
  df_ads %>% 
  transmute(
    treat_nchar = nchar(best_treatment_explain)
  ) %>% 
  summarize_all(sum, na.rm = T) %>% 
  pull(treat_nchar)

free_text_vars <- df_ads %>% select(ends_with("_explain")) %>% colnames()
free_text_vars <- free_text_vars[!free_text_vars %in% c('best_treatment_explain')]
mean_nchar <-
  df_ads %>%
  transmute_at(
    vars(free_text_vars),
    ~ nchar(.)
  ) %>%
  summarize_all(sum, na.rm = T) %>% 
  mutate(
    mean_nchar =
      pmap_dbl(
        select(., ends_with("_explain")),
        ~ mean(c(...), na.rm = TRUE)
      )
  ) %>% 
  pull(mean_nchar) %>% 
  round(2)

dropoff_9 <- tibble(
  metric = c(
    "Impressions (Total Count)", 
    "Clickthrough (%)",
    "Messages Sent (%)",
    "Consent Obtained (%)", 
    "Core Survey Complete (%)", 
    "Treatment Complete (%)",
    "Demo Questions Complete (%)",
    "Full Survey Complete (%)",
    "Total characters elicited per completed survey (treatment)",
    "Avg characters elicited per completed survey (impediment explanations)",
    "Cost per Impression",
    "Cost per Link Click",
    "Cost per Survey Complete (All participants)",
    "Cost per Survey Complete (Unvax)",
    "Cost per Survey Complete (Unvax, Open to Treatment)"
  ),
  ad = c(
    impressions, 
    form_percent(clicks / impressions),
    form_percent(conversations / clicks),
    form_percent(consents / conversations),
    form_percent(core_complete / consents),
    form_percent(treat_complete / core_complete),
    form_percent(demog_complete / treat_complete),
    form_percent(full_complete / demog_complete),
    round(treat_nchar/full_complete, 2),
    round(mean_nchar/full_complete, 2),
    form_cost(impressions), 
    form_cost(clicks),
    form_cost(full_complete),
    form_cost(full_complete_unvax),
    form_cost(full_complete_unvax_open)
  )
)

colnames(dropoff_9) <- c("Metric of Interest", "Image 9 (1)")

## MERGE dfs ##
dropoff_1 %>% 
  full_join(dropoff_2, by = "Metric of Interest") %>%
  full_join(dropoff_3, by = "Metric of Interest") %>%
  full_join(dropoff_4, by = "Metric of Interest") %>%
  full_join(dropoff_5, by = "Metric of Interest") %>%
  full_join(dropoff_6, by = "Metric of Interest") %>%
  full_join(dropoff_7, by = "Metric of Interest") %>%
  full_join(dropoff_8, by = "Metric of Interest") %>%
  full_join(dropoff_9, by = "Metric of Interest") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

Let’s narrow our focus and now look at the costs per completed survey for every ad for only unvaccinated respondents.

4.3 Costs per completion: By Ad and Impediment

4.3.1 By overall impediments

First, let’s break these down for those with:

motivation impediments
ability impediments

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    no_motive = case_when(
      motive == "no" ~ 1L,
      motive == "yes" ~ 0L,
    ),
    no_ability = case_when(
      ability == "no" ~ 1L,
      ability == "yes" ~ 0L,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    no_motive = round(cost/sum(no_motive, na.rm = T), 3) %>% str_c("$", .),
    no_ability = round(cost/sum(no_ability, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.3.2 By motivation sub-impediments

Costs per completed survey broken down by ad for those with motivation sub-impediments:

No benefits
Risky
Against beliefs

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    no_motive = case_when(
      motive == "no" ~ 1L,
      motive == "yes" ~ 0L,
    ),
    no_ability = case_when(
      ability == "no" ~ 1L,
      ability == "yes" ~ 0L,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    no_benefits = round(cost/sum(no_benefits, na.rm = T), 3) %>% str_c("$", .),
    risky = round(cost/sum(risky, na.rm = T), 3) %>% str_c("$", .),
    against_beliefs = round(cost/sum(against_beliefs, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.3.3 By ability sub-impediments

Costs per completed survey broken down by ad for those with ability sub-impediments:

No time
No money
No availability

Note: Blank cells represent no respondents for that ad-impediment combination.

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    no_motive = case_when(
      motive == "no" ~ 1L,
      motive == "yes" ~ 0L,
    ),
    no_ability = case_when(
      ability == "no" ~ 1L,
      ability == "yes" ~ 0L,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    no_time = round(cost/sum(no_time, na.rm = T), 3) %>% str_c("$", .),
    no_money = round(cost/sum(no_money, na.rm = T), 3) %>% str_c("$", .),
    no_availability = round(cost/sum(no_availability, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.4 Costs per completion: By ad and demographic

4.4.1 Gender

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    female = (gender == "female"),
    male = (gender == "male"),
  ) %>% 
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    females = round(cost/sum(female, na.rm = T), 3) %>% str_c("$", .),
    males = round(cost/sum(male, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.4.2 Education

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    high_school_or_less = (education == "< high school" | education == "high school"),
    more_than_high_school = !(education == "< high school" | education == "high school" | education == "prefer not to say" | education == "other"),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    high_school_or_less = round(cost/sum(high_school_or_less, na.rm = T), 3) %>% str_c("$", .),
    more_than_high_school = round(cost/sum(more_than_high_school, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.4.3 Income

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    less_than_5000_rand = (income == "< R5,000"),
    more_than_5000_rand = !(income == "< R5,000" | income == "prefer not to say" | income == "other"),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    less_than_5000_rand = round(cost/sum(less_than_5000_rand, na.rm = T), 3) %>% str_c("$", .),
    more_than_5000_rand = round(cost/sum(more_than_5000_rand, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.4.4 Religion

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    christian = str_detect(religion, "christ"),
    not_christian = !(str_detect(religion, "christ")),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    christian = round(cost/sum(christian, na.rm = T), 3) %>% str_c("$", .),
    not_christian = round(cost/sum(not_christian, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.4.5 Location

df %>%
  filter(vax_status == "unvax") %>%
  mutate(
    rural = (location == "rural"),
    urban = (location == "urban"),
    suburban = (location == "suburban"),
  ) %>%
  inner_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(cost = sum(Amount.spent..USD., na.rm = T)),
    by = "ad_name"
  ) %>% 
  group_by(ad_name) %>%
  summarise(
    cost = mean(cost, na.rm = T),
    rural = round(cost/sum(rural, na.rm = T), 3) %>% str_c("$", .),
    suburban = round(cost/sum(suburban, na.rm = T), 3) %>% str_c("$", .),
    urban = round(cost/sum(urban, na.rm = T), 3) %>% str_c("$", .),
  ) %>%
  mutate_all(~ na_if(., "$Inf")) %>%
  select(-cost) %>%
  clean_names(case = "title") %>%
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

4.5 Impediments by ad category clicked

4.5.1 Motivational

df %>%
  drop_na(original_ref, motive_main, `Analysis 3 - impediment theme`) %>%
  count(`Analysis 3 - impediment theme`, motive_main) %>%
  mutate(
    motive_main = case_when(
      motive_main == "belief" ~ "Against beliefs",
      motive_main == "benefit" ~ "No benefits",
      motive_main == "risk" ~ "Risky",
      motive_main == "other" ~ "Other",
    ),
    `Analysis 3 - impediment theme` = `Analysis 3 - impediment theme` %>% str_to_sentence()
  ) %>%
  arrange(`Analysis 3 - impediment theme`, n) %>%
  mutate(motive_main = fct_inorder(motive_main)) %>%
  ggplot(aes(`Analysis 3 - impediment theme`, n, fill = motive_main)) +
  geom_col(position = "dodge", alpha = 0.8, color="black") +
  theme_bw() +
  theme(legend.position = "bottom") +
  labs(
    x = "Category of ad clicked",
    y = "Count",
    fill = "Motivational impediment",
    title = "Distribution of motivational impediments by ad category"
  )

4.5.2 Ability

df %>%
  drop_na(original_ref, ability_main, `Analysis 3 - impediment theme`) %>%
  count(`Analysis 3 - impediment theme`, ability_main) %>%
  mutate(
    ability_main = case_when(
      ability_main == "time" ~ "No time",
      ability_main == "money" ~ "No money",
      ability_main == "availability" ~ "No availability",
      ability_main == "other" ~ "Other",
    ),
    `Analysis 3 - impediment theme` = `Analysis 3 - impediment theme` %>% str_to_sentence()
  ) %>%
  arrange(`Analysis 3 - impediment theme`, n) %>%
  mutate(ability_main = fct_inorder(ability_main)) %>%
  ggplot(aes(`Analysis 3 - impediment theme`, n, fill = ability_main)) +
  geom_col(position = "dodge", alpha = 0.8, color="black") +
  theme_bw() +
  theme(legend.position = "bottom") +
  labs(
    x = "Category of ad clicked",
    y = "Count",
    fill = "Ability impediment",
    title = "Distribution of ability impediments by ad category"
  )

5 ATE

We consider 3 proxy measures to assess the average treatment effect of each ad (as well as an option for respondents who did not reach the survey through ads):

P(click|impression): Probability of clicking on ad conditional on viewing it.
P(consent|click): Probability of consenting to the survey conditional on clicking the ad.
P(complete|consent): Probability of completing the survey conditional on consenting to it.

The ad-wise performance on these metrics can be seen below:

df_full %>% 
  filter(vax_status == "unvax") %>%
  group_by(ad_name) %>%
  summarise(
    consents = sum(consent == "yes", na.rm = T),
    full_complete = sum(full_complete == "yes", na.rm = T)
  ) %>%
  mutate(ad_name = if_else(is.na(ad_name), "No ad", ad_name)) %>% 
  left_join(
    ads %>% 
      group_by(ad_name) %>% 
      summarise(
        cost = sum(Amount.spent..USD., na.rm = T),
        impressions = sum(Impressions, na.rm = T),
        clicks = sum(Link.clicks, na.rm = T)
      ),
    by = "ad_name"
  ) %>% 
  transmute(
    `Ad Name` = ad_name,
    `P(click|impression)` = round(clicks/impressions, digits = 3),
    `P(consent|click)` = round(consents/clicks, digits = 3),
    `P(complete|consent)` = round(full_complete/consents, digits = 3),
  ) %>% 
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

We see that:

pilot_v5_ad5_image7 does best if our metric is P(click|impression)
pilot_v5_ad1_image1 does best if our metric is P(consent|click)
pilot_v5_ad3_image4 does best if our metric is P(complete|consent), excluding respondents not reaching the survey through our ads.

6 HTE subgroup analysis

We consider P(complete|consent) for subgroup analyses because we lack respondent-level information for ad impressions and ad clicks.

6.1 HTE by Vaccination Status

clean_up_demog(df_full) %>%
  drop_na(vax_status) %>% 
  group_by(ad_name, vax_status) %>%
  summarise(
    consents = sum(consent == "yes", na.rm = T),
    full_complete = sum(full_complete == "yes", na.rm = T)
  ) %>% 
  mutate(ad_name = if_else(is.na(ad_name), "No ad", ad_name)) %>%
  pivot_wider(names_from = vax_status, values_from = c(consents, full_complete)) %>% 
  ungroup() %>% 
  transmute(
    `Ad Name` = ad_name,
    `P(complete|consent) - Unvax` = round(full_complete_unvax/consents_vax, digits = 3),
    `P(complete|consent) - Vax` = round(full_complete_vax/consents_vax, digits = 3),
  ) %>% 
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

6.2 HTE by Impediments [Unvaccinated respondents]

df_full %>%
  filter(vax_status == "unvax") %>%
  mutate(
    no_motive = case_when(
      motive == "no" ~ 1L,
      motive == "yes" ~ 0L,
    ),
    no_ability = case_when(
      ability == "no" ~ 1L,
      ability == "yes" ~ 0L,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
  ) %>%
  drop_na(no_motive, no_ability) %>% 
  group_by(ad_name, no_motive, no_ability) %>%
  summarise(
    consents = sum(consent == "yes", na.rm = T),
    full_complete = sum(full_complete == "yes", na.rm = T)
  ) %>% 
  ungroup() %>% 
  mutate(
    ad_name = if_else(is.na(ad_name), "No ad", ad_name),
    prob = round(full_complete/consents, 3)
  ) %>%
  select(-c(consents, full_complete)) %>% 
  pivot_wider(names_from = c(no_motive, no_ability), values_from = prob) %>% 
  select(
    `Ad Name` = ad_name,
    `P(complete|consent) - No impediment` = `0_0`,
    `P(complete|consent) - No motive` = `1_0`,
    `P(complete|consent) - No ability` = `0_1`,
    `P(complete|consent) - Both impediments` = `1_1`,
  ) %>% 
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

6.3 HTE by Sub-motivation impediments [Unvaccinated respondents]

df_full %>%
  filter(vax_status == "unvax") %>%
  mutate(
    no_motive = case_when(
      motive == "no" ~ 1L,
      motive == "yes" ~ 0L,
    ),
    no_ability = case_when(
      ability == "no" ~ 1L,
      ability == "yes" ~ 0L,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
  ) %>%
  drop_na(against_beliefs, no_benefits, risky) %>% 
  group_by(ad_name, against_beliefs, no_benefits, risky) %>%
  summarise(
    consents = sum(consent == "yes", na.rm = T),
    full_complete = sum(full_complete == "yes", na.rm = T)
  ) %>% 
  ungroup() %>% 
  mutate(
    ad_name = if_else(is.na(ad_name), "No ad", ad_name),
    prob = round(full_complete/consents, 3)
  ) %>%
  select(-c(consents, full_complete)) %>% 
  pivot_wider(names_from = c(against_beliefs, no_benefits, risky), values_from = prob) %>% 
  select(
    `Ad Name` = ad_name,
    `P(complete|consent) - Against beliefs` = `1_0_0`,
    `P(complete|consent) - No benefits` = `0_1_0`,
    `P(complete|consent) - Risky` = `0_0_1`,
  ) %>% 
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

6.4 HTE by Sub-ability impediments [Unvaccinated respondents]

df_full %>%
  filter(vax_status == "unvax") %>%
  mutate(
    no_motive = case_when(
      motive == "no" ~ 1L,
      motive == "yes" ~ 0L,
    ),
    no_ability = case_when(
      ability == "no" ~ 1L,
      ability == "yes" ~ 0L,
    ),
    against_beliefs = if_else(motive_main == "belief", 1L, 0L),
    no_benefits = if_else(motive_main == "benefit", 1L, 0L),
    risky = if_else(motive_main == "risk", 1L, 0L),
    no_time = if_else(ability_main == "time", 1L, 0L),
    no_money = if_else(ability_main == "money", 1L, 0L),
    no_availability = if_else(ability_main == "availability", 1L, 0L),
  ) %>%
  drop_na(no_time, no_money, no_availability) %>% 
  group_by(ad_name, no_time, no_money, no_availability) %>%
  summarise(
    consents = sum(consent == "yes", na.rm = T),
    full_complete = sum(full_complete == "yes", na.rm = T)
  ) %>% 
  ungroup() %>% 
  mutate(
    ad_name = if_else(is.na(ad_name), "No ad", ad_name),
    prob = round(full_complete/consents, 3)
  ) %>%
  select(-c(consents, full_complete)) %>% 
  pivot_wider(names_from = c(no_time, no_money, no_availability), values_from = prob) %>% 
  select(
    `Ad Name` = ad_name,
    `P(complete|consent) - No time` = `1_0_0`,
    `P(complete|consent) - No money` = `0_1_0`,
    `P(complete|consent) - No availability` = `0_0_1`,
  ) %>% 
  datatable(options = list(pageLength = 25, columnDefs = list(list(orderable = TRUE, targets = 0))))

7 HTE Optimal Policy

Comparing best overall ad to best ad by subgroup

8 Power Calculations

Consider different outcomes (CTR, clicks/impression, completes/impressions)

9 Student To-dos

Students will:

run power calculations to determine total number of ads to run.
identify ads that are effectively attracting particular subgroups to reuse those ads.
identify subgroups that ads are not effectively attracting.
develop text for new ads to attract those subgroups.
ALP team then executes experiment with 6 chosen ads.
conduct analyses again to see if new ads effective and old ads continue to be effective.

ALP 308: Vaccine Hesitancy Tutorial

Updated: April 10, 2022

1 Introduction

2 Data Structure

3 Understanding the data

3.1 General cross-tabs

3.2 Vaccination status

3.3 Impediments

3.3.1 Overview

3.3.2 Analysis for each impediment

3.3.2.1 Motivational (Main Level)

3.3.2.2 Motivational (Sub-Level)

3.3.2.2.1 Benefit

3.3.2.2.2 Risk

3.3.2.2.3 Belief

3.3.2.3 Ability (Main Level)

3.3.2.4 Ability (Sub-Level)

3.3.2.4.1 Availability

3.3.2.4.2 Time

3.3.2.4.3 Money

3.4 Demographic Association

3.4.1 Correlation Plot: Demographics only

3.4.2 Correlation Plot: Motivation

3.4.3 Correlation Plot: Ability

4 Funnel outcomes

4.1 Costs - Overall

4.2 Costs: By Ad-type

4.2.1 Unnecessary vs Risky vs Inaccessible ads

4.2.2 Control vs Airtime

4.2.3 Image-wise Split

4.3 Costs per completion: By Ad and Impediment

4.3.1 By overall impediments

4.3.2 By motivation sub-impediments

4.3.3 By ability sub-impediments

4.4 Costs per completion: By ad and demographic

4.4.1 Gender

4.4.2 Education

4.4.3 Income

4.4.4 Religion

4.4.5 Location

4.5 Impediments by ad category clicked

4.5.1 Motivational

4.5.2 Ability

5 ATE

6 HTE subgroup analysis

6.1 HTE by Vaccination Status

6.2 HTE by Impediments [Unvaccinated respondents]

6.3 HTE by Sub-motivation impediments [Unvaccinated respondents]

6.4 HTE by Sub-ability impediments [Unvaccinated respondents]

7 HTE Optimal Policy

8 Power Calculations

9 Student To-dos