Goal

The goal of this document is to provide comprehensive summary statistics for the inoculation against misinformation project that replaces analyses before April 2024.

Experimental Design

The goal of this experiment is to study the effect of different text message courses on misinformation sharing and discernment. The figure below diagrams the experimental design. We recruited participants through Facebook ads to complete a five-day text message course, plus a pre- and post-survey, for a mobile airtime payment. Participants who clicked an ad were directed to the pre-survey on Qualtrics, where they were first randomized on whether they would see an accuracy nudge in the pre- and post-survey. Conditional on completing the pre-survey, participants were randomized into one of the text message course interventions and enrolled in the course. Participants in all course interventions, except the No-course baseline, received one text message a day for five days starting on the day they completed the pre-survey. Participants in the No-course baseline received the Combo course after the post-survey to ensure we fulfilled our recruitment promise of a text message course. On the last day of the course, participants received a link to complete the post-survey on Qualtrics. Participants who completed the post-survey were paid KSH 500 (about $4 in U.S. dollars) in mobile airtime. Seven to eleven weeks later, participants who completed the post-survey were randomized on whether they would see a prime in the text message recruiting them to a follow-up survey. Participants who completed the follow-up survey were paid KSH 350 (about $3 in U.S. dollars).

knitr::include_graphics("./figures/Experimental Design.png")
Experimental Design

Experimental Design

The pre- and post-surveys each contained 10 posts:

These posts were created by taking 15 facts/domains, then creating a non-misinfo version and 3 misinfo versions (emotions, reasoning, combo), for a total of 60 posts (plus 2 attention check posts). Participants saw each of the 15 facts at least once.

Whichever 3 facts/domains participants saw in the form of non-misinfo in the pre-survey, they saw in the form of misinfo in the post-survey, so these 3 facts/domains were repeated. Which facts/domains were repeated vary by participant.

For each posts, participants were asked 2 questions:

Data

In this script, we use two datasets.

intermediate_data_wide.csv: all data at the user level. Includes treatment conditions, covariates, and individual-aggregated outcomes. Sample of 8,684 participants who completed the pre- and post-survey.

intermediate_data_long_treatments.csv: all data at the user-post level. Includes treatment conditions, covariates, and individual-aggregated outcomes. Sample of 8,684 participants who completed the pre- and post-survey.

For more information about how these datasets are generated and a data dictionary, see memo.

Samples

This script primarily uses the sample of 8,684 participants who completed the post-survey.

For analyses involving the outcome referred to as the “Primary Misinfo Sharing Rate,” we drop participants who did not share any non-misinfo posts in the pre-survey, leaving a sample of 7,688 participants who completed the post-survey and intended to share at least 1 non-misinfo post in the pre-survey.

Outcomes and Covariates

All Outcomes

Research Question: Do the treatment interventions… Outcome Measures Estimands Notes
Sharing Outcomes
change misinfo sharing? primary misinfo sharing rate See equation below. Defined only in post.
count of misinfo posts shared \(Y^M\) \(\frac{E[Y^{M, post}(1)]}{6} - \frac{E[Y^{M, post}(0)]}{6}\), \((\frac{E[Y^{M, post}(1)]}{6} - \frac{E[Y^{M, pre}(1)]}{6}) - (\frac{E[Y^{M, post}(0)]}{6} - \frac{E[Y^{M, pre}(0)]}{6})\) post and post-pre misinfo sharing rate
change non-misinfo sharing? count of non-misinfo posts shared \(Y^N\) \(\frac{E[Y^{N, post}(1)]}{3} - \frac{E[Y^{N, post}(0)]}{3}\), \((\frac{E[Y^{N, post}(1)]}{6} - \frac{E[Y^{N, pre}(1)]}{6}) - (\frac{E[Y^{N, post}(0)]}{6} - \frac{E[Y^{N, pre}(0)]}{6})\) post and post-pre non-misinfo sharing rate
change sharing discernment? count of misinfo posts shared \(Y^M\), count of non-misinfo posts shared \(Y^N\) \((\frac{E[Y^{N, post}(1)]}{3} - \frac{E[Y^{M, post}(1)]}{6} ) - (\frac{E[Y^{N, post}(0)]}{3} - \frac{E[Y^{M, post}(0)]}{6})\) post sharing discernment
Accuracy Outcomes
change accuracy scoring of misinfo posts? total score on misinfo posts \(A^{M_1}\) \(\frac{E[A^{M_1, post}(1)]}{6} - \frac{E[A^{M_1, post}(0)]}{6}\), \((\frac{E[A^{M_1, post}(1)]}{6} - \frac{E[A^{M_1, pre}(1)]}{6}) - (\frac{E[A^{M_1, post}(0)]}{6} - \frac{E[A^{M_1, pre}(0)]}{6})\) post and post-pre misinfo accuracy score
count of misinfo posts with accuracy score > 0 \(A^{M_2}\) \(\frac{E[A^{M_2, post}(1)]}{6} - \frac{E[A^{M_2, post}(0)]}{6}\), \((\frac{E[A^{M_2, post}(1)]}{6} - \frac{E[A^{M_2, pre}(1)]}{6}) - (\frac{E[A^{M_2, post}(0)]}{6} - \frac{E[A^{M_2, pre}(0)]}{6})\) post and post-pre misinfo binarized accuracy score
change accuracy scoring of non-misinfo posts? total score on non-misinfo posts \(A^{N_1}\) \(\frac{E[A^{N_1, post}(1)]}{3} - \frac{E[A^{N_1, post}(0)]}{3}\), \((\frac{E[A^{N_1, post}(1)]}{6} - \frac{E[A^{N_1, pre}(1)]}{6}) - (\frac{E[A^{N_1, post}(0)]}{6} - \frac{E[A^{N_1, pre}(0)]}{6})\) post and post-pre non-misinfo accuracy score
count of non-misinfo posts with accuracy score > 0 \(A^{N_2}\) \(\frac{E[A^{N_2, post}(1)]}{3} - \frac{E[A^{N_2, post}(0)]}{3}\), \((\frac{E[A^{N_2, post}(1)]}{6} - \frac{E[A^{N_2, pre}(1)]}{6}) - (\frac{E[A^{N_2, post}(0)]}{6} - \frac{E[A^{N_2, pre}(0)]}{6})\) post and post-pre non-misinfo accuracy score
change accuracy discernment? total score on misinfo posts \(A^M\), total score on non-misinfo posts \(A^N\) \((\frac{E[A^{N, post}(1)]}{3} - \frac{E[A^{M, post}(1)]}{6} ) - (\frac{E[A^{N, post}(0)]}{3} - \frac{E[A^{M, post}(0)]}{6})\) post accuracy discernment
knitr::include_graphics("./figures/primary_sharing_equation.png")

All Covariates

Demographics

knitr::include_graphics("./figures/demographics_table.png")
Demographics Definitions

Demographics Definitions

Pre-treatment variables

Description Variable Construction
Pre misinfo sharing rate \(\frac{E[Y^{M, pre}]}{6}\)
Pre non-misinfo sharing rate \(\frac{E[Y^{N, pre}]}{3}\)
Pre sharing discernment \(\frac{E[Y^{N, pre}(1)]}{3} - \frac{E[Y^{M, pre}(1)]}{6}\)
Pre misinfo accuracy score \(\frac{E[A^{M, pre}]}{6}\)
Pre non-misinfo accuracy score \(\frac{E[A^{N, pre}]}{3}\)
Pre accuracy discernment \(\frac{E[A^{N, pre}(1)]}{3} - \frac{E[A^{M, pre}(1)]}{6}\)
Pre attention check passing \(1[\text{passed attention check}]\)

Setup

Load packages

packages = c(
  "tidyverse", "data.table", "dtplyr", "rlang", "kableExtra", "haven", "ggcorrplot", "visdat", "VIM", "corrplot", "kableExtra", "fastDummies"
)

# Load the packages, install if necessary
new_packages = packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages, dependencies = TRUE)
lapply(packages, require, character.only = TRUE) |> invisible()

Load data

df_wide <- read.csv("../intermediate_outcomes/intermediate_data_wide.csv")
df_long <- read.csv("../intermediate_outcomes/intermediate_data_long.csv") %>% filter(pre_post != "followup")

# compute the number of rows in each dataset
nrow_wide <- nrow(df_wide)
nrow_long <- nrow(df_long)

# print the number of rows in each dataset in a statement that says "There are x rows in the wide dataset and y rows in the long dataset."
paste("There are", nrow_wide, "rows in the wide dataset. There are", nrow_long, "rows in the long dataset, which is equal to the", nrow_wide*18, "expected for 18 user-post observations for each", nrow_wide, "user in the wide dataset.")
## [1] "There are 8684 rows in the wide dataset. There are 156312 rows in the long dataset, which is equal to the 156312 expected for 18 user-post observations for each 8684 user in the wide dataset."

Create functions

Standard error functions

# function to compute se for continuous variables
se_cont = function(x, na.rm=FALSE) {
  if (na.rm) x = na.omit(x)
  sqrt(var(x)/length(x))}

# function to compute se for binary variables
se_binary = function(x, na.rm=FALSE) {
  if (na.rm) x = na.omit(x)
  sqrt(mean(x)*(1-mean(x))/length(x))}

Summary statistics for continuous variables function

# function to generate summary statistics for continuous variable
summary_stats_continous <- function(dt, variable, group = NULL){
  # Get descriptive statistics for a single continuous variable
  variable_name = variable 
  variable = sym(variable)

  if(!is.null(group)){
    group_name = group
    group = sym(group)
    dt <- dt |> group_by({{group}})
  } # grouping if group variable is selected

  tab = dt |> lazy_dt() |>
    mutate(
      `Number_Missing` = sum(is.na({{variable}})),  # calculate missing values
    ) |>
    filter(!is.na({{variable}})) |>  # filtering out the missing values for the variable
    summarise(
        Total_Observations = n(),
        Number_Missing = first(`Number_Missing`),
        Mean = as.numeric(mean({{variable}})),
        SD = as.numeric(sd({{variable}})),
        SE = as.numeric(sd({{variable}}) / sqrt(n())),
        Min = as.numeric(min({{variable}})),
        Max =  as.numeric(max({{variable}})),
        Decile_10 = as.numeric(quantile({{variable}}, probs = 0.1)),
        Decile_20 = as.numeric(quantile({{variable}}, probs = 0.2)),
        Decile_30 = as.numeric(quantile({{variable}}, probs = 0.3)),
        Decile_40 = as.numeric(quantile({{variable}}, probs = 0.4)),
        Decile_50 = as.numeric(quantile({{variable}}, probs = 0.5)),
        Decile_60 = as.numeric(quantile({{variable}}, probs = 0.6)),
        Decile_70 = as.numeric(quantile({{variable}}, probs = 0.7)),
        Decile_80 = as.numeric(quantile({{variable}}, probs = 0.8)),
        Decile_90 = as.numeric(quantile({{variable}}, probs = 0.9))
      ) |> # compute summary statistics 
      mutate(
        Proportion_Missing = Number_Missing / Total_Observations,
        Variable = variable_name
      ) |> # adds a column indicating the proportion of missing values relative to the total observations.
      collect()

  if (is.null(group)){
    tab <- tab |> select(Variable, everything())
  } else {
    tab <- tab |> mutate(Group = paste(group_name, ": ", {{group}})) |> select(Variable, Group, everything()) |> select(-{{group}})
  }

  return(tab)
}

summary_stats_table_continous <- function(dt, variables, group = NULL,
  filename = NULL, print_html = TRUE, caption = NULL
  ){
  # Get descriptive statistics for a list of continuous variables and return a combined table
  results <- lapply(variables, function(x) summary_stats_continous(dt, x, group)) # apply the summary_stats_continous function to each variable in the list
  results <- bind_rows(results) # bind the list of dataframes into a single dataframe

  if(!is.null(filename)){
    write.csv(results, paste0("./tables/", filename)) # save as csv file
  }

  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

Summary statistics for binary variables function

# Define the function to create summary statistics for binary variables
summary_stats_binary <- function(data, variables, group = NULL) {
  combined_results <- data.frame(
    Variable = character(),
    Group = character(),
    Frequency = integer(),
    Proportion = numeric(),
    stringsAsFactors = FALSE
  ) # preset a dataframe
  
  for (variable in variables) {
    if (!is.null(group)) {
      # Grouped calculations
      summary_df <- data %>%
        group_by_at(vars(c(group, variable))) %>%
        summarise(Frequency = sum(!!sym(variable)), .groups = 'drop') %>%
        mutate(Proportion = Frequency / sum(Frequency)) %>%
        ungroup() %>%
        mutate(Variable = variable,
               Group = ifelse(is.na(.data[[group]]), "Missing", as.character(.data[[group]]))) %>% 
        select(Variable, Group, Frequency, Proportion)
    } else {
      # Ungrouped calculations
      summary_df <- data %>%
        summarise(
          Frequency = sum(!!sym(variable)),
          Proportion = sum(!!sym(variable)) / nrow(data),
          .groups = 'drop' # Avoid regrouping warning
        ) %>%
        ungroup() %>%
        mutate(Variable = variable) %>% # Add variable name
        select(Variable, everything())
    }
    
    # Combine the result with the overall results
    combined_results <- bind_rows(combined_results, summary_df)
  }
  
  return(combined_results)
}

summary_stats_tab_binary <- function(dt, variables, group = NULL,
                                    filename = NULL, print_html = TRUE, caption = NULL
) {
  
  results <- lapply(variables, function(x) summary_stats_binary(dt, x, group))
  results <- bind_rows(results) 
  
  if(!is.null(filename)){
    write.csv(results, paste0("./tables/", filename)) # save as csv file
  }
  
  if (print_html){
    results = results |> 
      kable(digits = 3,caption = caption) |>
      kable_styling(bootstrap_options = c("striped", "hover")) |>
      scroll_box( height = "500px") # generate html table 
  }
  
  return(results)
}

Function for generating probability distribution

Density plot

create_density_plot <- function(data, x_var, group_var, x_min, x_max, title, x_label, y_label,  file_basename)  {
  plt <-
  ggplot(data, aes_string(x = x_var, group = group_var, fill = group_var)) +
    geom_density(alpha = 0.7, color = "black") +
    labs(title = title,
         x = x_label,
         y = y_label) +
    scale_x_continuous(limits = c(x_min, x_max)) +
    theme_minimal()+
    guides(fill = guide_legend(title = NULL))

  print(plt)

  # Save the plot to JPG
  ggsave(paste0("./generated_figures/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)

}  

Histogram

create_histogram_grouped <- function(data, x_var, group_var, bin_width, title, x_label, y_label, line_color = "black", file_basename) {
  # Generate the histogram
  plt <- ggplot(data, aes_string(x = x_var, fill = group_var)) +
    geom_histogram(binwidth = bin_width, color = line_color, position = "dodge", alpha = 0.5) +
    labs(title = title, x = x_label, y = y_label) +
    theme_minimal() +
    guides(fill = guide_legend(title = NULL))

  # Print the plot
  print(plt)

  # Save the plot to JPG
  ggsave(paste0("./generated_figures/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)
  
}

Create variables lists

# list of continuous covariates
continuous_var_list <- c(
  "age", "social_media_hours", "base_rate_pre", "misinfo_pre", "new_disc_pre", "base_rate_post", "base_rate_diff", "misinfo_post", "misinfo_diff", "new_disc_post"
)

# list of binary variables
binary_var_list <- c(
  "att_check_pre", "att_check_post", "gender_Man", "education_High_school_or_less", "education_Some_college", "education_Bachelor_degree", "education_Graduate_degree", "marital_Married_or_in_a_domestic_partnership", "employment_Employed", "employment_Unemployed", "employment_Student", "location_Mostly_urban", "location_Suburban", "location_Mostly_rural", "religion_Christian", "religiosity_Attends", "social_media_bin_Yes", "social_media_share_80_100", "social_media_share_60_80", "social_media_share_40_60", "social_media_share_20_40", "social_media_share_0_20"
)

# list of variables that need histograms
histogram_var_list <- c(
  "base_rate_pre", "misinfo_pre", "new_disc_pre", "base_rate_post", "base_rate_diff", "misinfo_post", "misinfo_diff", "new_disc_post"
)

# list of variables that need density plots
pdf_var_list <- c(
  "age", "social_media_hours"
)

# list of demographic variables
demographic_var_list <- c(
  "gender_Man", "education_High_school_or_less", "education_Some_college", "education_Bachelor_degree", "education_Graduate_degree", "marital_Married_or_in_a_domestic_partnership", "employment_Employed", "employment_Unemployed", "employment_Student", "location_Mostly_urban", "location_Surburban", "location_Mostly_rural", "religion_Christian", "religiosity_Attends", "social_media_bin_Yes", "social_media_share_80_100", "social_media_share_60_80", "social_media_share_40_60", "social_media_share_20_40", "social_media_share_0_20"
)

#list of pre-treatment variables
pre_treat_var_list <- c(
  "base_rate_pre", "misinfo_pre", "new_disc_pre", "att_check_pre"
)


# list of outcomes
outcome_list <- c(
  "base_rate_post", "base_rate_diff", "misinfo_post", "misinfo_diff", "new_disc_post"
)

Full Sample Summary Statistics

Continuous variables

summary_stats_table_continous(df_wide,
  variables = continuous_var_list,
  caption = "Summary Statistics for Continous Variables",
  print_html = TRUE
  )
Summary Statistics for Continous Variables
Variable Total_Observations Number_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Proportion_Missing
age 8684 0 26.397 7.592 0.081 1 150 20.000 21.000 22.000 23.000 25.000 26.000 28.000 30.000 35.000 0
social_media_hours 8684 0 5.779 3.713 0.040 0 24 2.000 3.000 4.000 4.000 5.000 6.000 6.000 8.000 10.000 0
base_rate_pre 8684 0 0.693 0.343 0.004 0 1 0.000 0.333 0.667 0.667 0.667 1.000 1.000 1.000 1.000 0
misinfo_pre 8684 0 0.546 0.331 0.004 0 1 0.000 0.167 0.333 0.500 0.500 0.667 0.833 0.833 1.000 0
new_disc_pre 8684 0 0.147 0.300 0.003 -1 1 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0
base_rate_post 8684 0 0.574 0.377 0.004 0 1 0.000 0.000 0.333 0.333 0.667 0.667 1.000 1.000 1.000 0
base_rate_diff 8684 0 -0.119 0.394 0.004 -1 1 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0
misinfo_post 8684 0 0.426 0.342 0.004 0 1 0.000 0.000 0.167 0.333 0.333 0.500 0.667 0.833 1.000 0
misinfo_diff 8684 0 -0.120 0.332 0.004 -1 1 -0.500 -0.333 -0.333 -0.167 0.000 0.000 0.000 0.167 0.333 0
new_disc_post 8684 0 0.148 0.302 0.003 -1 1 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
# loop over a set of variables to generate pdfs
for (var in pdf_var_list) {
  create_density_plot(df_wide, var, NULL, NULL, NULL, paste("Density Plot of", var), var, "Density", paste("density_plot_", var))
}

# loop over a set of variables to generate histograms
for (var in histogram_var_list) {
  create_histogram_grouped(df_wide, var, NULL, NULL, paste("Histogram of", var), var, "Frequency", line_color = "black", paste("histogram_", var))
}

Binary variables

summary_stats_tab_binary(df_wide,  
  variables = binary_var_list,
  caption = "Summary Statistics for Binary Variables",
  print_html = TRUE
  )
Summary Statistics for Binary Variables
Variable Group Frequency Proportion
att_check_pre NA 4187 0.482
att_check_post NA 4809 0.554
gender_Man NA 5679 0.654
education_High_school_or_less NA 2356 0.271
education_Some_college NA 3087 0.355
education_Bachelor_degree NA 3131 0.361
education_Graduate_degree NA 110 0.013
marital_Married_or_in_a_domestic_partnership NA 2827 0.326
employment_Employed NA 2823 0.325
employment_Unemployed NA 3483 0.401
employment_Student NA 2378 0.274
location_Mostly_urban NA 2475 0.285
location_Suburban NA 3346 0.385
location_Mostly_rural NA 2863 0.330
religion_Christian NA 8179 0.942
religiosity_Attends NA 8217 0.946
social_media_bin_Yes NA 8626 0.993
social_media_share_80_100 NA 733 0.084
social_media_share_60_80 NA 2023 0.233
social_media_share_40_60 NA 2660 0.306
social_media_share_20_40 NA 1737 0.200
social_media_share_0_20 NA 1531 0.176

Outcomes by Groups & Time

By Treatment Group

summary_stats_table_continous(df_wide,
  variables = outcome_list,
  group = "treatment",
  caption = "Summary Statistics for Outcomes by Treatment Group",
  print_html = TRUE)
Summary Statistics for Outcomes by Treatment Group
Variable Group Total_Observations Number_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Proportion_Missing
base_rate_post treatment : Combo 1740 0 0.522 0.379 0.009 0.000 1 0.000 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 0
base_rate_post treatment : Emotions 1836 0 0.519 0.375 0.009 0.000 1 0.000 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 0
base_rate_post treatment : Facts Baseline 1790 0 0.607 0.379 0.009 0.000 1 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 1.000 0
base_rate_post treatment : No-course Baseline 1628 0 0.680 0.354 0.009 0.000 1 0.000 0.333 0.667 0.667 0.667 1.000 1.000 1.000 1.000 0
base_rate_post treatment : Reasoning 1690 0 0.550 0.374 0.009 0.000 1 0.000 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 0
base_rate_diff treatment : Combo 1740 0 -0.164 0.398 0.010 -1.000 1 -0.667 -0.333 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treatment : Emotions 1836 0 -0.170 0.418 0.010 -1.000 1 -0.667 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treatment : Facts Baseline 1790 0 -0.076 0.378 0.009 -1.000 1 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0.333 0
base_rate_diff treatment : No-course Baseline 1628 0 -0.024 0.359 0.009 -1.000 1 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0.333 0
base_rate_diff treatment : Reasoning 1690 0 -0.153 0.390 0.009 -1.000 1 -0.667 -0.333 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
misinfo_post treatment : Combo 1740 0 0.371 0.336 0.008 0.000 1 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0.667 1.000 0
misinfo_post treatment : Emotions 1836 0 0.353 0.318 0.007 0.000 1 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0.667 0.833 0
misinfo_post treatment : Facts Baseline 1790 0 0.465 0.352 0.008 0.000 1 0.000 0.167 0.167 0.333 0.500 0.500 0.667 0.833 1.000 0
misinfo_post treatment : No-course Baseline 1628 0 0.546 0.334 0.008 0.000 1 0.000 0.167 0.333 0.500 0.500 0.667 0.833 0.833 1.000 0
misinfo_post treatment : Reasoning 1690 0 0.404 0.335 0.008 0.000 1 0.000 0.000 0.167 0.167 0.333 0.500 0.667 0.667 1.000 0
misinfo_diff treatment : Combo 1740 0 -0.175 0.335 0.008 -1.000 1 -0.667 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0
misinfo_diff treatment : Emotions 1836 0 -0.188 0.345 0.008 -1.000 1 -0.667 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0
misinfo_diff treatment : Facts Baseline 1790 0 -0.074 0.316 0.007 -1.000 1 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.333 0
misinfo_diff treatment : No-course Baseline 1628 0 -0.019 0.301 0.007 -1.000 1 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0
misinfo_diff treatment : Reasoning 1690 0 -0.136 0.328 0.008 -1.000 1 -0.667 -0.333 -0.333 -0.167 -0.167 0.000 0.000 0.167 0.167 0
new_disc_post treatment : Combo 1740 0 0.151 0.304 0.007 -1.000 1 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
new_disc_post treatment : Emotions 1836 0 0.166 0.304 0.007 -0.833 1 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.500 0.667 0
new_disc_post treatment : Facts Baseline 1790 0 0.142 0.297 0.007 -1.000 1 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
new_disc_post treatment : No-course Baseline 1628 0 0.134 0.291 0.007 -0.833 1 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
new_disc_post treatment : Reasoning 1690 0 0.146 0.310 0.008 -0.667 1 -0.167 0.000 0.000 0.000 0.083 0.167 0.333 0.333 0.500 0
# loop over a set of variables and each treatment group to generate histograms

for (var in outcome_list) {
    create_histogram_grouped(df_wide, var, "treatment", NULL, paste("Histogram of", var, "by Treatment Group"), var, "Frequency", line_color = "black", paste("histogram_", var))
}

By Accuracy Group

summary_stats_table_continous(df_wide,
  variables = outcome_list,
  group = "accuracy",
  caption = "Summary Statistics for Outcomes by Accuracy Group",
  print_html = TRUE)
Summary Statistics for Outcomes by Accuracy Group
Variable Group Total_Observations Number_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Proportion_Missing
base_rate_post accuracy : Accuracy After 4322 0 0.615 0.368 0.006 0 1 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 1.000 0
base_rate_post accuracy : Accuracy Inter 4362 0 0.533 0.383 0.006 0 1 0.000 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 0
base_rate_diff accuracy : Accuracy After 4322 0 -0.127 0.393 0.006 -1 1 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff accuracy : Accuracy Inter 4362 0 -0.111 0.395 0.006 -1 1 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0
misinfo_post accuracy : Accuracy After 4322 0 0.454 0.337 0.005 0 1 0.000 0.167 0.167 0.333 0.500 0.500 0.667 0.833 1.000 0
misinfo_post accuracy : Accuracy Inter 4362 0 0.398 0.344 0.005 0 1 0.000 0.000 0.167 0.167 0.333 0.500 0.500 0.833 1.000 0
misinfo_diff accuracy : Accuracy After 4322 0 -0.125 0.328 0.005 -1 1 -0.500 -0.333 -0.333 -0.167 -0.167 0.000 0.000 0.167 0.333 0
misinfo_diff accuracy : Accuracy Inter 4362 0 -0.115 0.335 0.005 -1 1 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.333 0
new_disc_post accuracy : Accuracy After 4322 0 0.161 0.307 0.005 -1 1 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.500 0.500 0
new_disc_post accuracy : Accuracy Inter 4362 0 0.135 0.296 0.004 -1 1 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
# loop over a set of variables and each accuracy group to generate histograms

for (var in outcome_list) {
    create_histogram_grouped(df_wide, var, "accuracy", NULL, paste("Histogram of", var, "by Accuracy Group"), var, "Frequency", line_color = "black", paste("histogram_", var))
}

By Accuracy Group and Treatment Group

## Create a treatment accuracy group variable

df_wide <- df_wide |> mutate(treat_accuracy = paste(treatment, accuracy, sep = "_"))

summary_stats_table_continous(df_wide,
  variables = outcome_list,
  group = "treat_accuracy",
  caption = "Summary Statistics for Outcomes by Accuracy Group and Treatment Group",
  print_html = TRUE)
Summary Statistics for Outcomes by Accuracy Group and Treatment Group
Variable Group Total_Observations Number_Missing Mean SD SE Min Max Decile_10 Decile_20 Decile_30 Decile_40 Decile_50 Decile_60 Decile_70 Decile_80 Decile_90 Proportion_Missing
base_rate_post treat_accuracy : Combo_Accuracy After 859 0 0.559 0.377 0.013 0.000 1.000 0.000 0.000 0.333 0.333 0.667 0.667 1.000 1.000 1.000 0
base_rate_post treat_accuracy : Combo_Accuracy Inter 881 0 0.486 0.379 0.013 0.000 1.000 0.000 0.000 0.333 0.333 0.333 0.667 0.667 1.000 1.000 0
base_rate_post treat_accuracy : Emotions_Accuracy After 894 0 0.563 0.369 0.012 0.000 1.000 0.000 0.333 0.333 0.333 0.667 0.667 1.000 1.000 1.000 0
base_rate_post treat_accuracy : Emotions_Accuracy Inter 942 0 0.477 0.375 0.012 0.000 1.000 0.000 0.000 0.333 0.333 0.333 0.667 0.667 1.000 1.000 0
base_rate_post treat_accuracy : Facts Baseline_Accuracy After 886 0 0.657 0.364 0.012 0.000 1.000 0.000 0.333 0.333 0.667 0.667 1.000 1.000 1.000 1.000 0
base_rate_post treat_accuracy : Facts Baseline_Accuracy Inter 904 0 0.558 0.387 0.013 0.000 1.000 0.000 0.000 0.333 0.333 0.667 0.667 1.000 1.000 1.000 0
base_rate_post treat_accuracy : No-course Baseline_Accuracy After 827 0 0.719 0.332 0.012 0.000 1.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 1.000 1.000 0
base_rate_post treat_accuracy : No-course Baseline_Accuracy Inter 801 0 0.639 0.371 0.013 0.000 1.000 0.000 0.333 0.333 0.667 0.667 1.000 1.000 1.000 1.000 0
base_rate_post treat_accuracy : Reasoning_Accuracy After 856 0 0.582 0.367 0.013 0.000 1.000 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 1.000 0
base_rate_post treat_accuracy : Reasoning_Accuracy Inter 834 0 0.518 0.379 0.013 0.000 1.000 0.000 0.000 0.333 0.333 0.667 0.667 0.667 1.000 1.000 0
base_rate_diff treat_accuracy : Combo_Accuracy After 859 0 -0.166 0.401 0.014 -1.000 1.000 -0.667 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treat_accuracy : Combo_Accuracy Inter 881 0 -0.162 0.395 0.013 -1.000 1.000 -0.667 -0.333 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treat_accuracy : Emotions_Accuracy After 894 0 -0.175 0.423 0.014 -1.000 1.000 -0.667 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treat_accuracy : Emotions_Accuracy Inter 942 0 -0.165 0.413 0.013 -1.000 1.000 -0.667 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treat_accuracy : Facts Baseline_Accuracy After 886 0 -0.082 0.364 0.012 -1.000 1.000 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treat_accuracy : Facts Baseline_Accuracy Inter 904 0 -0.070 0.391 0.013 -1.000 1.000 -0.667 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0.333 0
base_rate_diff treat_accuracy : No-course Baseline_Accuracy After 827 0 -0.037 0.354 0.012 -1.000 1.000 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0.333 0
base_rate_diff treat_accuracy : No-course Baseline_Accuracy Inter 801 0 -0.012 0.364 0.013 -1.000 1.000 -0.333 -0.333 0.000 0.000 0.000 0.000 0.000 0.333 0.333 0
base_rate_diff treat_accuracy : Reasoning_Accuracy After 856 0 -0.170 0.396 0.014 -1.000 1.000 -0.667 -0.333 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
base_rate_diff treat_accuracy : Reasoning_Accuracy Inter 834 0 -0.136 0.384 0.013 -1.000 1.000 -0.667 -0.333 -0.333 -0.333 0.000 0.000 0.000 0.000 0.333 0
misinfo_post treat_accuracy : Combo_Accuracy After 859 0 0.387 0.327 0.011 0.000 1.000 0.000 0.000 0.167 0.167 0.333 0.500 0.500 0.667 0.867 0
misinfo_post treat_accuracy : Combo_Accuracy Inter 881 0 0.356 0.344 0.012 0.000 1.000 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0.667 1.000 0
misinfo_post treat_accuracy : Emotions_Accuracy After 894 0 0.381 0.318 0.011 0.000 1.000 0.000 0.000 0.167 0.167 0.333 0.500 0.500 0.667 0.833 0
misinfo_post treat_accuracy : Emotions_Accuracy Inter 942 0 0.326 0.316 0.010 0.000 1.000 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0.667 0.833 0
misinfo_post treat_accuracy : Facts Baseline_Accuracy After 886 0 0.504 0.345 0.012 0.000 1.000 0.000 0.167 0.333 0.333 0.500 0.667 0.667 0.833 1.000 0
misinfo_post treat_accuracy : Facts Baseline_Accuracy Inter 904 0 0.427 0.354 0.012 0.000 1.000 0.000 0.000 0.167 0.333 0.333 0.500 0.667 0.833 1.000 0
misinfo_post treat_accuracy : No-course Baseline_Accuracy After 827 0 0.578 0.320 0.011 0.000 1.000 0.167 0.333 0.333 0.500 0.667 0.667 0.833 0.833 1.000 0
misinfo_post treat_accuracy : No-course Baseline_Accuracy Inter 801 0 0.513 0.345 0.012 0.000 1.000 0.000 0.167 0.333 0.333 0.500 0.667 0.667 1.000 1.000 0
misinfo_post treat_accuracy : Reasoning_Accuracy After 856 0 0.427 0.334 0.011 0.000 1.000 0.000 0.000 0.167 0.333 0.333 0.500 0.667 0.833 1.000 0
misinfo_post treat_accuracy : Reasoning_Accuracy Inter 834 0 0.380 0.335 0.012 0.000 1.000 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0.667 1.000 0
misinfo_diff treat_accuracy : Combo_Accuracy After 859 0 -0.187 0.331 0.011 -1.000 0.833 -0.667 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0
misinfo_diff treat_accuracy : Combo_Accuracy Inter 881 0 -0.163 0.339 0.011 -1.000 1.000 -0.667 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0
misinfo_diff treat_accuracy : Emotions_Accuracy After 894 0 -0.182 0.344 0.011 -1.000 0.833 -0.667 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0
misinfo_diff treat_accuracy : Emotions_Accuracy Inter 942 0 -0.193 0.347 0.011 -1.000 1.000 -0.667 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0
misinfo_diff treat_accuracy : Facts Baseline_Accuracy After 886 0 -0.078 0.309 0.010 -1.000 1.000 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.333 0
misinfo_diff treat_accuracy : Facts Baseline_Accuracy Inter 904 0 -0.070 0.322 0.011 -1.000 1.000 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.333 0
misinfo_diff treat_accuracy : No-course Baseline_Accuracy After 827 0 -0.026 0.298 0.010 -1.000 1.000 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0
misinfo_diff treat_accuracy : No-course Baseline_Accuracy Inter 801 0 -0.011 0.303 0.011 -1.000 1.000 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0
misinfo_diff treat_accuracy : Reasoning_Accuracy After 856 0 -0.149 0.328 0.011 -1.000 0.833 -0.667 -0.333 -0.333 -0.167 -0.167 0.000 0.000 0.167 0.167 0
misinfo_diff treat_accuracy : Reasoning_Accuracy Inter 834 0 -0.123 0.327 0.011 -1.000 1.000 -0.500 -0.333 -0.167 -0.167 0.000 0.000 0.000 0.000 0.167 0
new_disc_post treat_accuracy : Combo_Accuracy After 859 0 0.172 0.315 0.011 -1.000 1.000 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.500 0.667 0
new_disc_post treat_accuracy : Combo_Accuracy Inter 881 0 0.130 0.292 0.010 -0.833 1.000 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
new_disc_post treat_accuracy : Emotions_Accuracy After 894 0 0.182 0.315 0.011 -0.833 1.000 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.500 0.667 0
new_disc_post treat_accuracy : Emotions_Accuracy Inter 942 0 0.151 0.293 0.010 -0.667 1.000 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
new_disc_post treat_accuracy : Facts Baseline_Accuracy After 886 0 0.154 0.293 0.010 -1.000 1.000 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0
new_disc_post treat_accuracy : Facts Baseline_Accuracy Inter 904 0 0.131 0.300 0.010 -1.000 1.000 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
new_disc_post treat_accuracy : No-course Baseline_Accuracy After 827 0 0.142 0.287 0.010 -0.833 1.000 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.333 0.500 0
new_disc_post treat_accuracy : No-course Baseline_Accuracy Inter 801 0 0.125 0.295 0.010 -0.833 1.000 -0.167 0.000 0.000 0.000 0.000 0.167 0.167 0.333 0.500 0
new_disc_post treat_accuracy : Reasoning_Accuracy After 856 0 0.155 0.321 0.011 -0.667 1.000 -0.167 0.000 0.000 0.000 0.167 0.167 0.333 0.500 0.667 0
new_disc_post treat_accuracy : Reasoning_Accuracy Inter 834 0 0.137 0.298 0.010 -0.667 1.000 -0.167 0.000 0.000 0.000 0.000 0.167 0.333 0.333 0.500 0
# loop over a set of variables and each accuracy x treatment group to generate histograms

for (var in outcome_list) {
    create_histogram_grouped(df_wide, var, "treat_accuracy", NULL, paste("Histogram of", var, "by Accuracy and Treatment Group"), var, "Frequency", line_color = "black", paste("histogram_", var))
}

By Time