The goal of this document is to provide comprehensive summary statistics for the inoculation against misinformation project that replaces analyses before April 2024.
The goal of this experiment is to study the effect of different text message courses on misinformation sharing and discernment. The figure below diagrams the experimental design. We recruited participants through Facebook ads to complete a five-day text message course, plus a pre- and post-survey, for a mobile airtime payment. Participants who clicked an ad were directed to the pre-survey on Qualtrics, where they were first randomized on whether they would see an accuracy nudge in the pre- and post-survey. Conditional on completing the pre-survey, participants were randomized into one of the text message course interventions and enrolled in the course. Participants in all course interventions, except the No-course baseline, received one text message a day for five days starting on the day they completed the pre-survey. Participants in the No-course baseline received the Combo course after the post-survey to ensure we fulfilled our recruitment promise of a text message course. On the last day of the course, participants received a link to complete the post-survey on Qualtrics. Participants who completed the post-survey were paid KSH 500 (about $4 in U.S. dollars) in mobile airtime. Seven to eleven weeks later, participants who completed the post-survey were randomized on whether they would see a prime in the text message recruiting them to a follow-up survey. Participants who completed the follow-up survey were paid KSH 350 (about $3 in U.S. dollars).
knitr::include_graphics("./figures/Experimental Design.png")
Experimental Design
The pre- and post-surveys each contained 10 posts:
These posts were created by taking 15 facts/domains, then creating a non-misinfo version and 3 misinfo versions (emotions, reasoning, combo), for a total of 60 posts (plus 2 attention check posts). Participants saw each of the 15 facts at least once.
Whichever 3 facts/domains participants saw in the form of non-misinfo in the pre-survey, they saw in the form of misinfo in the post-survey, so these 3 facts/domains were repeated. Which facts/domains were repeated vary by participant.
For each posts, participants were asked 2 questions:
In this script, we use two datasets.
intermediate_data_wide.csv
: all data at the user level.
Includes treatment conditions, covariates, and individual-aggregated
outcomes. Sample of 8,684 participants who completed the pre- and
post-survey.
intermediate_data_long_treatments.csv
: all data at the
user-post level. Includes treatment conditions, covariates, and
individual-aggregated outcomes. Sample of 8,684 participants who
completed the pre- and post-survey.
For more information about how these datasets are generated and a data dictionary, see memo.
This script primarily uses the sample of 8,684 participants who completed the post-survey.
For analyses involving the outcome referred to as the “Primary Misinfo Sharing Rate,” we drop participants who did not share any non-misinfo posts in the pre-survey, leaving a sample of 7,688 participants who completed the post-survey and intended to share at least 1 non-misinfo post in the pre-survey.
Research Question: Do the treatment interventions… | Outcome Measures | Estimands | Notes |
---|---|---|---|
Sharing Outcomes | |||
change misinfo sharing? | primary misinfo sharing rate | See equation below. | Defined only in post. |
count of misinfo posts shared \(Y^M\) | \(\frac{E[Y^{M, post}(1)]}{6} - \frac{E[Y^{M, post}(0)]}{6}\), \((\frac{E[Y^{M, post}(1)]}{6} - \frac{E[Y^{M, pre}(1)]}{6}) - (\frac{E[Y^{M, post}(0)]}{6} - \frac{E[Y^{M, pre}(0)]}{6})\) | post and post-pre misinfo sharing rate | |
change non-misinfo sharing? | count of non-misinfo posts shared \(Y^N\) | \(\frac{E[Y^{N, post}(1)]}{3} - \frac{E[Y^{N, post}(0)]}{3}\), \((\frac{E[Y^{N, post}(1)]}{6} - \frac{E[Y^{N, pre}(1)]}{6}) - (\frac{E[Y^{N, post}(0)]}{6} - \frac{E[Y^{N, pre}(0)]}{6})\) | post and post-pre non-misinfo sharing rate |
change sharing discernment? | count of misinfo posts shared \(Y^M\), count of non-misinfo posts shared \(Y^N\) | \((\frac{E[Y^{N, post}(1)]}{3} - \frac{E[Y^{M, post}(1)]}{6} ) - (\frac{E[Y^{N, post}(0)]}{3} - \frac{E[Y^{M, post}(0)]}{6})\) | post sharing discernment |
Accuracy Outcomes | |||
change accuracy scoring of misinfo posts? | total score on misinfo posts \(A^{M_1}\) | \(\frac{E[A^{M_1, post}(1)]}{6} - \frac{E[A^{M_1, post}(0)]}{6}\), \((\frac{E[A^{M_1, post}(1)]}{6} - \frac{E[A^{M_1, pre}(1)]}{6}) - (\frac{E[A^{M_1, post}(0)]}{6} - \frac{E[A^{M_1, pre}(0)]}{6})\) | post and post-pre misinfo accuracy score |
count of misinfo posts with accuracy score > 0 \(A^{M_2}\) | \(\frac{E[A^{M_2, post}(1)]}{6} - \frac{E[A^{M_2, post}(0)]}{6}\), \((\frac{E[A^{M_2, post}(1)]}{6} - \frac{E[A^{M_2, pre}(1)]}{6}) - (\frac{E[A^{M_2, post}(0)]}{6} - \frac{E[A^{M_2, pre}(0)]}{6})\) | post and post-pre misinfo binarized accuracy score | |
change accuracy scoring of non-misinfo posts? | total score on non-misinfo posts \(A^{N_1}\) | \(\frac{E[A^{N_1, post}(1)]}{3} - \frac{E[A^{N_1, post}(0)]}{3}\), \((\frac{E[A^{N_1, post}(1)]}{6} - \frac{E[A^{N_1, pre}(1)]}{6}) - (\frac{E[A^{N_1, post}(0)]}{6} - \frac{E[A^{N_1, pre}(0)]}{6})\) | post and post-pre non-misinfo accuracy score |
count of non-misinfo posts with accuracy score > 0 \(A^{N_2}\) | \(\frac{E[A^{N_2, post}(1)]}{3} - \frac{E[A^{N_2, post}(0)]}{3}\), \((\frac{E[A^{N_2, post}(1)]}{6} - \frac{E[A^{N_2, pre}(1)]}{6}) - (\frac{E[A^{N_2, post}(0)]}{6} - \frac{E[A^{N_2, pre}(0)]}{6})\) | post and post-pre non-misinfo accuracy score | |
change accuracy discernment? | total score on misinfo posts \(A^M\), total score on non-misinfo posts \(A^N\) | \((\frac{E[A^{N, post}(1)]}{3} - \frac{E[A^{M, post}(1)]}{6} ) - (\frac{E[A^{N, post}(0)]}{3} - \frac{E[A^{M, post}(0)]}{6})\) | post accuracy discernment |
knitr::include_graphics("./figures/primary_sharing_equation.png")
knitr::include_graphics("./figures/demographics_table.png")
Demographics Definitions
Description | Variable Construction |
---|---|
Pre misinfo sharing rate | \(\frac{E[Y^{M, pre}]}{6}\) |
Pre non-misinfo sharing rate | \(\frac{E[Y^{N, pre}]}{3}\) |
Pre sharing discernment | \(\frac{E[Y^{N, pre}(1)]}{3} - \frac{E[Y^{M, pre}(1)]}{6}\) |
Pre misinfo accuracy score | \(\frac{E[A^{M, pre}]}{6}\) |
Pre non-misinfo accuracy score | \(\frac{E[A^{N, pre}]}{3}\) |
Pre accuracy discernment | \(\frac{E[A^{N, pre}(1)]}{3} - \frac{E[A^{M, pre}(1)]}{6}\) |
Pre attention check passing | \(1[\text{passed attention check}]\) |
packages = c(
"tidyverse", "data.table", "dtplyr", "rlang", "kableExtra", "haven", "ggcorrplot", "visdat", "VIM", "corrplot", "kableExtra", "fastDummies"
)
# Load the packages, install if necessary
new_packages = packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages, dependencies = TRUE)
lapply(packages, require, character.only = TRUE) |> invisible()
df_wide <- read.csv("../intermediate_outcomes/intermediate_data_wide.csv")
df_long <- read.csv("../intermediate_outcomes/intermediate_data_long.csv") %>% filter(pre_post != "followup")
# compute the number of rows in each dataset
nrow_wide <- nrow(df_wide)
nrow_long <- nrow(df_long)
# print the number of rows in each dataset in a statement that says "There are x rows in the wide dataset and y rows in the long dataset."
paste("There are", nrow_wide, "rows in the wide dataset. There are", nrow_long, "rows in the long dataset, which is equal to the", nrow_wide*18, "expected for 18 user-post observations for each", nrow_wide, "user in the wide dataset.")
## [1] "There are 8684 rows in the wide dataset. There are 156312 rows in the long dataset, which is equal to the 156312 expected for 18 user-post observations for each 8684 user in the wide dataset."
# function to compute se for continuous variables
se_cont = function(x, na.rm=FALSE) {
if (na.rm) x = na.omit(x)
sqrt(var(x)/length(x))}
# function to compute se for binary variables
se_binary = function(x, na.rm=FALSE) {
if (na.rm) x = na.omit(x)
sqrt(mean(x)*(1-mean(x))/length(x))}
# function to generate summary statistics for continuous variable
summary_stats_continous <- function(dt, variable, group = NULL){
# Get descriptive statistics for a single continuous variable
variable_name = variable
variable = sym(variable)
if(!is.null(group)){
group_name = group
group = sym(group)
dt <- dt |> group_by({{group}})
} # grouping if group variable is selected
tab = dt |> lazy_dt() |>
mutate(
`Number_Missing` = sum(is.na({{variable}})), # calculate missing values
) |>
filter(!is.na({{variable}})) |> # filtering out the missing values for the variable
summarise(
Total_Observations = n(),
Number_Missing = first(`Number_Missing`),
Mean = as.numeric(mean({{variable}})),
SD = as.numeric(sd({{variable}})),
SE = as.numeric(sd({{variable}}) / sqrt(n())),
Min = as.numeric(min({{variable}})),
Max = as.numeric(max({{variable}})),
Decile_10 = as.numeric(quantile({{variable}}, probs = 0.1)),
Decile_20 = as.numeric(quantile({{variable}}, probs = 0.2)),
Decile_30 = as.numeric(quantile({{variable}}, probs = 0.3)),
Decile_40 = as.numeric(quantile({{variable}}, probs = 0.4)),
Decile_50 = as.numeric(quantile({{variable}}, probs = 0.5)),
Decile_60 = as.numeric(quantile({{variable}}, probs = 0.6)),
Decile_70 = as.numeric(quantile({{variable}}, probs = 0.7)),
Decile_80 = as.numeric(quantile({{variable}}, probs = 0.8)),
Decile_90 = as.numeric(quantile({{variable}}, probs = 0.9))
) |> # compute summary statistics
mutate(
Proportion_Missing = Number_Missing / Total_Observations,
Variable = variable_name
) |> # adds a column indicating the proportion of missing values relative to the total observations.
collect()
if (is.null(group)){
tab <- tab |> select(Variable, everything())
} else {
tab <- tab |> mutate(Group = paste(group_name, ": ", {{group}})) |> select(Variable, Group, everything()) |> select(-{{group}})
}
return(tab)
}
summary_stats_table_continous <- function(dt, variables, group = NULL,
filename = NULL, print_html = TRUE, caption = NULL
){
# Get descriptive statistics for a list of continuous variables and return a combined table
results <- lapply(variables, function(x) summary_stats_continous(dt, x, group)) # apply the summary_stats_continous function to each variable in the list
results <- bind_rows(results) # bind the list of dataframes into a single dataframe
if(!is.null(filename)){
write.csv(results, paste0("./tables/", filename)) # save as csv file
}
if (print_html){
results = results |>
kable(digits = 3,caption = caption) |>
kable_styling(bootstrap_options = c("striped", "hover")) |>
scroll_box( height = "500px") # generate html table
}
return(results)
}
# Define the function to create summary statistics for binary variables
summary_stats_binary <- function(data, variables, group = NULL) {
combined_results <- data.frame(
Variable = character(),
Group = character(),
Frequency = integer(),
Proportion = numeric(),
stringsAsFactors = FALSE
) # preset a dataframe
for (variable in variables) {
if (!is.null(group)) {
# Grouped calculations
summary_df <- data %>%
group_by_at(vars(c(group, variable))) %>%
summarise(Frequency = sum(!!sym(variable)), .groups = 'drop') %>%
mutate(Proportion = Frequency / sum(Frequency)) %>%
ungroup() %>%
mutate(Variable = variable,
Group = ifelse(is.na(.data[[group]]), "Missing", as.character(.data[[group]]))) %>%
select(Variable, Group, Frequency, Proportion)
} else {
# Ungrouped calculations
summary_df <- data %>%
summarise(
Frequency = sum(!!sym(variable)),
Proportion = sum(!!sym(variable)) / nrow(data),
.groups = 'drop' # Avoid regrouping warning
) %>%
ungroup() %>%
mutate(Variable = variable) %>% # Add variable name
select(Variable, everything())
}
# Combine the result with the overall results
combined_results <- bind_rows(combined_results, summary_df)
}
return(combined_results)
}
summary_stats_tab_binary <- function(dt, variables, group = NULL,
filename = NULL, print_html = TRUE, caption = NULL
) {
results <- lapply(variables, function(x) summary_stats_binary(dt, x, group))
results <- bind_rows(results)
if(!is.null(filename)){
write.csv(results, paste0("./tables/", filename)) # save as csv file
}
if (print_html){
results = results |>
kable(digits = 3,caption = caption) |>
kable_styling(bootstrap_options = c("striped", "hover")) |>
scroll_box( height = "500px") # generate html table
}
return(results)
}
create_density_plot <- function(data, x_var, group_var, x_min, x_max, title, x_label, y_label, file_basename) {
plt <-
ggplot(data, aes_string(x = x_var, group = group_var, fill = group_var)) +
geom_density(alpha = 0.7, color = "black") +
labs(title = title,
x = x_label,
y = y_label) +
scale_x_continuous(limits = c(x_min, x_max)) +
theme_minimal()+
guides(fill = guide_legend(title = NULL))
print(plt)
# Save the plot to JPG
ggsave(paste0("./generated_figures/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)
}
create_histogram_grouped <- function(data, x_var, group_var, bin_width, title, x_label, y_label, line_color = "black", file_basename) {
# Generate the histogram
plt <- ggplot(data, aes_string(x = x_var, fill = group_var)) +
geom_histogram(binwidth = bin_width, color = line_color, position = "dodge", alpha = 0.5) +
labs(title = title, x = x_label, y = y_label) +
theme_minimal() +
guides(fill = guide_legend(title = NULL))
# Print the plot
print(plt)
# Save the plot to JPG
ggsave(paste0("./generated_figures/", file_basename, ".jpg"), plt, device = "jpg", width = 10, height = 8)
}
# list of continuous covariates
continuous_var_list <- c(
"age", "social_media_hours", "base_rate_pre", "misinfo_pre", "new_disc_pre", "base_rate_post", "base_rate_diff", "misinfo_post", "misinfo_diff", "new_disc_post"
)
# list of binary variables
binary_var_list <- c(
"att_check_pre", "att_check_post", "gender_Man", "education_High_school_or_less", "education_Some_college", "education_Bachelor_degree", "education_Graduate_degree", "marital_Married_or_in_a_domestic_partnership", "employment_Employed", "employment_Unemployed", "employment_Student", "location_Mostly_urban", "location_Suburban", "location_Mostly_rural", "religion_Christian", "religiosity_Attends", "social_media_bin_Yes", "social_media_share_80_100", "social_media_share_60_80", "social_media_share_40_60", "social_media_share_20_40", "social_media_share_0_20"
)
# list of variables that need histograms
histogram_var_list <- c(
"base_rate_pre", "misinfo_pre", "new_disc_pre", "base_rate_post", "base_rate_diff", "misinfo_post", "misinfo_diff", "new_disc_post"
)
# list of variables that need density plots
pdf_var_list <- c(
"age", "social_media_hours"
)
# list of demographic variables
demographic_var_list <- c(
"gender_Man", "education_High_school_or_less", "education_Some_college", "education_Bachelor_degree", "education_Graduate_degree", "marital_Married_or_in_a_domestic_partnership", "employment_Employed", "employment_Unemployed", "employment_Student", "location_Mostly_urban", "location_Surburban", "location_Mostly_rural", "religion_Christian", "religiosity_Attends", "social_media_bin_Yes", "social_media_share_80_100", "social_media_share_60_80", "social_media_share_40_60", "social_media_share_20_40", "social_media_share_0_20"
)
#list of pre-treatment variables
pre_treat_var_list <- c(
"base_rate_pre", "misinfo_pre", "new_disc_pre", "att_check_pre"
)
# list of outcomes
outcome_list <- c(
"base_rate_post", "base_rate_diff", "misinfo_post", "misinfo_diff", "new_disc_post"
)
summary_stats_table_continous(df_wide,
variables = continuous_var_list,
caption = "Summary Statistics for Continous Variables",
print_html = TRUE
)
Variable | Total_Observations | Number_Missing | Mean | SD | SE | Min | Max | Decile_10 | Decile_20 | Decile_30 | Decile_40 | Decile_50 | Decile_60 | Decile_70 | Decile_80 | Decile_90 | Proportion_Missing |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
age | 8684 | 0 | 26.397 | 7.592 | 0.081 | 1 | 150 | 20.000 | 21.000 | 22.000 | 23.000 | 25.000 | 26.000 | 28.000 | 30.000 | 35.000 | 0 |
social_media_hours | 8684 | 0 | 5.779 | 3.713 | 0.040 | 0 | 24 | 2.000 | 3.000 | 4.000 | 4.000 | 5.000 | 6.000 | 6.000 | 8.000 | 10.000 | 0 |
base_rate_pre | 8684 | 0 | 0.693 | 0.343 | 0.004 | 0 | 1 | 0.000 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 1.000 | 0 |
misinfo_pre | 8684 | 0 | 0.546 | 0.331 | 0.004 | 0 | 1 | 0.000 | 0.167 | 0.333 | 0.500 | 0.500 | 0.667 | 0.833 | 0.833 | 1.000 | 0 |
new_disc_pre | 8684 | 0 | 0.147 | 0.300 | 0.003 | -1 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
base_rate_post | 8684 | 0 | 0.574 | 0.377 | 0.004 | 0 | 1 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_diff | 8684 | 0 | -0.119 | 0.394 | 0.004 | -1 | 1 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
misinfo_post | 8684 | 0 | 0.426 | 0.342 | 0.004 | 0 | 1 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.833 | 1.000 | 0 |
misinfo_diff | 8684 | 0 | -0.120 | 0.332 | 0.004 | -1 | 1 | -0.500 | -0.333 | -0.333 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0 |
new_disc_post | 8684 | 0 | 0.148 | 0.302 | 0.003 | -1 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
# loop over a set of variables to generate pdfs
for (var in pdf_var_list) {
create_density_plot(df_wide, var, NULL, NULL, NULL, paste("Density Plot of", var), var, "Density", paste("density_plot_", var))
}
# loop over a set of variables to generate histograms
for (var in histogram_var_list) {
create_histogram_grouped(df_wide, var, NULL, NULL, paste("Histogram of", var), var, "Frequency", line_color = "black", paste("histogram_", var))
}
summary_stats_tab_binary(df_wide,
variables = binary_var_list,
caption = "Summary Statistics for Binary Variables",
print_html = TRUE
)
Variable | Group | Frequency | Proportion |
---|---|---|---|
att_check_pre | NA | 4187 | 0.482 |
att_check_post | NA | 4809 | 0.554 |
gender_Man | NA | 5679 | 0.654 |
education_High_school_or_less | NA | 2356 | 0.271 |
education_Some_college | NA | 3087 | 0.355 |
education_Bachelor_degree | NA | 3131 | 0.361 |
education_Graduate_degree | NA | 110 | 0.013 |
marital_Married_or_in_a_domestic_partnership | NA | 2827 | 0.326 |
employment_Employed | NA | 2823 | 0.325 |
employment_Unemployed | NA | 3483 | 0.401 |
employment_Student | NA | 2378 | 0.274 |
location_Mostly_urban | NA | 2475 | 0.285 |
location_Suburban | NA | 3346 | 0.385 |
location_Mostly_rural | NA | 2863 | 0.330 |
religion_Christian | NA | 8179 | 0.942 |
religiosity_Attends | NA | 8217 | 0.946 |
social_media_bin_Yes | NA | 8626 | 0.993 |
social_media_share_80_100 | NA | 733 | 0.084 |
social_media_share_60_80 | NA | 2023 | 0.233 |
social_media_share_40_60 | NA | 2660 | 0.306 |
social_media_share_20_40 | NA | 1737 | 0.200 |
social_media_share_0_20 | NA | 1531 | 0.176 |
summary_stats_table_continous(df_wide,
variables = outcome_list,
group = "treatment",
caption = "Summary Statistics for Outcomes by Treatment Group",
print_html = TRUE)
Variable | Group | Total_Observations | Number_Missing | Mean | SD | SE | Min | Max | Decile_10 | Decile_20 | Decile_30 | Decile_40 | Decile_50 | Decile_60 | Decile_70 | Decile_80 | Decile_90 | Proportion_Missing |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
base_rate_post | treatment : Combo | 1740 | 0 | 0.522 | 0.379 | 0.009 | 0.000 | 1 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_post | treatment : Emotions | 1836 | 0 | 0.519 | 0.375 | 0.009 | 0.000 | 1 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_post | treatment : Facts Baseline | 1790 | 0 | 0.607 | 0.379 | 0.009 | 0.000 | 1 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treatment : No-course Baseline | 1628 | 0 | 0.680 | 0.354 | 0.009 | 0.000 | 1 | 0.000 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treatment : Reasoning | 1690 | 0 | 0.550 | 0.374 | 0.009 | 0.000 | 1 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_diff | treatment : Combo | 1740 | 0 | -0.164 | 0.398 | 0.010 | -1.000 | 1 | -0.667 | -0.333 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treatment : Emotions | 1836 | 0 | -0.170 | 0.418 | 0.010 | -1.000 | 1 | -0.667 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treatment : Facts Baseline | 1790 | 0 | -0.076 | 0.378 | 0.009 | -1.000 | 1 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0 |
base_rate_diff | treatment : No-course Baseline | 1628 | 0 | -0.024 | 0.359 | 0.009 | -1.000 | 1 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0 |
base_rate_diff | treatment : Reasoning | 1690 | 0 | -0.153 | 0.390 | 0.009 | -1.000 | 1 | -0.667 | -0.333 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
misinfo_post | treatment : Combo | 1740 | 0 | 0.371 | 0.336 | 0.008 | 0.000 | 1 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 1.000 | 0 |
misinfo_post | treatment : Emotions | 1836 | 0 | 0.353 | 0.318 | 0.007 | 0.000 | 1 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.833 | 0 |
misinfo_post | treatment : Facts Baseline | 1790 | 0 | 0.465 | 0.352 | 0.008 | 0.000 | 1 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.500 | 0.667 | 0.833 | 1.000 | 0 |
misinfo_post | treatment : No-course Baseline | 1628 | 0 | 0.546 | 0.334 | 0.008 | 0.000 | 1 | 0.000 | 0.167 | 0.333 | 0.500 | 0.500 | 0.667 | 0.833 | 0.833 | 1.000 | 0 |
misinfo_post | treatment : Reasoning | 1690 | 0 | 0.404 | 0.335 | 0.008 | 0.000 | 1 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.667 | 0.667 | 1.000 | 0 |
misinfo_diff | treatment : Combo | 1740 | 0 | -0.175 | 0.335 | 0.008 | -1.000 | 1 | -0.667 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
misinfo_diff | treatment : Emotions | 1836 | 0 | -0.188 | 0.345 | 0.008 | -1.000 | 1 | -0.667 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
misinfo_diff | treatment : Facts Baseline | 1790 | 0 | -0.074 | 0.316 | 0.007 | -1.000 | 1 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0 |
misinfo_diff | treatment : No-course Baseline | 1628 | 0 | -0.019 | 0.301 | 0.007 | -1.000 | 1 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0 |
misinfo_diff | treatment : Reasoning | 1690 | 0 | -0.136 | 0.328 | 0.008 | -1.000 | 1 | -0.667 | -0.333 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.167 | 0.167 | 0 |
new_disc_post | treatment : Combo | 1740 | 0 | 0.151 | 0.304 | 0.007 | -1.000 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treatment : Emotions | 1836 | 0 | 0.166 | 0.304 | 0.007 | -0.833 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.667 | 0 |
new_disc_post | treatment : Facts Baseline | 1790 | 0 | 0.142 | 0.297 | 0.007 | -1.000 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treatment : No-course Baseline | 1628 | 0 | 0.134 | 0.291 | 0.007 | -0.833 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treatment : Reasoning | 1690 | 0 | 0.146 | 0.310 | 0.008 | -0.667 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.083 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
# loop over a set of variables and each treatment group to generate histograms
for (var in outcome_list) {
create_histogram_grouped(df_wide, var, "treatment", NULL, paste("Histogram of", var, "by Treatment Group"), var, "Frequency", line_color = "black", paste("histogram_", var))
}
summary_stats_table_continous(df_wide,
variables = outcome_list,
group = "accuracy",
caption = "Summary Statistics for Outcomes by Accuracy Group",
print_html = TRUE)
Variable | Group | Total_Observations | Number_Missing | Mean | SD | SE | Min | Max | Decile_10 | Decile_20 | Decile_30 | Decile_40 | Decile_50 | Decile_60 | Decile_70 | Decile_80 | Decile_90 | Proportion_Missing |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
base_rate_post | accuracy : Accuracy After | 4322 | 0 | 0.615 | 0.368 | 0.006 | 0 | 1 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | accuracy : Accuracy Inter | 4362 | 0 | 0.533 | 0.383 | 0.006 | 0 | 1 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_diff | accuracy : Accuracy After | 4322 | 0 | -0.127 | 0.393 | 0.006 | -1 | 1 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | accuracy : Accuracy Inter | 4362 | 0 | -0.111 | 0.395 | 0.006 | -1 | 1 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
misinfo_post | accuracy : Accuracy After | 4322 | 0 | 0.454 | 0.337 | 0.005 | 0 | 1 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.500 | 0.667 | 0.833 | 1.000 | 0 |
misinfo_post | accuracy : Accuracy Inter | 4362 | 0 | 0.398 | 0.344 | 0.005 | 0 | 1 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.500 | 0.833 | 1.000 | 0 |
misinfo_diff | accuracy : Accuracy After | 4322 | 0 | -0.125 | 0.328 | 0.005 | -1 | 1 | -0.500 | -0.333 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.167 | 0.333 | 0 |
misinfo_diff | accuracy : Accuracy Inter | 4362 | 0 | -0.115 | 0.335 | 0.005 | -1 | 1 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0 |
new_disc_post | accuracy : Accuracy After | 4322 | 0 | 0.161 | 0.307 | 0.005 | -1 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.500 | 0 |
new_disc_post | accuracy : Accuracy Inter | 4362 | 0 | 0.135 | 0.296 | 0.004 | -1 | 1 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
# loop over a set of variables and each accuracy group to generate histograms
for (var in outcome_list) {
create_histogram_grouped(df_wide, var, "accuracy", NULL, paste("Histogram of", var, "by Accuracy Group"), var, "Frequency", line_color = "black", paste("histogram_", var))
}
## Create a treatment accuracy group variable
df_wide <- df_wide |> mutate(treat_accuracy = paste(treatment, accuracy, sep = "_"))
summary_stats_table_continous(df_wide,
variables = outcome_list,
group = "treat_accuracy",
caption = "Summary Statistics for Outcomes by Accuracy Group and Treatment Group",
print_html = TRUE)
Variable | Group | Total_Observations | Number_Missing | Mean | SD | SE | Min | Max | Decile_10 | Decile_20 | Decile_30 | Decile_40 | Decile_50 | Decile_60 | Decile_70 | Decile_80 | Decile_90 | Proportion_Missing |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
base_rate_post | treat_accuracy : Combo_Accuracy After | 859 | 0 | 0.559 | 0.377 | 0.013 | 0.000 | 1.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Combo_Accuracy Inter | 881 | 0 | 0.486 | 0.379 | 0.013 | 0.000 | 1.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Emotions_Accuracy After | 894 | 0 | 0.563 | 0.369 | 0.012 | 0.000 | 1.000 | 0.000 | 0.333 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Emotions_Accuracy Inter | 942 | 0 | 0.477 | 0.375 | 0.012 | 0.000 | 1.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Facts Baseline_Accuracy After | 886 | 0 | 0.657 | 0.364 | 0.012 | 0.000 | 1.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Facts Baseline_Accuracy Inter | 904 | 0 | 0.558 | 0.387 | 0.013 | 0.000 | 1.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : No-course Baseline_Accuracy After | 827 | 0 | 0.719 | 0.332 | 0.012 | 0.000 | 1.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : No-course Baseline_Accuracy Inter | 801 | 0 | 0.639 | 0.371 | 0.013 | 0.000 | 1.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Reasoning_Accuracy After | 856 | 0 | 0.582 | 0.367 | 0.013 | 0.000 | 1.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 1.000 | 0 |
base_rate_post | treat_accuracy : Reasoning_Accuracy Inter | 834 | 0 | 0.518 | 0.379 | 0.013 | 0.000 | 1.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0.667 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
base_rate_diff | treat_accuracy : Combo_Accuracy After | 859 | 0 | -0.166 | 0.401 | 0.014 | -1.000 | 1.000 | -0.667 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Combo_Accuracy Inter | 881 | 0 | -0.162 | 0.395 | 0.013 | -1.000 | 1.000 | -0.667 | -0.333 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Emotions_Accuracy After | 894 | 0 | -0.175 | 0.423 | 0.014 | -1.000 | 1.000 | -0.667 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Emotions_Accuracy Inter | 942 | 0 | -0.165 | 0.413 | 0.013 | -1.000 | 1.000 | -0.667 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Facts Baseline_Accuracy After | 886 | 0 | -0.082 | 0.364 | 0.012 | -1.000 | 1.000 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Facts Baseline_Accuracy Inter | 904 | 0 | -0.070 | 0.391 | 0.013 | -1.000 | 1.000 | -0.667 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0 |
base_rate_diff | treat_accuracy : No-course Baseline_Accuracy After | 827 | 0 | -0.037 | 0.354 | 0.012 | -1.000 | 1.000 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0 |
base_rate_diff | treat_accuracy : No-course Baseline_Accuracy Inter | 801 | 0 | -0.012 | 0.364 | 0.013 | -1.000 | 1.000 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Reasoning_Accuracy After | 856 | 0 | -0.170 | 0.396 | 0.014 | -1.000 | 1.000 | -0.667 | -0.333 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
base_rate_diff | treat_accuracy : Reasoning_Accuracy Inter | 834 | 0 | -0.136 | 0.384 | 0.013 | -1.000 | 1.000 | -0.667 | -0.333 | -0.333 | -0.333 | 0.000 | 0.000 | 0.000 | 0.000 | 0.333 | 0 |
misinfo_post | treat_accuracy : Combo_Accuracy After | 859 | 0 | 0.387 | 0.327 | 0.011 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.500 | 0.667 | 0.867 | 0 |
misinfo_post | treat_accuracy : Combo_Accuracy Inter | 881 | 0 | 0.356 | 0.344 | 0.012 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 1.000 | 0 |
misinfo_post | treat_accuracy : Emotions_Accuracy After | 894 | 0 | 0.381 | 0.318 | 0.011 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.500 | 0.667 | 0.833 | 0 |
misinfo_post | treat_accuracy : Emotions_Accuracy Inter | 942 | 0 | 0.326 | 0.316 | 0.010 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.833 | 0 |
misinfo_post | treat_accuracy : Facts Baseline_Accuracy After | 886 | 0 | 0.504 | 0.345 | 0.012 | 0.000 | 1.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.667 | 0.833 | 1.000 | 0 |
misinfo_post | treat_accuracy : Facts Baseline_Accuracy Inter | 904 | 0 | 0.427 | 0.354 | 0.012 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.833 | 1.000 | 0 |
misinfo_post | treat_accuracy : No-course Baseline_Accuracy After | 827 | 0 | 0.578 | 0.320 | 0.011 | 0.000 | 1.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.667 | 0.833 | 0.833 | 1.000 | 0 |
misinfo_post | treat_accuracy : No-course Baseline_Accuracy Inter | 801 | 0 | 0.513 | 0.345 | 0.012 | 0.000 | 1.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.667 | 1.000 | 1.000 | 0 |
misinfo_post | treat_accuracy : Reasoning_Accuracy After | 856 | 0 | 0.427 | 0.334 | 0.011 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 0.833 | 1.000 | 0 |
misinfo_post | treat_accuracy : Reasoning_Accuracy Inter | 834 | 0 | 0.380 | 0.335 | 0.012 | 0.000 | 1.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0.667 | 1.000 | 0 |
misinfo_diff | treat_accuracy : Combo_Accuracy After | 859 | 0 | -0.187 | 0.331 | 0.011 | -1.000 | 0.833 | -0.667 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
misinfo_diff | treat_accuracy : Combo_Accuracy Inter | 881 | 0 | -0.163 | 0.339 | 0.011 | -1.000 | 1.000 | -0.667 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
misinfo_diff | treat_accuracy : Emotions_Accuracy After | 894 | 0 | -0.182 | 0.344 | 0.011 | -1.000 | 0.833 | -0.667 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
misinfo_diff | treat_accuracy : Emotions_Accuracy Inter | 942 | 0 | -0.193 | 0.347 | 0.011 | -1.000 | 1.000 | -0.667 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
misinfo_diff | treat_accuracy : Facts Baseline_Accuracy After | 886 | 0 | -0.078 | 0.309 | 0.010 | -1.000 | 1.000 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0 |
misinfo_diff | treat_accuracy : Facts Baseline_Accuracy Inter | 904 | 0 | -0.070 | 0.322 | 0.011 | -1.000 | 1.000 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0 |
misinfo_diff | treat_accuracy : No-course Baseline_Accuracy After | 827 | 0 | -0.026 | 0.298 | 0.010 | -1.000 | 1.000 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0 |
misinfo_diff | treat_accuracy : No-course Baseline_Accuracy Inter | 801 | 0 | -0.011 | 0.303 | 0.011 | -1.000 | 1.000 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0 |
misinfo_diff | treat_accuracy : Reasoning_Accuracy After | 856 | 0 | -0.149 | 0.328 | 0.011 | -1.000 | 0.833 | -0.667 | -0.333 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.167 | 0.167 | 0 |
misinfo_diff | treat_accuracy : Reasoning_Accuracy Inter | 834 | 0 | -0.123 | 0.327 | 0.011 | -1.000 | 1.000 | -0.500 | -0.333 | -0.167 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0 |
new_disc_post | treat_accuracy : Combo_Accuracy After | 859 | 0 | 0.172 | 0.315 | 0.011 | -1.000 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.667 | 0 |
new_disc_post | treat_accuracy : Combo_Accuracy Inter | 881 | 0 | 0.130 | 0.292 | 0.010 | -0.833 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treat_accuracy : Emotions_Accuracy After | 894 | 0 | 0.182 | 0.315 | 0.011 | -0.833 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.667 | 0 |
new_disc_post | treat_accuracy : Emotions_Accuracy Inter | 942 | 0 | 0.151 | 0.293 | 0.010 | -0.667 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treat_accuracy : Facts Baseline_Accuracy After | 886 | 0 | 0.154 | 0.293 | 0.010 | -1.000 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treat_accuracy : Facts Baseline_Accuracy Inter | 904 | 0 | 0.131 | 0.300 | 0.010 | -1.000 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treat_accuracy : No-course Baseline_Accuracy After | 827 | 0 | 0.142 | 0.287 | 0.010 | -0.833 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
new_disc_post | treat_accuracy : No-course Baseline_Accuracy Inter | 801 | 0 | 0.125 | 0.295 | 0.010 | -0.833 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0 |
new_disc_post | treat_accuracy : Reasoning_Accuracy After | 856 | 0 | 0.155 | 0.321 | 0.011 | -0.667 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.167 | 0.167 | 0.333 | 0.500 | 0.667 | 0 |
new_disc_post | treat_accuracy : Reasoning_Accuracy Inter | 834 | 0 | 0.137 | 0.298 | 0.010 | -0.667 | 1.000 | -0.167 | 0.000 | 0.000 | 0.000 | 0.000 | 0.167 | 0.333 | 0.333 | 0.500 | 0 |
# loop over a set of variables and each accuracy x treatment group to generate histograms
for (var in outcome_list) {
create_histogram_grouped(df_wide, var, "treat_accuracy", NULL, paste("Histogram of", var, "by Accuracy and Treatment Group"), var, "Frequency", line_color = "black", paste("histogram_", var))
}