Brennan Center for Justice: Data Analysis

Author

Gauri Adarsh

Published

March 8, 2026

Overview

When writing up my internship application for the Brennan Center of Justice, I found myself intrigued by the small donor public financing system and how efficient it seemed in increasing civic engagement from underrepresented communities. As someone who just ran a presidential election at Northwestern, I found the problem of low voter turnout especially frustrating (even though we had the highest voter turnout in two decades, it was still at around 22%).

I kept thinking to myself during the election - how do we get people to care, especially those who have found little success in institutions and feel disenfranchised? Therefore, I decided to run an analysis on the impact of New York State’s voluntary small donor public financing program in getting underrepresented and novel groups involved in donating to campaigns, in the hopes that if it works, we can advocate for its expansion using data-backed reasoning.

Data

I used New York City Campaign Finance Board’s data to do a preliminary investigation into whether small donor public financing had achieved its goals of civic engagement. Contributions data from 2013 and 2017 served as the pre-public financing program data, and data from 2021 and 2023 served as post-public financing program data.

Load packages and clean data

There’s a lot of data cleaning involved here - open the code if you’re interested (but also it’s not all that interesting) :).

Code
library(tidyverse)
library(naniar)
library(lubridate)
library(scales)
library(knitr)
library(here)

don_2013 <- read_csv("data/2013_contribution.csv")
don_2017 <- read_csv("data/2017_Contributions.csv")
don_2021 <- read_csv("data/2021_Contributions.csv")
don_2023 <- read_csv("data/2023_Contributions (2).csv")

don_2013 <- don_2013 |> mutate(year = 2013, period = "pre")
don_2017 <- don_2017 |>  mutate(year = 2017, period = "pre")
don_2021 <- don_2021 |> mutate(year = 2021, period = "post")
don_2023 <- don_2023 |>  mutate(year = 2023, period = "post")

don_2013 <- don_2013 |> 
  select(year, period, DATE, NAME, C_CODE, ZIP, BOROUGHCD, AMNT, MATCHAMNT, RECIPNAME, OFFICECD)
don_2017 <- don_2017 |> 
  select(year, period, DATE, NAME, C_CODE, ZIP, BOROUGHCD, AMNT, MATCHAMNT, RECIPNAME, OFFICECD)
don_2021 <- don_2021 |> 
  select(year, period, DATE, NAME, C_CODE, ZIP, BOROUGHCD, AMNT, MATCHAMNT, RECIPNAME, OFFICECD)
don_2023 <- don_2023 |> 
  select(year, period, DATE, NAME, C_CODE, ZIP, BOROUGHCD, AMNT, MATCHAMNT, RECIPNAME, OFFICECD)

don_2013 <- don_2013 |>  mutate(ZIP = as.character(ZIP))
don_2017 <- don_2017 |> mutate(ZIP = as.character(ZIP))
don_2021 <- don_2021 |>  mutate(ZIP = as.character(ZIP))
don_2023 <- don_2023 |> mutate(ZIP = as.character(ZIP))

don_2013 <- don_2013 |>  mutate(OFFICECD = as.character(OFFICECD))
don_2017 <- don_2017 |> mutate(OFFICECD = as.character(OFFICECD))
don_2021 <- don_2021 |>  mutate(OFFICECD = as.character(OFFICECD))
don_2023 <- don_2023 |> mutate(OFFICECD = as.character(OFFICECD))

donations <- bind_rows(don_2013, don_2017, don_2021, don_2023)

donations <- donations |> 
  mutate(period = factor(period, levels = c("pre", "post")),
         C_CODE = factor(C_CODE), 
         BOROUGHCD = factor(BOROUGHCD), 
         matchable_donation = AMNT >= 5 & AMNT <= 250) |> 
  filter(C_CODE == "IND", AMNT > 0)

write_csv(donations, file = "data/donations.csv")

donations <- read_csv(here("data/donations.csv"))

Results

I first looked at the distribution of donations from $5-$250, comparing pre-2020 to post-2020. We can see here that there’s a clear movement to donations on the smaller end, which is promising. With small donor public financing, we want to try and strengthen the power of smaller donors so the 1% does not have such a hold on campaign finance.

Code
ggplot(
  donations |> filter(matchable_donation),
  aes(x = AMNT, fill = period, color = period)) +
  geom_density(alpha = 0.5, linewidth = .5) +
  labs(
    title = "Distribution of Matchable Contribution Sizes",
    x = "Contribution Amount ($)",
    y = "Density") +
  theme_minimal() +
  scale_fill_manual(values = c(
    "pre" = "#363737",
    "post" = "#FF474C")) +
  scale_color_manual(values = c(
    "pre" = "#363737",
    "post" = "#FF474C")) +
  theme(
    text = element_text(family = "Helvetica"),
    plot.title = element_text(face = "bold", size = 16),
    axis.title = element_text(size = 14),
    axis.text = element_text(size = 12),
    legend.title = element_text(size = 13),
    legend.text = element_text(size = 12),
    panel.grid.minor = element_blank())

Next, I wanted to study the differences of pre and post statistics. In Table 1, we see that before small donor campaign finance reform in 2020, we see around 300K total donations, 81% of which were small donations. The median donation and mean donations were significantly higher, and we have less unique donors. Post-reform, we see significantly higher small donations, a significantly lower median and mean donation, and many more unique donors.

Code
# Let's make some tables!!! 

summary_donations <- donations |> 
  group_by(period) |> 
  summarise(
    total_donations = n(),
    small_donations = sum(matchable_donation),
    pct_small = mean(matchable_donation),
    median_donation = median(AMNT),
    mean_donation = mean(AMNT),
    unique_donors = n_distinct(NAME))

summary_donations |>
  mutate(
    pct_small = scales::percent(pct_small, accuracy = 0.1),
    median_donation = scales::dollar(median_donation),
    mean_donation = scales::dollar(mean_donation)) |>
  rename("Total donations" = total_donations,
  "Small donations" = small_donations,
  "% small" = pct_small, 
  "Median donation" = median_donation, 
  "Mean donation"= mean_donation, 
  "Unique donors" = unique_donors) |> 
  kable(caption = "Donation Summary by Period")
Table 1: Donation Summary by Period
Donation Summary by Period
period Total donations Small donations % small Median donation Mean donation Unique donors
post 489638 446259 91.1% $50 $215.22 258380
pre 300702 243836 81.1% $100 $337.43 187028

Table 1 is hugely exciting. We can SEE the change in the empowerment of small donors. But is that simply because there were more donations made? Perhaps there were just more contentious races in 2021 and 2023 and a lot more money donated overall?

So let us look at Table 2. We see that the total amount of money pre and post reform is NOT significantly different - only about a 4% increase. Yet, we observe a huge jump in the percentage of total money donated that’s from small-donors. For a campaign finance environment dictated by Citizens United v. FEC post-2010, this is incredibly important - candidates are reaching out more to small donors because their money is important too.

Code
money_summary <- donations |>
  group_by(period) |>
  summarise(
    total_money = sum(AMNT),
    small_money = sum(AMNT[matchable_donation]),
    pct_small_money = small_money / total_money)

money_summary |>
  mutate(
    total_money = scales::comma(total_money),
    small_money = scales::comma(small_money),
    pct_small_money = scales::percent(pct_small_money, accuracy = 0.1)) |>
  rename("Total money" = total_money,
  "Small-donor money" = small_money,
  "% from small donors" = pct_small_money) |> 
  kable(caption = "Share of Campaign Money from Matchable Donations")
Table 2: Share of Campaign Money from Matchable Donations
Share of Campaign Money from Matchable Donations
period Total money Small-donor money % from small donors
post 105,380,764 31,282,866 29.7%
pre 101,466,587 21,547,218 21.2%

Alright, so how does this matching system work in practice? Table 3 allows us to understand what the average private donation is, and how much of a public match there is to that. We can observe that the multiplier here is almost 1.

Code
match_summary <- donations |>
  filter(MATCHAMNT > 0) |>
  summarise(
    avg_private = mean(AMNT),
    avg_match = mean(MATCHAMNT),
    avg_multiplier = mean(MATCHAMNT / AMNT))

match_summary |>
  mutate(
    avg_private = scales::dollar(avg_private),
    avg_match = scales::dollar(avg_match),
    avg_multiplier = round(avg_multiplier, 3)) |>
  rename("Avg private donation" = avg_private,
  "Avg public match" = avg_match,
  "Avg match multiplier" = avg_multiplier) |> 
  kable(caption = "Summary of Matched Contributions")
Table 3: Summary of Matched Contributions
Summary of Matched Contributions
Avg private donation Avg public match Avg match multiplier
$153.99 $75.76 0.93

This got me thinking about donations by size WITHIN the small donation size, which in NYC is $5-$250. We see significant increases in the under $25 and $25-$99 categories, and a decrease in the over $250 category. Once again, we see the demonstrated benefits of the public donor financing system.

Code
# Number of donations by size

donations |>
  mutate(
    size_group = case_when(
      AMNT < 25 ~ "Under $25",
      AMNT < 100 ~ "$25–$99",
      AMNT <= 250 ~ "$100–$250",
      TRUE ~ "Over $250"),
    size_group = factor(size_group, levels = c("Under $25", "$25–$99", "$100–$250", "Over $250")),  
    period = factor(period, levels = c("pre", "post"))) |> 
  count(period, size_group) |>
  ggplot(aes(x = size_group, y = n, fill = period)) +
  geom_col(position = "dodge", color = "black", linewidth = 0.4) +
  labs(
    title = "Number of Donations by Size Category",
    x = "Donation Size",
    y = "Number of Donations") +
  theme_minimal() +
  scale_fill_manual(values = c(
    "post" = "red",
    "pre" = "black")) +
  theme(
    text = element_text(family = "Helvetica"),
    plot.title = element_text(face = "bold", size = 16),
    axis.title = element_text(size = 14),
    axis.text = element_text(size = 12),
    legend.position = "right",
    panel.grid.minor = element_blank())

Finally, I took a look at the claim that the small donor public financing system encourages new and more diverse donors, specifically donors from “historically underrepresented communities”, to donate. I’d want to do a more through analysis of this through zip codes and demographic data, but for now, I limited it to an analysis of share of donations by borough. As we can see, Manhattan’s share of donations decreased significantly, while Brooklyn’s and out-of-NYC’s increased by a good amount.

Code
borough_share <- donations |>
  count(period, BOROUGHCD) |> 
  group_by(period) |>
  mutate(
    total = sum(n),
    pct = n / total)

borough_change <- borough_share |>
  select(period, BOROUGHCD, pct) |>
  pivot_wider(names_from = period, values_from = pct) |>
  mutate(change = post - pre) 

ggplot(borough_change |> filter(!is.na(BOROUGHCD)),
       aes(x = BOROUGHCD, y = change, fill = change > 0)) +
  geom_col(width = 0.7, color = "black", linewidth = 0.4) +
  scale_y_continuous(labels = scales::percent) +
  labs(title = "Change in Share of Donations by Borough",
    x = "Borough",
    y = "Change in Share of Donations") +
  theme_minimal() +
  scale_x_discrete(labels = c(
    "K" = "Brooklyn",
    "M" = "Manhattan",
    "Q" = "Queens",
    "S" = "Staten Island",
    "X" = "Bronx",
    "Z" = "Outside NYC")) + 
  scale_fill_manual(values = c(
    "TRUE" = "red",   
    "FALSE" = "black")) +
  theme(text = element_text(family = "Helvetica"), 
    plot.title = element_text(face = "bold", size = 16),
    axis.title = element_text(size = 14),
    axis.text = element_text(size = 12),
    legend.position = "none",
    panel.grid.minor = element_blank())

Conclusion

Working on this analysis made me even more curious about the mechanics of small donor public financing and how policy design can shape civic participation. While this was a preliminary exploration, it was exciting to see patterns that suggest small donors may be playing a larger role in campaign finance after the reform. If nothing else, it reassured me that staring at large campaign finance datasets for fun might actually be a productive use of time :)

Given more time, I would love to expand this analysis further by connecting contribution data with demographic and geographic information to better understand whether participation increased among historically underrepresented communities. There is a lot more to explore here, and this project only scratches the surface of the questions I find most interesting.

More broadly, working on this project reinforced why I’m so drawn to the Brennan Center’s work. I’m fascinated by the ways institutional design can shape political participation, and campaign finance reform feels like a particularly powerful lever for strengthening democratic engagement. If I were lucky enough to work with the Brennan Center’s legal department this summer, I would be thrilled to keep asking questions like these!