We will assess provided data on the present allocation of the Infrastructure Investment and Jobs Act funding by State and Territory to address the following questions:
Is the allocation equitable based on the population of each of the States and Territories, or is bias apparent?
Does the allocation favor the political interests of the Biden administration?
To do so we will source data on the current estimated populations and official 2020 Presidential election results of each State and Territory.
Subsections:
- Loading Libraries
- Loading Data
- Description of Variables
We’re using the tidyverse
to have a cohesive and
consistent ecosystem of libraries to complete Data Science-motivated
projects. The maps
and mapproj
libraries are
to plot the USA in the second half of the report.
# Libraries used
library(tidyverse)
library(maps)
library(mapproj)
We took the provided IIJA Funding data as of March 2023 and added estimated population data as of July 2023 and the electoral and popular vote totals from the 2020 presidential election to one excel file before uploading.
While the following locations all received IIJA Funding, none of them
contributed electoral votes to the 2020 presidential election and have
been excluded:
- American Samoa
- Guam
- Northern Mariana Islands
- Puerto Rico
- the Tribal Communities
- the US Virgin Islands
Compiling the data in Google Sheets and downloading as a .csv file allowed us to quickly make adjustments to minimize manipulations after loading. These adjustments included: correcting State names (ex. Deleware), pairing them with two letter abbreviations, collating the three sets of data by State, and creating convenient column names. For purposes of this assignment we are referring to the District of Columbia as a State.
The source for the population data came from the U.S. Census Bureau’s Annual Estimates of the Resident Population for the United States located here: https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html
The source for the federal election results came from the FEC here: https://www.fec.gov/documents/4228/federalelections2020.xlsx
# Loading Data
FPV <- read_csv("IIJA Funding Population Votes.csv",show_col_types = FALSE)
Below is a glimpse of the data and a table of the variables with short descriptions:
Variable | xxxx | Definition |
---|---|---|
Area | Name of State (or District) | |
Code | Two Letter State Code | |
Funding | IIJA Funding in Billions | |
Population | U.S. Census Bureau Population Estimate as of 7/1/2023 | |
ElecVoteD | Electoral Votes for Biden | |
ElecVoteR | Electoral Votes for Trump | |
PopVoteD | Popular Votes for Biden | |
PopVoteR | Popular Votes for Trump |
# Quick look at the data
glimpse(FPV)
## Rows: 51
## Columns: 8
## $ Area <chr> "Alabama", "Alaska", "Arizona", "Arkansas", "California", "…
## $ Code <chr> "AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FL",…
## $ Funding <dbl> 3.000, 3.700, 3.500, 2.800, 18.400, 3.200, 2.500, 0.792, 1.…
## $ Population <dbl> 5108468, 733406, 7431344, 3067732, 38965193, 5877610, 36171…
## $ ElecVoteD <dbl> NA, NA, 11, NA, 55, 9, 7, 3, 3, NA, 16, 4, NA, 20, NA, NA, …
## $ ElecVoteR <dbl> 9, 3, NA, 6, NA, NA, NA, NA, NA, 29, NA, NA, 4, NA, 11, 6, …
## $ PopVoteD <dbl> 849624, 153778, 1672143, 423932, 11110639, 1804352, 1080831…
## $ PopVoteR <dbl> 1441170, 189951, 1661686, 760647, 6006518, 1364607, 714717,…
To assess whether the funding is equitably allocated based on population we check if the funding per person is roughly equal for each location. For this we’ll need a variable for Funding per person.
To assess whether the funding allocation assuming there’s a bias
favors the political interests of the Biden administration we could
possibly compare:
- Funding per person vs Winner of States’ electoral votes
- Funding per person vs Percent of popular vote that went to Biden
Subsections:
- Funding per person
- Winner of States’ electoral votes
- Percent of popular vote that went to Biden
We’re creating a funding-per-person variable, FundingPP
,
by dividing the Funding by the expected population. Our units will be in
dollars per person so we will have to multiply the Funding in billions
by a billion to get funding in dollars.
# Create Funding per person
FPV$FundingPP <- FPV$Funding * 1000000000 / FPV$Population
Our new variable BidenWon
, will be either 1
if Biden won or 0
if Trump won. Note Maine and Nebraska
allow the state to split the electoral votes so we’re using a
comparison. If the votes were equal this would incorrectly say the state
went to Trump.
# Set missing values to zero before creating the next variable
FPV$ElecVoteD <- replace_na(FPV$ElecVoteD,0)
FPV$ElecVoteR <- replace_na(FPV$ElecVoteR,0)
# Create variable determining who each state voted for
FPV <- FPV %>%
mutate(BidenWon = ifelse(ElecVoteD > ElecVoteR, 1, 0))
Our new variable PctBiden
, will be the percent of the
popular vote that went to Biden in each State.
# Create variable for percent of popular vote that went to Biden
FPV$PctBiden <- FPV$PopVoteD / (FPV$PopVoteD + FPV$PopVoteR)
If there were no bias in the allocation of IIJA funding based on population alone we would assume that the funding per person in each State would be roughly the same. However looking at the following chart we see that funding per person varies by state between $5,045/person in Alaska to $363/person in Florida.
Below we can see there is bias in the allocation of IIJA funding based on population. The States in Green have more then $1,000/person and the States in Red have less than $500/person.
If the IIJA funding were allocated to reward States that voted for Biden then we would expect to see higher Funding per persons in the States that voted for Biden.
We’ve created two maps, of the States that voted for Biden and Trump respectively, with lighter shades indicating higher funding. If IIJA funding were allocated in favor of the states that voted for Biden we would expect that map to be lighter but it’s not discernably lighter than the Trump map.
We show that there wasn’t even funding per person by State, however the reason for the differences wasn’t whether a State voted for Biden or not.
Thematic Consistency - it would have been nice if we could have married the look of the charts in the two sections so they looked as if they were part of a cohesive story.
People aren’t used to seeing the USA in parts. Maybe I could have shaded the excluded States as gray. Also I wasn’t able to capture Alaska or Hawaii. Also, the color scale between the two charts is slightly different. It would be improved if I could use the same scale in both maps. Also, I could have renamed the legend so it was clearer: maybe ” Funding Per Person”.
Because I went with maps in the second part of the report, I gave up some specificity. Had I gone with charts I could have left the reader with a more definitive sense of an answer.
Ultimately I didn’t use the Percent of the Popular vote that went to Biden so I could have trimmed down the file after it was completed.
To get started on making the map I relied on Dr. Lyndon Walker’s youtube video: “How to plot a color coded map of USA in R”
https://www.youtube.com/watch?v=BwRwjAbXSSU
# Libraries used
library(tidyverse)
library(maps)
library(mapproj)
# Loading Data
FPV <- read_csv("IIJA Funding Population Votes.csv",show_col_types = FALSE)
# Quick look at the data
glimpse(FPV)
# Create Funding per person
FPV$FundingPP <- FPV$Funding * 1000000000 / FPV$Population
# Set missing values to zero before creating the next variable
FPV$ElecVoteD <- replace_na(FPV$ElecVoteD,0)
FPV$ElecVoteR <- replace_na(FPV$ElecVoteR,0)
# Create variable determining who each state voted for
FPV <- FPV %>%
mutate(BidenWon = ifelse(ElecVoteD > ElecVoteR, 1, 0))
# Create variable for percent of popular vote that went to Biden
FPV$PctBiden <- FPV$PopVoteD / (FPV$PopVoteD + FPV$PopVoteR)
# Plot to show funding per person not equal between States
plt <- ggplot(FPV, aes(y = reorder(Code, +FundingPP), x = FundingPP, fill = cut(FundingPP, breaks = c(-Inf, 500, 1000, Inf), labels = c("red", "blue", "green")))) +
geom_col(width = 0.8) +
labs(y = "States", x = "Dollars of Funding Per Person", title = "Dollars of Funding Per Person Are Not Equal Across States") +
scale_fill_identity() +
scale_x_continuous(expand = c(0, 0)) +
theme(axis.text.y = element_text(angle = 0, hjust = 0))
plt
# Preparation to create maps
states <- map_data("state")
FPV <- FPV %>%
rename(region = Area)
FPV$FundingPPlog <- log(FPV$FundingPP)
FPV$region <- tolower(FPV$region)
# Sort maps for display and create one for Biden or Trump
FPV.geo <- merge(states, FPV, sort = FALSE, by = "region")
FPV.geo <- FPV.geo[order(FPV.geo$order), ]
FPV.geo_biden <- filter(FPV.geo, BidenWon == 1)
FPV.geo_trump <- filter(FPV.geo, BidenWon == 0)
# Plot Biden's Map
ggplot(FPV.geo_biden, aes(long, lat)) +
geom_polygon(aes(group = group, fill = FundingPPlog)) +
coord_map() +
theme_void() +
labs(title = "States that Voted for Biden Shaded by Funding Per Person")
# Plot Trump's Map
ggplot(FPV.geo_trump, aes(long, lat)) +
geom_polygon(aes(group = group, fill = FundingPPlog)) +
coord_map() +
theme_void() +
labs(title = "States that Voted for Trump Shaded by Funding Per Person")