DATA_608_Story_01_New

Author

Henock Montcho

Published

February 9, 2026

Story - 1 : Infrastructure Investment & Jobs Act Funding Allocation

Instructions

The attached Excel file contains data on the present allocation of the Infrastructure Investment and Jobs Act funding by State and Territory.

library(forcats)
library(ggplot2)
library(tidyr)
library(dplyr)
library(readxl)

Your story (Data Visualization(s) ) should address the following questions:

Is the allocation equitable based on the population of each of the States and Territories, or is bias apparent?

Let’s plot the funding allocation by State and Territory, and sort in ascending order

setwd("C:/Users/month/OneDrive/Documents/Cuny MSDS/Data 608_Knowledge And Visual Analytics")
allocation <- read_excel("IIJAFUNDINGASOFMARCH2023.xlsx")
head(allocation)

# A tibble: 6 × 2
  `State, Teritory or Tribal Nation` `Total (Billions)`
  <chr>                                           <dbl>
1 ALABAMA                                        3     
2 ALASKA                                         3.7   
3 AMERICAN SAMOA                                 0.0686
4 ARIZONA                                        3.5   
5 ARKANSAS                                       2.8   
6 CALIFORNIA                                    18.4

allocation |>
  ggplot(aes(
    y = fct_reorder(`State, Teritory or Tribal Nation`, `Total (Billions)`, .desc = TRUE),
    x = `Total (Billions)`)) +
  geom_bar(stat = "identity") +
  theme(
    axis.text.y = element_text(size = 5),
    plot.margin = margin(10, 10, 10, 20)) +
  labs(title = "IIJA Funding as of March 2023", 
       x = "Total Funding (Billions)", 
       y = "State, Territory, or Tribal Nation")

Let’s plot the population by State and Territory, and sort in ascending order

library(forcats)
library(ggplot2)
library(tidyr)
library(dplyr)
library(readxl)
setwd("C:/Users/month/OneDrive/Documents/Cuny MSDS/Data 608_Knowledge And Visual Analytics")
population <- read_excel("EST2025POP.xlsx")
head(population)

# A tibble: 6 × 2
  `State, Teritory or Tribal Nation` Population
  <chr>                                   <dbl>
1 Alabama                               5193088
2 Alaska                                 737270
3 AMERICAN SAMOA                          46400
4 Arizona                               7623818
5 Arkansas                              3114791
6 California                           39355309

population |>
  ggplot(aes(
    y = fct_reorder(`State, Teritory or Tribal Nation`, `Population`, .desc = TRUE),
    x = `Population`)) +
  geom_bar(stat = "identity") +
  theme(
    axis.text.y = element_text(size = 5),
    plot.margin = margin(10, 10, 10, 20)) +
  labs(title = "US Population as of 2025", 
       x = "Total Population", 
       y = "State, Territory, or Tribal Nation")

Comment:

The allocation could potentially be equitable based on the population of each of the States and Territories, if a random State or Territory has the same position on both graphs. From the two visualizations above, we realize that there are some instances where the funding allocation by State or Territory does not match the population by State or Territory. For example, Wyoming is the 5th smallest State or Territory in terms of population, but it has the 21st smallest funding by State or Territory.

Let’s estimate the Funding per Capita to assess the impact from the change in the populations

library(stringr)  

allocation_norm <- allocation |>
  mutate(state_key = str_to_upper(str_squish(str_trim(`State, Teritory or Tribal Nation`))))

population_norm <- population |>
  mutate(state_key = str_to_upper(str_squish(str_trim(`State, Teritory or Tribal Nation`))))

joined <- allocation_norm |>
  left_join(population_norm, by = "state_key") |>
  mutate(
    funding_per_capita = (`Total (Billions)` * 1e9) / Population,
    `State, Teritory or Tribal Nation`= `State, Teritory or Tribal Nation.x`
  ) |>
  select(-c(1,3,4)) |>
  select(`State, Teritory or Tribal Nation`, everything())
head(joined)

# A tibble: 6 × 4
  State, Teritory or Tribal N…¹ `Total (Billions)` Population funding_per_capita
  <chr>                                      <dbl>      <dbl>              <dbl>
1 ALABAMA                                   3         5193088               578.
2 ALASKA                                    3.7        737270              5019.
3 AMERICAN SAMOA                            0.0686      46400              1478.
4 ARIZONA                                   3.5       7623818               459.
5 ARKANSAS                                  2.8       3114791               899.
6 CALIFORNIA                               18.4      39355309               468.
# ℹ abbreviated name: ¹`State, Teritory or Tribal Nation`

Let’s plot the Funding per Capita

stats <- summary(joined$funding_per_capita)

joined |>
  ggplot(aes(y = `State, Teritory or Tribal Nation`, 
             x = funding_per_capita)) +
  geom_bar(stat = "identity") +
  theme(
    axis.text.y = element_text(size = 5),
    plot.margin = margin(10, 10, 10, 20)) +
  labs(title = "USA Per Capita Funding as of 2025", 
       x = "Per Capita Funding (Dollars)", 
       y = "State, Territory, or Tribal Nation") +
  geom_vline(xintercept = stats["Median"], color = "blue", linetype = "dashed") +
  geom_text(aes(x = stats["Median"], y = -1, label = paste("Median =", round(stats["Median"], 2))),
            vjust = -0.5, hjust = 0.5, size = 4, color = "blue")

Comment:

The visualization shows that several states receive significantly higher per‑capita funding than the median. Alaska stands out with more than five thousand dollars per resident; over eight times the median allocation. Other states and territories, including Wyoming, West Virginia, Vermont, the U.S. Virgin Islands, South Dakota, Rhode Island, the Northern Mariana Islands, North Dakota, New Mexico, Montana, the District of Columbia, and American Samoa, also exhibit disproportionately high per‑capita funding relative to their populations.

Conversely, some jurisdictions; most notably Puerto Rico and Florida, fall well below the median.

Taken together, these disparities suggest that IIJA funding is not distributed proportionately to population size, indicating that the allocation may not be equitable if evaluated strictly on a per‑capita basis.

Does the allocation favor the political interests of the Biden administration?

Let’s recreate a join to only take into consideration the States

setwd("C:/Users/month/OneDrive/Documents/Cuny MSDS/Data 608_Knowledge And Visual Analytics")
results2020 <- read.csv("1976_2020president.csv", na.strings = c("")) |>
 filter(c(year == 2020), (party_detailed %in% c("DEMOCRAT","REPUBLICAN"))) |>
  select(c(1,2,8,9,11))
head(results2020)

  year   state           candidate party_detailed candidatevotes
1 2020 ALABAMA BIDEN, JOSEPH R. JR       DEMOCRAT         849624
2 2020 ALABAMA    TRUMP, DONALD J.     REPUBLICAN        1441170
3 2020  ALASKA BIDEN, JOSEPH R. JR       DEMOCRAT         153778
4 2020  ALASKA    TRUMP, DONALD J.     REPUBLICAN         189951
5 2020 ARIZONA BIDEN, JOSEPH R. JR       DEMOCRAT        1672143
6 2020 ARIZONA    TRUMP, DONALD J.     REPUBLICAN        1661686

joined_new <- joined  |>
  left_join(results2020, by = c(`State, Teritory or Tribal Nation` = "state"))

#Dropping NA's to only keep States (no Territories)
states_only <- drop_na(joined_new)
head(states_only)

# A tibble: 6 × 8
  State, Teritory or Tr…¹ `Total (Billions)` Population funding_per_capita  year
  <chr>                                <dbl>      <dbl>              <dbl> <int>
1 ALABAMA                                3      5193088               578.  2020
2 ALABAMA                                3      5193088               578.  2020
3 ALASKA                                 3.7     737270              5019.  2020
4 ALASKA                                 3.7     737270              5019.  2020
5 ARIZONA                                3.5    7623818               459.  2020
6 ARIZONA                                3.5    7623818               459.  2020
# ℹ abbreviated name: ¹`State, Teritory or Tribal Nation`
# ℹ 3 more variables: candidate <chr>, party_detailed <chr>,
#   candidatevotes <int>

Let’s explore the funding from the political parties standpoint

joined_new |>
  group_by(party_detailed) |>
  summarize(total = sum(`Total (Billions)`)) |>
  mutate(funding_ratio = total / sum(total)) |>
  ggplot(aes(x = party_detailed, y = funding_ratio * 100, fill = party_detailed)) +
  geom_bar(stat = "identity") +
  labs(title = "Funding Per Party", 
       x = "Political Party", 
       y = "Percentage of Total Allocations") +
  scale_fill_manual(values = c("blue", "red", "grey")) +
  geom_text(aes(label = paste0(round(funding_ratio*100, 0),'%'), vjust = -0.2))

Comment:

The funding per political parties is approximately the same between Democrats and Republicans.

Let’s explore whether the winner of the 2020 presidential elections (Biden administration) favored Democrat states

states_only |>
  group_by(`State, Teritory or Tribal Nation`)  |>
  slice_max(order_by = candidatevotes, n = 1, with_ties = FALSE)  |>
  ungroup()  |>

  ggplot(aes(y = reorder(`State, Teritory or Tribal Nation`, funding_per_capita), x = funding_per_capita, fill = candidate)) +
  geom_bar(stat = "identity")+
  theme(
    axis.text.y = element_text(size = 5),
    plot.margin = margin(10, 10, 10, 20)) +
  labs(title = "USA Per Capita Funding as of 2025", 
       x = "Funding Per Capita (Dollars)", 
       y = "State",
       fill = "Elected Candidate 2020") +
  scale_fill_manual(values = c("blue", "red"))

Comment:

States receiving higher levels of Funding per Capita, are predominantly those that supported the Republican candidate in the 2020 presidential election. This pattern suggests that the distribution of funds does not simply reflect partisan alignment with the current administration. However, many factors could influence how these allocations were determined. Additional information such as: demographic characteristics, economic conditions, and infrastructure needs, would be necessary to better understand the underlying drivers of these funding decisions.

Conclusion

The visualizations suggest that funding levels are not distributed evenly across states, as there are substantial differences in per‑capita allocations. Although large states like California and Texas received high total amounts due to their populations, their per‑person funding was relatively low compared with states that have much smaller populations, such as Alaska and Wyoming.

At the same time, the distribution does not appear to align with President Biden’s partisan interests. The states receiving the highest per‑capita funding are predominantly Republican‑leaning. To understand what factors may be driving these patterns, further analysis would be needed; particularly incorporating demographic, economic, and infrastructure‑related variables.

Sources of the data used:

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/42MVDX

(For the 2020 Presidential Election results).