#' nice_table
#' 
#' @param df
#' @param fw
nice_table <- function(df, cap=NULL, cols=NULL, dig=3, fw=F){
  if (is.null(cols)) {c <- colnames(df)} else {c <- cols}
  table <- df %>% 
    kable(caption=cap, col.names=c, digits=dig) %>% 
    kable_styling(
      bootstrap_options = c("striped", "hover", "condensed"),
      html_font = 'monospace',
      full_width = fw)
  return(table)
}

Overview

The Infrastructure Investment and Jobs Act is a significant commitment by the federal government to improving the nation’s infrastructure. This analysis seeks to assess the equity of the Act’s funding allocation across states and territories, as well as whether there is any evidence of political bias in the distribution of funds. Understanding how these funds are allocated is critical for ensuring that the Act achieves its stated goal of equitable infrastructure development.

1. Data Preparation

1.1 Summary Statistics

Funding Data

The funding data contained 57 observations of 2 variables. It included columns for each state, territory, or tribal nation, as well as total funding in billions. The data source: Github repo

Population Data

The population data had a complex structure, with row headers and numerous unnamed columns. The relevant data appeared to be organized into columns with headers such as “April 1, 2020 Population Estimates Base” and “Population Estimate (as of July 1)” for various years. As a results of data preparation step, this dataset should have contained 57 observations of 2 variables. It included columns for each state, territory, or tribal nation, as well as total population for 2020 year. The data source: United States Census Bureau, List of U.S. states and territories by population, Native Americans in the United States

Election Results Data

The election results data included columns for the state, final call (indicating which party won), Democratic and Republican votes, other votes, percentages for each, and margin of victory. There were additional columns for state IDs, electoral votes, and other metadata. As a results of data preparation step, this dataset should have contained 51 observations of 4 variables. The data source: The Cook Political Report with Amy Walter

All three datasets had no missing values for their key columns, indicating that the data was complete in terms of the presence of values.

# Load data from Github
df1 <- read.csv("https://raw.githubusercontent.com/ex-pr/DATA_608/main/Story_1/Popular%20vote%20backend%20-%20Sheet1.csv")

df2 <- read.xlsx("https://github.com/ex-pr/DATA_608/blob/main/Story_1/NST-EST2023-CHG.xlsx?raw=true", 1)

df3 <- read.xlsx("https://github.com/ex-pr/DATA_608/blob/main/Story_1/IIJA%20FUNDING%20AS%20OF%20MARCH%202023(1).xlsx?raw=true", 1)
elections <- df1
population <- df2
funding <- df3
#show the data in tables
elections %>% 
  nice_table(cap='The 2020 Presidential election') %>% 
  scroll_box(width='100%', height = "400px")
The 2020 Presidential election
state called final dem_votes rep_votes other_votes dem_percent rep_percent other_percent dem_this_margin margin_shift vote_change stateid EV X Y State_num Center_X Center_Y X.1 X2016.Margin Total.2016.Votes
U.S. Total D Yes 81,282,916 74,223,369 2,891,441 51.3% 46.9% 1.8% 4.5% 2.4% 15.9% NA NA NA NA NA NA NA 0.021 136,639,848
15 Key Battlegrounds 31,908,248 33,002,287 950,354 48.4% 50.1% 1.4% -1.7% 2.0% 17.2% NA NA NA NA NA NA NA -0.036 56,209,173
Non-Battlegrounds 49,374,668 41,221,082 1,941,087 53.4% 44.5% 2.1% 8.8% 2.7% 15.1% NA NA NA NA NA NA NA 0.061 80,430,331
15 Key Battlegrounds NA NA NA NA NA NA NA NA
Arizona D Yes 1,672,143 1,661,686 53,497 49.4% 49.1% 1.6% 0.3% 3.9% 31.6% AZ 11 2 2 4 205 374 NA -0.035 2,573,165
Florida R Yes 5,297,045 5,668,731 101,680 47.9% 51.2% 0.9% -3.4% -2.2% 17.5% FL 29 8 0 12 787 520 NA -0.012 9,420,039
Georgia D Yes 2,473,633 2,461,854 62,229 49.5% 49.3% 1.2% 0.2% 5.4% 22.1% GA 16 7 1 13 734 419 NA -0.051 4,092,373
Iowa R Yes 759,061 897,672 34,138 44.9% 53.1% 2.0% -8.2% 1.2% 8.0% IA 6 5 4 19 536 223 NA -0.094 1,566,031
Maine 2nd District R Yes 168,696 196,692 10,961 44.8% 52.3% 2.9% -7.4% 2.8% 6.5% ME2 1 12 8 102 919 92 NA -0.103 353,416
Michigan D Yes 2,804,040 2,649,852 85,410 50.6% 47.8% 1.5% 2.8% 3.0% 15.4% MI 16 7 5 26 681 190 NA -0.002 4,799,284
Minnesota D Yes 1,717,077 1,484,065 76,029 52.4% 45.3% 2.3% 7.1% 5.6% 11.3% MN 10 5 5 27 519 120 NA 0.015 2,944,813
Nebraska 2nd District D Yes 176,468 154,377 6,909 52.2% 45.7% 2.0% 6.5% 8.8% 15.8% NE2 1 12 6 104 438 238 NA -0.022 291,680
Nevada D Yes 703,486 669,890 32,000 50.1% 47.7% 2.3% 2.4% 0.0% 24.9% NV 6 2 4 32 136 240 NA 0.024 1,125,385
New Hampshire D Yes 424,921 365,654 15,607 52.7% 45.4% 1.9% 7.4% 7.0% 8.3% NH 4 11 6 33 895 146 NA 0.004 744,158
North Carolina R Yes 2,684,292 2,758,775 81,737 48.6% 49.9% 1.5% -1.3% 2.3% 16.5% NC 15 7 2 37 808 349 NA -0.037 4,741,564
Ohio R Yes 2,679,165 3,154,834 88,203 45.2% 53.3% 1.5% -8.0% 0.1% 7.7% OH 18 7 4 39 720 250 NA -0.081 5,496,487
Pennsylvania D Yes 3,458,229 3,377,674 79,380 50.0% 48.8% 1.1% 1.2% 1.9% 13.1% PA 20 8 4 42 806 221 NA -0.007 6,115,402
Texas R Yes 5,259,126 5,890,347 165,583 46.5% 52.1% 1.5% -5.6% 3.4% 26.2% TX 38 3 0 48 444 471 NA -0.090 8,969,226
Wisconsin D Yes 1,630,866 1,610,184 56,991 49.4% 48.8% 1.7% 0.6% 1.4% 10.8% WI 10 6 6 55 601 169 NA -0.008 2,976,150
Non-Battlegrounds NA NA NA NA NA NA NA NA
Alabama R Yes 849,624 1,441,170 32,488 36.6% 62.0% 1.4% -25.5% 2.3% 9.4% AL 9 6 1 1 674 424 NA -0.277 2,123,372
Alaska R Yes 153,778 189,951 15,801 42.8% 52.8% 4.4% -10.1% 4.7% 12.8% AK 3 0 6 2 105 512 NA -0.147 318,608
Arkansas R Yes 423,932 760,647 34,490 34.8% 62.4% 2.8% -27.6% -0.7% 7.8% AR 6 5 2 5 564 386 NA -0.269 1,130,635
California D Yes 11,110,250 6,006,429 384,192 63.5% 34.3% 2.2% 29.2% -0.9% 23.4% CA 55 1 3 6 75 283 NA 0.301 14,181,595
Colorado D Yes 1,804,352 1,364,607 87,993 55.4% 41.9% 2.7% 13.5% 8.6% 17.1% CO 9 3 3 8 330 284 NA 0.049 2,780,220
Connecticut D Yes 1,080,831 714,717 28,309 59.3% 39.2% 1.6% 20.1% 6.4% 10.9% CT 7 10 4 9 886 190 NA 0.136 1,644,920
Delaware D Yes 296,268 200,603 7,475 58.7% 39.8% 1.5% 19.0% 7.6% 14.2% DE 3 11 3 10 853 258 NA 0.114 441,590
District of Columbia D Yes 317,323 18,586 8,447 92.1% 5.4% 2.5% 86.8% 0.0% 10.6% DC 3 9 3 11 829 264 NA 0.868 311,268
Hawaii D Yes 366,130 196,864 11,475 63.7% 34.3% 2.0% 29.5% -2.7% 33.9% HI 4 0 0 15 272 570 NA 0.322 428,937
Idaho R Yes 287,021 554,119 26,091 33.1% 63.9% 3.0% -30.8% 1.0% 25.6% ID 4 2 5 16 195 148 NA -0.318 690,255
Illinois D Yes 3,471,915 2,446,891 114,938 57.5% 40.6% 1.9% 17.0% -0.1% 9.0% IL 20 6 5 17 614 264 NA 0.171 5,536,424
Indiana R Yes 1,242,413 1,729,516 61,183 41.0% 57.0% 2.0% -16.1% 3.1% 10.9% IN 11 6 4 18 669 264 NA -0.192 2,734,958
Kansas R Yes 570,323 771,406 30,574 41.6% 56.2% 2.2% -14.7% 5.9% 15.9% KS 6 4 2 20 457 307 NA -0.206 1,184,402
Kentucky R Yes 772,474 1,326,646 37,648 36.2% 62.1% 1.8% -25.9% 3.9% 11.1% KY 8 6 3 21 706 314 NA -0.298 1,924,149
Louisiana R Yes 856,034 1,255,776 36,252 39.9% 58.5% 1.7% -18.6% 1.0% 5.9% LA 8 4 1 22 560 456 NA -0.196 2,029,032
Maine D Yes 435,072 360,737 23,652 53.1% 44.0% 2.9% 9.1% 6.1% 9.6% ME 2 11 7 23 919 92 NA 0.030 747,927
Maine 1st District D Yes 266,376 164,045 12,691 60.1% 37.0% 2.9% 23.1% 8.3% 12.4% ME1 1 12 9 101 919 92 NA 0.148 394,329
Maryland D Yes 1,985,023 976,414 75,593 65.4% 32.2% 2.5% 33.2% 6.8% 9.2% MD 10 10 3 24 825 254 NA 0.264 2,781,446
Massachusetts D Yes 2,382,202 1,167,202 81,998 65.6% 32.1% 2.3% 33.5% 6.3% 9.2% MA 11 10 5 25 895 169 NA 0.272 3,325,046
Mississippi R Yes 539,508 756,789 17,597 41.1% 57.6% 1.3% -16.5% 1.3% 8.6% MS 6 5 1 28 617 427 NA -0.178 1,209,357
Missouri R Yes 1,253,014 1,718,736 54,212 41.4% 56.8% 1.8% -15.4% 3.2% 7.7% MO 10 5 3 29 557 305 NA -0.186 2,808,605
Montana R Yes 244,786 343,602 15,286 40.5% 56.9% 2.5% -16.4% 4.1% 21.4% MT 3 3 5 30 289 95 NA -0.204 497,147
Nebraska R Yes 374,583 556,846 20,283 39.4% 58.5% 2.1% -19.2% 5.9% 12.7% NE 2 4 3 31 438 238 NA -0.250 844,289
Nebraska 1st District R Yes 132,261 180,290 7,495 41.3% 56.3% 2.3% -15.0% 5.7% 13.4% NE1 1 12 7 103 438 238 NA -0.207 282,338
Nebraska 3rd District R Yes 65,854 222,179 5,879 22.4% 75.6% 2.0% -53.2% 1.0% 8.8% NE3 1 12 5 105 438 238 NA -0.542 270,109
New Jersey D Yes 2,608,335 1,883,274 57,744 57.3% 41.4% 1.3% 15.9% 1.8% 17.4% NJ 14 9 4 34 865 231 NA 0.141 3,874,046
New Mexico D Yes 501,614 401,894 20,457 54.3% 43.5% 2.2% 10.8% 2.6% 15.7% NM 5 3 2 35 309 385 NA 0.082 798,318
New York D Yes 5,244,886 3,251,997 119,978 60.9% 37.7% 1.4% 23.1% 0.6% 11.6% NY 29 9 5 36 838 168 NA 0.225 7,721,453
North Dakota R Yes 114,902 235,595 11,322 31.8% 65.1% 3.1% -33.4% 2.4% 5.1% ND 3 4 5 38 426 97 NA -0.357 344,360
Oklahoma R Yes 503,890 1,020,280 36,529 32.3% 65.4% 2.3% -33.1% 3.3% 7.4% OK 7 3 1 40 477 377 NA -0.364 1,452,992
Oregon D Yes 1,340,383 958,448 75,490 56.5% 40.4% 3.2% 16.1% 5.1% 18.6% OR 7 1 4 41 97 134 NA 0.110 2,001,336
Rhode Island D Yes 307,486 199,922 10,349 59.4% 38.6% 2.0% 20.8% 5.3% 11.6% RI 4 11 4 44 904 183 NA 0.155 464,144
South Carolina R Yes 1,091,541 1,385,103 36,685 43.4% 55.1% 1.5% -11.7% 2.6% 19.5% SC 9 8 2 45 783 394 NA -0.143 2,103,027
South Dakota R Yes 150,471 261,043 11,095 35.6% 61.8% 2.6% -26.2% 3.6% 14.2% SD 3 4 4 46 430 168 NA -0.298 370,093
Tennessee R Yes 1,143,711 1,852,475 57,665 37.5% 60.7% 1.9% -23.2% 2.8% 21.8% TN 11 6 2 47 670 360 NA -0.260 2,508,027
Utah R Yes 560,282 865,140 62,867 37.6% 58.1% 4.2% -20.5% -2.4% 31.5% UT 6 2 3 49 225 266 NA -0.181 1,131,430
Vermont D Yes 242,820 112,704 11,904 66.1% 30.7% 3.2% 35.4% 9.0% 16.6% VT 3 10 6 50 873 134 NA 0.264 315,067
Virginia D Yes 2,413,568 1,962,430 84,526 54.1% 44.0% 1.9% 10.1% 4.8% 12.0% VA 13 8 3 51 809 299 NA 0.053 3,982,752
Washington D Yes 2,369,612 1,584,651 133,368 58.0% 38.8% 3.3% 19.2% 3.5% 21.5% WA 12 1 5 53 128 58 NA 0.157 3,365,644
West Virginia R Yes 235,984 545,382 13,286 29.7% 68.6% 1.7% -38.9% 3.1% 11.4% WV 5 7 3 54 765 284 NA -0.421 713,051
Wyoming R Yes 73,491 193,559 9,715 26.6% 69.9% 3.5% -43.4% 2.9% 8.2% WY 3 3 4 56 306 187 NA -0.463 255,849
population %>% 
  nice_table(cap='State Population Totals and Components of Change: 2020-2023') %>% 
  scroll_box(width='100%',  height = "400px")
State Population Totals and Components of Change: 2020-2023
table.with.row.headers.in.column.A.and.column.headers.in.rows.3.through.5..(leading.dots.indicate.sub-parts) X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19
Annual and Cumulative Estimates of Resident Population Change for the United States, Regions, States, District of Columbia, and Puerto Rico and Region and State Rankings: April 1, 2020 to July 1, 2023 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
.Geographic Area April 1, 2020 Population Estimates Base Population Estimate (as of July 1) NA NA NA National Ranking of Regions and States NA NA NA NA Change, July 1, 2022 to July 1,2023 NA Change, April 1, 2020 to July 1,2023 NA National Ranking of Regions and States NA NA NA
NA NA NA NA NA NA April 1, 2020 Population Estimates Base Population Estimate (as of July 1) NA NA NA NA NA NA NA Change, July 1, 2022 to July 1,2023 NA Change, April 1, 2020 to July 1,2023 NA
NA NA 2020 2021 2022 2023 NA 2020 2021 2022 2023 Number Percent Number Percent Number Percent Number Percent
United States 331464948 331526933 332048977 333271411 334914895
1643484 0.49313680854551312 3449947 1.0408180475239872
Northeast 57614141 57430477 57243423 57026847 56983517 4 4 4 4 4 -43330 -7.5981756452360064E-2 -630624 -1.0945646139200444 4 4 4 4
Midwest 68987296 68969794 68850246 68783028 68909283 3 3 3 3 3 126255 0.18355545498811132 -78013 -0.11308313924929018 3 2 3 3
South 126268529 126465281 127353282 128702030 130125290 1 1 1 1 1 1423260 1.1058566830686354 3856761 3.0544119192201884 1 1 1 1
West 78594982 78661381 78602026 78759506 78896805 2 2 2 2 2 137299 0.17432689331494791 301823 0.38402324463920612 2 3 2 2
.Alabama 5024294 5031864 5050380 5073903 5108468 24 24 24 24 24 34565 0.68123099712391033 84174 1.6753398586945749 11 16 15 19
.Alaska 733374 732964 734923 733276 733406 48 48 48 48 48 130
32
43 43 36 36
.Arizona 7157902 7186683 7272487 7365684 7431344 14 14 14 14 14 65660 0.8914311284600317 273442 3.8201417119150274 7 12 5 8
.Arkansas 3011490 3014348 3028443 3046404 3067732 33 33 33 33 33 21328 0.70010412276244383 56242 1.8675805000182635 19 15 17 16
.California 39538212 39503200 39145060 39040616 38965193 1 1 1 1 1 -75423 -0.19319111153369095 -573019 -1.4492789911693529 50 46 50 47
.Colorado 5773707 5785219 5811596 5841039 5877610 21 21 21 21 21 36571 0.62610436259713387 103903 1.7995890681671238 9 18 11 17
.Connecticut 3605912 3577586 3603691 3608706 3617176 29 29 29 29 29 8470 0.23471017034915009 11264 0.31237589824710088 30 33 29 31
.Delaware 989946 991862 1004881 1019459 1031890 45 45 45 45 45 12431 1.219372235666172 41944 4.2369987857923563 26 6 19 7
.District of Columbia 689548 670839 669037 670949 678972 49 49 49 49 49 8023 1.1957689779700096 -10576 -1.5337583460469757 31 7 40 48
.Florida 21538216 21591299 21830708 22245521 22610726 3 3 3 3 3 365205 1.6417012665156279 1072510 4.9795674813550015 2 2 2 3
.Georgia 10713771 10732390 10790385 10913150 11029227 8 8 8 8 8 116077 1.0636434026839181 315456 2.9443974488534428 4 10 4 12
.Hawaii 1455274 1451181 1446745 1439399 1435138 40 40 40 40 40 -4261 -0.29602632765480597 -20136 -1.3836569608197493 45 49 42 46
.Idaho 1839117 1849339 1904537 1938996 1964726 38 38 38 38 38 25730 1.3269754037656603 125609 6.8298536743448075 17 4 9 1
.Illinois 12813469 12790357 12690341 12582515 12549689 6 6 6 6 6 -32826 -0.26088584039041479 -263780 -2.0586150401581333 49 48 49 50
.Indiana 6785442 6789098 6813798 6832274 6862199 17 17 17 17 17 29925 0.4379947291341067 76757 1.1312011804094708 14 22 16 22
.Iowa 3190427 3190904 3197944 3199693 3207004 31 31 31 31 31 7311 0.22849067082373214 16577 0.51958562286490173 32 34 27 29
.Kansas 2937835 2938124 2937946 2936716 2940546 35 35 35 35 34 3830 0.13041778639813995 2711 9.2278837987837981E-2 36 38 34 33
.Kentucky 4506297 4508155 4507600 4511563 4526154 26 26 26 26 26 14591 0.32341341570537752 19857 0.44065005036285892 25 29 25 30
.Louisiana 4657785 4652022 4627047 4588023 4573749 25 25 25 25 25 -14274 -0.311114395023739 -84036 -1.8042052177161463 48 50 48 49
.Maine 1363177 1364517 1378787 1389338 1395722 42 42 42 42 42 6384 0.45949941626875535 32545 2.3874375814732787 33 21 22 14
.Maryland 6177253 6173689 6175045 6163981 6180253 18 18 18 19 19 16272 0.26398523940940116 3000
24 32 33 34
.Massachusetts 7032933 6997713 6991951 6982740 7001399 15 15 15 16 16 18659 0.2672160212180319 -31534 -0.44837623222061124 22 31 45 43
.Michigan 10077674 10070627 10038117 10033281 10037261 10 10 10 10 10 3980
-40413 -0.40101515488593892 35 41 46 42
.Minnesota 5706804 5710578 5717968 5714300 5737915 22 22 22 22 22 23615 0.41326146684633286 31111 0.54515627310838077 18 25 23 28
.Mississippi 2961306 2958409 2949582 2938928 2939690 34 34 34 34 35 762
-21616 -0.72994820528510052 41 42 43 44
.Missouri 6154889 6154426 6170393 6177168 6196156 19 19 19 18 18 18988 0.30739005317647183 41267 0.67047512960834876 21 30 20 26
.Montana 1084244 1087211 1106366 1122878 1132812 44 44 43 43 43 9934 0.88469094594426112 48568 4.4794345184294304 28 13 18 5
.Nebraska 1961965 1963273 1964253 1968060 1978379 37 37 37 37 37 10319 0.5243234454234118 16414 0.83661023514690624 27 20 28 24
.Nevada 3104617 3115840 3146632 3177421 3194176 32 32 32 32 32 16755 0.52731444778642811 89559 2.8847036526566723 23 19 13 13
.New Hampshire 1377524 1378702 1387494 1399003 1402054 41 41 41 41 41 3051 0.21808387830476419 24530 1.78073122500951 37 36 24 18
.New Jersey 9289039 9272392 9269175 9260817 9290841 11 11 11 11 11 30024 0.32420465710530721 1802
13 28 35 35
.New Mexico 2117525 2118488 2116950 2113476 2114371 36 36 36 36 36 895
-3154 -0.14894747405579628 40 40 38 40
.New York 20202320 20104710 19854526 19673200 19571216 4 4 4 4 4 -101984 -0.51839050078279081 -631104 -3.1239184410503351 51 51 51 51
.North Carolina 10439459 10453812 10567100 10695965 10835491 9 9 9 9 9 139526 1.3044732289232435 396032 3.7936065460863442 3 5 3 9
.North Dakota 779079 779563 777982 778912 783926 47 47 47 47 47 5014 0.64371841748490199 4847 0.62214486592502172 34 17 31 27
.Ohio 11799331 11798292 11765227 11759697 11785935 7 7 7 7 7 26238 0.22311799360136575 -13396 -0.11353186040801806 16 35 41 38
.Oklahoma 3959411 3965234 3991634 4019271 4053824 28 28 28 28 28 34553 0.85968326096946446 94413 2.3845213341075224 12 14 12 15
.Oregon 4237279 4245044 4256465 4239379 4233358 27 27 27 27 27 -6021 -0.14202551836011831 -3921 -9.2535799507183741E-2 46 45 39 37
.Pennsylvania 13002788 12995477 13013614 12972091 12961683 5 5 5 5 5 -10408 -8.0233788060845398E-2 -41105 -0.31612451114330253 47 44 47 41
.Rhode Island 1097371 1096444 1097092 1093842 1095962 43 43 44 44 44 2120 0.19381226904799778 -1409 -0.12839777978459427 39 37 37 39
.South Carolina 5118422 5132151 5193848 5282955 5373555 23 23 23 23 23 90600 1.7149493039406924 255133 4.9846026763717415 5 1 6 2
.South Dakota 886668 887852 896299 909869 919318 46 46 46 46 46 9449 1.0385011468683953 32650 3.6823252897364065 29 11 21 10
.Tennessee 6910786 6926091 6963709 7048976 7126489 16 16 16 15 15 77513 1.0996348973240935 215703 3.121251330890582 6 8 7 11
.Texas 29145459 29234361 29561286 30029848 30503301 2 2 2 2 2 473453 1.5766080467673365 1357842 4.6588458256910625 1 3 1 4
.Utah 3271614 3283982 3339284 3381236 3417734 30 30 30 30 30 36498 1.0794277595530155 146120 4.4662970631620968 10 9 8 6
.Vermont 643077 642936 647093 647110 647464 50 50 50 50 50 354 5.470476425955402E-2 4387 0.68218891361376632 42 39 32 25
.Virginia 8631373 8637193 8657348 8679099 8715698 12 12 12 12 12 36599 0.4216912377655791 84325 0.97695928562002832 8 23 14 23
.Washington 7705267 7724566 7741433 7784477 7812880 13 13 13 13 13 28403 0.36486715806341258 107613 1.3966161068785807 15 26 10 20
.West Virginia 1793713 1791562 1785249 1774035 1770071 39 39 39 39 39 -3964 -0.22344542244093268 -23642 -1.3180480935355878 44 47 44 45
.Wisconsin 5893713 5896700 5879978 5890543 5910955 20 20 20 20 20 20412 0.34652153460215807 17242 0.29254902639473618 20 27 26 32
.Wyoming 576850 577664 579548 581629 584057 51 51 51 51 51 2428 0.41744823590295527 7207 1.2493715870676954 38 24 30 21
Puerto Rico 3285874 3281557 3262693 3220113 3205691
-14422 -0.44787248149366188 -80183 -2.4402335573427343
Dash (-) represents zero or rounds to zero. NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
  1. Not applicable.
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Vintage 2023 data products are associated with Data Management System projects P-6000042, P-7501659, and P-7527355. The U.S. Census Bureau reviewed these data products for unauthorized disclosure of confidential information and approved the disclosure avoidance practices applied to this release (CBDRB-FY24-0085). NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Note: The estimates are developed from a base that integrates the 2020 Census, Vintage 2020 estimates, and (for the U.S. only) 2020 Demographic Analysis estimates. For population estimates methodology statements, see https://www.census.gov/programs-surveys/popest/technical-documentation/methodology.html. See Geographic Terms and Definitions at https://www.census.gov/programs-surveys/popest/guidance-geographies/terms-and-definitions.html for a list of the states that are included in each region. All geographic boundaries for the 2023 population estimates series are as of January 1, 2023. NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Suggested Citation: NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Annual and Cumulative Estimates of Resident Population Change for the United States, Regions, States, District of Columbia, and Puerto Rico and Region and State Rankings: April 1, 2020 to July 1, 2023 (NST-EST2023-CHG) NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Source: U.S. Census Bureau, Population Division NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Release Date: December 2023 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
funding %>% 
  nice_table(cap='Allocation of the Infrastructure Investment and Jobs Act, 2023') %>% 
  scroll_box(width='100%',  height = "400px")
Allocation of the Infrastructure Investment and Jobs Act, 2023
State,.Teritory.or.Tribal.Nation Total.(Billions)
ALABAMA 3.000
ALASKA 3.700
AMERICAN SAMOA 0.069
ARIZONA 3.500
ARKANSAS 2.800
CALIFORNIA 18.400
COLORADO 3.200
CONNECTICUT 2.500
DELEWARE 0.792
DISTRICT OF COLUMBIA 1.100
FLORIDA 8.200
GEORGIA 5.000
GUAM 0.143
HAWAII 1.000
IDAHO 1.200
ILLINOIS 8.400
INDIANA 3.400
IOWA 2.400
KANSAS 1.500
KENTUCKY 3.900
LOUISIANA 4.300
MAINE 1.100
MARYLAND 2.700
MASSACHUSETTS 3.600
MICHIGAN 5.200
MINNESOTA 2.700
MISSISSIPPI 2.300
MISSOURI 3.800
MONTANA 3.300
NEBRASKA 1.300
NEVADA 1.700
NEW HAMPSHIRE 0.752
NEW JERSEY 5.100
NEW MEXICO 2.600
NEW YORK 10.100
NORTH CAROLINA 4.500
NORTH DAKOTA 1.800
NORTHERN MARIANA ISLANDS 0.083
OHIO 6.600
OKLAHOMA 2.900
OREGON 2.300
PENNSYLVANIA 8.100
PUERTO RICO 0.994
RHODE ISLAND 1.100
SOUTH CAROLINA 2.300
SOUTH DAKOTA 1.300
TENNESSEE 3.700
TEXAS 14.200
TRIBAL COMMUNITIES 3.000
US VIRGIN ISLANDS 0.148
UTAH 1.800
VERMONT 0.852
VIRGINIA 4.500
WASHINGTON 4.000
WEST VIRGINIA 2.000
WISCONSIN 2.800
WYOMING 2.300

1.2 Column change

The funding dataset’s column names have been changed to more descriptive and consistent titles, making the data easier to access and understand.To ensure consistency and accuracy in state naming, “DELEWARE” was corrected to “Delaware”, as were “District Of Columbia” to “District of Columbia” and “Us Virgin Islands” to “US Virgin Islands”. State names were converted from uppercase to title case (first letter uppercase, rest lowercase) to ensure consistency across datasets.

#change column names
funding <- funding %>% 
  rename("state" = "State,.Teritory.or.Tribal.Nation", "total_billions" = "Total.(Billions)")

#change DELEWARE to Delaware
funding[funding=="DELEWARE"] <- "Delaware"

#change state names from uppercase to lower
funding$state <- str_to_title(funding$state)

#change DELEWARE to Delaware
funding[funding=="District Of Columbia"] <- "District of Columbia"
funding[funding=="Us Virgin Islands"] <- "US Virgin Islands"

For the population dataset, only relevant columns, specifically those containing state names and population figures, were kept and renamed for clarity. The index column was reset to ensure that rows were numbered correctly and sequentially after selecting a specific range of rows.Dots in state names were removed to ensure consistency with the naming convention used in other datasets. A new dataframe was created to include U.S. territories and tribal communities that were not previously included in the population dataset, ensuring complete coverage of all regions impacted by the funding. The population figures were converted to numeric data types, allowing for mathematical operations and analyses.

#choose columns with states only
population <- population[10:61,1:2] %>% 
  rename("state" = 1, "total_population" = 2)

#reset index column
rownames(population) <- NULL 

#remove dots from the state names
population$state <- gsub("\\.", "", population$state) 

#new df with us territories and population
df <- data.frame("state" = c("Guam", "US Virgin Islands", "American Samoa", "Northern Mariana Islands", "Tribal Communities"),
                  "total_population" = c(153836, 87146, 49710, 47329, 9666058))

#add us territories to the data
population <- rbind(population,df)

#transform total_population column to numeric
population$total_population <- as.numeric(population$total_population)

For the elections dataset, to ensure consistency, it was filtered to remove unnecessary rows and columns, focusing only on relevant states and removing districts within states except for the District of Columbia. Results for Democratic or Republican wins were simplified to “Biden” or “Trump” to facilitate analysis. Vote counts were originally in string format with commas but were converted to numeric values to allow for quantitative comparisons.

#choose columns with states only
elections <- elections[5:61,] %>%
  filter(!row_number() %in% c(16))

#remove districts for Nebraska, Maine but keep District of Columbia
columbia <- elections %>% 
  filter(str_detect(state, "District of Columbia"))

elections <- elections[!grepl("District", elections$state), ]

elections <- rbind(elections, columbia)

#choose columns with results for democrats or republicans only
elections <- elections[,c(1, 2, 4, 5)]

#substitute D, R for results with democrats win (1) or not (0)
elections <- elections %>% mutate(win = factor(ifelse(called == "D","Biden","Trump"))) %>%
  select(-called)

#transform to numeric values columns with votes
elections$dem_votes <- as.numeric(gsub(",", "", elections$dem_votes))
elections$rep_votes <- as.numeric(gsub(",", "", elections$rep_votes))

1.3 Merge data. New features

We also calculated the funding per capita for each state and territory by dividing the total funding by the population. This helped us assess the equity of funding allocation relative to population size. Furthermore, to improve clarity in subsequent visualizations, state names were substituted with their respective two-letter abbreviations. The datasets had been successfully merged, and the funding per capita for each state and territory had been calculated. The resulting dataset contained information such as state names, total funding in billions of dollars, population figures, an indicator of whether Biden or Trump won each state in the 2020 election, and calculated funding per capita. This extensive dataset served as the foundation for subsequent analyses, visualizations, and interpretations aimed at gaining insight into the Infrastructure Investment and Jobs Act’s funding allocation.

#merge funding and population data
first_df <- merge(x = funding, y = population, by = "state", all = TRUE)

#add new feature
first_df <- first_df %>%
  mutate(funding_per_capita = total_billions*1000000000 / total_population)

#state names to abbriviations
first_df$abbriv <- state.abb[match(first_df$state,state.name)]
first_df <- first_df %>%
     mutate(abbriv=replace(abbriv, state=="American Samoa", "AS")) %>%
     mutate(abbriv=replace(abbriv, state=="District of Columbia", "DC")) %>%
     mutate(abbriv=replace(abbriv, state=="Guam", "GU")) %>% 
    mutate(abbriv=replace(abbriv, state=="Northern Mariana Islands", "MP")) %>% 
     mutate(abbriv=replace(abbriv, state=="Puerto Rico", "PR")) %>%
     mutate(abbriv=replace(abbriv, state=="Tribal Communities", "TT")) %>%     
     mutate(abbriv=replace(abbriv, state=="US Virgin Islands", "VI")) 

#merge funding, population and election data
second_df <- merge(x = first_df, y = elections, by = "state",  all.y = TRUE)

second_df %>% 
  nice_table(cap='Merged dataset') %>% 
  scroll_box(width='100%', height = "400px")
Merged dataset
state total_billions total_population funding_per_capita abbriv dem_votes rep_votes win
Alabama 3.000 5024294 597.099 AL 849624 1441170 Trump
Alaska 3.700 733374 5045.175 AK 153778 189951 Trump
Arizona 3.500 7157902 488.970 AZ 1672143 1661686 Biden
Arkansas 2.800 3011490 929.772 AR 423932 760647 Trump
California 18.400 39538212 465.373 CA 11110250 6006429 Biden
Colorado 3.200 5773707 554.237 CO 1804352 1364607 Biden
Connecticut 2.500 3605912 693.306 CT 1080831 714717 Biden
Delaware 0.792 989946 800.044 DE 296268 200603 Biden
District of Columbia 1.100 689548 1595.248 DC 317323 18586 Biden
Florida 8.200 21538216 380.719 FL 5297045 5668731 Trump
Georgia 5.000 10713771 466.689 GA 2473633 2461854 Biden
Hawaii 1.000 1455274 687.156 HI 366130 196864 Biden
Idaho 1.200 1839117 652.487 ID 287021 554119 Trump
Illinois 8.400 12813469 655.560 IL 3471915 2446891 Biden
Indiana 3.400 6785442 501.073 IN 1242413 1729516 Trump
Iowa 2.400 3190427 752.250 IA 759061 897672 Trump
Kansas 1.500 2937835 510.580 KS 570323 771406 Trump
Kentucky 3.900 4506297 865.456 KY 772474 1326646 Trump
Louisiana 4.300 4657785 923.186 LA 856034 1255776 Trump
Maine 1.100 1363177 806.938 ME 435072 360737 Biden
Maryland 2.700 6177253 437.087 MD 1985023 976414 Biden
Massachusetts 3.600 7032933 511.877 MA 2382202 1167202 Biden
Michigan 5.200 10077674 515.992 MI 2804040 2649852 Biden
Minnesota 2.700 5706804 473.119 MN 1717077 1484065 Biden
Mississippi 2.300 2961306 776.684 MS 539508 756789 Trump
Missouri 3.800 6154889 617.395 MO 1253014 1718736 Trump
Montana 3.300 1084244 3043.595 MT 244786 343602 Trump
Nebraska 1.300 1961965 662.601 NE 374583 556846 Trump
Nevada 1.700 3104617 547.572 NV 703486 669890 Biden
New Hampshire 0.752 1377524 545.762 NH 424921 365654 Biden
New Jersey 5.100 9289039 549.034 NJ 2608335 1883274 Biden
New Mexico 2.600 2117525 1227.849 NM 501614 401894 Biden
New York 10.100 20202320 499.943 NY 5244886 3251997 Biden
North Carolina 4.500 10439459 431.057 NC 2684292 2758775 Trump
North Dakota 1.800 779079 2310.420 ND 114902 235595 Trump
Ohio 6.600 11799331 559.354 OH 2679165 3154834 Trump
Oklahoma 2.900 3959411 732.432 OK 503890 1020280 Trump
Oregon 2.300 4237279 542.801 OR 1340383 958448 Biden
Pennsylvania 8.100 13002788 622.943 PA 3458229 3377674 Biden
Rhode Island 1.100 1097371 1002.396 RI 307486 199922 Biden
South Carolina 2.300 5118422 449.357 SC 1091541 1385103 Trump
South Dakota 1.300 886668 1466.163 SD 150471 261043 Trump
Tennessee 3.700 6910786 535.395 TN 1143711 1852475 Trump
Texas 14.200 29145459 487.211 TX 5259126 5890347 Trump
Utah 1.800 3271614 550.187 UT 560282 865140 Trump
Vermont 0.852 643077 1325.036 VT 242820 112704 Biden
Virginia 4.500 8631373 521.354 VA 2413568 1962430 Biden
Washington 4.000 7705267 519.125 WA 2369612 1584651 Biden
West Virginia 2.000 1793713 1115.006 WV 235984 545382 Trump
Wisconsin 2.800 5893713 475.083 WI 1630866 1610184 Biden
Wyoming 2.300 576850 3987.172 WY 73491 193559 Trump

2. Data analysis

2.1 Question 1: Equitable allocation based on population

The detailed examination of each state’s funding per capita in relation to population size revealed the uneven distribution of funds, challenging the notion that allocation was solely determined by population.

We saw that funding per capita varied by state, with Alaska, Wyoming, and Montana receiving the most though these states didn’t have the smallest or even large population size. Highly populous states, such as California and Texas, which stand out due to their large populations, did not rank first for funding per capita. This could imply that, while these states received a significant total allocation due to their size, the per capita distribution was lower than in less populous states, indicating an attempt to distribute funds more evenly across the country. As a result, allocation was influenced by other factors such as geographical location, infrastructure requirements, or even the cost of living and construction in these areas but not solely by population size.

# To add detailed info for each dot
df <- first_df %>%
  mutate(text = paste("State:", abbriv, "<br>Population:", formatC(total_population, format="d", big.mark=","), "<br>Funding Per Capita, $:", round(funding_per_capita, 2)))

# Create the scatter plot
fig <- plot_ly(data = df, x = ~total_population, y = ~funding_per_capita, text = ~text, type = 'scatter', mode = 'markers', marker = list(size = 10, opacity = 0.6), hoverinfo = 'text')  %>% 
  layout(
    title = list(text = 'Population vs. Funding Per Capita by State',
    font = list(
      size = 18,
      color = "black",
      weight = "bold"  # This will make the title bold
    )
  ),
    annotations = list(list(
      text = 'Each point represents a state with its funding per capita to check if the allocation equitable based on the population',
      x = 0.5,
      y = 1.01, # Position just below the title
      xref = 'paper',
      yref = 'paper',
      showarrow = FALSE,
      font = list(size = 12)
    )
  ),
     xaxis = list(title = 'Population', size=8),
     yaxis = list(title = 'Funding Per Capita, $', size=8),
     hovermode = 'closest') %>% layout(width = 800)

# Show the plot
fig

2.2 Question 2: Political Bias in Funding

This scatter plot depicted the relationship between total funding allocation and population size while also reflecting the political landscape as determined by the 2020 Presidential Election results. Points were colored to indicate whether a state was won by Biden (blue) or Trump (red). Notably, the plot did not reveal a clear pattern in which one political party’s winning states consistently receive more or less funding, implying that political victory did not always translate into financial favoritism. The states with the largest populations did not always receive the most funding, and vice versa, which may indicate that factors other than population and political leanings influenced funding distribution.

# Separate the data by Biden and Trump
trump_df <- subset(second_df, win == 'Trump')
biden_df <- subset(second_df, win == 'Biden')

# Plot for Biden
fig <- plot_ly(data = biden_df, x = ~total_population, y = ~total_billions,
               type = 'scatter', mode = 'markers',
               marker = list(size = 10, opacity = 0.6, color = 'blue'),
               text = ~paste("State: ", abbriv, "<br>Population size:", formatC(total_population, format="d", big.mark=","), "<br>Funding (billions): ", total_billions),
               hoverinfo = 'text', name = 'Biden')

# Add Trump data
fig <- fig %>% add_trace(data = trump_df, x = ~total_population, y = ~total_billions,
                         type = 'scatter', mode = 'markers',
                         marker = list(size = 12, opacity = 0.6, color = 'red', symbol = 'triangle-up'),
                         text = ~paste("State: ", state, "<br>Population: ", total_population, "<br>Funding (billions): ", total_billions),
                         hoverinfo = 'text', name = 'Trump')

# Add axis and plot names
fig <- fig %>% layout(
   title = list(text = 'Total Funding vs. Population Size by 2020 Presidential Election Results',
    font = list(
      size = 18,
      color = "black",
      weight = "bold"  # This will make the title bold
    )
  ),
  annotations = list(
    list(text = 'State funding compared to population, colored by 2020 election results to check the political bias in funding',
      x = 0.5,
      y = 1.01, # Position just below the title
      xref = 'paper',
      yref = 'paper',
      showarrow = FALSE,
      font = list(size = 12)
    )
  ),
  xaxis = list(title = 'Population Size', size=8),
  yaxis = list(title = 'Total Funding (Billions)', size=8),
  showlegend = TRUE,
  legend = list(
    title = list(text = "Win"))) %>% layout(width = 800)

# Show the plot
fig
# ggplot(second_df, aes(x=total_population, y=total_billions, color=factor(win))) +
#   geom_point(aes(shape = factor(win)), size=4, alpha=0.7) +
#   scale_color_manual(values=c('Trump'='red', 'Biden'='blue')) +
#   scale_x_log10() +
#   scale_y_log10() +
#   labs(title='Total funding vs. Population Size by 2020 Presidential Election Results',
#        x='Population Size',
#        y='Total funding, billions',
#        color='Win',
#        shape='Win') +
#   theme_minimal() +
#   theme(legend.position="right") +
#   geom_hline(yintercept=0, linetype="dashed") +
#   geom_vline(xintercept=0, linetype="dashed")

# From the original submission

# sample_states <- first_df %>%
#   arrange(desc(funding_per_capita)) %>% 
#   #slice_head(n = 20) %>% 
#   slice_head(n = 57) %>% 
#   pull(abbriv)
# 
# sample_data <- first_df %>% 
#   filter(abbriv %in% sample_states)
# 
#   
# a <- ggplot(sample_data) + geom_col(aes(x=funding_per_capita, y=reorder(abbriv, funding_per_capita), fill=funding_per_capita)) +
#   scale_fill_gradient(low="blue", high="red")+
#   labs(title="Funding Per Capita by State/Territory", 
#        x="Funding Per Capita, $", 
#        y="State",
#        fill="Funding Per Capita") + 
#   theme_minimal() + 
#   theme(axis.text.y  = element_text(size=6), legend.position="bottom", legend.text = element_text(size=6)) +
#   geom_hline(yintercept=0, linetype="dashed") +
#   geom_vline(xintercept=0, linetype="dashed")

# sample_population <- first_df %>%
#   arrange(desc(total_population)) %>% 
#  # slice_head(n = 20) %>% 
#   slice_head(n = 57) %>% 
#   pull(abbriv)
# 
# sample_data <- first_df %>% 
#   filter(abbriv %in% sample_population)
# 
# b <- ggplot(sample_data) +
#   geom_col(aes(x=total_population, y=reorder(abbriv, total_population), fill=total_population)) +
#   scale_fill_gradient(low="blue", high="red")+
#   labs(title="Population by State/Territory", 
#        x="Population", 
#        y="State",
#        fill="Population") + 
#   theme_minimal() +
#   theme(axis.text.y  = element_text(size=6), legend.position="bottom", legend.text = element_text(size=6)) +
#   geom_hline(yintercept=0, linetype="dashed") +
#   geom_vline(xintercept=0, linetype="dashed")
# 
# ggarrange(a, b, 
#           ncol = 2)

3. Conclusions

While some states with larger populations received more funding, there were notable exceptions in which smaller states received disproportionately high or low funding in relation to their population size. This pattern suggested that factors other than population size had a significant influence on funding allocations. Such disparities highlight the need for a closer look at the criteria used to make funding decisions, which should ideally be guided by a combination of need, capacity, and equitable growth goals.

The second visualization, which compared per capita funding to population sizes while highlighting the winning party in the 2020 elections, sheded light on whether political considerations influenced funding distributions. According to the analysis, there was no clear national pattern of preferential funding based on party affiliation. However, individual cases in which states deviated from the general funding pattern necessitate further investigation to rule out or confirm political favoritism.

Based on these findings, policymakers and stakeholders involved in the distribution of IIJA funds should consider using more transparent and formula-based allocation methods that can be publicly verified. Furthermore, more research should be conducted into the specific outliers identified in this analysis to ensure that all states and territories receive fair funding that reflects their infrastructure needs and promotes equitable development. Continuing this research could include more detailed statistical analysis to quantify the impact of political bias, as well as a closer look at the socioeconomic factors that influence funding decisions. Such studies would provide a more complete picture of how effectively federal funds are used to promote balanced economic growth across the country.

References

  1. Knaflic, C. N. (2015). Storytelling with Data. https://doi.org/10.1002/9781119055259

  2. US Census Bureau. (2023, December 18). State Population Totals and Components of Change: 2020-2023. Census.gov. https://www.census.gov/data/datasets/time-series/demo/popest/2020s-state-total.html

  3. 2020 | The American Presidency Project. (n.d.). https://www.presidency.ucsb.edu/statistics/elections/2020

  4. Cook Political Report. (n.d.). 2020 Popular Vote Tracker | Cook Political Report. https://www.cookpolitical.com/2020-national-popular-vote-tracker

  5. Wikipedia contributors. (2024, February 1). List of U.S. states and territories by population. Wikipedia. https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population

  6. Wikipedia contributors. (2024a, January 27). Native Americans in the United States. Wikipedia. https://en.wikipedia.org/wiki/Native_Americans_in_the_United_States

  7. Ex-Pr. (n.d.). GitHub - ex-pr/DATA_608. GitHub. https://github.com/ex-pr/DATA_608/tree/main

  8. The Data Visualisation catalogue. (n.d.). https://datavizcatalogue.com/index.html