The following outlines the exodus of wealth from Onondaga County during the 2018 Hancock Airport renovations, when nearly 75 cents on the dollar left Onondaga County along with itinerant workers residing in 24 other counties.
These findings are completely reproducible from the raw data, provided by Syracuse Regional Airport Authority, and may be found in my GitHub repository: Wealth.
While machine-readable code to reproduce the analysis is provided, human-readable text walks the reader through each step in preprocessing, analysis, visualization, and the staggering conclusions.
See Part 2: Executive Summary for chief conclusions.
Onondaga County earns the greatest total gross at $288K (26.2%) and had the most workers at 67 (35.4%) of any county, but 122 itinerant workers from 24 other counties, including 2 counties in Pennsylvania, left with the remaining 73.8% of gross income, or $812K.
Oswego County workers were a close second with total county earnings of $262K (23.8%) but had 23.8% less workers at 51 total workers (27%). Oswego workers earned an average gross of $5.1K, 19.2% higher than the average Onondaga worker, earning $4.3K.
Tioga County, Pennsylvania earned the third highest gross at $70K (6.4%), which is 75.4% less than Onondaga County. However, it had 92.5% fewer workers (5, or 2.6% of the total workforce). 4 of these 5 workers were Journeymen Carpenters working for Quality Structures, Inc. and individually earned total gross incomes of $18.7K, $15.7K, $15.5K, and $14.3K, while 1 Apprentice Carpenter earned $6.4K. On average, each Tioga County worker earned a whopping 228.6% more than the average Onondaga County worker, at $14.1K and $4.3K, respectively.
Lastly, one Foreman from Summit, NY (pop. 1,123) single-handedly carried the entirety of Schoharie County to 10th highest grossing counties, among 25 counties total, earning $27.8K alone.
The 2018 Hancock Airport renovations illustrate a staggering exodus of wealth by itinerant workers in public construction projects.
The following R packages and datasets are required for this analysis and loaded with functions library() and data().
library(sf)
library(tmap)
library(readr)
library(dplyr)
library(tigris)
library(scales)
library(stringr)
library(ggplot2)
library(leaflet)
library(noncensus)
library(kableExtra)
data(zip_codes)
data(counties)
Below, we import the CSV (Comma-Separated Value) file containing all scraped records for both Lakeview Amphitheater constructions and Hancock Airport renovations: hancock_lakeview_tidy.csv. This file is available in the Wealth Repository, which also includes the Raw Data Folder containing the original records.
url <- paste0("https://raw.githubusercontent.com/jamisoncrawford/",
"wealth/master/Tidy%20Data/hancock_lakeview_tidy.csv")
lvhc <- read_csv(file = url,
col_types = "ccDcccddddccliii")
rm(url)
By using R package noncensus, we can convert worker ZIP (Zone Improvement Plan) codes to counties based on their individual payment records. The preprocessing occurs as follows:
fips in dataset counties to determine FIPS county codelvhc for ZIP codes, gross income, and project namezip_codes dataset with Lakeview & Hancock data lvhc by FIPS county codescounties dataset with Lakeview & Hancock data lvhc by county nameshc, and Lakeview records, lvcounties <- counties %>%
mutate(fips = paste0(state_fips,
county_fips), # Paste FIPS IDs (State + County)
fips = as.character(fips)) %>%
select(county_name, fips)
lvhc <- lvhc %>%
filter(!is.na(zip),
!is.na(gross),
!is.na(project)) %>% # Remove NA values
left_join(zip_codes,
by = "zip") %>%
mutate(fips = as.character(fips)) %>% # Join FIPS IDs
left_join(counties,
by = "fips") %>% # Join counties by FIPS
mutate(county = str_replace_all(string = county_name,
pattern = " County$",
replacement = "")) %>% # Clean county names
select(project:hours, gross, sex:race, city:state,
county, fips, longitude:latitude) %>% # Select variables
filter(!is.na(state)) # Remove 8 unidentified records
rm(zip_codes, counties)
lv <- lvhc %>%
filter(project == "Lakeview")
hc <- lvhc %>%
filter(project == "Hancock")
Here, we perform a series of grouping and summarizing operations. In effect, we produce the following summary data:
hc_grosshc_indshc_allhc_gross <- hc %>%
group_by(county, state) %>%
summarize(gross = sum(gross)) %>% # Total gross by county
ungroup() %>%
arrange(desc(gross)) %>%
mutate(perc_gross = gross / sum(gross)) # Percentage of total gross
hc_inds <- hc %>%
group_by(name, zip, ssn, class, sex, race) %>%
summarize(gross = sum(gross),
county = unique(county),
state = unique(state)) %>% # Total gross by individual
ungroup() %>%
arrange(county, desc(gross))
hc_wrks <- hc_inds %>%
group_by(county, state) %>%
summarize(workers = n(),
gross = sum(gross)) %>%
ungroup() %>%
arrange(desc(gross)) # Total workers by county
hc_all <- hc_gross %>%
left_join(hc_wrks,
by = c("county", "state")) %>%
select(-gross.y) %>%
rename(gross = gross.x) %>% # Merge workers and gross
mutate(perc_workers = workers / sum(workers))
rm(hc_gross, hc_wrks)
The below code visualizes and stores the above summary data, hc_all, in graphical object county_gross using the Tidyverse graphics package ggplot2. Labels, style, and other “non-data ink” are modified for clarity to produce Total gross income by county, state.
options(scipen = 999) # Disable scientific notation
county_gross <- ggplot(hc_all,
aes(x = reorder(county, gross), gross,
y = gross,
fill = state)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(title = "Total gross income by county, state",
subtitle = "Hancock Airport Renovations",
caption = "Source: Syracuse Regional Airport Authority",
x = "County",
y = "Total Gross (USD)",
fill = "State") +
scale_y_continuous(labels = comma) +
theme_minimal()
The resulting visual is disheartening. Although 35.4% of all workers reside in Onondaga County, only 26.2% of gross income is earned by Onondaga County residents. Indeed, nearly three quarters of every dollar earned in the 2018 Hancock Airport renovations left Onondaga County along with itinerant workers residing in 24 other counties.
The context is staggering. While the above visualization, Total gross income by county, state, is indeed disheartening, the numbers behind the graphic are overwhelming. The following code modifies table labels and produces an interactive table of key summary data in HTML, using package kableExtra. Notable conclusions follow, which underscore this exodus of wealth by itinerant workers.
hc_all_tbl <- hc_all %>%
mutate(gross = dollar(gross),
perc_gross = percent(perc_gross),
perc_workers = percent(perc_workers)) %>% # Modify labels, e.g. "%", "$"
rename(County = county,
State = state,
"Total Gross" = gross,
"Gross (%)" = perc_gross,
"Total Workers" = workers,
"Workforce (%)" = perc_workers) # Rename variables
county_tbl <- kable(hc_all_tbl) %>%
kable_styling(bootstrap_options = c("striped", # Stripe table rows
"hover", # Interactivity
"responsive"), # Automate fit adjustment
full_width = TRUE)
| County | State | Total Gross | Gross (%) | Total Workers | Workforce (%) |
|---|---|---|---|---|---|
| Onondaga | NY | $288,849 | 26.2% | 67 | 35.4% |
| Oswego | NY | $262,136 | 23.8% | 51 | 27.0% |
| Tioga | PA | $70,843 | 6.4% | 5 | 2.6% |
| Cortland | NY | $63,370 | 5.8% | 4 | 2.1% |
| Cayuga | NY | $54,589 | 5.0% | 7 | 3.7% |
| Oneida | NY | $45,475 | 4.1% | 9 | 4.8% |
| Broome | NY | $41,179 | 3.7% | 2 | 1.1% |
| Madison | NY | $39,965 | 3.6% | 9 | 4.8% |
| Steuben | NY | $27,872 | 2.5% | 3 | 1.6% |
| Schoharie | NY | $27,697 | 2.5% | 1 | 0.5% |
| Chemung | NY | $27,558 | 2.5% | 2 | 1.1% |
| Ontario | NY | $23,060 | 2.1% | 4 | 2.1% |
| Genesee | NY | $21,290 | 1.9% | 2 | 1.1% |
| St. Lawrence | NY | $17,895 | 1.6% | 1 | 0.5% |
| Monroe | NY | $16,846 | 1.5% | 7 | 3.7% |
| Livingston | NY | $16,759 | 1.5% | 1 | 0.5% |
| Lycoming | PA | $15,570 | 1.4% | 1 | 0.5% |
| Wayne | NY | $12,883 | 1.2% | 2 | 1.1% |
| Cattaraugus | NY | $6,286 | 0.6% | 2 | 1.1% |
| Seneca | NY | $6,278 | 0.6% | 2 | 1.1% |
| Schuyler | NY | $4,691 | 0.4% | 1 | 0.5% |
| Jefferson | NY | $4,624 | 0.4% | 1 | 0.5% |
| Erie | NY | $2,586 | 0.2% | 2 | 1.1% |
| Tompkins | NY | $1,625 | 0.1% | 2 | 1.1% |
| Herkimer | NY | $1,160 | 0.1% | 1 | 0.5% |
In light of total workers per county vis-à-vis total gross per county, the notable conclusions follow:
Onondaga County had the greatest total gross ($288K, 26.2%) and workers (67, 35.4%)
Oswego County was a close second in total gross ($262K, 23.8%) but 23.8% fewer workers (51, 27.0%)
Tioga County, PA earned 75.4% less gross than Onondaga ($70K, 6.4%) but with 92.5% fewer workers (5, 2.6%)
Notably, 4 Journeymen Carpenters earned $18.7K, $15.7K, $15.5K, and $14.3K, and 1 Apprentice earned $6.4K
Tioga workers earned an average gross 228.6% more than Onondaga workers ($14.1K & $4.3K, respectively)
One Foreman from Summit, NY (pop. 1,123) carried Schoharie County to 10th place with total gross of $27.8K
The following uses package tigris to import shapefiles, or tables of longitude-latitude coordinates for meaningful geometric shapes, from the U.S. Census Bureau (2016). Package sf, or “simple features”, allows one to easily merge tables like hc_all, containing our totals and proportions of gross earnings and workers by county, into the appropriate county shapefile.
shapes <- counties(state = c("NY", "PA"), # Import NY & PA counties
resolution = "20m", # Maximum resolution
year = 2016, # Census data year
class = "sf") # Specify class "simple features"
One can then use these easily-modifiable shapefiles in dataset shapes to merge them with the gross and worker summaries by county visualized in Part 7: Visualizing an Exodus (hc_all). By iterating over each shapefile, the following code removes irrelevant county shapefiles where no itinerant Hancock renovation workers reside.
shapes <- fortify(shapes) # Preserve class "simple features"
hcsf <- shapes %>%
rename(county = NAME) %>%
left_join(hc_all) %>% # Merge summary on variable "county"
mutate(gross_usd = dollar(gross),
perc_gross = percent(perc_gross),
perc_workers = percent(perc_workers)) # Modify formatting, e.g. "%", "$"
for (i in 1:nrow(hcsf)){
if (is.na(hcsf$gross)[i]){
hcsf$gross[i] <- 0 # Replace missing gross with "$0"
hcsf$county[i] <- "" # Remove irrelevant county labels
}
}
Lastly, the below code uses packages tmap and leaflet to create a choropleth map, or a geographic map of geometric shapes - in this case, counties - where a gradient color scale fills each shape and is more or less saturated (darker or lighter) based on a particular data value - in this case, total gross income.
hc_chor <- tm_shape(hcsf) +
tm_fill(col = "gross",
title = "Gross (USD)",
style = "cont",
alpha = 0.75,
id = "NAMELSAD",
popup.vars = c("Gross (USD):" = "gross",
"Gross (%):" = "perc_gross",
"Workers:" = "workers",
"Workforce (%):" = "perc_workers")) +
tm_borders(col = "White",
lwd = 1.6) +
tm_text(text = "county", size = .75) +
tm_layout(main.title.position = c("LEFT", "TOP"),
legend.title.size = 1,
legend.position = c("RIGHT", "BOTTOM"),
frame = FALSE, saturation = -3)
tmap_mode(mode = "view")
The following visualization, made in package ggplot2, illustrates total gross earnings by individual workers per each of the 25 counties, including a boxplot that indicates the first quartile (25th percentile), third quartile (75th percentile), and median (50th percentile) for total individual gross by county.
These distributions emphasize just how paltry gross earnings were among Onondaga County workers compared to itinerant workers involved in the 2018 Hancock Airport renovations, highlighting the massive exodus of wealth by itinerant workers from counties such as Schoharie, Broom, and Cortland.
dist_inds <- ggplot(hc_inds,
aes(x = reorder(county, gross),
y = gross)) +
geom_boxplot(alpha = 0.85,
color = "gray80",
outlier.shape = 15,
outlier.size = 2) +
geom_jitter(alpha = 0.5,
height = 0,
width = .15,
size = 1,
color = "tomato") +
labs(title = "Distribution of total individual gross earnings by county",
subtitle = "Hancock Airport Renovations",
caption = "Source: Syracuse Regional Airport Authority",
x = "County",
y = "Total Individual Gross Earnings (USD)") +
coord_flip() +
scale_y_continuous(labels = scales::comma) +
theme_minimal()