1 Overview

The following outlines the exodus of wealth from Onondaga County during the 2018 Hancock Airport renovations, when nearly 75 cents on the dollar left Onondaga County along with itinerant workers residing in 24 other counties.

These findings are completely reproducible from the raw data, provided by Syracuse Regional Airport Authority, and may be found in my GitHub repository: Wealth.

While machine-readable code to reproduce the analysis is provided, human-readable text walks the reader through each step in preprocessing, analysis, visualization, and the staggering conclusions.

See Part 2: Executive Summary for chief conclusions.


2 Executive Summary

Onondaga County earns the greatest total gross at $288K (26.2%) and had the most workers at 67 (35.4%) of any county, but 122 itinerant workers from 24 other counties, including 2 counties in Pennsylvania, left with the remaining 73.8% of gross income, or $812K.

Oswego County workers were a close second with total county earnings of $262K (23.8%) but had 23.8% less workers at 51 total workers (27%). Oswego workers earned an average gross of $5.1K, 19.2% higher than the average Onondaga worker, earning $4.3K.

Tioga County, Pennsylvania earned the third highest gross at $70K (6.4%), which is 75.4% less than Onondaga County. However, it had 92.5% fewer workers (5, or 2.6% of the total workforce). 4 of these 5 workers were Journeymen Carpenters working for Quality Structures, Inc. and individually earned total gross incomes of $18.7K, $15.7K, $15.5K, and $14.3K, while 1 Apprentice Carpenter earned $6.4K. On average, each Tioga County worker earned a whopping 228.6% more than the average Onondaga County worker, at $14.1K and $4.3K, respectively.

Lastly, one Foreman from Summit, NY (pop. 1,123) single-handedly carried the entirety of Schoharie County to 10th highest grossing counties, among 25 counties total, earning $27.8K alone.

The 2018 Hancock Airport renovations illustrate a staggering exodus of wealth by itinerant workers in public construction projects.


3 Required Packages & Data

The following R packages and datasets are required for this analysis and loaded with functions library() and data().


library(sf)
library(tmap)
library(readr)
library(dplyr)
library(tigris)
library(scales)
library(stringr)
library(ggplot2)
library(leaflet)
library(noncensus)
library(kableExtra)

data(zip_codes)
data(counties)


4 Importing External Data

Below, we import the CSV (Comma-Separated Value) file containing all scraped records for both Lakeview Amphitheater constructions and Hancock Airport renovations: hancock_lakeview_tidy.csv. This file is available in the Wealth Repository, which also includes the Raw Data Folder containing the original records.


url <- paste0("https://raw.githubusercontent.com/jamisoncrawford/",
              "wealth/master/Tidy%20Data/hancock_lakeview_tidy.csv")
lvhc <- read_csv(file = url, 
                 col_types = "ccDcccddddccliii")
rm(url)


5 Convert ZIP to Counties

By using R package noncensus, we can convert worker ZIP (Zone Improvement Plan) codes to counties based on their individual payment records. The preprocessing occurs as follows:

  1. Create new variable fips in dataset counties to determine FIPS county code
  2. Filter missing values from Lakeview & Hancock data lvhc for ZIP codes, gross income, and project name
  3. Merge zip_codes dataset with Lakeview & Hancock data lvhc by FIPS county codes
  4. Merge counties dataset with Lakeview & Hancock data lvhc by county names
  5. Filter ~8 records failing to identify county (0.2% of total records)
  6. Separate data by project, viz. Hancock records, hc, and Lakeview records, lv


counties <- counties %>%
  mutate(fips = paste0(state_fips, 
                       county_fips),                        # Paste FIPS IDs (State + County)
         fips = as.character(fips)) %>%
  select(county_name, fips) 

lvhc <- lvhc %>%
  filter(!is.na(zip),
         !is.na(gross),
         !is.na(project)) %>%                               # Remove NA values
  left_join(zip_codes, 
            by = "zip") %>%
  mutate(fips = as.character(fips)) %>%                     # Join FIPS IDs
  left_join(counties,
            by = "fips") %>%                                # Join counties by FIPS
  mutate(county = str_replace_all(string = county_name, 
                                  pattern = " County$", 
                                  replacement = "")) %>%    # Clean county names
  select(project:hours, gross, sex:race, city:state, 
         county, fips, longitude:latitude) %>%              # Select variables
  filter(!is.na(state))                                     # Remove 8 unidentified records
  
rm(zip_codes, counties)

lv <- lvhc %>%
  filter(project == "Lakeview")

hc <- lvhc %>%
  filter(project == "Hancock")


6 County Summaries

Here, we perform a series of grouping and summarizing operations. In effect, we produce the following summary data:

  • Total gross per county and state, as well as proportion of total gross: hc_gross
  • Total gross per individual worker, identified by company, ZIP, SSN, class, gender, and race: hc_inds
  • Total workers and proportion of workforce by county and state, combined with gross income: hc_all


hc_gross <- hc %>%
  group_by(county, state) %>%
  summarize(gross = sum(gross)) %>%               # Total gross by county
  ungroup() %>%
  arrange(desc(gross)) %>%
  mutate(perc_gross = gross / sum(gross))         # Percentage of total gross

hc_inds <- hc %>%
  group_by(name, zip, ssn, class, sex, race) %>%
  summarize(gross = sum(gross),
            county = unique(county),
            state = unique(state)) %>%            # Total gross by individual
  ungroup() %>%
  arrange(county, desc(gross))

hc_wrks <- hc_inds %>%
  group_by(county, state) %>%
  summarize(workers = n(),
            gross = sum(gross)) %>%
  ungroup() %>%
  arrange(desc(gross))                            # Total workers by county

hc_all <- hc_gross %>%
  left_join(hc_wrks,
            by = c("county", "state")) %>%
  select(-gross.y) %>%
  rename(gross = gross.x) %>%                     # Merge workers and gross
  mutate(perc_workers = workers / sum(workers))

rm(hc_gross, hc_wrks)


7 Visualizing an Exodus

The below code visualizes and stores the above summary data, hc_all, in graphical object county_gross using the Tidyverse graphics package ggplot2. Labels, style, and other “non-data ink” are modified for clarity to produce Total gross income by county, state.


options(scipen = 999)                     # Disable scientific notation

county_gross <- ggplot(hc_all, 
                       aes(x = reorder(county, gross), gross, 
                           y = gross, 
                           fill = state)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Total gross income by county, state",
       subtitle = "Hancock Airport Renovations",
       caption = "Source: Syracuse Regional Airport Authority",
       x = "County",
       y = "Total Gross (USD)",
       fill = "State") +
  scale_y_continuous(labels = comma) +
  theme_minimal()


The resulting visual is disheartening. Although 35.4% of all workers reside in Onondaga County, only 26.2% of gross income is earned by Onondaga County residents. Indeed, nearly three quarters of every dollar earned in the 2018 Hancock Airport renovations left Onondaga County along with itinerant workers residing in 24 other counties.



8 Context Is Everything

The context is staggering. While the above visualization, Total gross income by county, state, is indeed disheartening, the numbers behind the graphic are overwhelming. The following code modifies table labels and produces an interactive table of key summary data in HTML, using package kableExtra. Notable conclusions follow, which underscore this exodus of wealth by itinerant workers.


hc_all_tbl <- hc_all %>%
  mutate(gross = dollar(gross),
         perc_gross = percent(perc_gross),
         perc_workers = percent(perc_workers)) %>%      # Modify labels, e.g. "%", "$"
  rename(County = county,
         State = state,
         "Total Gross" = gross,
         "Gross (%)" = perc_gross,
         "Total Workers" = workers,
         "Workforce (%)" = perc_workers)                # Rename variables

county_tbl <- kable(hc_all_tbl) %>%
  kable_styling(bootstrap_options = c("striped",        # Stripe table rows
                                      "hover",          # Interactivity
                                      "responsive"),    # Automate fit adjustment
                full_width = TRUE)


County State Total Gross Gross (%) Total Workers Workforce (%)
Onondaga NY $288,849 26.2% 67 35.4%
Oswego NY $262,136 23.8% 51 27.0%
Tioga PA $70,843 6.4% 5 2.6%
Cortland NY $63,370 5.8% 4 2.1%
Cayuga NY $54,589 5.0% 7 3.7%
Oneida NY $45,475 4.1% 9 4.8%
Broome NY $41,179 3.7% 2 1.1%
Madison NY $39,965 3.6% 9 4.8%
Steuben NY $27,872 2.5% 3 1.6%
Schoharie NY $27,697 2.5% 1 0.5%
Chemung NY $27,558 2.5% 2 1.1%
Ontario NY $23,060 2.1% 4 2.1%
Genesee NY $21,290 1.9% 2 1.1%
St. Lawrence NY $17,895 1.6% 1 0.5%
Monroe NY $16,846 1.5% 7 3.7%
Livingston NY $16,759 1.5% 1 0.5%
Lycoming PA $15,570 1.4% 1 0.5%
Wayne NY $12,883 1.2% 2 1.1%
Cattaraugus NY $6,286 0.6% 2 1.1%
Seneca NY $6,278 0.6% 2 1.1%
Schuyler NY $4,691 0.4% 1 0.5%
Jefferson NY $4,624 0.4% 1 0.5%
Erie NY $2,586 0.2% 2 1.1%
Tompkins NY $1,625 0.1% 2 1.1%
Herkimer NY $1,160 0.1% 1 0.5%


9 Conclusions

In light of total workers per county vis-à-vis total gross per county, the notable conclusions follow:

  1. Onondaga County had the greatest total gross ($288K, 26.2%) and workers (67, 35.4%)

  2. Oswego County was a close second in total gross ($262K, 23.8%) but 23.8% fewer workers (51, 27.0%)

    • Oswego workers earned an average gross 19.2% more than Onondaga workers ($5.1K & $4.3K, respectively)
  3. Tioga County, PA earned 75.4% less gross than Onondaga ($70K, 6.4%) but with 92.5% fewer workers (5, 2.6%)

    • Notably, 4 Journeymen Carpenters earned $18.7K, $15.7K, $15.5K, and $14.3K, and 1 Apprentice earned $6.4K

    • Tioga workers earned an average gross 228.6% more than Onondaga workers ($14.1K & $4.3K, respectively)

  4. One Foreman from Summit, NY (pop. 1,123) carried Schoharie County to 10th place with total gross of $27.8K


10 Index

10.1 Map: Gross by County

The following uses package tigris to import shapefiles, or tables of longitude-latitude coordinates for meaningful geometric shapes, from the U.S. Census Bureau (2016). Package sf, or “simple features”, allows one to easily merge tables like hc_all, containing our totals and proportions of gross earnings and workers by county, into the appropriate county shapefile.


shapes <- counties(state = c("NY", "PA"),     # Import NY & PA counties
                   resolution = "20m",        # Maximum resolution
                   year = 2016,               # Census data year
                   class = "sf")              # Specify class "simple features"


One can then use these easily-modifiable shapefiles in dataset shapes to merge them with the gross and worker summaries by county visualized in Part 7: Visualizing an Exodus (hc_all). By iterating over each shapefile, the following code removes irrelevant county shapefiles where no itinerant Hancock renovation workers reside.


shapes <- fortify(shapes)                       # Preserve class "simple features"

hcsf <- shapes %>%
  rename(county = NAME) %>%
  left_join(hc_all) %>%                         # Merge summary on variable "county"
  mutate(gross_usd = dollar(gross),
         perc_gross = percent(perc_gross),
         perc_workers = percent(perc_workers))  # Modify formatting, e.g. "%", "$"

for (i in 1:nrow(hcsf)){
  if (is.na(hcsf$gross)[i]){
    hcsf$gross[i] <- 0                          # Replace missing gross with "$0"
    hcsf$county[i] <- ""                        # Remove irrelevant county labels
  }
}


Lastly, the below code uses packages tmap and leaflet to create a choropleth map, or a geographic map of geometric shapes - in this case, counties - where a gradient color scale fills each shape and is more or less saturated (darker or lighter) based on a particular data value - in this case, total gross income.


hc_chor <- tm_shape(hcsf) +
  tm_fill(col = "gross",
          title = "Gross (USD)",
          style = "cont",
          alpha = 0.75, 
          id = "NAMELSAD",
          popup.vars = c("Gross (USD):" = "gross",
                         "Gross (%):" = "perc_gross",
                         "Workers:" = "workers",
                         "Workforce (%):" = "perc_workers")) +
  tm_borders(col = "White",
             lwd = 1.6) +
  tm_text(text = "county", size = .75) +
  tm_layout(main.title.position = c("LEFT", "TOP"),
            legend.title.size = 1,
            legend.position = c("RIGHT", "BOTTOM"),
            frame = FALSE, saturation = -3)

tmap_mode(mode = "view")



10.2 Distribution: Gross by Worker

The following visualization, made in package ggplot2, illustrates total gross earnings by individual workers per each of the 25 counties, including a boxplot that indicates the first quartile (25th percentile), third quartile (75th percentile), and median (50th percentile) for total individual gross by county.

These distributions emphasize just how paltry gross earnings were among Onondaga County workers compared to itinerant workers involved in the 2018 Hancock Airport renovations, highlighting the massive exodus of wealth by itinerant workers from counties such as Schoharie, Broom, and Cortland.


dist_inds <- ggplot(hc_inds, 
                    aes(x = reorder(county, gross), 
                        y = gross)) +
  geom_boxplot(alpha = 0.85, 
               color = "gray80", 
               outlier.shape = 15, 
               outlier.size = 2) +
  geom_jitter(alpha = 0.5, 
              height = 0, 
              width = .15, 
              size = 1, 
              color = "tomato") +
  labs(title = "Distribution of total individual gross earnings by county",
       subtitle = "Hancock Airport Renovations",
       caption = "Source: Syracuse Regional Airport Authority",
       x = "County",
       y = "Total Individual Gross Earnings (USD)") +
  coord_flip() +
  scale_y_continuous(labels = scales::comma) +
  theme_minimal()