In this analysis I will be using American Community Survey (ACS) 2016 5-year data through a ACS API pull along with geoprocessing to map the mean median household income and mean per capita income by Chicago community areas. Since the ACS does not delineate by Chicago community areas, I aggregrate census tracts up to these areas in order to match the spatial scale of my overall project. Evaluating the spatial patterns of these income variable constructs will allow me to assess the economic disadvantage throughout Chicago as it relates to the context of IDUs’ HIV risk environment in 2016. Scholarship has found that poverty is not only linked to increased HIV transmission and prevalence, but is also tied to increased drug use (Rhodes et al. 2009). These factors together, can heighten the risk of Chicago IDUs HIV contraction.
The income data that I use, as previously mentioned, is pulled from the 2016 5-year ACS API. I will specifically be using ACS variables “B19013_001” median household income and “B19301_001” per capita income to analyze community areas where there is economic disadvantage. In my analysis, I also use the “Boundaries - City” shp from the Chicago Data Portal which is a shapefile of the outer boundary of Chicago accessed here, “Boundaries - Community Areas(current)” which is spatial data consisting of current community area boundaries in Chicago accessed here, and finally, “Census Tract to Community Area Equivalency File” which I downloaded as a csv from UChicago Library’s “Chicago GIS Base Layers for Chicago” section accessed here. This csv file lists Chicago census tracts and which community area they fall within which is helpful for grouping these tracts together in my analysis.
To generate my new variables: mean median household income and mean per capita income by Chicago community area 2016, I EXTRACT 1) ACS data through my API key, 2) Chicago City Boundary Shp, 3) Census Tracts in Chicago Community Areas csv, and 4) Chicago Community Area Boundary Shp. In my workflow I TRANSFORM these datasets through geoprocessing and then LOAD a cleaned mean income variables by Chicago community areas shp, and a final map of my results. A more detailed account of this process can be seen in my ETL diagram below. ## Extract (ACS 2016 IL Income Variables by Census Tract) ### Load libraries and set up R Session In order to begin my analysis, I first load all the libraries that will be used throughout this spatial analysis process.
library(sf)
library(tidycensus)
library(tidyverse)
library(tmap)
library(dplyr)
library(leaflet)
library(data.table)
library(tidyr)
library(tigris)
Next, I enable my ACS API key.
## [1] ""
Once my ACS API is enabled, I can view and inspect the ACS 2016 data to identify the variable IDs for median household income and per capita income.
ACS16var <- load_variables(2016, "acs5", cache = TRUE)
view(ACS16var)
After inspecting the data, I identified the median household income to have the ID “B19013_001” and per capita income to have the ID “B19301_001.” Now I can use the get_acs function to pull data of these variables by IL tracts with geometry. In this code chunk, I also clean the data using select and spread.
tractsShp <- get_acs(geography = 'tract', variables = c(medinc = "B19013_001", percap = "B19301_001"),
year = 2016, state = 'IL', geometry = TRUE) %>%
select(GEOID, NAME, variable, estimate) %>%
spread(variable, estimate)
## Getting data from the 2012-2016 5-year ACS
## Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
Inspection of results:
glimpse(tractsShp)
## Rows: 3,123
## Columns: 5
## $ GEOID <chr> "17001000100", "17001000201", "17001000202", "17001000400", …
## $ NAME <chr> "Census Tract 1, Adams County, Illinois", "Census Tract 2.01…
## $ medinc <dbl> 54550, 41538, 40018, 28819, 32313, 44324, 17850, 26012, 4047…
## $ percap <dbl> 30465, 22267, 21367, 18268, 14527, 29072, 17410, 16130, 1833…
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-91.37766 3..., MULTIPOLYGON ((…
Next, I want to read in the Chicago City Boundary shapefile and take a look at this data.
chicago_bound <-st_read("Boundaries - City")
## Reading layer `geo_export_ba34cc3b-294c-40fc-9cf1-630e836447c2' from data source `/Users/brifadden/Desktop/IDU_project/Boundaries - City' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 4 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304
## geographic CRS: WGS84(DD)
glimpse(chicago_bound)
## Rows: 1
## Columns: 5
## $ name <chr> "CHICAGO"
## $ objectid <dbl> 1
## $ shape_area <dbl> 6450276623
## $ shape_len <dbl> 845282.9
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-87.93514 4...
Since I will be intersecting this city boundary with my ACS spatial data, I will first need to make sure both are in the same CRS. Here, I transform each to be in 4326.
chicago_bound <- st_transform(chicago_bound, 4326)
tractsShp <- st_transform(tractsShp, 4326)
This intersection with only keep the tracts within the Chicago city boundary, since I am not interested in the rest of IL in my analysis.
tract_acs_shp <- st_intersection(tractsShp, chicago_bound)
## although coordinates are longitude/latitude, st_intersection assumes that they are planar
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries
head(tract_acs_shp)
glimpse(tract_acs_shp)
Here, I want to read in the census tract by community area file from UChicago GIS Library Resources. This file lists which census tracts are within each community area allowing me to use the {r}group_by function at a later stage.
chi_tracts_comm <-read.csv(file = "Census_Tracts_in_Chicago_Community_Areas.csv")
glimpse(chi_tracts_comm)
## Rows: 794
## Columns: 4
## $ Tract <dbl> 17031010100, 17031010201, 17031010202, 1703101030…
## $ Label <chr> "Census Tract 101, Cook County, Illinois", "Censu…
## $ CommunityAreaNumber <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2…
## $ CommunityAreaName <chr> "Rogers Park", "Rogers Park", "Rogers Park", "Rog…
This merge will enable me to connect the census tracts in community areas identifiers to the ACS tract data that I am looking to aggregate up to the community area level. Here, I use the two keys “GEOID” and “Tract” which represent the census tract code.
chi_tracts_merge <- merge(tract_acs_shp, chi_tracts_comm, by.x = "GEOID", by.y = "Tract")
glimpse(chi_tracts_merge)
## Rows: 794
## Columns: 12
## $ GEOID <chr> "17031010100", "17031010201", "17031010202", "170…
## $ NAME <chr> "Census Tract 101, Cook County, Illinois", "Censu…
## $ medinc <dbl> 29861, 38861, 29432, 37515, 37228, 26750, 29870, …
## $ percap <dbl> 24933, 23919, 20438, 32129, 20880, 25217, 23685, …
## $ name <chr> "CHICAGO", "CHICAGO", "CHICAGO", "CHICAGO", "CHIC…
## $ objectid <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ shape_area <dbl> 6450276623, 6450276623, 6450276623, 6450276623, 6…
## $ shape_len <dbl> 845282.9, 845282.9, 845282.9, 845282.9, 845282.9,…
## $ Label <chr> "Census Tract 101, Cook County, Illinois", "Censu…
## $ CommunityAreaNumber <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2…
## $ CommunityAreaName <chr> "Rogers Park", "Rogers Park", "Rogers Park", "Rog…
## $ geometry <POLYGON [°]> POLYGON ((-87.67699 42.0229..., POLYGON (…
Here, I want to group all the census tracts that are within a certain community area together so I can take get the mean median household income and mean per capita income per community area. This aggregates the ACS data up to my final project scale of the community area level.
acs_means <- chi_tracts_merge %>%
group_by(CommunityAreaNumber) %>%
summarize(mean_medinc = mean(medinc, na.rm = TRUE), mean_percap = mean(percap, na.rm = TRUE))
Now, I want to inspect the result.
glimpse(acs_means)
## Rows: 77
## Columns: 4
## $ CommunityAreaNumber <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15…
## $ mean_medinc <dbl> 35616.92, 53301.29, 45986.08, 70449.73, 101092.29…
## $ mean_percap <dbl> 24487.92, 25162.71, 37941.33, 45083.82, 60947.93,…
## $ geometry <GEOMETRY [°]> POLYGON ((-87.66055 41.9981..., POLYGON …
Here, I am loading the final spatial dataset of mean median household income and mean per capita income by community area as a shapefile.
st_write(acs_means, "incomeACSchi_2016.shp")
## Warning in abbreviate_shapefile_names(obj): Field names abbreviated for ESRI
## Shapefile driver
To plot this data, I used tmap’s interactive viewing mode. From the “Plots” R environment, I can then export this map as an image, PDF, or link.
Mean Median Household Income by Community Area 2016:
## tmap mode set to interactive viewing
Mean Per Capita Income by Community Area 2016:
Rhodes, Tim, Merrill Singer, Philippe Bourgois, Samuel R. Friedman, and Steffanie A.Strathdee. “The Social Structural Production of HIV Risk among Injecting DrugUsers.” Social Science & Medicine 61, no. 5 (September 1, 2005): 1026–44.https://doi.org/10.1016/j.socscimed.2004.12.024.