This will be a walk-through of the spatial joins and areal calculations needed to do a spatial overlay of modern census boundaries and historic redlining data. However, this process can apply to more general aggregation practices.
Redlining was the systematic practice of outlining, often racial or ethnic, neighborhoods or communities often by financial service industries such as banking and insurance for the denial of various services or the selective raising of prices. Historic redlining data can help characterize or contextualize modern census block data. However, redlining boundaries were are heterogeneous as the practice of redlining itself, and was practiced by various individuals and industries on both official and “unofficial” basis. Additionally, census boundaries are constantly in flux, growing and shrinking to accommodating moving and dynamic populations. Also, when comparing data historically, like modern data census boundaries and boundaries drawn approximately 80 years ago, appropriate aggregation becomes important. Therefore, it is important to learn how to do spatial joins and areal calculations that can identify what percent of a block is covered by a redlined area category.
The redlining boundaries and category are taken from the United State’s federal government’s Howe Owner’s Loan Corporation between 1935 and 1940. This is by no means the only redlining maps created, as various industries participated in redlining from health care to supermarkets, nor is it definitive for the banking or insurance industries where different companies could have thier own maps. Redlining is also not only a historic practice, with insurance companies in Ohio, for example, allowing the continued use of demographic data at the ZIP code level to determine insurance rate.
Pre-steps are to download the data from the links provided in the data section. Then, library the necessary packages
library(sf)
library(tidyverse)
library(tigris)
library(tmap)
library(ggplot2)
library(leaflet)
Read in the shapefile and then subset by desired state and city and select desired columms.
nation_redlining <- st_read("/Users/elizabethzazycki/redlining shapefile/shapefile/holc_ad_data.shp")
## Reading layer `holc_ad_data' from data source `/Users/elizabethzazycki/redlining shapefile/shapefile/holc_ad_data.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8878 features and 7 fields (with 3 geometries empty)
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -122.7675 ymin: 25.70537 xmax: -70.9492 ymax: 47.72251
## CRS: 4326
philly_redlining <- subset(nation_redlining, state == "PA" & city == "Philadelphia") %>%
select(holc_grade, geometry)
view(philly_redlining)
Visulize:
plot(philly_redlining)
Load in shapefile of Philadelphia’s Census blocks.
philly_blocks <- st_read("/Users/elizabethzazycki/Philly Project/Census_Block_Groups_2010-shp/Census_Block_Groups_2010.shp") %>%
select(NAMELSAD10) #name of census blocks
## Reading layer `Census_Block_Groups_2010' from data source `/Users/elizabethzazycki/Philly Project/Census_Block_Groups_2010-shp/Census_Block_Groups_2010.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1336 features and 15 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -75.28031 ymin: 39.86747 xmax: -74.95575 ymax: 40.13793
## CRS: 4326
view(philly_blocks)
Visualize:
Since we are going to be doing calculations it is important to first check that all our data is in the same CRS and that we have a CRS with appropriate units.
st_crs(philly_blocks)
## Coordinate Reference System:
## User input: 4326
## wkt:
## GEOGCS["WGS 84",
## DATUM["WGS_1984",
## SPHEROID["WGS 84",6378137,298.257223563,
## AUTHORITY["EPSG","7030"]],
## AUTHORITY["EPSG","6326"]],
## PRIMEM["Greenwich",0,
## AUTHORITY["EPSG","8901"]],
## UNIT["degree",0.0174532925199433,
## AUTHORITY["EPSG","9122"]],
## AUTHORITY["EPSG","4326"]]
st_crs(philly_redlining)
## Coordinate Reference System:
## User input: 4326
## wkt:
## GEOGCS["WGS 84",
## DATUM["WGS_1984",
## SPHEROID["WGS 84",6378137,298.257223563,
## AUTHORITY["EPSG","7030"]],
## AUTHORITY["EPSG","6326"]],
## PRIMEM["Greenwich",0,
## AUTHORITY["EPSG","8901"]],
## UNIT["degree",0.0174532925199433,
## AUTHORITY["EPSG","9122"]],
## AUTHORITY["EPSG","4326"]]
While both datasets are in the same CRS, the units are degree and for our calculations we need U.S. feet.
Therefore, we will transform the data into a appropriate CRS (for our purposes one calculated for Pennsylvania) whose units are U.S. feet:
philly_blocks_trans <- st_transform(philly_blocks, 2272)
st_crs(philly_blocks_trans)
## Coordinate Reference System:
## User input: EPSG:2272
## wkt:
## PROJCS["NAD83 / Pennsylvania South (ftUS)",
## GEOGCS["NAD83",
## DATUM["North_American_Datum_1983",
## SPHEROID["GRS 1980",6378137,298.257222101,
## AUTHORITY["EPSG","7019"]],
## TOWGS84[0,0,0,0,0,0,0],
## AUTHORITY["EPSG","6269"]],
## PRIMEM["Greenwich",0,
## AUTHORITY["EPSG","8901"]],
## UNIT["degree",0.0174532925199433,
## AUTHORITY["EPSG","9122"]],
## AUTHORITY["EPSG","4269"]],
## PROJECTION["Lambert_Conformal_Conic_2SP"],
## PARAMETER["standard_parallel_1",40.96666666666667],
## PARAMETER["standard_parallel_2",39.93333333333333],
## PARAMETER["latitude_of_origin",39.33333333333334],
## PARAMETER["central_meridian",-77.75],
## PARAMETER["false_easting",1968500],
## PARAMETER["false_northing",0],
## UNIT["US survey foot",0.3048006096012192,
## AUTHORITY["EPSG","9003"]],
## AXIS["X",EAST],
## AXIS["Y",NORTH],
## AUTHORITY["EPSG","2272"]]
philly_redlining_trans <- st_transform(philly_redlining, 2272)
st_crs(philly_redlining_trans)
## Coordinate Reference System:
## User input: EPSG:2272
## wkt:
## PROJCS["NAD83 / Pennsylvania South (ftUS)",
## GEOGCS["NAD83",
## DATUM["North_American_Datum_1983",
## SPHEROID["GRS 1980",6378137,298.257222101,
## AUTHORITY["EPSG","7019"]],
## TOWGS84[0,0,0,0,0,0,0],
## AUTHORITY["EPSG","6269"]],
## PRIMEM["Greenwich",0,
## AUTHORITY["EPSG","8901"]],
## UNIT["degree",0.0174532925199433,
## AUTHORITY["EPSG","9122"]],
## AUTHORITY["EPSG","4269"]],
## PROJECTION["Lambert_Conformal_Conic_2SP"],
## PARAMETER["standard_parallel_1",40.96666666666667],
## PARAMETER["standard_parallel_2",39.93333333333333],
## PARAMETER["latitude_of_origin",39.33333333333334],
## PARAMETER["central_meridian",-77.75],
## PARAMETER["false_easting",1968500],
## PARAMETER["false_northing",0],
## UNIT["US survey foot",0.3048006096012192,
## AUTHORITY["EPSG","9003"]],
## AXIS["X",EAST],
## AXIS["Y",NORTH],
## AUTHORITY["EPSG","2272"]]
Now that the data is loaded, cleaned, and in the right CRS we can begin spatial overlay and calculation process.
In one really compact coding chunck we can perform a spatial join and an areal calculation. In the next steps we will break down the process
overlap <- st_intersection(
philly_redlining_trans,
philly_blocks_trans ) %>% # spatial joining
mutate(area = st_area(geometry)) %>% # calculate block area and add new column
as.data.frame() %>%
select(NAMELSAD10, area, holc_grade, geometry) %>%
group_by(NAMELSAD10) %>%
mutate(coverage = as.numeric(area / sum(area)) # Calculate coverage
)
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries
view(overlap)
head(overlap)
## # A tibble: 6 x 5
## # Groups: NAMELSAD10 [3]
## NAMELSAD10 area holc_grade geometry coverage
## <fct> [US_survey <fct> <POLYGON [US_survey_foot]> <dbl>
## 1 Block Grou… 1154366.2 D ((2682148 242123.8, 2682453 242169… 0.00150
## 2 Block Grou… 1117026.0 D ((2683777 240700.8, 2683155 240593… 0.00201
## 3 Block Grou… 470634.9 D ((2686641 240342.8, 2686676 240141… 0.000845
## 4 Block Grou… 1172871.7 D ((2680332 243447, 2679669 243332, … 0.00211
## 5 Block Grou… 1107834.1 D ((2683049 243015.5, 2683007 242950… 0.00144
## 6 Block Grou… 1198951.8 D ((2681213 244107.2, 2681856 243823… 0.00316
In the first part of the code chunk we create an object called overlap that is the inersection of our redlining data and census block data.
overlap <- st_intersection(
philly_redlining_trans,
philly_blocks_trans) %>% # spatial joining
Then, in the next chunk we calculate the area of the intersection of redlining boundaries and the modern day census blocks and select the columns we are interested in.
mutate(area = st_area(geometry)) %>% # calculate block area and add new column
as.data.frame() %>%
select(NAMELSAD10, area, holc_grade, geometry) %>%
group_by(NAMELSAD10) %>%
Finally, we calculate the coverage of each modern day census block by each redlining boundary and its grade.
mutate(coverage = as.numeric(area / sum(area)) # Calculate coverage
)
view(overlap)
## tmap mode set to plotting
And, just for fun, let’s look at blocks that overlap with Holc Grade A
From here, with your data all at the same scale, you can go forth and do more complex analysis!