Welcome to our mapping unit! Mapping in R is easy if you know how to use ggplot2, as most maps will just build on our ggplots by adding an additional geom layer that is specific to spatial data.

In R, most spatial analysis uses the SF package, which stands for Simple Features. This is the type of data that we will store spatial information in - if you’ve worked with GIS, this is what R calls a shapefile. The main difference between an SF object and a dataframe is that an SF object has a column at the end called “Geometry” that specifies the spatial information for each row of the object.

We’ll explore mapping in R using historical redlining data in US cities from the Mapping Inequality project at the University of Richmond. See below for a description of the Redlining grades made by the Home Owners’ Loan Corporation:

“HOLC staff members, using data and evaluations organized by local real estate professionals—lenders, developers, and real estate appraisers—in each city, assigned grades to residential neighborhoods that reflected their”mortgage security” that would then be visualized on color-coded maps. Neighborhoods receiving the highest grade of “A”—colored green on the maps—were deemed minimal risks for banks and other mortgage lenders when they were determining who should received loans and which areas in the city were safe investments. Those receiving the lowest grade of “D,” colored red, were considered “hazardous.””

Redlining in urban areas across the US is responsible for systematically excluding the poor, families of color, immigrants, and other minoritized groups from accessing homeownership. In some places, this caused widespread housing insecurity for these households, as they had no choice but to rent low-quality housing that was available in their neighborhoods. Excluding entire groups from homeownership has also limited their access to building intergenerational wealth. A significant racial wealth gap persists today, even as families of color have experienced increased incomes and access to opportunity - the legacies of redlining are widespread.

library(sf)     
library(dplyr)   
library(spData) 
library(urbnmapr)
library(ggplot2)
library(sp)
library(ggmap)
library(ggthemes)
library(stringr)

setwd("~/Binghamton/harp325/shapefile")
Warning: The working directory was changed to C:/Users/mhaller/Documents/Binghamton/harp325/shapefile inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
map <- st_read("holc_ad_data.shp")
Reading layer `holc_ad_data' from data source 
  `C:\Users\mhaller\Documents\Binghamton\Harp325\shapefile\holc_ad_data.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 8878 features and 7 fields (with 3 geometries empty)
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -122.7675 ymin: 25.70537 xmax: -70.9492 ymax: 47.72251
Geodetic CRS:  WGS 84
#load the county base map
counties <- get_urbn_map("counties", sf = TRUE)

#Here, I'm changing the CRS of both layers so that they match
#EPSG codes specify the projection of the map
#You can find some state-level epsg codes here: https://github.com/veltman/d3-stateplane
#The espg code I've chosen is for the entire US, and is fine for this map
#I'd definitely pick a different projection if I were using this in a paper or assignment

map <- map %>% st_transform("EPSG:4269")
counties <- counties %>% st_transform("EPSG:4269")
old-style crs object detected; please recreate object with a recent sf::st_crs()
ggplot()+
  geom_sf(counties, mapping = aes(), fill = "gray", color = "white")+
  geom_sf(map, mapping = aes(), fill = "red", color = "red")+
  theme_minimal()

Ok, clearly we are zoomed out way too far! Let’s pick a city to plot. I’ll start with my hometown, Rochester, NY. We’ll do an activity after this so that you can plot a city of your choice as well.

#I need to include this option to turn off spherical geometry
#don't worry about this too much - R won't let us focus on one city unless we do
sf_use_s2(FALSE)

#I'll grab rochester's county, Monroe, for plotting
rochester <- counties %>% filter(county_name == "Monroe County" & state_name == "New York")

#we need to also filter the redlining layer by Rochester - the simplest way to do this is with brackets, like this:
#this only works because we turned off spherical geometry
roc_redline <- map[rochester, ]
although coordinates are longitude/latitude, st_intersects assumes that they are planar
#Before I map it, I want to check the dimensions of the redlining layer, so I can zoom in my map 

roc_redline
Simple feature collection with 61 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -77.6922 ymin: 43.10451 xmax: -77.51284 ymax: 43.27137
Geodetic CRS:  NAD83
First 10 features:
     state      city                                  name holc_id
5961    NY Rochester      Shore Front Creece and Rochester      A1
5962    NY Rochester                              Brighton     A10
5963    NY Rochester                 Meadow Brook Brighton     A11
5964    NY Rochester                                  <NA>     A12
5965    NY Rochester Boulevard Manor- Maywood, Irondequoit      A2
5966    NY Rochester          Winona Boulevard Irondequoit      A3
5967    NY Rochester                     Koda Vista Greece      A4
5968    NY Rochester              Latona Homesteads Greece      A5
5969    NY Rochester                 East Avenue Rochester      A6
5970    NY Rochester                Brightondale Rochester      A7
     holc_grade neighborho
5961          A       2367
5962          A       2422
5963          A       2407
5964          A       2425
5965          A       2415
5966          A       2414
5967          A       2410
5968          A       2409
5969          A       2399
5970          A       2405
                                                                                                                                                                                                                                                             area_descr
5961 { "72" : "$19.54 Greece, $41.73 Roch. No sales reported in \\"A\\" areas in Greece. Effective Tax Rate for all \\"A\\" areas in Rochester based on 3 sales $50.49.", "8" : "A small area advantageously located on the lake front at some distance from the center
5962     { "2j" : " Fair up to $10,000 on 2nd hand", "2l" : " *60-75", "3" : "3 On order 1-family at $20,000+", "9" : "11/1/39 A-10 High 1st Brighton", "2m" : "(+)11   *65-85", "1a" : " Slowly ", "5" : "3", "8" : "There are actually two parts to this area. The fi
5963     { "2o" : "Good up to $60 ", "2i" : " 0 4500-12,000 1939 ", "2n" : "0 1939 *50-75 (*Small homes only)  ", "2m" : "- No rentals  ", "2j" : "Fair up to $9,000 on 2nd hand, Good up to $9,000 on new ", "2k" : "Stable ", "2l" : "None in existence ", "0" : "Roc
5964     { "3" : " 0 ", "71" : "9", "4a" : "0", "10" : "", "2" : "100 ", "2a" : "Medium sized 1-family 2-2 1/2 stories, 7-10 rooms ", "0" : "Rochester, New York", "9" : "11/1/39 A-12 Clover Hills, Brighton & Pittsford 1st", "6" : "Ample", "1d" : "0", "1c" : "0 ",
5965     { "2c" : "Up to 20 years ", "2j" : " Fair up to $8,000", "71" : "9", "8" : "Although this area dates back over a considerable period, more recent years have witnessed a very substantial amount of speculative development work which still continues. The ol
5966   { "1d" : "0", "4b" : "24", "2o" : " Not a rental area", "2a" : " Moderate size, one-family 1 1/2-2 1/2 stories, 5-7 rooms", "2h" : "5500-7500   -", "2k" : " Stable", "72" : "16.93, Effective Tax Rate for all \\"A\\" areas in Irondequoit based on 8 sales $1
5967     { "5" : "3", "2c" : "Up to 10 years ", "71" : "9", "9" : "Low 1st Koda Vista, Greece 11/1/39 A-4", "5b" : "11", "2f" : " 96%", "4b" : "19", "4a" : "0", "1c" : "0 ", "10" : "", "2a" : " Moderate sized, 1-family 2-2 1/2 stories, 5-7 rooms", "1b" : "Foremen
5968     { "1b" : "Better paid employees of Eastman Kodak Co. with incomes from $2500 to $4000", "4b" : "0", "4a" : "0", "5b" : "0", "10" : "", "1d" : "0", "2g" : " None in existence", "2a" : " Small 1-family 1 1/2-2-story, 5-6 rooms", "2d" : " Excellent", "2b" :
5969   { "2i" : " 1939 (-)19  *12,500-40,000, *Nominal", "4a" : "0", "72" : "41.73, Effective Tax Rate for all \\"A\\" areas in ROchester based on 3 sales $50.49.", "5" : "3", "2m" : "  - No rentals", "2e" : "98% ", "2d" : "Excellent ", "1a" : "  Yes", "2p" : " "
5970     { "1d" : "0", "1b" : "Executives & white-collar whose incomes range from $3000 to $10,000", "2k" : " Stable in lower bracket & downward in upper", "2d" : " Excellent", "9" : "11/1/39 1st A-7 Brightondale, Rochester", "71" : "9", "2n" : "  65-125 0 1939",
                           geometry
5961 MULTIPOLYGON (((-77.61829 4...
5962 MULTIPOLYGON (((-77.53257 4...
5963 MULTIPOLYGON (((-77.57404 4...
5964 MULTIPOLYGON (((-77.54934 4...
5965 MULTIPOLYGON (((-77.59907 4...
5966 MULTIPOLYGON (((-77.60402 4...
5967 MULTIPOLYGON (((-77.66021 4...
5968 MULTIPOLYGON (((-77.68379 4...
5969 MULTIPOLYGON (((-77.56185 4...
5970 MULTIPOLYGON (((-77.57222 4...
#let's map it using those coordinates!
ggplot()+
  geom_sf(rochester, mapping = aes(), fill = "gray80")+
  geom_sf(roc_redline, mapping = aes(fill = holc_grade), color = "white")+
  theme_minimal()+
  labs(fill = "Redline Score", title = "Redlining in Rochester, NY")+
  theme(plot.title = element_text(hjust = 0.5))+
  ylim(43.1, 43.3)+
  xlim(-77.7, -77.5)

We can do much better than this in R! One issue is that there ins’t a base map here - if you’re familiar with the area, it would be helpful to see the locations of streets and major highways. We can do this using the ggmap package.

#set lat/long of base map
LAT1 = 43.1
LAT2 = 43.3
LON1 = -77.7
LON2 = -77.5

#create basemap - I like the terrain-lines map in ggmap
#you can play around with it on your own. Other basemaps can be found in ?get_stamenmap
#first, set lat/long coordinates of base map
base_map <- get_stamenmap(bbox = c(left = LON1, right = LON2, 
                                   bottom = LAT1, top = LAT2),
                          #this zoom works well for a smaller city
                          #I recommend starting at 10 and increasing by 1 to find                             the best zoom option
                          zoom = 12, 
                          #pick the map type you want
                          maptype = "terrain-lines")
i Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.
#let's make a nice color palette for the legend
colors <- c("seagreen", "seagreen1", "yellow", "tomato")

#use the ggmap function instead of ggplot2 to map the base map
ggmap(base_map)+
  geom_sf(roc_redline, mapping = aes(fill = holc_grade), color = "white",
  #we need to include inheret.aes=F here or our geom_sf layer won't work
  inherit.aes = FALSE, alpha = 0.6)+
  theme_minimal()+
  labs(fill = "Redline Score", title = "Redlining in Rochester, NY")+
  theme(plot.title = element_text(hjust = 0.5))+
  scale_fill_manual(values = colors)
Coordinate system already present. Adding new coordinate system, which will replace the existing one.

Before we move on to the activity, I want to also show you how to map data at the national level. I’ll illustrate this using the inequality data that we’ve used previously from Opportunity Insights.

setwd("~/binghamton/geog380")
Warning: The working directory was changed to C:/Users/mhaller/Documents/binghamton/geog380 inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
data <- read.csv("county_data.csv")

#We'll need to merge this dataset with the county base map
#Because it doesn't have any geographic info of its own to use to map it
#To do this, we need 1-2 columns that identify each unique county in each 
#We'll use county_name and state_name
#The county base map has some words (e.g. County, Parish etc.) and symbols (the . after St.) that our data doesn't have - we have to clean it up, or we can't merge the dataframes
counties1 <- counties %>% 
  #remove unecessary words and the extra space 
  mutate(county_name = str_remove_all(county_name, " County| Parish| Borough| Census Area| Municipality")) %>% 
  #get rid of the period using gsub
  mutate(county_name = gsub("[.]", "", county_name)) 

#the column names need to match for this method - only statename and state_name don't match in the two dataframes
colnames(data)[3] <- "state_name"

#merge the data! Always put the sf object first, or you will lose the geogrpahic information in the merge
data_merged <- merge(counties1, data, by = c("county_name", "state_name"))

#map it!
ggplot()+
  geom_sf(counties, mapping = aes(), fill = "gray", color = "white")+
  geom_sf(data_merged, mapping = aes(fill = poverty_rate), color = "white")+
  theme_map()+
  scale_fill_gradient(low = "white", high = "tomato3")+
  labs(title = "Poverty Rate in US Counties, 2000",
       fill = "Poverty Rate")+
  theme(plot.title = element_text(hjust = 0.5))

We’re still missing a few counties! We’d need to go in manually to check which counties aren’t merging properly. This is a fairly tedious process. Why don’t we try mapping this at the state level instead?

states <- get_urbn_map("states", sf = TRUE)

#aggregate poverty data to the state level
pov_data <- data %>% select(state_name, poverty_rate) %>% 
  group_by(state_name) %>% 
  summarise(mean_poverty = mean(poverty_rate))

#oops, the District of Columbia doesn't match up in both dataframes - the Of is capitalized in pov_data but not in states, let's fix it
pov_data[9,1] <- "District of Columbia"

#now merge it
pov_merged <- merge(states, pov_data, by = "state_name")
old-style crs object detected; please recreate object with a recent sf::st_crs()
#Make the final map
ggplot()+
  geom_sf(states, mapping = aes(), fill = "gray", color = "white")+
  geom_sf(pov_merged, mapping = aes(fill = mean_poverty), color = "white")+
  theme_map()+
  scale_fill_gradient(low = "white", high = "tomato3")+
  labs(title = "Poverty Rate in US States, 2000",
       fill = "Mean Poverty Rate")+
  theme(plot.title = element_text(hjust = 0.5))

states
Simple feature collection with 51 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -2600000 ymin: -2363000 xmax: 2516374 ymax: 732352.2
old-style crs object detected; please recreate object with a recent sf::st_crs()
Projected CRS: NAD27 / US National Atlas Equal Area
First 10 features:
old-style crs object detected; please recreate object with a recent sf::st_crs()
old-style crs object detected; please recreate object with a recent sf::st_crs()
   state_fips state_abbv  state_name                       geometry
1          01         AL     Alabama MULTIPOLYGON (((1150023 -15...
2          04         AZ     Arizona MULTIPOLYGON (((-1386136 -1...
3          08         CO    Colorado MULTIPOLYGON (((-786661.9 -...
4          09         CT Connecticut MULTIPOLYGON (((2156197 -83...
5          12         FL     Florida MULTIPOLYGON (((1953691 -20...
6          13         GA     Georgia MULTIPOLYGON (((1308636 -10...
7          16         ID       Idaho MULTIPOLYGON (((-1357097 78...
8          18         IN     Indiana MULTIPOLYGON (((1042064 -71...
9          20         KS      Kansas MULTIPOLYGON (((-174904.2 -...
10         22         LA   Louisiana MULTIPOLYGON (((1075669 -15...

Resources

Data: Mapping Inequality, Redlining New Deal in America. https://dsl.richmond.edu/panorama/redlining/

Data: Opportunity Insights. https://opportunityinsights.org/

Urban Institute (2022). Introduction to Mapping. Retrieved from: https://urbaninstitute.github.io/r-at-urban/mapping.html.

Kahle, D., Wickham, H., Jackson, S. & Korpela, M. (2023). Ggmap. Retrieved from: https://cran.r-project.org/web/packages/ggmap/ggmap.pdf. Read More: https://github.com/dkahle/ggmap

---
title: "Topic 5: Mapping"
output: html_notebook
---
Welcome to our mapping unit! Mapping in R is easy if you know how to use ggplot2, as most maps will just build on our ggplots by adding an additional geom layer that is specific to spatial data. 

In R, most spatial analysis uses the SF package, which stands for Simple Features. This is the type of data that we will store spatial information in - if you've worked with GIS, this is what R calls a shapefile. The main difference between an SF object and a dataframe is that an SF object has a column at the end called "Geometry" that specifies the spatial information for each row of the object.

We'll explore mapping in R using historical redlining data in US cities from the Mapping Inequality project at the University of Richmond. See below for a description of the Redlining grades made by the Home Owners' Loan Corporation:

"HOLC staff members, using data and evaluations organized by local real estate professionals—lenders, developers, and real estate appraisers—in each city, assigned grades to residential neighborhoods that reflected their "mortgage security" that would then be visualized on color-coded maps. Neighborhoods receiving the highest grade of "A"—colored green on the maps—were deemed minimal risks for banks and other mortgage lenders when they were determining who should received loans and which areas in the city were safe investments. Those receiving the lowest grade of "D," colored red, were considered "hazardous.""

Redlining in urban areas across the US is responsible for systematically excluding the poor, families of color, immigrants, and other minoritized groups from accessing homeownership. In some places, this caused widespread housing insecurity for these households, as they had no choice but to rent low-quality housing that was available in their neighborhoods. Excluding entire groups from homeownership has also limited their access to building intergenerational wealth. A significant racial wealth gap persists today, even as families of color have experienced increased incomes and access to opportunity - the legacies of redlining are widespread. 
```{r}
library(sf)     
library(dplyr)   
library(spData) 
library(urbnmapr)
library(ggplot2)
library(sp)
library(ggmap)
library(ggthemes)
library(stringr)

setwd("~/Binghamton/harp325/shapefile")
map <- st_read("holc_ad_data.shp")

#load the county base map
counties <- get_urbn_map("counties", sf = TRUE)

#Here, I'm changing the CRS of both layers so that they match
#EPSG codes specify the projection of the map
#You can find some state-level epsg codes here: https://github.com/veltman/d3-stateplane
#The espg code I've chosen is for the entire US, and is fine for this map
#I'd definitely pick a different projection if I were using this in a paper or assignment

map <- map %>% st_transform("EPSG:4269")
counties <- counties %>% st_transform("EPSG:4269")


ggplot()+
  geom_sf(counties, mapping = aes(), fill = "gray", color = "white")+
  geom_sf(map, mapping = aes(), fill = "red", color = "red")+
  theme_minimal()
```

Ok, clearly we are zoomed out way too far! Let's pick a city to plot. I'll start with my hometown, Rochester, NY. We'll do an activity after this so that you can plot a city of your choice as well. 
```{r}
#I need to include this option to turn off spherical geometry
#don't worry about this too much - R won't let us focus on one city unless we do
sf_use_s2(FALSE)

#I'll grab rochester's county, Monroe, for plotting
rochester <- counties %>% filter(county_name == "Monroe County" & state_name == "New York")

#we need to also filter the redlining layer by Rochester - the simplest way to do this is with brackets, like this:
#this only works because we turned off spherical geometry
roc_redline <- map[rochester, ]

#Before I map it, I want to check the dimensions of the redlining layer, so I can zoom in my map 

roc_redline

```

```{r}
#let's map it using those coordinates!
ggplot()+
  geom_sf(rochester, mapping = aes(), fill = "gray80")+
  geom_sf(roc_redline, mapping = aes(fill = holc_grade), color = "white")+
  theme_minimal()+
  labs(fill = "Redline Score", title = "Redlining in Rochester, NY")+
  theme(plot.title = element_text(hjust = 0.5))+
  ylim(43.1, 43.3)+
  xlim(-77.7, -77.5)

```

We can do much better than this in R! One issue is that there ins't a base map here - if you're familiar with the area, it would be helpful to see the locations of streets and major highways. We can do this using the ggmap package.

```{r}
#set lat/long of base map
LAT1 = 43.1
LAT2 = 43.3
LON1 = -77.7
LON2 = -77.5

#create basemap - I like the terrain-lines map in ggmap
#you can play around with it on your own. Other basemaps can be found in ?get_stamenmap
#first, set lat/long coordinates of base map
base_map <- get_stamenmap(bbox = c(left = LON1, right = LON2, 
                                   bottom = LAT1, top = LAT2),
                          #this zoom works well for a smaller city
                          #I recommend starting at 10 and increasing by 1 to find                             the best zoom option
                          zoom = 12, 
                          #pick the map type you want
                          maptype = "terrain-lines")

#let's make a nice color palette for the legend
colors <- c("seagreen", "seagreen1", "yellow", "tomato")

#use the ggmap function instead of ggplot2 to map the base map
ggmap(base_map)+
  geom_sf(roc_redline, mapping = aes(fill = holc_grade), color = "white",
  #we need to include inheret.aes=F here or our geom_sf layer won't work
  inherit.aes = FALSE, alpha = 0.6)+
  theme_minimal()+
  labs(fill = "Redline Score", title = "Redlining in Rochester, NY")+
  theme(plot.title = element_text(hjust = 0.5))+
  scale_fill_manual(values = colors)

```
Before we move on to the activity, I want to also show you how to map data at the national level. I'll illustrate this using the inequality data that we've used previously from Opportunity Insights.

```{r}
setwd("~/binghamton/geog380")
data <- read.csv("county_data.csv")

#We'll need to merge this dataset with the county base map
#Because it doesn't have any geographic info of its own to use to map it
#To do this, we need 1-2 columns that identify each unique county in each 
#We'll use county_name and state_name
#The county base map has some words (e.g. County, Parish etc.) and symbols (the . after St.) that our data doesn't have - we have to clean it up, or we can't merge the dataframes
counties1 <- counties %>% 
  #remove unecessary words and the extra space 
  mutate(county_name = str_remove_all(county_name, " County| Parish| Borough| Census Area| Municipality")) %>% 
  #get rid of the period using gsub
  mutate(county_name = gsub("[.]", "", county_name)) 

#the column names need to match for this method - only statename and state_name don't match in the two dataframes
colnames(data)[3] <- "state_name"

#merge the data! Always put the sf object first, or you will lose the geogrpahic information in the merge
data_merged <- merge(counties1, data, by = c("county_name", "state_name"))

#map it!
ggplot()+
  geom_sf(counties, mapping = aes(), fill = "gray", color = "white")+
  geom_sf(data_merged, mapping = aes(fill = poverty_rate), color = "white")+
  theme_map()+
  scale_fill_gradient(low = "white", high = "tomato3")+
  labs(title = "Poverty Rate in US Counties, 2000",
       fill = "Poverty Rate")+
  theme(plot.title = element_text(hjust = 0.5))

```

We're still missing a few counties! We'd need to go in manually to check which counties aren't merging properly. This is a fairly tedious process. Why don't we try mapping this at the state level instead?

```{r}
states <- get_urbn_map("states", sf = TRUE)

#aggregate poverty data to the state level
pov_data <- data %>% select(state_name, poverty_rate) %>% 
  group_by(state_name) %>% 
  summarise(mean_poverty = mean(poverty_rate))

#oops, the District of Columbia doesn't match up in both dataframes - the Of is capitalized in pov_data but not in states, let's fix it
pov_data[9,1] <- "District of Columbia"

#now merge it
pov_merged <- merge(states, pov_data, by = "state_name")

#Make the final map
ggplot()+
  geom_sf(states, mapping = aes(), fill = "gray", color = "white")+
  geom_sf(pov_merged, mapping = aes(fill = mean_poverty), color = "white")+
  theme_map()+
  scale_fill_gradient(low = "white", high = "tomato3")+
  labs(title = "Poverty Rate in US States, 2000",
       fill = "Mean Poverty Rate")+
  theme(plot.title = element_text(hjust = 0.5))
```
```{r}
#This projection looks nice! What is it?

states
```

Resources

Data: Mapping Inequality, Redlining New Deal in America. https://dsl.richmond.edu/panorama/redlining/

Data: Opportunity Insights. https://opportunityinsights.org/ 

Urban Institute (2022). Introduction to Mapping. Retrieved from: https://urbaninstitute.github.io/r-at-urban/mapping.html. 

Kahle, D., Wickham, H., Jackson, S. & Korpela, M. (2023). Ggmap. Retrieved from: https://cran.r-project.org/web/packages/ggmap/ggmap.pdf. 
Read More: https://github.com/dkahle/ggmap 


