Lecture Content

Note: Document prepared for Spatial Socioeconometric Modeling and materials build on Lovelace, Nowosad, and Muenchow (2019) and Tennekes (2018)

1 Visualizing geocoded data

1.1 Approaches: Dynamic and Static approaches

  • The choice depends on your audience and objectives.
    • For papers and publication purposes, static may be better
    • For presentations, much better to rely on dynamic or interactive visuals
    • Big geodata may not be supported with dynamic procedures, in these cases you need to rely on static tools
  • The key element is that map making allows us to communicate the results
    • Some patterns may emerge
    • Intuition behind processes and structures may be explained more easily
    • One should be creative in deciding what to plot
  • In previous sessions we have seen how to visualize geocoded elements
    • In today’s session we will conduct dedicated mapping a more sophisticated approach to map making.
    • It can generate static and interactive maps using the same code via tmap_mode()

1.1.1 Static maps for polygons

  • Initially, static maps were the only type of maps that R could produce. This changed with tmap

  • tmap is a powerful and flexible map-making package with amazing defaults

    • It has a concise syntax that allows for the creation of attractive maps with minimal code
  • tmap is based on the idea of a ‘grammar of graphics’ (see Wilkinson (2012))

    • A separation between the input data and the aesthetics
    • Each input dataset can be ‘mapped’ in a range of different ways including location on the map
  • tm_shape()constitutes the basic building block that is followed by one or more layer elements such as tm_fill() and tm_dots()

#library(knitr)
knitr::include_graphics("https://geocompr.robinlovelace.net/08-mapping_files/figure-html/tmshape-1.png")

Spatial weights matrixs from Lovelace, Nowosad, and Muenchow (2019) * We will rely on data gathered at the ZCTA level using similar procedures in previous classes

  • The spatial object or shell is obtained from Tigris (Walker (2020)) and geolocated data from IRS and ACS

Specifically, we will rely on the following packages

install.packages("tigris")
install.pacakges("tmap")
library(spdep)
library(tigris)
options(tigris_use_cache = TRUE)
library(acs)
library(stringr)
library(tmap)
library(tmaptools)
library(plyr)

1.1.1.1 The following lines download the zip codes from PA

# For NY state you can use the function starts with
zip_PA<-zctas(starts_with = c("15", "16", "17", "18", "19"), class="sp")
## Warning in proj4string(obj): CRS object has comment, which is lost in output
  • To plot PA as a whole we can use the following
tm_shape(zip_PA) +
  tm_fill()
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Filled map. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Filled map. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

  • To plot borders of PA ZCTAs we now use the following
tm_shape(zip_PA) +
  tm_borders()
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Border map. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Border map. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

  • We can bring everything together by adding another layer now use the following
tm_shape(zip_PA) +
 tm_fill() + #base layer 
  tm_borders() # border layer
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Filled and Border map. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Filled and Border map. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

  • tm_polygons() condenses tm_fill() + tm_borders() into a single function
tm_shape(zip_PA) +
 tm_polygons()
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Filled and Border map. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Filled and Border map. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

1.1.1.2 Adding data

a <- read.csv("G:\\My Drive\\Phudcfily\\Syllabus PHUDCFILY\\GitHub PHUDCFILY\\SSEM\\Mapping and visualization\\Materials\\concentrated_disadvantage.csv")
head(a)

Do you need to dead with zip length?

names(zip_PA@data)
## [1] "ZCTA5CE10"  "GEOID10"    "CLASSFP10"  "MTFCC10"    "FUNCSTAT10"
## [6] "ALAND10"    "AWATER10"   "INTPTLAT10" "INTPTLON10"
zip_PA<-geo_join(zip_PA, a, by_sp="ZCTA5CE10", by_df="zip", how = "left")
summary(zip_PA@data[,c("pct_single_mother", "pct_african_american", "pct_eitc", "pct10kbelow")])
##  pct_single_mother pct_african_american    pct_eitc       pct10kbelow    
##  Min.   :0.00000   Min.   :0.000000     Min.   :0.0000   Min.   :0.0833  
##  1st Qu.:0.05636   1st Qu.:0.000000     1st Qu.:0.0932   1st Qu.:0.2981  
##  Median :0.08845   Median :0.005461     Median :0.1298   Median :0.3348  
##  Mean   :0.10025   Mean   :0.045204     Mean   :0.1337   Mean   :0.3361  
##  3rd Qu.:0.12449   3rd Qu.:0.035636     3rd Qu.:0.1626   3rd Qu.:0.3688  
##  Max.   :1.00000   Max.   :0.940823     Max.   :0.5250   Max.   :0.7667  
##  NA's   :29        NA's   :16           NA's   :446      NA's   :446
  • tmap easily maps missing values
    • Good practice to add legend and a contrasting color
tmap_mode("plot") #for static maps, default when loading package
## tmap mode set to plotting
map1<-
tm_shape(zip_PA) +
  tm_fill("pct_single_mother", style = "quantile", 
  n = 9, palette = "inferno", 
  title = "Pct single_mother\nper Zip Code*", 
  colorNA = "grey81") +
  tm_layout(bg.color = "grey21",
  title = "Zip Code distribution of single mother households",
  title.position = c("right", "top"), title.size = 1.1, title.color = "white",
  legend.position = c("left", "bottom"), legend.text.size = 0.85,
  legend.width = 0.25, legend.text.color = "white", legend.title.color="white") +
  tm_credits("Data source: ACS\n*Missing values contain ZCTAs with no information",
  position = c(0.25, 0.02), size = .75, col="white")+
  tm_borders(col=rgb(31,31,31,max=250,250/3))
map1 #print
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Filled and Border map. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Filled and Border map. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

tmap_mode("plot") #for static maps, default when loading package
## tmap mode set to plotting
map2<-
tm_shape(zip_PA) +
  tm_fill("pct10kbelow", style = "quantile", 
  n = 9, palette = "inferno", 
  title = "Pct below poverty line\nper Zip Code*", 
  colorNA = "grey81") +
  tm_layout(bg.color = "grey21",
  title = "Zip Code distribution of ZCTAs taxpayers below poverty line",
  title.position = c("right", "top"), title.size = 1.1, title.color = "white",
  legend.position = c("left", "bottom"), legend.text.size = 0.85,
  legend.width = 0.25, legend.text.color = "white", legend.title.color="white") +
  tm_credits("Data source: IRS\n*Missing values contain ZCTAs with no information",
  position = c(0.25, 0.02), size = .75, col="white")+
  tm_borders(col=rgb(31,31,31,max=250,250/3))
map2 #print
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Filled and Border map. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Filled and Border map. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

tmap_arrange(map1, map2)
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
## Some legend labels were too wide. These labels have been resized to 0.44, 0.44, 0.44, 0.44, 0.44, 0.44, 0.44, 0.44, 0.44. Increase legend.width (argument of tm_layout) to make the legend wider and therefore the labels larger.
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
## Some legend labels were too wide. These labels have been resized to 0.44, 0.44, 0.44, 0.44, 0.44, 0.44, 0.44, 0.44, 0.44. Increase legend.width (argument of tm_layout) to make the legend wider and therefore the labels larger.
Arranging maps. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Arranging maps. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

1.1.2 Dynamic or interactive procedures for polygons

To add multiple layers in the same zip codes, we need to create copies

zip_PA2<-zip_PA
summary(as.numeric(zip_PA@data$INTPTLAT10))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   38.47   40.12   40.50   40.59   41.01   42.17
summary(as.numeric(zip_PA@data$INTPTLON10)) 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -80.51  -79.44  -77.39  -77.58  -75.78  -74.74
tmap_mode("view")
html_maps<-tm_basemap("CartoDB.DarkMatter") +
tm_shape(zip_PA) +
  tm_fill("pct_single_mother", 
          style = "quantile", 
          n = 9, 
          palette = rev(inferno(256)), 
          title = "ZCTA distribution<br>of single mother households<br>in Pennsylvania", 
          colorNA = "grey81", 
          id="zip", 
          textNA = "Information not available") +
tm_shape(zip_PA2) +
  tm_fill("pct10kbelow", 
          style = "quantile", 
          n = 9, 
          palette = rev(inferno(256)), 
          title = "ZCTA distribution<br>of taxpayers below poverty line<br>in Pennsylvania", 
          colorNA = "grey81", 
          id="zip", 
          textNA = "Information not available")

html_maps+tm_view(alpha = 1, 
                  view.legend.position = c("right", "top"), 
                  set.bounds=c(-74.74, 42.17, -80.51, 38.47), 
                  set.view=(zoom=6))
Plotting polygons

Plotting polygons

2 Point mapping

This version also deals with static and dynamic mapping ## Static version for points First let’s gather point level information

a<-getwd()
url18 <- paste('http://nces.ed.gov/ipeds/datacenter/data/HD2018.zip')
download.file(url18, destfile = paste(a,"HD2018.zip",sep="/"))
d18<- read.csv(unz("HD2018.zip", "hd2018.csv"))

#Subsetting the file with institutions located in PA
d18 <-d18[d18$STABBR=="PA",] 
#Looking at the dimensions of the subsetted dataset
dim(d18)
## [1] 359  73
table(d18$SECTOR)
## 
##   0   1   2   3   4   5   6   7   8   9 
##   1  45 110   9  18  26  53  31   5  61
d18<-d18[d18$SECTOR<5,c("UNITID", "INSTNM", "SECTOR", "LONGITUD", "LATITUDE")]
d18#Print Table 
#This will create the object coords as shown in equation (17)
coords = cbind(d18$LONGITUD, d18$LATITUDE)
url18 <- paste('https://nces.ed.gov/ipeds/datacenter/data/SFA1718.zip')
download.file(url18, destfile = paste(a,"SFA1718.zip",sep="/"))
financial_aid18<- read.csv(unz("SFA1718.zip", "sfa1718.csv"))

d18$pct_pell <- financial_aid18$UPGRNTP[match(d18$UNITID,financial_aid18$UNITID)]
head(d18)
summary(d18)
##      UNITID          INSTNM              SECTOR         LONGITUD     
##  Min.   :210492   Length:183         Min.   :0.000   Min.   :-80.51  
##  1st Qu.:212788   Class :character   1st Qu.:1.500   1st Qu.:-78.84  
##  Median :214634   Mode  :character   Median :2.000   Median :-76.17  
##  Mean   :234969                      Mean   :1.989   Mean   :-76.97  
##  3rd Qu.:215806                      3rd Qu.:2.000   3rd Qu.:-75.32  
##  Max.   :484783                      Max.   :4.000   Max.   :-74.91  
##                                                                      
##     LATITUDE        pct_pell    
##  Min.   :39.81   Min.   : 0.00  
##  1st Qu.:40.04   1st Qu.:26.00  
##  Median :40.29   Median :34.00  
##  Mean   :40.46   Mean   :34.48  
##  3rd Qu.:40.66   3rd Qu.:42.00  
##  Max.   :42.13   Max.   :87.00  
##                  NA's   :24
projcrs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
public4yr <- st_as_sf(x = d18[d18$SECTOR==1,],
           coords = c("LONGITUD", "LATITUDE"),
           crs = projcrs)
privateNOPROF4yr <- st_as_sf(x = d18[d18$SECTOR==2,],
           coords = c("LONGITUD", "LATITUDE"),
           crs = projcrs)
privatePROFIT4yr <- st_as_sf(x = d18[d18$SECTOR==3,],
           coords = c("LONGITUD", "LATITUDE"),
           crs = projcrs)
public2yr <- st_as_sf(x = d18[d18$SECTOR==4,],
           coords = c("LONGITUD", "LATITUDE"),
           crs = projcrs)

Now, we are ready to use zip_PA

tmap_mode("plot")
## tmap mode set to plotting
map_pub4yr<-
 tm_shape(zip_PA) +
  tm_polygons() +
 tm_shape(public4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "red", 
             id="INSTNM", 
             alpha=.5)  
map_pub4yr
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Mapping public four-year colleges in PA

Mapping public four-year colleges in PA

tmap_mode("plot")
## tmap mode set to plotting
map_pub2yr<-
 tm_shape(zip_PA) +
  tm_polygons() +
 tm_shape(public2yr) + 
  tm_symbols(size = "pct_pell", 
             col = "pink", 
             id="INSTNM", 
             alpha=.5)  
map_pub2yr
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Mapping public two-year colleges in PA

Mapping public two-year colleges in PA

tmap_mode("plot")
## tmap mode set to plotting
map_priNoPro4yr<-
 tm_shape(zip_PA) +
  tm_polygons() +
 tm_shape(privateNOPROF4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "green", 
             id="INSTNM", 
             alpha=.5)
map_priNoPro4yr
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Mapping private non-profit four-year colleges in PA

Mapping private non-profit four-year colleges in PA

tmap_mode("plot")
## tmap mode set to plotting
map_priProFit4yr<-
 tm_shape(zip_PA) +
  tm_polygons() +
 tm_shape(privatePROFIT4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "yellow", 
             id="INSTNM", 
             alpha=.5)  
map_priProFit4yr
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Mapping private for-profit four-year colleges in PA

Mapping private for-profit four-year colleges in PA

tmap_arrange(map_pub2yr, map_pub4yr, map_priNoPro4yr, map_priProFit4yr)
## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output

## Warning in sp::proj4string(obj): CRS object has comment, which is lost in output
Arranging maps. Geographic data obtained from @tigris plotted with tmap (@Tennekes2018)

Arranging maps. Geographic data obtained from Walker (2020) plotted with tmap (Tennekes (2018))

2.1 Dynamic HTML version for points

tmap_mode("view")
html_maps<-tm_basemap("CartoDB.DarkMatter") +
tm_shape(zip_PA) +
  tm_polygons() +
tm_shape(public4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "red", 
             id="INSTNM", 
             alpha=.5) +
tm_shape(public2yr) + 
  tm_symbols(size = "pct_pell", 
             col = "pink", id="INSTNM", 
             alpha=.5) +
tm_shape(privateNOPROF4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "green", id="INSTNM", 
             alpha=.5) +
tm_shape(privatePROFIT4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "yellow", id="INSTNM", 
             alpha=.5)


html_maps+tm_view(alpha = 1, 
                  view.legend.position = c("right", "top"), 
                  set.bounds=c(-74.74, 42.17, -80.51, 38.47), 
                  set.view=(zoom=6))
Plotting points

Plotting points

2.2 Dynamic HTML version for points and polygons

html_maps<-tm_basemap("CartoDB.DarkMatter") +
tm_shape(zip_PA) +
  tm_fill("pct_single_mother", 
          style = "quantile", 
          n = 9, 
          palette = rev(inferno(256)), 
          title = "ZCTA distribution<br>of single mother households<br>in Pennsylvania", 
          colorNA = "grey81", 
          id="zip", 
          textNA = "Information not available") +
          tm_style("cobalt") +
tm_shape(zip_PA2) +
  tm_fill("pct10kbelow", 
          style = "quantile", 
          n = 9, 
          palette = rev(inferno(256)), 
          title = "ZCTA distribution<br>of taxpayers below poverty line<br>in Pennsylvania", 
          colorNA = "grey81", 
          id="zip", 
          textNA = "Information not available") + 
tm_shape(public4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "red", 
             id="INSTNM", 
             alpha=.5) +
tm_shape(public2yr) + 
  tm_symbols(size = "pct_pell", 
             col = "black", id="INSTNM", 
             alpha=.5) +
tm_shape(privateNOPROF4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "green", id="INSTNM", 
             alpha=.5) +
tm_shape(privatePROFIT4yr) + 
  tm_symbols(size = "pct_pell", 
             col = "yellow", id="INSTNM", 
             alpha=.5)

html_maps+tm_view(alpha = 1, 
                  view.legend.position = c("right", "top"), 
                  set.bounds=c(-74.74, 42.17, -80.51, 38.47), 
                  set.view=(zoom=6))
Plotting points and polygons

Plotting points and polygons

References

Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. Geocomputation with R. CRC Press. https://geocompr.robinlovelace.net/.

Tennekes, Martijn. 2018. “Tmap: Thematic Maps in R.” Journal of Statistical Software, Articles 84 (6): 1–39. https://doi.org/10.18637/jss.v084.i06.

Walker, Kyle. 2020. Tigris: Load Census Tiger/Line Shapefiles. https://CRAN.R-project.org/package=tigris.

Wilkinson, Leland. 2012. “The Grammar of Graphics.” In Handbook of Computational Statistics, 375–414. Springer.