You learned in an earlier lesson how to produce the table of fair market rent data, shown below, for each major ZIP code in Rutherford County, Tennessee. It table is designed to help would-be renters get an idea of where they might find a rental home they can afford.
Rutherford FMR, by size and ZIP | |||||||
ZIP | Studio | BR1 | BR2 | BR3 | BR4 | ZIP_Average | Rent_Category |
---|---|---|---|---|---|---|---|
37037 | 1660 | 1710 | 1920 | 2410 | 2940 | 2128 | Above average |
37086 | 1580 | 1620 | 1820 | 2290 | 2790 | 2020 | Above average |
37128 | 1510 | 1550 | 1740 | 2190 | 2670 | 1932 | Above average |
37129 | 1420 | 1460 | 1640 | 2060 | 2510 | 1818 | Above average |
37153 | 1410 | 1450 | 1630 | 2040 | 2490 | 1804 | Above average |
37167 | 1290 | 1330 | 1490 | 1870 | 2280 | 1652 | Below average |
37085 | 1260 | 1290 | 1450 | 1820 | 2210 | 1606 | Below average |
37127 | 1240 | 1270 | 1430 | 1800 | 2190 | 1586 | Below average |
37130 | 1180 | 1210 | 1360 | 1710 | 2080 | 1508 | Below average |
37132 | 1180 | 1210 | 1360 | 1710 | 2080 | 1508 | Below average |
37118 | 1100 | 1130 | 1270 | 1590 | 1960 | 1410 | Below average |
37149 | 1100 | 1130 | 1270 | 1590 | 1960 | 1410 | Below average |
But even long-time residents of Rutherford County would probably have a tough time describing where these ZIP codes are. I couldn’t do it, and I’ve lived here for nearly three decades.
What would-be renters really need is a map like the one below. The map shows each ZIP code, with each one shaded to indicate whether its average rent is above or below the overall average. Even better, if you click on a ZIP code, you can see the fair market rent for each rental home size within the ZIP code. You also can move the map around and zoom it in or out. And if you click the three stacked squares on the left side of the map, you can choose different base maps offering everything from details about place and street names to satellite imagery.
The map reveals a pattern that isn’t necessarily evident in the table: Rents are cheapest in the northeast quadrant of Rutherford County, around MTSU’s campus, and in the Smyrna area near the border with Metro Nashville / Davidson County. Later in this series, we’ll explore some possible explanations for this pattern.
For now, let’s step through the process of making this map using R.
Equipping R to work with maps will require installing a loading four packages you haven’t seen until now. Chief among these is the mapview package. Other mapping packages for R offer a wider range of capabilities, but mapview is one of the easiest to use.
The other three new packages are sf, leafpop, and RColorBrewer. The sf package, short for Simple Features, let’s R read and work with geospatial datasets like the one we’ll use to make the map. The leafpop package let’s R improve the looks of mapview map pop-up windows. Finally, the RColorBrewer lets you use any of about three dozen color palettes for your map.
This code will install (if needed) and load the mapview, sf, leafpop, and RColorBrewer packages, along with the packages you have already encountered in previous lessons: tidyverse, for wrangling data, and gtExtras, for making tables.
if (!require("tidyverse"))
install.packages("tidyverse")
if (!require("gtExtras"))
install.packages("gtExtras")
if (!require("leafpop"))
install.packages("leafpop")
if (!require("sf"))
install.packages("sf")
if (!require("mapview"))
install.packages("mapview")
if (!require("RColorBrewer"))
install.packages("RColorBrewer")
library(tidyverse)
library(gtExtras)
library(sf)
library(mapview)
library(leafpop)
library(RColorBrewer)
The end of the last lesson involved saving some analyzed small-area
fair market rent data as a comma-separated value file in your R project
folder on your computer. If you have reopened the project, so that you
are working in the same folder you saved the data file to, this code
will use the read_csv()
function retrieve the file
into an R data frame called FMR_RuCo:
FMR_RuCo <- read_csv("FMR_RuCo.csv")
If the file is unavailable for some reason, you can use this code to retrieve a copy of it from my GitHub page:
FMR_RuCo <- read_csv("https://raw.githubusercontent.com/drkblake/Data/refs/heads/main/FMR_RuCo.csv")
This code, from the earlier lesson, will make and display the “Rutherford FMR, by size and ZIP” data table shown above. Alternatively, you could simply click on the FMR_RuCo data frame in RStudio’s “Environment” tab and look at the data there.
FMR_RuCo_table <- gt(FMR_RuCo) %>%
tab_header("Rutherford FMR, by size and ZIP") %>%
cols_align(align = "left") %>%
gt_theme_538
FMR_RuCo_table
The U.S. Census Bureau’s Cartographic Boundary Files page offers downloadable map files for a range of geographic areas. The page provides the files in both shapefile and kml formats. The kml format is used mainly by Google geospatial tools, like Google Earth and Google Maps. We’ll be working with the shapefile format, which is the more widely used format of the two. In both cases, the files have been compressed into a .zip file for easy handling and downloading.
Some details. The most recent ZIP code map is available under the page’s 2020 tab, all the way at the bottom of the page, under the heading “ZIP Code Tabulation Areas (ZCTAs).” There’s no need to try to download it from there, though. The code below will do that for you. It also will decompress the file. A shapefile is made up of about half a dozen component files. Usually, all with begin the same way. In this case, they all start with cb_2020_us_zcta520_500k. But the end with different extensions like .cpg, .dbf, .prj, .xml, .shp, and so on. Don’t bother trying to click on and open any of them. Your computer probably doesn’t have the requisite software, and there’s not much to see in any case. Shortly, you’ll run some R code that will take care of everything.
Some caveats. The Census Bureau’s ZCTA boundaries don’t always match ZIP code boundaries perfectly They are usually more like close approximations. Furthermore, some ZIP codes do not have a corresponding ZCTA. For example, the 37131 ZIP code is reserved for the former State Farm Insurance regional office complex just north of the Memorial Boulevard and Dejarnette Lane intersection. The complex is presently vacant. But when State Farm ran its regional headquarters out of the complex, the U.S. Postal Service found it handy to assign the complex its own ZIP code. But because nobody lives in the ZIP code, there is no way to estimate its typical rent, or much of anything else about it. In the Census map, it is simply a part of the 37130 ZCTA.
The code’s functions. The code below uses the download.file()
function to retrieve
the .zip file and store it in your computer’s current R project folder
as a file called “ZCTAs2020.zip.” Next, it uses the unzip()
function to extract and store
the seven files inside. Both download.file() and unzip() are in Base R.
Finally, the code loads the map file into R as a data frame called
“ZCTAMap” by applying the sf package’s read_sf()
function to
cb_2020_us_zcta520_500k.shp, one of the seven extracted files.
# Downloading the file
download.file("https://www2.census.gov/geo/tiger/GENZ2020/shp/cb_2020_us_zcta520_500k.zip","ZCTAs2020.zip")
# Unzipping the file
unzip("ZCTAs2020.zip")
# Loading the file into R as "ZCTAMap"
ZCTAMap <- read_sf("cb_2020_us_zcta520_500k.shp")
The “Environment” tab in RStudio’s upper-right window now contains both the FMR_RuCo and the ZCTAMap data frames.
Click the ZCTAMap data frame, and you’ll see that the “geometry” column, all the way to the right, contains “MULTIPOLYGON (((” followed by what looks like a negative longitude coordinate, because that’s exactly what it is. It’s the first in a series of longitude and latitude coordinates for points that, when strung together, make up the boundaries for the row’s ZIP code.
The file contains a whopping 33,791 rows, one for each ZCTA in the United States. The first column, ZCTA5CE20, contains the ZIP code’s five digits. The first is 15301, which happens to be a ZIP code in Southwestern Pennsylvania.
All we have to do is put the two files together. But not just in any old way. We need each ZIP code’s rent information to line up with the geometry column data that defines its boundaries. We also need to get rid of ZCTAMap rows that don’t match one of the dozen Rutherford County ZIP codes we want to map.
R can do all of this with only the following three lines of code. Scroll down for an explanation of what each line does.
# Making ZIP a character variable
FMR_RuCo$ZIP <- as.character(FMR_RuCo$ZIP)
# Joining the files
FMR_RuCo_Map <- left_join(FMR_RuCo, ZCTAMap, by = c("ZIP" = "ZCTA5CE20"))
# Dropping unneeded columns
FMR_RuCo_Map <- FMR_RuCo_Map %>%
select(-c(AFFGEOID20, GEOID20, NAME20, LSAD20, ALAND20, AWATER20))
Making ZIP a character variable. We’re going to join
the files by matching the ZIP codes in the FMR_RuCo data frame’s “ZIP”
column with the ZIP codes in the ZCTAMap’s “ZCTA5CE20” column. But
there’s a problem. R thinks the ZIP codes in the “ZIP” column are
numbers but thinks the ZIP codes in the “ZCTA5CE20” column are
characters. To R, that means the two columns don’t match. The code fixes
the problem by applying the as.character()
function from base R to
the ZIP column in FMR_RuCo. Doing so changes the ZIP column’s contents
from numbers to characters.
Joining the files. Here, the left_join()
function from the dplyr package
(which is included in the tidyverse package) looks at each ZIP code in the
“ZIP” column in FMR_RuCo and searches the ZCTA5CE20 column in ZCTAMap
for a match. When it finds one, it adds the matching row from ZCTAMap to
the corresponding FMR_RuCo row. The order is important. When you use the
left_join()
function, R keeps all rows in the data frame on
the left but discards all unmatched rows in the data frame on the right.
The code stores the merged data in a new data frame called
FMR_RuCo_Map.
Dropping unneeded columns. The extra columns from
ZCTAMap that the left_join() operation put in FMR_RuCo_Map could stay
right where they are. They worst thing they can do is take up more disk
space than the file truly needs. But it’s easy to delete them. This line
of code does so using the dply package’s select()
function. Note the use of
c()
to specify the list of variable names to be deleted and
the -
, which tells R to delete the variables listed. If you
omitted the -
, R would keep the variables listed and delete
all the rest.
Here’s the part I assume you’ve been waiting for: Actually making and displaying the map. We’ve been making the map all along, of course. Had we not completed each of the steps described above, this code would fail to work. But this is the part that brings it all together.
# Converting FMR_RuCo_Map
FMR_RuCo_Map <- st_as_sf(FMR_RuCo_Map)
# Making the map
Rent_Category_Map <- mapview(
FMR_RuCo_Map,
zcol = "Rent_Category",
layer.name = "Rent category",
popup = popupTable(
FMR_RuCo_Map,
feature.id = FALSE,
row.numbers = FALSE,
zcol = c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")))
# Showing the map
Rent_Category_Map
Converting FMR_RuCo_Map. The first line,
FMR_RuCo_Map <- st_as_sf(FMR_RuCo_Map)
, uses the sf
package’s st_as_sf() function to turn FMR_RuCo_Map into a file
that mapview can understand as a “geospatial” file capable of be mapped.
It applies the st_as_sf()
function to the current version
of FMR_RuCo_Map, then overwrites FMR_RuCo_Map with the results. If you
open FMR_RuCo_Map in RStudio, it will look like nothing has changed. But
R sees a difference, and R has to see that difference before it can use
FMR_RuCo_Map in the upcoming mapview()
function.
Making the map. This code creates an object - a map
called Rent_Category_Map - by applying the mapview package’s mapview()
function to the FMR_RuCo_Map
sf object we just created. Key arguments for the function include:
Naming the sf object (here, it’s specified as
FMR_RuCo_Map
)
Using zcol =
to indicate which column in the sf
object should be used to shade the map’s regions (here, it’s the
Rent_Category
column in FMR_RuCo_Map).
Specifying “Rent category” as the title of the map key by setting
layer.name=
to “Rent category.”
Specifying (once again) FMR_RuCo_Map as the sf object in the
popup = popupTable()
argument. If the sf object specified
here doesn’t match the one specified earlier in the code, the code will
fail. The popupTable()
argument is from the
leafpop package that you installed and loaded way back at the beginning
of the script.
Listing, in the zcol = c()
argument, the names of
the columns we want to appear in each popup window. Note that each
column name has to be enclosed in quotes and separated with a comma.
Make sure you spell and capitalize the column names correctly. If
there’s a mismatch, the code will fail.
Showing the map. Finally, putting the name of the
map in a line of code all by itself, Rent_Category_Map
,
tells R to display the map. If you omitted
Rent_Category_Map <-
from the start of the preceding
block of code and went straight into the mapview()
function, R would show the map without pausing to give it a name. I like
giving the map a name, though, it case I want to display it again
somewhere later in whatever script I’m working on.
Running the code above will produce the map shown at the top of the page. But you can easily make other versions of the map by tweaking the zcol = and layer.name = lines.
Mapping by ZIP code. Suppose, for example, you wanted the map key to identify each ZIP code:
# Mapping by ZIP code
ZIP_Map <- mapview(
FMR_RuCo_Map,
zcol = "ZIP",
layer.name = "ZIP code",
popup = popupTable(
FMR_RuCo_Map,
feature.id = FALSE,
row.numbers = FALSE,
zcol = c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")))
# Showing the map
ZIP_Map
Mapping by average rent. Or, you could map by each ZIP code’s average rent. Like this:
# Mapping by ZIP code
Avg_Rent_Map <- mapview(
FMR_RuCo_Map,
zcol = "ZIP_Average",
layer.name = "Average rent",
popup = popupTable(
FMR_RuCo_Map,
feature.id = FALSE,
row.numbers = FALSE,
zcol = c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")))
# Showing the map
Avg_Rent_Map
Mapping by three-bedroom rent. Or, you could map by the fair market rent for a particular size of rental unit. Let’s go with the fair market rent for a three-bedroom unit:
# Mapping by ZIP code
BR3_Map <- mapview(
FMR_RuCo_Map,
zcol = "BR3",
layer.name = "Three-bedroom rent",
popup = popupTable(
FMR_RuCo_Map,
feature.id = FALSE,
row.numbers = FALSE,
zcol = c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")))
# Showing the map
BR3_Map
If you don’t care for the mapview package’s default colors, you can go with other colors. First, you have to do is install (if needed) and load (every time) the RColorBrewer package. If you’ve been using the code on this page, you installed and loaded RColorBrewer back at the beginning of the script.
With RColorBrewer loaded, all you have to do is add a line the
mapview() code. For example, adding
col.regions = brewer.pal(9, "Blues"),
to the three-bedroom
rent map’s code gives you a blue-shaded map:
# Mapping by ZIP code
BR3_Map <- mapview(
FMR_RuCo_Map,
zcol = "BR3",
col.regions = brewer.pal(9, "Blues"),
layer.name = "Three-bedroom rent",
popup = popupTable(
FMR_RuCo_Map,
feature.id = FALSE,
row.numbers = FALSE,
zcol = c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")))
# Showing the map
BR3_Map
The col.regions =
part is a mapview() argument. The
brewer.pal()
part is an RColorBrewer function. Inside the
function’s parentheses, you put the number of shades in the paletted you
have chosen (there are nine in the “Blues” palette), followed by a comma
and the name of the palette in quotes (like "Blues"
).
Punching display.brewer.all()
into R will show you all
of the available palettes and the number of shades in each one:
display.brewer.all()
Meanwhile, this modified version of the same function will show you the subset of RColorBrewer palettes that are accessible for people with color blindness:
display.brewer.all(colorblindFriendly = TRUE)
Here’s the three-bedroom map again, this time with the 11-shade, color-blind-friendly “RdYlBu” palette:
# Mapping by ZIP code
BR3_Map <- mapview(
FMR_RuCo_Map,
zcol = "BR3",
col.regions = brewer.pal(11, "RdYlBu"),
layer.name = "Three-bedroom rent",
popup = popupTable(
FMR_RuCo_Map,
feature.id = FALSE,
row.numbers = FALSE,
zcol = c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")))
# Showing the map
BR3_Map
More can be done with the FMR_RuCo_Map data frame, so let’s save it
to your computer’s hard drive for use in later lessons. This code will
save the FMR_RuCo_Map data frame as a shapefile called FMR_RuCo_Map.shp.
The delete_layer=TRUE
argument tells R to overwrite the
FMR_RuCo_Map.shp file if one happens to exist already, like from a
previous save operation. Without that argument, a pre-existing version
of the file would cause an error.
st_write(FMR_RuCo_Map,"FMR_RuCo_Map.shp", delete_layer=TRUE)