This week, Emily Badger and Darla Cameron published a post at The Washington Post’s Wonkblog based on data from the Federal Housing Finance Agency that suggests that housing prices have increased more in central urban areas than in suburban areas. Kyle Walker, one of my very favorite R/visualization people, published a post on his blog showing how to make the same kind of maps from the original WashPo post using R and ggplot2. Kyle made a map in his post for Dallas-Fort Worth, his metro area (and my hometown!) and that made me want to reproduce the map for the city where I now call home, Salt Lake City.
Kyle provided his entire script so there is not much to do other than follow along. The data itself is available from the Federal Housing Finance Agency as an Excel file here.
library(readxl)
library(dplyr)
orig <- read_excel('HPI_AT_ZIP5.xlsx', skip = 6)
averages <- orig %>%
select(zip = `Five-Digit ZIP Code`, year = Year, change = `Annual Change (%)`) %>%
filter(year > 1989, change != '.') %>%
mutate(change = as.numeric(change)) %>%
group_by(zip) %>%
summarize(avg = mean(change, na.rm = TRUE)) %>%
mutate(avgf = cut(avg, breaks = c(-100, 0:5, 100), right = FALSE,
labels = c('Decrease', '0', '1', '2', '3', '4', '+5%')))
We have the data now, and the next step is to find the zip codes, roads, city, etc. that make up where I live. I am using Kyle’s wonderful tigris package for this, just as he shows.
library(tigris)
library(sp)
library(maptools)
zips <- zctas(cb = TRUE)
ctys <- counties('UT', cb = TRUE)
slc_metro <- ctys[ctys$NAME == 'Salt Lake', ]
over_zips <- bind_rows(over(slc_metro, zips, returnList = TRUE))
slc_zips <- spTransform(zips[zips$ZCTA5CE10 %in% over_zips$ZCTA5CE10, ],
CRS("+init=epsg:26912"))
pri <- spTransform(primary_roads(), CRS("+init=epsg:26912"))
cities <- places('UT', cb = TRUE)
slc_cities <- spTransform(cities[cities$NAME %in% c('Salt Lake City', 'Draper', 'Holladay',
'South Jordan', 'West Valley City'), ],
CRS("+init=epsg:26912"))
slc_cities$long <- coordinates(slc_cities)[,1]
slc_cities$lat <- coordinates(slc_cities)[,2]
slc_averages <- averages[averages$zip %in% slc_zips$ZCTA5CE10, ]
Now we can work on making the cool circular map, like Kyle shows us based on the Wonkblog originals.
library(rgeos)
library(raster)
library(rgdal)
radius <- ( (bbox(slc_zips)[3] - bbox(slc_zips)[1]) / 2 )
circle <- gBuffer(gCentroid(slc_zips), width = radius, quadsegs = 500)
slc_clipped <- gIntersection(slc_zips, circle, byid = TRUE, id = slc_zips$ZCTA5CE10)
slc_clipped$id <- row.names(slc_clipped)
gClip <- function(shp, bb){
if(class(bb) == "matrix") b_poly <- as(extent(as.vector(t(bb))), "SpatialPolygons")
else b_poly <- as(extent(bb), "SpatialPolygons")
proj4string(b_poly) <- proj4string(shp)
gIntersection(shp, b_poly, byid = T)
}
pri_clipped <- gClip(pri, circle)
pri_clipped$id <- 1:length(pri_clipped)
Let’s check out what this looks like, before we use the FHFA data.
plot(circle)
plot(slc_clipped, add = TRUE)
plot(pri_clipped, add = TRUE, col = 'red')
Yep, that looks like home. Now, on to the real map!
library(ggplot2)
library(viridis)
library(extrafont)
library(ggthemes)
library(ggsn)
slc <- fortify(slc_clipped, region = 'id')
circlef <- fortify(circle)
roadsf <- fortify(pri_clipped)
ggplot() +
geom_polygon(data = circlef, aes(x = long, y = lat, group = group),
fill = '#eaeaea') +
geom_map(data = slc_averages, map = slc,
aes(fill = avgf, map_id = zip)) +
geom_path(data = roadsf, aes(x = long, y = lat, group = group),
color = 'white', size = 0.5) +
geom_text(data = slc_cities@data, aes(x = long, y = lat, label = NAME),
color = 'black', fontface = 'bold', family = 'Tahoma') +
theme_map(base_family = 'Tahoma', base_size = 14) +
coord_equal() +
theme(legend.position = 'top',
legend.key = element_blank()) +
scale_fill_manual(values = viridis(7)[2:7],
guide = guide_legend(nrow = 1, direction = 'horizontal',
label.hjust = 0, label.position = 'bottom',
keywidth = 5.51, keyheight = 0.75, title = "")) +
labs(title = "Home prices in Salt Lake City, UT",
subtitle = "Annual average change in home prices from 1990 to 2015",
caption = "Source: Federal Housing Finance Agency") +
scalebar(data = circlef, dist = 10, location = "bottomleft")
We are missing some data in a few central zip codes, just like Kyle found in making his maps. The funny shaped gray area in the northwest is actually the Great Salt Lake, so no homes there to go up or down in price! There is less of a dramatic difference in Salt Lake City between the outer suburbs and the inner urban core compared to the Dallas-Fort Worth area or the other metropolitan areas shown in the Wonkblog post. More of the SLC area has increased ~5% or so annually over the past 25 years than the other cities, and no zip code has shown no increase (~0%).