Primary Source
Blog: USING TIDYCENSUS AND LEAFLET TO MAP CENSUS DATA
Author: Julia Silge
Original Publish Date: Jun 24, 2017
https://juliasilge.com/blog/using-tidycensus/
tidycensus source
tidycensus[https://github.com/walkerke/tidycensus]
Blog: USING TIDYCENSUS AND LEAFLET TO MAP CENSUS DATA
Author: Julia Silge
Original Publish Date: Jun 24, 2017
https://juliasilge.com/blog/using-tidycensus/
tidycensus[https://github.com/walkerke/tidycensus]
library(tidyverse)
library(tidycensus)
library(stringr)
library(leaflet)
library(sf)
Getting variables from the Census or ACS requires knowing the variable ID - and there are thousands of these IDs across the different Census files. To rapidly search for variables, use the load_variables function. The function takes two required arguments: the year of the Census or endyear of the ACS sample, and the dataset - one of “sf1”, “sf3”, or “acs5”. For ideal functionality, I recommend assigning the result of this function to a variable, setting cache = TRUE to store the result on your computer for future access, and using the View function in RStudio to interactively browse for variables.
Description
Load variables from a decennial Census or American Community Survey dataset to search in R
Usage
load_variables(year, dataset, cache = FALSE)
v15 <- load_variables(2015, "acs5", cache = TRUE)
glimpse(v15)
## Observations: 45,503
## Variables: 3
## $ name <chr> "AIANHH", "AIHHTLI", "AITS", "AITSCE", "ANRC", "B00001...
## $ label <chr> "FIPS AIANHH code", "American Indian Trust Land/Hawaii...
## $ concept <chr> "Selectable Geographies", "Selectable Geographies", "S...
Wow, 45,503 different reports available and no grouping variable.
Let’s just follow along in Julie’s example…
I live in Arkansas so I’m going to change her code when it refers to Texas or Utah and use Arkansas instead.
This example uses the B01003_001 dataset
Description
Obtain data and feature geometry for the five-year American Community Survey
Usage
get_acs(geography, variables, endyear = 2015, output = "tidy", state = NULL, county = NULL, geometry = FALSE, keep_geo_vars = FALSE, summary_var = NULL, key = NULL, moe_level = 90, ...)
*Setting geometry = TRUE
loads the mapping figures so it’s very important to what we’re doing now, even though the default it to FALSE.
arkansas_pop <- get_acs(geography = "county",
variables = "B01003_001",
state = "AR",
geometry = TRUE)
glimpse(arkansas_pop)
## Observations: 75
## Variables: 6
## $ GEOID <chr> "05011", "05015", "05019", "05025", "05063", "05067",...
## $ NAME <chr> "Bradley County, Arkansas", "Carroll County, Arkansas...
## $ variable <chr> "B01003_001", "B01003_001", "B01003_001", "B01003_001...
## $ estimate <dbl> 11206, 27635, 22751, 8510, 36952, 17597, 43652, 40633...
## $ moe <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ geometry <simple_feature> MULTIPOLYGON(((-92.381616 3..., MULTIPOLYG...
moe
variable?It is the margin of error which is useful for statistics, but insignificant for what we’re doing.
color_pal <- colorQuantile(palette = "viridis", domain = arkansas_pop$estimate, n = 10)
arkansas_pop %>%
st_transform(crs = "+init=epsg:4326") %>%
leaflet(width = "100%") %>%
addProviderTiles(provider = "CartoDB.Positron") %>%
addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
stroke = FALSE,
smoothFactor = 0,
fillOpacity = 0.7,
color = ~ color_pal(estimate)) %>%
addLegend("bottomright",
pal = color_pal,
values = ~ estimate,
title = "Population percentiles",
opacity = 1)
st_transform()
doing?Julia says, “Well, I am no cartographer and I am still fuzzy on these issues, but it is doing a projection onto a certain reference system of the spatial information contained in the sf column. The specific choice of an EPSG code of 4326 is for a given projection.”
crs = Coordinate Reference System
Instead of quantiles, just for something different.
plasma
palettecolor_pal <- colorNumeric(palette = "plasma",
domain = arkansas_pop$estimate)
#color_pal <- colorNumeric(palette = "viridis",
# domain = arkansas_pop$estimate)
arkansas_pop %>%
st_transform(crs = "+init=epsg:4326") %>%
leaflet(width = "100%") %>%
addProviderTiles(provider = "CartoDB.Positron") %>%
addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
stroke = FALSE,
smoothFactor = 0,
fillOpacity = 0.7,
color = ~ color_pal(estimate)) %>%
addLegend("bottomright",
pal = color_pal,
values = ~ estimate,
title = "County Populations",
opacity = 1)
Benton County is the northwestern most county in Arkansas with borders on the states of Missouri to the North, and Oklahoma to the West.
benton_co_home_value <- get_acs(geography = "tract",
variables = "B25077_001",
state = "AR",
county = "Benton County",
geometry = TRUE)
pal <- colorNumeric(palette = "viridis",
domain = benton_co_home_value$estimate)
benton_co_home_value %>%
st_transform(crs = "+init=epsg:4326") %>%
leaflet(width = "100%") %>%
addProviderTiles(provider = "CartoDB.Positron") %>%
addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
stroke = FALSE,
smoothFactor = 0,
fillOpacity = 0.7,
color = ~ pal(estimate)) %>%
addLegend("bottomright",
pal = pal,
values = ~ estimate,
title = "Median Home Value",
labFormat = labelFormat(prefix = "$"),
opacity = 1)
tidycensus
is probably going to change the way that I work with US demographic data. The datasets are very rich and the only limitation is based on your ability to efficiently explore them and extract the data that you are looking for.
leaflet
is my favorite package for mapping and I can do so much with so little code it is really incredible.
Big thanks to Julia Silge for her blog that inspired this post.