‘Choropleth’ is an obscure-sounding term, but it is a common type of map where regions are different colors according to some statistic. An example that we have all seen is the US electoral map with red and blue states indicating which candidate or party won each state.
library(tidyverse)
── Attaching packages ────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ ggplot2 3.2.1 ✓ purrr 0.3.3
✓ tibble 2.1.3 ✓ dplyr 0.8.3
✓ tidyr 1.0.0 ✓ stringr 1.4.0
✓ readr 1.3.1 ✓ forcats 0.4.0
── Conflicts ───────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
library(tidycensus) # gets census data that we can use to create maps
library(leaflet) # interactive mapping package
library(trendyy)
We’ll start out with census data because it has both the maps and the data (population, income, etc.). Later, we’ll separate these steps so we can learn to map any data.
You’ll need a free api key to get census access. Go here to get a key, but be aware that it might take a couple days:
http://api.census.gov/data/key_signup.html
Then when you get the API key, put it into the quotes in census_api_key("").
census_api_key("cb178eeaba3baed3bf20dd66ee2dc9d42a90257a", install = TRUE, overwrite = TRUE)
Your original .Renviron will be backed up and stored in your R HOME directory if needed.
Your API key has been stored in your .Renviron and can be accessed by Sys.getenv("CENSUS_API_KEY").
To use now, restart R or run `readRenviron("~/.Renviron")`
[1] "cb178eeaba3baed3bf20dd66ee2dc9d42a90257a"
This gets maps of the states along with population data from the census bureau:
readRenviron("~/.Renviron")
states <- get_acs(geography = "state", # gets state by state data
variables = "B01003_001", # this is state population
geometry = TRUE, # gets geometry (the maps)
shift_geo = T) # shifts Hawaii and Alaska
Getting data from the 2014-2018 5-year ACS
Using feature geometry obtained from the albersusa package
Please note: Alaska and Hawaii are being shifted and are not to scale.
Let’s take a look at the data we just got. Create a new chunk below and type in states:
states
Simple feature collection with 51 features and 5 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -2100000 ymin: -2500000 xmax: 2516374 ymax: 732103.3
epsg (SRID): NA
proj4string: +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs
First 10 features:
GEOID NAME variable estimate moe geometry
1 04 Arizona B01003_001 6946685 NA MULTIPOLYGON (((-1111066 -8...
2 05 Arkansas B01003_001 2990671 NA MULTIPOLYGON (((557903.1 -1...
3 06 California B01003_001 39148760 NA MULTIPOLYGON (((-1853480 -9...
4 08 Colorado B01003_001 5531141 NA MULTIPOLYGON (((-613452.9 -...
5 09 Connecticut B01003_001 3581504 NA MULTIPOLYGON (((2226838 519...
6 11 District of Columbia B01003_001 684498 NA MULTIPOLYGON (((1960720 -41...
7 13 Georgia B01003_001 10297484 NA MULTIPOLYGON (((1379893 -98...
8 17 Illinois B01003_001 12821497 NA MULTIPOLYGON (((868942.5 -2...
9 18 Indiana B01003_001 6637426 NA MULTIPOLYGON (((1279733 -39...
10 22 Louisiana B01003_001 4663616 NA MULTIPOLYGON (((1080885 -16...
The data look a little different from our usual data. But note that it has state names under NAME, it has the population of each state under estimate, and geometry provides the data to create the map. So it includes both the map information and the data (population) that we’re going to map.
Here’s a really simple choropleth map using the data. All it does is take the state data, send it to ggplot, and fill (or color) the states with their population (called estimate).
states %>%
ggplot() +
geom_sf(aes(fill = estimate))
Let’s improve it a little bit. Copy-paste the above chunk, and add the following lines at the end:
states %>%
ggplot() +
geom_sf(aes(fill=estimate))+
coord_sf(datum = NA)+
scale_fill_viridis_c()+
theme_minimal()+
labs(title = "US population by state")
Here’s another one. The only thing I changed here is variables =, which I changed to get median income.
state_income <- get_acs(geography = "state", variables = "B19013_001", shift_geo = T, geometry = TRUE)
Getting data from the 2014-2018 5-year ACS
Using feature geometry obtained from the albersusa package
Please note: Alaska and Hawaii are being shifted and are not to scale.
state_income %>%
ggplot() +
geom_sf(aes(fill = estimate)) +
coord_sf(datum = NA) +
theme_minimal() +
scale_fill_viridis_c() +
labs(title = "Median state income")
These have been maps of states in the country, but we could pick a state and then map the counties in that state. The following creates a map of the median income of counties in Montana.
MT_population <- get_acs(geography = "county", state = "MT", variables = "B19013_001", geometry = TRUE)
Getting data from the 2014-2018 5-year ACS
Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
Using FIPS code '30' for state 'MT'
|
| | 0%
|
|= | 1%
|
|== | 1%
|
|== | 2%
|
|=== | 2%
|
|=== | 3%
|
|==== | 3%
|
|==== | 4%
|
|===== | 4%
|
|===== | 5%
|
|====== | 5%
|
|====== | 6%
|
|======= | 6%
|
|======= | 7%
|
|======== | 7%
|
|======== | 8%
|
|========= | 8%
|
|========= | 9%
|
|========== | 9%
|
|========== | 10%
|
|=========== | 10%
|
|=========== | 11%
|
|============ | 11%
|
|============= | 11%
|
|============= | 12%
|
|============== | 12%
|
|============== | 13%
|
|=============== | 13%
|
|=============== | 14%
|
|================ | 14%
|
|================ | 15%
|
|================= | 15%
|
|================= | 16%
|
|================== | 16%
|
|================== | 17%
|
|=================== | 17%
|
|=================== | 18%
|
|==================== | 18%
|
|==================== | 19%
|
|===================== | 19%
|
|===================== | 20%
|
|====================== | 20%
|
|====================== | 21%
|
|======================= | 21%
|
|======================= | 22%
|
|======================== | 22%
|
|========================= | 23%
|
|========================== | 24%
|
|=========================== | 24%
|
|=========================== | 25%
|
|============================ | 25%
|
|============================ | 26%
|
|============================= | 26%
|
|============================= | 27%
|
|============================== | 27%
|
|============================== | 28%
|
|=============================== | 28%
|
|=============================== | 29%
|
|================================ | 29%
|
|================================ | 30%
|
|================================= | 30%
|
|================================= | 31%
|
|================================== | 31%
|
|================================== | 32%
|
|=================================== | 32%
|
|==================================== | 33%
|
|===================================== | 34%
|
|====================================== | 35%
|
|======================================= | 35%
|
|======================================= | 36%
|
|======================================== | 36%
|
|======================================== | 37%
|
|========================================= | 37%
|
|========================================= | 38%
|
|========================================== | 38%
|
|========================================== | 39%
|
|=========================================== | 39%
|
|=========================================== | 40%
|
|============================================ | 40%
|
|============================================ | 41%
|
|============================================= | 41%
|
|============================================= | 42%
|
|============================================== | 42%
|
|============================================== | 43%
|
|=============================================== | 43%
|
|================================================ | 44%
|
|================================================= | 45%
|
|================================================== | 45%
|
|================================================== | 46%
|
|=================================================== | 46%
|
|=================================================== | 47%
|
|==================================================== | 47%
|
|==================================================== | 48%
|
|===================================================== | 48%
|
|===================================================== | 49%
|
|====================================================== | 49%
|
|====================================================== | 50%
|
|======================================================= | 50%
|
|======================================================= | 51%
|
|======================================================== | 51%
|
|======================================================== | 52%
|
|========================================================= | 52%
|
|========================================================= | 53%
|
|========================================================== | 53%
|
|========================================================== | 54%
|
|=========================================================== | 54%
|
|=========================================================== | 55%
|
|============================================================ | 55%
|
|============================================================= | 56%
|
|============================================================== | 56%
|
|============================================================== | 57%
|
|=============================================================== | 57%
|
|=============================================================== | 58%
|
|================================================================ | 58%
|
|================================================================ | 59%
|
|================================================================= | 59%
|
|================================================================= | 60%
|
|================================================================== | 60%
|
|================================================================== | 61%
|
|=================================================================== | 61%
|
|=================================================================== | 62%
|
|==================================================================== | 62%
|
|==================================================================== | 63%
|
|===================================================================== | 63%
|
|===================================================================== | 64%
|
|====================================================================== | 64%
|
|====================================================================== | 65%
|
|======================================================================= | 65%
|
|======================================================================= | 66%
|
|======================================================================== | 66%
|
|========================================================================= | 67%
|
|========================================================================== | 67%
|
|========================================================================== | 68%
|
|=========================================================================== | 68%
|
|=========================================================================== | 69%
|
|============================================================================ | 69%
|
|============================================================================ | 70%
|
|============================================================================= | 70%
|
|============================================================================= | 71%
|
|============================================================================== | 71%
|
|============================================================================== | 72%
|
|=============================================================================== | 72%
|
|=============================================================================== | 73%
|
|================================================================================ | 73%
|
|================================================================================ | 74%
|
|================================================================================= | 74%
|
|================================================================================= | 75%
|
|================================================================================== | 75%
|
|================================================================================== | 76%
|
|=================================================================================== | 76%
|
|=================================================================================== | 77%
|
|==================================================================================== | 77%
|
|===================================================================================== | 78%
|
|====================================================================================== | 79%
|
|======================================================================================= | 79%
|
|======================================================================================= | 80%
|
|======================================================================================== | 80%
|
|======================================================================================== | 81%
|
|========================================================================================= | 81%
|
|========================================================================================= | 82%
|
|========================================================================================== | 82%
|
|========================================================================================== | 83%
|
|=========================================================================================== | 83%
|
|=========================================================================================== | 84%
|
|============================================================================================ | 84%
|
|============================================================================================ | 85%
|
|============================================================================================= | 85%
|
|============================================================================================= | 86%
|
|============================================================================================== | 86%
|
|============================================================================================== | 87%
|
|=============================================================================================== | 87%
|
|=============================================================================================== | 88%
|
|================================================================================================ | 88%
|
|================================================================================================= | 89%
|
|================================================================================================== | 90%
|
|=================================================================================================== | 90%
|
|=================================================================================================== | 91%
|
|==================================================================================================== | 91%
|
|==================================================================================================== | 92%
|
|===================================================================================================== | 92%
|
|===================================================================================================== | 93%
|
|====================================================================================================== | 93%
|
|====================================================================================================== | 94%
|
|======================================================================================================= | 94%
|
|======================================================================================================= | 95%
|
|======================================================================================================== | 95%
|
|======================================================================================================== | 96%
|
|========================================================================================================= | 96%
|
|========================================================================================================= | 97%
|
|========================================================================================================== | 97%
|
|========================================================================================================== | 98%
|
|=========================================================================================================== | 98%
|
|=========================================================================================================== | 99%
|
|============================================================================================================ | 99%
|
|============================================================================================================ | 100%
|
|=============================================================================================================| 100%
MT_population %>%
ggplot() +
geom_sf(aes(fill = estimate), color = NA) +
coord_sf(datum = NA) +
theme_minimal() +
scale_fill_viridis_c() +
labs(title = "Median income of Montana counties")
The census package includes the state maps that we can use to map other data. Then we can join some new data to the map data from the census. Data from the google search package trendyy includes region data which, if we look just at the US, will give us states.
The following code will get the google data on searches for naloxone (naloxone is a drug that treats opioid overdose), takes the state-by-state data and puts it into a new data frame called naloxone_states, and then shows it.
naloxone <- trendy("naloxone", geo = "US", from = "2019-01-01", to = "2020-01-01")
Failed to create bus connection: No such file or directory
running command 'timedatectl' had status 1
naloxone_states <- naloxone %>%
get_interest_region()
naloxone_states
NA
In order to map these google search data, we need to join the google search data to the map information (called states).
Notice that in the naloxone data, the states column is called ‘location,’ but in the census data, the states column is called ‘NAME.’ In order to join the naloxone search data to the state map data, we have to rename the states colum. Here’s how we could do that:
states %>%
rename(location = NAME) %>%
inner_join(naloxone_states)
Joining, by = "location"
Simple feature collection with 51 features and 9 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -2100000 ymin: -2500000 xmax: 2516374 ymax: 732103.3
epsg (SRID): NA
proj4string: +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs
First 10 features:
GEOID location variable estimate moe hits keyword geo gprop geometry
1 04 Arizona B01003_001 6946685 NA 46 naloxone US web MULTIPOLYGON (((-1111066 -8...
2 05 Arkansas B01003_001 2990671 NA 44 naloxone US web MULTIPOLYGON (((557903.1 -1...
3 06 California B01003_001 39148760 NA 36 naloxone US web MULTIPOLYGON (((-1853480 -9...
4 08 Colorado B01003_001 5531141 NA 48 naloxone US web MULTIPOLYGON (((-613452.9 -...
5 09 Connecticut B01003_001 3581504 NA 34 naloxone US web MULTIPOLYGON (((2226838 519...
6 11 District of Columbia B01003_001 684498 NA 100 naloxone US web MULTIPOLYGON (((1960720 -41...
7 13 Georgia B01003_001 10297484 NA 28 naloxone US web MULTIPOLYGON (((1379893 -98...
8 17 Illinois B01003_001 12821497 NA 41 naloxone US web MULTIPOLYGON (((868942.5 -2...
9 18 Indiana B01003_001 6637426 NA 37 naloxone US web MULTIPOLYGON (((1279733 -39...
10 22 Louisiana B01003_001 4663616 NA 41 naloxone US web MULTIPOLYGON (((1080885 -16...
Notice that this creates data with a hits column, which is search volume for the term naloxone, along with the geometry that we can use to make a map.
Pipe all that into the following to make the map:
states %>%
rename(location = NAME) %>%
inner_join(naloxone_states) %>%
ggplot() + # create graph
geom_sf(aes(fill = hits)) + # color states with hits
scale_fill_viridis_c() + # use the viridis colors
coord_sf(datum = NA) + # remove coordinates
theme_minimal() + # remove background
labs(title = "State google searches for 'naloxone'", fill = "Search volume")
Joining, by = "location"
We can also use the interactive mapping package leaflet to create these maps. Choropleth maps are harder to create in leaflet, but I think it’s worth it.
Make sure you don’t shift Hawaii and Alaska. I’m going to call this states_leaflet just to keep it straight. We’re not going to use the income data, but get_acs() requires you to get some data.
states_leaflet <- get_acs(geography = "state", # gets state by state data
variables = "B19013_001", # this is state income
geometry = TRUE) # gets geometry (the maps)
Getting data from the 2014-2018 5-year ACS
Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
|
| | 0%
|
|= | 0%
|
|= | 1%
|
|== | 2%
|
|=== | 2%
|
|=== | 3%
|
|==== | 4%
|
|===== | 4%
|
|===== | 5%
|
|====== | 5%
|
|====== | 6%
|
|======= | 6%
|
|======= | 7%
|
|======== | 7%
|
|======== | 8%
|
|========= | 8%
|
|========== | 9%
|
|========== | 10%
|
|=========== | 10%
|
|============ | 11%
|
|============= | 12%
|
|============== | 13%
|
|=============== | 14%
|
|================ | 14%
|
|================ | 15%
|
|================= | 15%
|
|================= | 16%
|
|================== | 16%
|
|================== | 17%
|
|=================== | 17%
|
|=================== | 18%
|
|==================== | 18%
|
|==================== | 19%
|
|===================== | 19%
|
|===================== | 20%
|
|====================== | 20%
|
|======================= | 21%
|
|======================== | 22%
|
|========================= | 23%
|
|========================== | 23%
|
|========================== | 24%
|
|=========================== | 25%
|
|============================ | 25%
|
|============================ | 26%
|
|============================= | 27%
|
|============================== | 27%
|
|============================== | 28%
|
|=============================== | 28%
|
|=============================== | 29%
|
|================================ | 29%
|
|================================ | 30%
|
|================================= | 30%
|
|================================= | 31%
|
|================================== | 31%
|
|================================== | 32%
|
|=================================== | 32%
|
|=================================== | 33%
|
|==================================== | 33%
|
|===================================== | 34%
|
|====================================== | 35%
|
|======================================= | 36%
|
|======================================== | 37%
|
|========================================= | 37%
|
|========================================= | 38%
|
|========================================== | 38%
|
|========================================== | 39%
|
|=========================================== | 39%
|
|============================================ | 40%
|
|============================================ | 41%
|
|============================================= | 41%
|
|============================================== | 42%
|
|=============================================== | 43%
|
|================================================ | 44%
|
|================================================= | 45%
|
|================================================== | 46%
|
|=================================================== | 46%
|
|=================================================== | 47%
|
|==================================================== | 47%
|
|==================================================== | 48%
|
|===================================================== | 48%
|
|===================================================== | 49%
|
|====================================================== | 49%
|
|====================================================== | 50%
|
|======================================================= | 50%
|
|======================================================= | 51%
|
|======================================================== | 51%
|
|======================================================== | 52%
|
|========================================================= | 52%
|
|========================================================= | 53%
|
|========================================================== | 53%
|
|=========================================================== | 54%
|
|=========================================================== | 55%
|
|============================================================ | 55%
|
|============================================================= | 56%
|
|============================================================== | 57%
|
|=============================================================== | 58%
|
|================================================================ | 58%
|
|================================================================ | 59%
|
|================================================================= | 59%
|
|================================================================= | 60%
|
|================================================================== | 60%
|
|================================================================== | 61%
|
|=================================================================== | 61%
|
|=================================================================== | 62%
|
|==================================================================== | 62%
|
|===================================================================== | 63%
|
|===================================================================== | 64%
|
|====================================================================== | 64%
|
|======================================================================= | 65%
|
|======================================================================== | 66%
|
|========================================================================= | 67%
|
|========================================================================== | 68%
|
|=========================================================================== | 68%
|
|=========================================================================== | 69%
|
|============================================================================ | 70%
|
|============================================================================= | 70%
|
|============================================================================= | 71%
|
|============================================================================== | 72%
|
|=============================================================================== | 72%
|
|=============================================================================== | 73%
|
|================================================================================ | 73%
|
|================================================================================ | 74%
|
|================================================================================= | 74%
|
|================================================================================== | 75%
|
|================================================================================== | 76%
|
|=================================================================================== | 76%
|
|==================================================================================== | 77%
|
|==================================================================================== | 78%
|
|===================================================================================== | 78%
|
|====================================================================================== | 79%
|
|======================================================================================= | 79%
|
|======================================================================================= | 80%
|
|======================================================================================== | 81%
|
|========================================================================================= | 81%
|
|========================================================================================= | 82%
|
|========================================================================================== | 82%
|
|========================================================================================== | 83%
|
|=========================================================================================== | 83%
|
|=========================================================================================== | 84%
|
|============================================================================================ | 84%
|
|============================================================================================ | 85%
|
|============================================================================================= | 85%
|
|============================================================================================== | 86%
|
|============================================================================================== | 87%
|
|=============================================================================================== | 87%
|
|================================================================================================ | 88%
|
|================================================================================================= | 89%
|
|================================================================================================== | 90%
|
|=================================================================================================== | 91%
|
|==================================================================================================== | 91%
|
|==================================================================================================== | 92%
|
|===================================================================================================== | 93%
|
|====================================================================================================== | 93%
|
|====================================================================================================== | 94%
|
|======================================================================================================= | 94%
|
|======================================================================================================= | 95%
|
|======================================================================================================== | 95%
|
|======================================================================================================== | 96%
|
|========================================================================================================= | 96%
|
|========================================================================================================= | 97%
|
|========================================================================================================== | 97%
|
|=========================================================================================================== | 98%
|
|=========================================================================================================== | 99%
|
|============================================================================================================ | 99%
|
|=============================================================================================================| 100%
# shift_geo = T # shifts Hawaii and Alaska
To begin, pipe the states_leaflet data into leaflet(), then use addTiles() to put the map down, and the addPolygons() which will show the outlines of the states from our map data:
states_leaflet %>%
leaflet() %>%
addTiles() %>%
addPolygons()
sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'
It works, but it looks pretty ugly. Let’s improve this.
state_colors <- colorNumeric(palette = "viridis", domain = states_leaflet$estimate)
states_leaflet %>%
leaflet() %>%
addTiles() %>%
addPolygons(weight = 1,
fillColor = ~state_colors(estimate)) %>%
setView(-95, 40, zoom = 4) %>%
addLegend(pal = state_colors, values = ~estimate)
sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'
That’s pretty complicated, but I think it looks really nice.
To finish, I’m going to add a few lines inside addPolygons() so that you can hover over the states and get information.
state_colors <- colorNumeric(palette = "viridis", domain = states_leaflet$estimate)
states_leaflet %>%
leaflet() %>%
addTiles() %>%
addPolygons(weight = 1,
fillColor = ~state_colors(estimate),
label = ~paste0(NAME, ", income = ", estimate),
highlight = highlightOptions(weight = 2)) %>%
setView(-95, 40, zoom = 4) %>%
addLegend(pal = state_colors, values = ~estimate)
sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'
Let’s take this to the next level, joining data and then mapping it with leaflet.
Remember to join with states where Alaska and Hawaii have not been shifted!
naloxone_colors <- colorNumeric(palette = "viridis", domain = naloxone_states$hits)
states_leaflet %>%
rename(location = NAME) %>%
inner_join(naloxone_states) %>%
leaflet() %>%
addTiles() %>%
addPolygons(weight = 1,
fillColor = ~naloxone_colors(hits),
label = ~paste0(location, ", Search volume = ", hits),
highlight = highlightOptions(weight = 2)) %>%
setView(-95, 40, zoom = 4) %>%
addLegend(pal = naloxone_colors, values = ~hits)
Joining, by = "location"sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'