For this lab assignment, your task is to replicate the map shown below, which visualizes median income by county in North Carolina using census data. We’ll use several packages to accomplish this:
Retrieve North Carolina Map Data: First, use the osmdata and ggmap packages to get the map of North Carolina, as demonstrated in the lab. You’ll need to access map tiles and apply the correct zoom level to capture all county boundaries.
Get Census Data: Using the tidycensus package, import data on median income by county in North Carolina. Note that the code below provides an efficient way to retrieve the coordinates of each county, including geometry, which will be necessary to create the map.
JO Note: Used https://rpubs.com/kquimz/1057059 as a reference and template for initial coding.
# Load necessary libraries
library(osmdata) # for fetching Stamen data
## Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(ggmap) # for retrieving and displaying static maps
## Loading required package: ggplot2
## ℹ Google's Terms of Service: <https://mapsplatform.google.com>
## Stadia Maps' Terms of Service: <https://stadiamaps.com/terms-of-service/>
## OpenStreetMap's Tile Usage Policy: <https://operations.osmfoundation.org/policies/tiles/>
## ℹ Please cite ggmap if you use it! Use `citation("ggmap")` for details.
library(dplyr) # for data manipulation
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse) # for a collection of data science packages
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.3 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.1
## ✔ readr 2.1.5
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidycensus) # for accessing U.S. Census Bureau data
library(plotly) # for interactive plots
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggmap':
##
## wind
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
library(sf) # for spatial data manipulation
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
# Enable caching for tidycensus data to avoid repeated downloads
options(tigris_use_cache = TRUE)
# Retrieve 2020 median household income by county in North Carolina
nc_income <- get_acs(
geography = "county", # level of geography (county-level data)
variables = "B19013_001", # median household income variable
state = "NC", # state code for North Carolina
year = 2020, # data for the year 2020
geometry = TRUE # includes spatial geometry for mapping
) %>%
# Add centroid coordinates (longitude, latitude) for each county
mutate(
centroid = st_centroid(geometry), # calculate centroids of geometries
longitude = st_coordinates(centroid)[, 1], # extract longitude from centroid
latitude = st_coordinates(centroid)[, 2] # extract latitude from centroid
)
## Getting data from the 2016-2020 5-year ACS
## Warning: • You have not set a Census API key. Users without a key are limited to 500
## queries per day and may experience performance limitations.
## ℹ For best results, get a Census API key at
## http://api.census.gov/data/key_signup.html and then supply the key to the
## `census_api_key()` function to use it throughout your tidycensus session.
## This warning is displayed once per session.
Plot the Interactive Map: We’ll use ggplot, ggmap, and plotly to create the interactive map. Instead of the geom_point() function used in the lab, use geom_sf() to color each county based on income levels, ensuring that each region is accurately represented by the income variable.
# Map
nc_map <- ggplot(nc_income, aes(fill = estimate, text = NAME)) + # Create a plot with the estimate as the fill
geom_sf() + # Add geography
theme_void() + # Void theme
scale_fill_viridis_c() +
labs(title = "North Carolina Income by County")
# Step 2: Convert to Plotly for interactivity (requires existing ggmap)
interactive_map <- ggplotly(nc_map, tooltip = c("NAME"))
interactive_map