This R Markdown file fulfills the Practical R Part 3 assignment. It produces a region‑level summary table for the 14 counties that make up the Nashville metropolitan statistical area and a statewide map showing the percentage of adults with at least a bachelor’s degree. The document follows the numbered sections of the assignment and includes an explanation of how the code works. To ensure the map renders reliably even when the Census geometry API is unavailable, the geographic shapes are retrieved using the tigris package rather than via geometry = TRUE in tidycensus::get_acs().

2. Load ACS codebooks

The ACS codebooks provide metadata for variables. Loading them caches the metadata for later use. We won’t display their contents here because they are large.

DetailedTables <- load_variables(2023, "acs5", cache = TRUE)
SubjectTables  <- load_variables(2023, "acs5/subject", cache = TRUE)
ProfileTables  <- load_variables(2023, "acs5/profile", cache = TRUE)

3. Define variables of interest

The assignment calls for two measures from the ACS: the count and percentage of adults age 25 or older with a bachelor’s degree or higher. These correspond to DP02_0059 (estimate) and DP02_0068P (percentage).

VariableList <- c(
  Count_   = "DP02_0059",
  Percent_ = "DP02_0068P"
)
VariableList
##       Count_     Percent_ 
##  "DP02_0059" "DP02_0068P"

4. Fetch and prepare county subdivision data

We first download ACS estimates for all minor civil divisions (county subdivisions) in Tennessee using get_acs(). We set geometry = FALSE here because we will acquire the shapes separately using tigris. After downloading, we split the NAME field into Division, County and State and strip the “County” suffix for readability.

mydata <- get_acs(
  geography      = "county subdivision",
  state          = "TN",
  variables      = VariableList,
  year           = 2023,
  survey         = "acs5",
  output         = "wide",
  geometry       = FALSE,
  show_progress  = FALSE
)

mydata <- mydata %>%
  separate_wider_delim(
    NAME,
    delim = ", ",
    names = c("Division", "County", "State")
  ) %>%
  mutate(County = str_remove(County, " County"))

head(mydata)
## # A tibble: 6 × 8
##   GEOID      Division   County   State     Count_E Count_M Percent_E Percent_M
##   <chr>      <chr>      <chr>    <chr>       <dbl>   <dbl>     <dbl>     <dbl>
## 1 4700190002 District 1 Anderson Tennessee    6994     682      24         4.1
## 2 4700190192 District 2 Anderson Tennessee    7777     404      18.4       4.8
## 3 4700190382 District 3 Anderson Tennessee    7917     706      30.9       4.4
## 4 4700190572 District 4 Anderson Tennessee    6449     589      10.9       3  
## 5 4700190762 District 5 Anderson Tennessee    6925     746      16.8       4.3
## 6 4700190952 District 6 Anderson Tennessee    6197     436      32.5       4.3

12. Region‑level aggregation and table

This step reproduces the “Estimates by Region” table from the assignment. We first fetch county‑level ACS estimates (without geometry), then filter for the 14 counties that make up the Nashville metropolitan statistical area. Using case_when(), we classify each county as either Davidson, Doughnut (the six counties surrounding Davidson) or Non‑doughnut. Finally, we compute the average percentage of college‑educated adults, the average margin of error, and the total count for each region.

CountyLevelData <- get_acs(
  geography     = "county",
  state         = "TN",
  variables     = VariableList,
  year          = 2023,
  survey        = "acs5",
  output        = "wide",
  geometry      = FALSE,
  show_progress = FALSE
)

SelectedCounties <- CountyLevelData %>%
  filter(NAME %in% c(
    "Cannon County, Tennessee",  "Cheatham County, Tennessee",
    "Davidson County, Tennessee", "Dickson County, Tennessee",
    "Hickman County, Tennessee",  "Macon County, Tennessee",
    "Maury County, Tennessee",    "Robertson County, Tennessee",
    "Rutherford County, Tennessee", "Smith County, Tennessee",
    "Sumner County, Tennessee",   "Trousdale County, Tennessee",
    "Williamson County, Tennessee","Wilson County, Tennessee"
  )) %>%
  select(NAME, Percent_E, Percent_M, Count_E) %>%
  rename(
    County  = NAME,
    Percent = Percent_E,
    Range   = Percent_M,
    Count   = Count_E
  ) %>%
  arrange(desc(Percent))

SelectedCounties <- SelectedCounties %>%
  mutate(
    CountyClean = str_remove(County, ", Tennessee"),
    Region = case_when(
      CountyClean == "Davidson County" ~ "Davidson",
      CountyClean %in% c(
        "Rutherford County", "Williamson County",
        "Cheatham County",    "Robertson County",
        "Sumner County",      "Wilson County"
      ) ~ "Doughnut",
      TRUE ~ "Non-doughnut"
    )
  )

RegionTableData <- SelectedCounties %>%
  group_by(Region) %>%
  summarise(
    Percent = mean(Percent, na.rm = TRUE),
    Range   = mean(Range,   na.rm = TRUE),
    Count   = sum(Count,    na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(Percent))

RegionTable <- RegionTableData %>%
  gt() %>%
  gt_theme_espn() %>%
  cols_label(
    Percent = "Percent (%)",
    Range   = "Margin of Error",
    Count   = "Count"
  ) %>%
  fmt_number(columns = c(Percent, Range), decimals = 1) %>%
  fmt_number(columns = c(Count), decimals = 0, use_seps = TRUE) %>%
  tab_header(
    title    = "Estimates by Region",
    subtitle = "(2019–2023 ACS 5‑Year ACS)"
  )

RegionTable
Estimates by Region
(2019–2023 ACS 5‑Year ACS)
Region Percent (%) Margin of Error Count
Davidson 47.3 0.7 493,272
Doughnut 35.4 1.4 715,926
Non-doughnut 16.9 2.3 179,896

13. Statewide map of Tennessee

To build the statewide map, we first rename the ACS columns to shorter names and convert the estimates and margins of error to numeric. We then obtain cartographic boundary files for minor civil divisions and counties using tigris. After joining the estimates onto the subdivision shapes, we build an interactive Leaflet map.

# Rename and convert variables so that Percent and PercentEM are numeric
mapdata3 <- mydata %>%
  rename(
    Percent   = Percent_E,
    PercentEM = Percent_M,
    Count     = Count_E,
    CountEM   = Count_M
  ) %>%
  mutate(
    Percent   = as.numeric(Percent),
    PercentEM = as.numeric(PercentEM)
  )

# Retrieve subdivision and county shapes via tigris
tn_subdivs <- county_subdivisions(state = "TN", cb = TRUE, year = 2023) %>%
  st_transform(4326)

county_borders <- counties(state = "TN", cb = TRUE, year = 2023) %>%
  st_transform(4326)

# Join the estimates onto the subdivision shapes using GEOID
map_sf <- tn_subdivs %>%
  left_join(mapdata3, by = "GEOID")

# Colour palette for the map
pal <- colorNumeric(
  palette = "viridis",
  domain  = map_sf$Percent
)

DivisionMap3 <- leaflet(map_sf) %>%
  addProviderTiles("CartoDB.Positron", group = "Positron (Light)") %>%
  addProviderTiles("Esri.WorldStreetMap", group = "ESRI Streets") %>%
  addProviderTiles("Esri.WorldImagery", group = "ESRI Satellite") %>%
  addPolygons(
    fillColor   = ~pal(Percent),
    fillOpacity = 0.7,
    color       = "#333333",
    weight      = 1,
    group       = "Data Layer",
    popup       = ~paste0(
      "<b>", NAME, "</b><br>",
      "County: ", County, "<br>",
      "Percent: ", Percent, "%<br>",
      "Margin of Error: ±", PercentEM, "<br>",
      "Count: ", Count, "<br>",
      "Count MOE: ±", CountEM
    )
  ) %>%
  addPolylines(
    data    = county_borders,
    color   = "black",
    weight  = 1,
    opacity = 0.8,
    group   = "County Borders"
  ) %>%
  addLegend(
    pal      = pal,
    values   = map_sf$Percent,
    position = "topright",
    title    = "Percent"
  ) %>%
  addLayersControl(
    baseGroups    = c(
      "Positron (Light)",
      "ESRI Streets",
      "ESRI Satellite"
    ),
    overlayGroups = c("Data Layer", "County Borders"),
    options       = layersControlOptions(
      collapsed = TRUE,
      position  = "bottomleft"
    )
  )

DivisionMap3

About the code

Several key functions make this script work: