Introduction

This report will focus on techniques for presenting American Community Survey (ACS) Data

We will walk through:

  • Retrieving ACS Data using ‘tidycensus’
  • Cleaning and preparing the data for visualization
  • Generating an interactive margin of error plot
  • Creating an interactive map** using ‘mapview’
  • Designing a static choropleth map** using ‘ggplot2’

This report is divided into two sections:

  1. Part A: Visualization of Graduate Degree Holders in New Jersey
  2. Part B: Spatial Analysis of Median Household Income in North Carolina

Data Preparation

Loading Required Libraries

Load all the necessary R packages. These tools allow us to retrieve, alter, and visualize the ACS data.

Part A: Graduate Degree Holders by County in New Jersey

To illustrate the margin of error visualization, we retrieve data on the percentage of the population with a graduate degree. This dataset provides an ideal case for displaying error bars, which indicate uncertainty in the estimates.

Retrieve ACS Data

# Get data on graduate degree holders in NJ
nj_grad_degree <- get_acs(
  geography = "county",
  state = "New Jersey",
  variables = "DP02_0066P",  # ACS variable for % of population with a graduate degree
  year = 2021,
  survey = "acs5"
)

# Rename Columns for clarity 
colnames(nj_grad_degree) <- c("GEOID", "County", "Variable", "Estimate", "MOE")

# Preview the dataset
head(nj_grad_degree)
## # A tibble: 6 × 5
##   GEOID County                        Variable   Estimate   MOE
##   <chr> <chr>                         <chr>         <dbl> <dbl>
## 1 34001 Atlantic County, New Jersey   DP02_0066P     10.1   0.6
## 2 34003 Bergen County, New Jersey     DP02_0066P     20.2   0.5
## 3 34005 Burlington County, New Jersey DP02_0066P     14.2   0.5
## 4 34007 Camden County, New Jersey     DP02_0066P     13.1   0.4
## 5 34009 Cape May County, New Jersey   DP02_0066P     13     0.9
## 6 34011 Cumberland County, New Jersey DP02_0066P      5.9   0.6

Indentify the Top and Bottom 5 Counties

# Filter data to top 5 and bottom counties  
top5 <- nj_grad_degree %>% arrange(desc(Estimate)) %>% head(5)
bottom5 <- nj_grad_degree %>% arrange(Estimate) %>% head(5)

# Print results 
print(top5)
## # A tibble: 5 × 5
##   GEOID County                       Variable   Estimate   MOE
##   <chr> <chr>                        <chr>         <dbl> <dbl>
## 1 34035 Somerset County, New Jersey  DP02_0066P     25.7   0.8
## 2 34027 Morris County, New Jersey    DP02_0066P     23.5   0.6
## 3 34019 Hunterdon County, New Jersey DP02_0066P     22.6   1.1
## 4 34003 Bergen County, New Jersey    DP02_0066P     20.2   0.5
## 5 34021 Mercer County, New Jersey    DP02_0066P     20.2   0.7
print(bottom5)
## # A tibble: 5 × 5
##   GEOID County                        Variable   Estimate   MOE
##   <chr> <chr>                         <chr>         <dbl> <dbl>
## 1 34011 Cumberland County, New Jersey DP02_0066P      5.9   0.6
## 2 34033 Salem County, New Jersey      DP02_0066P      7.6   0.9
## 3 34031 Passaic County, New Jersey    DP02_0066P      9.6   0.5
## 4 34001 Atlantic County, New Jersey   DP02_0066P     10.1   0.6
## 5 34029 Ocean County, New Jersey      DP02_0066P     11.1   0.4

Interactive Margin of Error Plot

The interactive plot visualizes graduate degree percentages along with their margin of error, enabling users to hover over counties for details.

# Remove "New Jersery" from County names
nj_grad_degree <- nj_grad_degree %>%
  mutate(County = str_remove(County, " County, New Jersey|, New Jersey")) %>%
  arrange(Estimate) %>%
  mutate(County = factor(County, levels = County))  

# Create interactive ggplot
p <- ggplot(nj_grad_degree, aes(x = County, y = Estimate)) +
  geom_point_interactive(aes(
    tooltip = paste0(County, ": ", round(Estimate, 1), "%"),  
    data_id = County
  ), color = "blue", size = 3) +
  
  geom_errorbar_interactive(aes(
    ymin = Estimate - MOE, 
    ymax = Estimate + MOE, 
    tooltip = paste0("MOE: ±", round(MOE, 1), "%"),
    data_id = County
  ), width = 0.4, color = "red") +
  
  coord_flip() +  
  labs(
    title = "Graduate Degree Holders (%) by County in New Jersey",
    subtitle = "Including margin of error from ACS 5-Year Estimates (2021)",
    x = "County",
    y = "Percentage (%)",
    caption = "Data Source: U.S. Census Bureau, American Community Survey 5-Year Estimates (2021)"
  ) +
  theme_minimal() +
  theme(axis.text.y = element_text(size = 10))

# Render interactive plot
girafe(ggobj = p, width_svg = 8, height_svg = 6)

Part B: Mapping Median Household Income in North Carolina

This section will focus on spatial visualization techniques by using mapview and ggplot2 to map median household income in North Carolina

Retrieve Spatial ACS Data

Use get_acs() with geometry = TRUE to include spatial boundaries for counties

Interactive Map

This interactive map allows users to explore Median Household income across North Carolina’ counties

#Display interactive map of Median Household Income
mapview(nc_income, zcol = "estimate")

Static Choropleth Map

A choropleth Map was chosen to visualize income distribution, with color intensity representing income levels at the county level.

# Display choropleth map
ggplot(nc_income) +
  geom_sf(aes(fill = estimate), color = "white", lwd = 0.2) +
  scale_fill_viridis_c(option = "magma", name = "Median Income ($)") +
  labs(
    title = "Median Household Income by County in North Carolina",
    caption = "Data Source: ACS 5-Year Estimates (2021)",
  )+
  theme_minimal()

Conclusion

This report highlights various visualization techniques to present ACS data. Through integrating both interactive and static visualizations, you have seen examples of how to transform raw demographic and economic data into impactful insights.