Intorduction: Analyzing Graduate Degree Attainment and Median Household Income Across Wisconsin Counties

This project looks into the socio-economic landscape of Wisconsin by examining two key metrics across its seventy-two counties: Part A looks at the percentage of residents holding graduate degrees. Part B analyzes the median household income and explores county trends related to graduate degrees and median household income. By leveraging data from the American Community Survey (ACS) and employing visualization tools in R. The analysis results will be put into visualizations for each section.

Part A: Part A: Analyzing Graduate Degree Percentages in Wisconsin Counties: Visualization and Margin of Error Using R

The goal of Part A of this this project is to analyze the percentage of people holding graduate degrees across all seventy-two counties in Wisconsin. Using R-Studio, we will leverage data from the American Community Survey (ACS) to conduct this analysis. We will examine the distribution of graduate degree holders by county, identify the counties with the highest and lowest percentages of graduate degree attainment, and create visualizations to highlight these trends. Additionally, we will incorporate margins of error to assess the reliability of the estimates provided by the ACS.

Project Set Up, Data Acquisition, Data Cleaning

library(tidycensus)
library(tidyverse)
library(mapview)
library(plotly)
library(ggiraph)
library(survey)
library(srvyr)


# Clone the repository
system("git clone https://github.com/walkerke/umich-workshop-2023.git")

#Percentage of people with Graduate Degrees in Wisconsion By County
WI_Grad_Degrees <- get_acs(
  geography = "county",
  state = "WI",
  variables = c(percent_gradute = "DP02_0066P"),
  year = 2021)
glimpse(WI_Grad_Degrees)
## Rows: 72
## Columns: 5
## $ GEOID    <chr> "55001", "55003", "55005", "55007", "55009", "55011", "55013"…
## $ NAME     <chr> "Adams County, Wisconsin", "Ashland County, Wisconsin", "Barr…
## $ variable <chr> "percent_gradute", "percent_gradute", "percent_gradute", "per…
## $ estimate <dbl> 3.7, 6.2, 6.6, 12.4, 9.6, 5.4, 6.9, 8.5, 6.6, 3.5, 7.6, 5.5, …
## $ moe      <dbl> 0.6, 1.1, 0.8, 1.2, 0.5, 0.7, 0.7, 0.8, 0.7, 0.4, 0.7, 0.8, 0…

Top Ten Counties with the Highest Percentage of individual with Graduate Degrees

In this section identifies the top 10 Wisconsin Counties with the highest percentage of individuals that have earned a graduate degree. The data is arranged in descending order pasted on the percentage of graduates (estimate).

# Process the data to get the top 10 counties with the highest percentage of graduates
Cty_Higest_Pct <- WI_Grad_Degrees %>%
  arrange(desc(estimate)) %>%  # Arrange by descending order of percentage of graduates
  slice_head(n = 10)  # Select the top 10 rows
# Display results
Cty_Higest_Pct
## # A tibble: 10 × 5
##    GEOID NAME                         variable        estimate   moe
##    <chr> <chr>                        <chr>              <dbl> <dbl>
##  1 55025 Dane County, Wisconsin       percent_gradute     21.4   0.6
##  2 55089 Ozaukee County, Wisconsin    percent_gradute     19.6   1.3
##  3 55133 Waukesha County, Wisconsin   percent_gradute     15.6   0.4
##  4 55063 La Crosse County, Wisconsin  percent_gradute     13.3   1.1
##  5 55007 Bayfield County, Wisconsin   percent_gradute     12.4   1.2
##  6 55029 Door County, Wisconsin       percent_gradute     11.9   1  
##  7 55079 Milwaukee County, Wisconsin  percent_gradute     11.8   0.3
##  8 55109 St. Croix County, Wisconsin  percent_gradute     11.3   1  
##  9 55035 Eau Claire County, Wisconsin percent_gradute     11.2   0.9
## 10 55059 Kenosha County, Wisconsin    percent_gradute     10.9   0.8

Top Ten Counties with the Lowest Percentage of individual with Graduate Degrees

In this section identifies the top 10 Wisconsin Counties with the lowest percentage of individuals that have earned a graduate degree. The data is arranged in descending order pasted on the percentage of graduates (estimate).

# Process the data to get the top 10 counties with the lowest percentage of graduates
Cty_lowest_Pct <- WI_Grad_Degrees %>%
  arrange(estimate) %>%  # Arrange by descending order of percentage of graduates
  slice_head(n = 10)  # Select the top 10 rows
# Display results
Cty_lowest_Pct
## # A tibble: 10 × 5
##    GEOID NAME                        variable        estimate   moe
##    <chr> <chr>                       <chr>              <dbl> <dbl>
##  1 55019 Clark County, Wisconsin     percent_gradute      3.5   0.4
##  2 55001 Adams County, Wisconsin     percent_gradute      3.7   0.6
##  3 55041 Forest County, Wisconsin    percent_gradute      3.7   0.6
##  4 55077 Marquette County, Wisconsin percent_gradute      3.7   0.5
##  5 55083 Oconto County, Wisconsin    percent_gradute      3.9   0.6
##  6 55057 Juneau County, Wisconsin    percent_gradute      4.6   0.7
##  7 55099 Price County, Wisconsin     percent_gradute      4.7   0.9
##  8 55115 Shawano County, Wisconsin   percent_gradute      5     0.5
##  9 55067 Langlade County, Wisconsin  percent_gradute      5.1   1  
## 10 55107 Rusk County, Wisconsin      percent_gradute      5.1   1

Margin of Error Plot

R-Studio and ACS data was used to create a plot to visualize the percentage of graduate degree holders across Wisconsin counties, incorporating margins of error (MOE). The plot displays each county’s estimated percentage of graduates along with error bars representing the MOE. This is a visual representation of the reliability and precision of the estimates provided by the ACS. Additionally, an interactive plot was created allow for enhanced exploration of the data.

# Martin of Error Plot
wi_plot_errorbar <- ggplot(WI_Grad_Degrees, aes(x = estimate, 
                                                y = reorder(NAME, estimate))) + 
  geom_errorbar(aes(xmin = estimate - moe, xmax = estimate + moe),
                width = 1.1, linewidth = 1.1) +
  geom_point(color = "darkred", size = 2) + 
  scale_x_continuous(labels = scales::label_percent(scale = 1)) + 
  scale_y_discrete(labels = function(x) str_remove(x, " County, Wisconsin|, Wisconsin")) + 
  labs(title = "Percentage of Graduates, Wisconsin Counties",
       subtitle = "Data from the American Community Survey",
       caption = "Data acquired with R and tidycensus. Error bars represent margin of error around estimates.",
       x = "ACS estimate",
       y = "") + 
  theme_minimal(base_size = 5) + 
  theme(axis.text.y = element_text(size = 5), 
        axis.text.x = element_text(size = 10, face = "bold"))

# Save the plot as a .jpg file
ggsave("C:/PENNSTATE/GEOG588_Analytical Approaches_/R_Projects/R_Lab5/Outputs/wi_plot_errorbar.jpg", 
       plot = wi_plot_errorbar, 
       width = 14, height = 10, 
       units = "in", dpi = 300)

Interactive Error Plot

# Convert the ggplot to an interactive plotly plot
interactive_plot <- ggplotly(wi_plot_errorbar, tooltip = "x")
interactive_plot

FIGURE 1. Interactive Chart showing percentage of graduate degree holders in various Wisconsin counties, with error bars representing the margin of error around the ACS estimates. This visualization highlights the distribution and reliability of educational attainment data across the state.

Static Error Plot

FIGURE 2. Chart showing percentage of graduate degree holders in various Wisconsin counties, with error bars representing the margin of error around the ACS estimates. This visualization highlights the distribution and reliability of educational attainment data across the state.

Part B: Analysis of Wisconsin Income and Graduate Degrees by County

Part B explores the relationship between median household income and educational attainment. By merging the income data with the percentage of graduate degree holders, the analysis may reveal potential correlations and geographic disparities. The visualizations will provide a glimpse of how socio-economic factors may and graduate degree attainment may interplay across the state’s counties.

Project Set Up, Data Acquisition, Data Cleaning

# Load necessary libraries 
library(tidyverse)
library(sf)  # For handling the geometry
library(tidycensus)
library(ggplot2)

system("git clone https://github.com/walkerke/umich-workshop-2023.git")
# Retrieve median household income data with geometry
Wisconsin_income_by_County <- get_acs(
  geography = "county",
  variables = "B19013_001E",
  state = "wi",
  year = 2021,
  geometry = TRUE)
glimpse(Wisconsin_income_by_County)
## Rows: 72
## Columns: 6
## $ GEOID    <chr> "55011", "55111", "55019", "55057", "55131", "55075", "55037"…
## $ NAME     <chr> "Buffalo County, Wisconsin", "Sauk County, Wisconsin", "Clark…
## $ variable <chr> "B19013_001", "B19013_001", "B19013_001", "B19013_001", "B190…
## $ estimate <dbl> 61167, 67702, 57547, 58561, 85574, 55694, 52143, 48908, 58289…
## $ moe      <dbl> 2352, 1885, 1507, 2370, 2146, 1927, 4584, 2251, 2069, 3486, 3…
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-92.08384 4..., MULTIPOLYGON (((…

Median Household Income for Wisconsin County

This section presents a visualization of the median household income by county in Wisconsin for the year 2021. A

# Load necessary libraries
library(tidyverse)
library(sf)
library(tidycensus)
library(viridis)
library(scales)
# Plot the median household income data
ggplot(Wisconsin_income_by_County) + 
  geom_sf(aes(fill = estimate)) + 
  scale_fill_viridis_c(option = "plasma", labels = dollar) + 
  theme_minimal() + 
  labs(title = "Median Household Income by County in Wisconsin (2021)",
       fill = "Income",
       caption = "Source: American Community Survey 2021")

Figure 1: This choropleth map displays the median household income by county in Wisconsin for the year 2021. The income levels are represented using a color gradient, with the highest incomes shown in yellow and the lowest in dark purple.

Combining Data: Linking Economic and Educational Metrics

In this section, we integrate two datasets to provide a more comprehensive view of Wisconsin’s socio-economic landscape. By joining the data on median household income and graduate degrees by county we can look for relationships between income levels and educational attainment across the state. The merged dataset shows how the distribution of graduate degrees corresponds with income levels, providing insights into potential correlations and disparities.

# Join the two dataframes on the GEOID field
merged_data <- WI_Grad_Degrees %>%
  left_join(Wisconsin_income_by_County, by = "GEOID")

# View first few rows of the merged dataframe
head(merged_data)
## # A tibble: 6 × 10
##   GEOID NAME.x    variable.x estimate.x moe.x NAME.y variable.y estimate.y moe.y
##   <chr> <chr>     <chr>           <dbl> <dbl> <chr>  <chr>           <dbl> <dbl>
## 1 55001 Adams Co… percent_g…        3.7   0.6 Adams… B19013_001      51878  2534
## 2 55003 Ashland … percent_g…        6.2   1.1 Ashla… B19013_001      55070  3527
## 3 55005 Barron C… percent_g…        6.6   0.8 Barro… B19013_001      55256  1873
## 4 55007 Bayfield… percent_g…       12.4   1.2 Bayfi… B19013_001      62859  2368
## 5 55009 Brown Co… percent_g…        9.6   0.5 Brown… B19013_001      68799  1469
## 6 55011 Buffalo … percent_g…        5.4   0.7 Buffa… B19013_001      61167  2352
## # ℹ 1 more variable: geometry <MULTIPOLYGON [°]>

Visualizing Graduate Degrees by County

This section uses a visualization of the educational attainment levels across Wisconsin, to understand the percentage of residents with graduate degrees in each county. Using data from the American Community Survey 2021, a choropleth map was created that highlights the distribution of higher education throughout the state.

library(tidyverse)
library(sf)
library(ggplot2)

# Convert merged_data to an sf object if it isn't already
merged_data_sf <- st_as_sf(merged_data)

# Transform the CRS to WGS 84 (EPSG: 4326)
merged_data_sf <- st_transform(merged_data_sf, crs = 4326)

# Plot the percentage of graduates by county 
ggplot(merged_data_sf) + 
  geom_sf(aes(fill = estimate.x, geometry = geometry)) + 
  scale_fill_viridis_c(option = "plasma", labels = scales::percent_format(scale = 1)) + 
  theme_minimal() + 
  labs(title = "Percentage of Graduates by County in Wisconsin (2021)",
       fill = "Percentage of Graduates",
       caption = "Source: American Community Survey 2021")

Figure 2: The visualization reveals significant variations in the percentage of graduates across Wisconsin counties.

Visualizing Graduate Degrees by County Income Levels

This section contributes to the analysis by integrating median household income data with the percentage of graduates for each county in Wisconsin. The merged data is converted into an sf object, and centroids are calculated to effectively visualize both data sets on a single map.

#library(tidyverse)
library(sf)
library(ggplot2)
# Convert to sf object
merged_data_sf <- st_as_sf(merged_data)

# Calculate the centroids of each county
centroids <- st_centroid(merged_data_sf)

# Extract coordinates from centroids
centroids_coords <- st_coordinates(centroids)

# Add coordinates to the data
merged_data_sf <- merged_data_sf %>%
  mutate(longitude = centroids_coords[, 1], latitude = centroids_coords[, 2])

# Plot median household income data and add points for percentage of graduate
library(tidyverse)
library(sf)
library(ggplot2)
library(tidycensus)
ggplot(merged_data_sf) + 
  geom_sf(aes(fill = estimate.y, geometry = geometry)) + 
  geom_point(aes(x = longitude, y = latitude, size = estimate.x), 
             color = "green", alpha = 0.6, shape = 21) +
  scale_fill_viridis_c(option = "plasma", labels = scales::dollar) + 
  scale_size_continuous(labels = scales::percent_format(scale = 1)) +
  theme_minimal() + 
  labs(title = "Median Household Income by County in Wisconsin (2021)",
       subtitle = "Overlayed with Percentage of Graduates",
       fill = "Median Household Income",
       size = "Percentage of Graduates",
       caption = "Source: American Community Survey 2021")

Figure 3: This map depicts the median household income by county in Wisconsin for the year 2021, using a color gradient to represent income levels. Superimposed on this map are green points indicating the percentage of residents with graduate degrees, with the size of each point proportional to the percentage.

Visualizing Graduate Degrees by County Income Levels with Interactive Map

This section leverages the mapview package to create an interactive map of median household income by county in Wisconsin. By using custom colors from the rocket palette in the viridisLite package, the map provides dynamic way to explore income data across the state. Clicking on the county will reveal a popup that shows the county name, percent of individual with graduate degrees, the county’s median household income (estimate.y)

#Customizing Interactive Map with Mapview
library(tidyverse)
library(sf)
library(mapview)
library(viridisLite)

# Generate colors using the 'rocket' palette
colors <- rocket(n = 100)

# Ensure the data is in the correct CRS for mapview
merged_data_sf <- st_transform(merged_data_sf, 4326)

# Create interactive map with custom colors
Interactive_map <- mapview(merged_data_sf, zcol = "estimate.y", 
               layer.name = "Median Household Income",
               col.regions = colors)

# Display the interactive map
Interactive_map

Figure 4: This interactive map displays the median household income by county in Wisconsin for the year 2021, using a custom color palette from the viridisLite package. Data source: American Community Survey 2021.

Conclusions

This project provided an insightful look into Wisconsin’s socio-economic landscape, focusing on the percentage of residents with graduate degrees and median household income across the state’s seventy-two counties. By analyzing and visualizing these datasets, several significant patterns and relationships emerged.

Part A: Graduate Degree Percentages The analysis of graduate degree percentages revealed notable disparities in educational attainment across Wisconsin counties. Certain regions demonstrated a higher concentration of residents with graduate degrees, while others lagged behind. I was happy to see that Bayfield County has the five highest percentage of residents with graduate degrees in the state. That was a surprising result as it is very rural and has only one small college within sixty miles of the county.

Part B: Income and Education Relationship Merging the income data with educational attainment, the analysis uncovered potential correlations between median household income and the percentage of graduate degree holders. Visualizations illustrated how these socio-economic factors intersect, highlighting areas of economic prosperity and graduate degree attainment. The interactive maps provided a dynamic tool for exploring these relationships, offering insights for policymakers and community organizations, and citizens that enjoy living in an area where education is valued.