This project looks into the socio-economic landscape of Wisconsin by examining two key metrics across its seventy-two counties: Part A looks at the percentage of residents holding graduate degrees. Part B analyzes the median household income and explores county trends related to graduate degrees and median household income. By leveraging data from the American Community Survey (ACS) and employing visualization tools in R. The analysis results will be put into visualizations for each section.
The goal of Part A of this this project is to analyze the percentage of people holding graduate degrees across all seventy-two counties in Wisconsin. Using R-Studio, we will leverage data from the American Community Survey (ACS) to conduct this analysis. We will examine the distribution of graduate degree holders by county, identify the counties with the highest and lowest percentages of graduate degree attainment, and create visualizations to highlight these trends. Additionally, we will incorporate margins of error to assess the reliability of the estimates provided by the ACS.
library(tidycensus)
library(tidyverse)
library(mapview)
library(plotly)
library(ggiraph)
library(survey)
library(srvyr)
# Clone the repository
system("git clone https://github.com/walkerke/umich-workshop-2023.git")
#Percentage of people with Graduate Degrees in Wisconsion By County
WI_Grad_Degrees <- get_acs(
geography = "county",
state = "WI",
variables = c(percent_gradute = "DP02_0066P"),
year = 2021)
glimpse(WI_Grad_Degrees)
## Rows: 72
## Columns: 5
## $ GEOID <chr> "55001", "55003", "55005", "55007", "55009", "55011", "55013"…
## $ NAME <chr> "Adams County, Wisconsin", "Ashland County, Wisconsin", "Barr…
## $ variable <chr> "percent_gradute", "percent_gradute", "percent_gradute", "per…
## $ estimate <dbl> 3.7, 6.2, 6.6, 12.4, 9.6, 5.4, 6.9, 8.5, 6.6, 3.5, 7.6, 5.5, …
## $ moe <dbl> 0.6, 1.1, 0.8, 1.2, 0.5, 0.7, 0.7, 0.8, 0.7, 0.4, 0.7, 0.8, 0…
In this section identifies the top 10 Wisconsin Counties with the highest percentage of individuals that have earned a graduate degree. The data is arranged in descending order pasted on the percentage of graduates (estimate).
# Process the data to get the top 10 counties with the highest percentage of graduates
Cty_Higest_Pct <- WI_Grad_Degrees %>%
arrange(desc(estimate)) %>% # Arrange by descending order of percentage of graduates
slice_head(n = 10) # Select the top 10 rows
# Display results
Cty_Higest_Pct
## # A tibble: 10 × 5
## GEOID NAME variable estimate moe
## <chr> <chr> <chr> <dbl> <dbl>
## 1 55025 Dane County, Wisconsin percent_gradute 21.4 0.6
## 2 55089 Ozaukee County, Wisconsin percent_gradute 19.6 1.3
## 3 55133 Waukesha County, Wisconsin percent_gradute 15.6 0.4
## 4 55063 La Crosse County, Wisconsin percent_gradute 13.3 1.1
## 5 55007 Bayfield County, Wisconsin percent_gradute 12.4 1.2
## 6 55029 Door County, Wisconsin percent_gradute 11.9 1
## 7 55079 Milwaukee County, Wisconsin percent_gradute 11.8 0.3
## 8 55109 St. Croix County, Wisconsin percent_gradute 11.3 1
## 9 55035 Eau Claire County, Wisconsin percent_gradute 11.2 0.9
## 10 55059 Kenosha County, Wisconsin percent_gradute 10.9 0.8
In this section identifies the top 10 Wisconsin Counties with the lowest percentage of individuals that have earned a graduate degree. The data is arranged in descending order pasted on the percentage of graduates (estimate).
# Process the data to get the top 10 counties with the lowest percentage of graduates
Cty_lowest_Pct <- WI_Grad_Degrees %>%
arrange(estimate) %>% # Arrange by descending order of percentage of graduates
slice_head(n = 10) # Select the top 10 rows
# Display results
Cty_lowest_Pct
## # A tibble: 10 × 5
## GEOID NAME variable estimate moe
## <chr> <chr> <chr> <dbl> <dbl>
## 1 55019 Clark County, Wisconsin percent_gradute 3.5 0.4
## 2 55001 Adams County, Wisconsin percent_gradute 3.7 0.6
## 3 55041 Forest County, Wisconsin percent_gradute 3.7 0.6
## 4 55077 Marquette County, Wisconsin percent_gradute 3.7 0.5
## 5 55083 Oconto County, Wisconsin percent_gradute 3.9 0.6
## 6 55057 Juneau County, Wisconsin percent_gradute 4.6 0.7
## 7 55099 Price County, Wisconsin percent_gradute 4.7 0.9
## 8 55115 Shawano County, Wisconsin percent_gradute 5 0.5
## 9 55067 Langlade County, Wisconsin percent_gradute 5.1 1
## 10 55107 Rusk County, Wisconsin percent_gradute 5.1 1
R-Studio and ACS data was used to create a plot to visualize the percentage of graduate degree holders across Wisconsin counties, incorporating margins of error (MOE). The plot displays each county’s estimated percentage of graduates along with error bars representing the MOE. This is a visual representation of the reliability and precision of the estimates provided by the ACS. Additionally, an interactive plot was created allow for enhanced exploration of the data.
# Martin of Error Plot
wi_plot_errorbar <- ggplot(WI_Grad_Degrees, aes(x = estimate,
y = reorder(NAME, estimate))) +
geom_errorbar(aes(xmin = estimate - moe, xmax = estimate + moe),
width = 1.1, linewidth = 1.1) +
geom_point(color = "darkred", size = 2) +
scale_x_continuous(labels = scales::label_percent(scale = 1)) +
scale_y_discrete(labels = function(x) str_remove(x, " County, Wisconsin|, Wisconsin")) +
labs(title = "Percentage of Graduates, Wisconsin Counties",
subtitle = "Data from the American Community Survey",
caption = "Data acquired with R and tidycensus. Error bars represent margin of error around estimates.",
x = "ACS estimate",
y = "") +
theme_minimal(base_size = 5) +
theme(axis.text.y = element_text(size = 5),
axis.text.x = element_text(size = 10, face = "bold"))
# Save the plot as a .jpg file
ggsave("C:/PENNSTATE/GEOG588_Analytical Approaches_/R_Projects/R_Lab5/Outputs/wi_plot_errorbar.jpg",
plot = wi_plot_errorbar,
width = 14, height = 10,
units = "in", dpi = 300)
# Convert the ggplot to an interactive plotly plot
interactive_plot <- ggplotly(wi_plot_errorbar, tooltip = "x")
interactive_plot
FIGURE 1. Interactive Chart showing percentage of graduate degree holders in various Wisconsin counties, with error bars representing the margin of error around the ACS estimates. This visualization highlights the distribution and reliability of educational attainment data across the state.
FIGURE 2. Chart showing percentage of graduate degree holders in various Wisconsin counties, with error bars representing the margin of error around the ACS estimates. This visualization highlights the distribution and reliability of educational attainment data across the state.
Part B explores the relationship between median household income and educational attainment. By merging the income data with the percentage of graduate degree holders, the analysis may reveal potential correlations and geographic disparities. The visualizations will provide a glimpse of how socio-economic factors may and graduate degree attainment may interplay across the state’s counties.
# Load necessary libraries
library(tidyverse)
library(sf) # For handling the geometry
library(tidycensus)
library(ggplot2)
system("git clone https://github.com/walkerke/umich-workshop-2023.git")
# Retrieve median household income data with geometry
Wisconsin_income_by_County <- get_acs(
geography = "county",
variables = "B19013_001E",
state = "wi",
year = 2021,
geometry = TRUE)
glimpse(Wisconsin_income_by_County)
## Rows: 72
## Columns: 6
## $ GEOID <chr> "55011", "55111", "55019", "55057", "55131", "55075", "55037"…
## $ NAME <chr> "Buffalo County, Wisconsin", "Sauk County, Wisconsin", "Clark…
## $ variable <chr> "B19013_001", "B19013_001", "B19013_001", "B19013_001", "B190…
## $ estimate <dbl> 61167, 67702, 57547, 58561, 85574, 55694, 52143, 48908, 58289…
## $ moe <dbl> 2352, 1885, 1507, 2370, 2146, 1927, 4584, 2251, 2069, 3486, 3…
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-92.08384 4..., MULTIPOLYGON (((…
This section presents a visualization of the median household income by county in Wisconsin for the year 2021. A
# Load necessary libraries
library(tidyverse)
library(sf)
library(tidycensus)
library(viridis)
library(scales)
# Plot the median household income data
ggplot(Wisconsin_income_by_County) +
geom_sf(aes(fill = estimate)) +
scale_fill_viridis_c(option = "plasma", labels = dollar) +
theme_minimal() +
labs(title = "Median Household Income by County in Wisconsin (2021)",
fill = "Income",
caption = "Source: American Community Survey 2021")
Figure 1: This choropleth map displays the median household income by county in Wisconsin for the year 2021. The income levels are represented using a color gradient, with the highest incomes shown in yellow and the lowest in dark purple.
In this section, we integrate two datasets to provide a more comprehensive view of Wisconsin’s socio-economic landscape. By joining the data on median household income and graduate degrees by county we can look for relationships between income levels and educational attainment across the state. The merged dataset shows how the distribution of graduate degrees corresponds with income levels, providing insights into potential correlations and disparities.
# Join the two dataframes on the GEOID field
merged_data <- WI_Grad_Degrees %>%
left_join(Wisconsin_income_by_County, by = "GEOID")
# View first few rows of the merged dataframe
head(merged_data)
## # A tibble: 6 × 10
## GEOID NAME.x variable.x estimate.x moe.x NAME.y variable.y estimate.y moe.y
## <chr> <chr> <chr> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
## 1 55001 Adams Co… percent_g… 3.7 0.6 Adams… B19013_001 51878 2534
## 2 55003 Ashland … percent_g… 6.2 1.1 Ashla… B19013_001 55070 3527
## 3 55005 Barron C… percent_g… 6.6 0.8 Barro… B19013_001 55256 1873
## 4 55007 Bayfield… percent_g… 12.4 1.2 Bayfi… B19013_001 62859 2368
## 5 55009 Brown Co… percent_g… 9.6 0.5 Brown… B19013_001 68799 1469
## 6 55011 Buffalo … percent_g… 5.4 0.7 Buffa… B19013_001 61167 2352
## # ℹ 1 more variable: geometry <MULTIPOLYGON [°]>
This section uses a visualization of the educational attainment levels across Wisconsin, to understand the percentage of residents with graduate degrees in each county. Using data from the American Community Survey 2021, a choropleth map was created that highlights the distribution of higher education throughout the state.
library(tidyverse)
library(sf)
library(ggplot2)
# Convert merged_data to an sf object if it isn't already
merged_data_sf <- st_as_sf(merged_data)
# Transform the CRS to WGS 84 (EPSG: 4326)
merged_data_sf <- st_transform(merged_data_sf, crs = 4326)
# Plot the percentage of graduates by county
ggplot(merged_data_sf) +
geom_sf(aes(fill = estimate.x, geometry = geometry)) +
scale_fill_viridis_c(option = "plasma", labels = scales::percent_format(scale = 1)) +
theme_minimal() +
labs(title = "Percentage of Graduates by County in Wisconsin (2021)",
fill = "Percentage of Graduates",
caption = "Source: American Community Survey 2021")
Figure 2: The visualization reveals significant variations in the percentage of graduates across Wisconsin counties.
This section contributes to the analysis by integrating median household income data with the percentage of graduates for each county in Wisconsin. The merged data is converted into an sf object, and centroids are calculated to effectively visualize both data sets on a single map.
#library(tidyverse)
library(sf)
library(ggplot2)
# Convert to sf object
merged_data_sf <- st_as_sf(merged_data)
# Calculate the centroids of each county
centroids <- st_centroid(merged_data_sf)
# Extract coordinates from centroids
centroids_coords <- st_coordinates(centroids)
# Add coordinates to the data
merged_data_sf <- merged_data_sf %>%
mutate(longitude = centroids_coords[, 1], latitude = centroids_coords[, 2])
# Plot median household income data and add points for percentage of graduate
library(tidyverse)
library(sf)
library(ggplot2)
library(tidycensus)
ggplot(merged_data_sf) +
geom_sf(aes(fill = estimate.y, geometry = geometry)) +
geom_point(aes(x = longitude, y = latitude, size = estimate.x),
color = "green", alpha = 0.6, shape = 21) +
scale_fill_viridis_c(option = "plasma", labels = scales::dollar) +
scale_size_continuous(labels = scales::percent_format(scale = 1)) +
theme_minimal() +
labs(title = "Median Household Income by County in Wisconsin (2021)",
subtitle = "Overlayed with Percentage of Graduates",
fill = "Median Household Income",
size = "Percentage of Graduates",
caption = "Source: American Community Survey 2021")
Figure 3: This map depicts the median household income by county in Wisconsin for the year 2021, using a color gradient to represent income levels. Superimposed on this map are green points indicating the percentage of residents with graduate degrees, with the size of each point proportional to the percentage.
This section leverages the mapview package to create an interactive map of median household income by county in Wisconsin. By using custom colors from the rocket palette in the viridisLite package, the map provides dynamic way to explore income data across the state. Clicking on the county will reveal a popup that shows the county name, percent of individual with graduate degrees, the county’s median household income (estimate.y)
#Customizing Interactive Map with Mapview
library(tidyverse)
library(sf)
library(mapview)
library(viridisLite)
# Generate colors using the 'rocket' palette
colors <- rocket(n = 100)
# Ensure the data is in the correct CRS for mapview
merged_data_sf <- st_transform(merged_data_sf, 4326)
# Create interactive map with custom colors
Interactive_map <- mapview(merged_data_sf, zcol = "estimate.y",
layer.name = "Median Household Income",
col.regions = colors)
# Display the interactive map
Interactive_map
Figure 4: This interactive map displays the median household income by county in Wisconsin for the year 2021, using a custom color palette from the viridisLite package. Data source: American Community Survey 2021.
This project provided an insightful look into Wisconsin’s socio-economic landscape, focusing on the percentage of residents with graduate degrees and median household income across the state’s seventy-two counties. By analyzing and visualizing these datasets, several significant patterns and relationships emerged.
Part A: Graduate Degree Percentages The analysis of graduate degree percentages revealed notable disparities in educational attainment across Wisconsin counties. Certain regions demonstrated a higher concentration of residents with graduate degrees, while others lagged behind. I was happy to see that Bayfield County has the five highest percentage of residents with graduate degrees in the state. That was a surprising result as it is very rural and has only one small college within sixty miles of the county.
Part B: Income and Education Relationship Merging the income data with educational attainment, the analysis uncovered potential correlations between median household income and the percentage of graduate degree holders. Visualizations illustrated how these socio-economic factors intersect, highlighting areas of economic prosperity and graduate degree attainment. The interactive maps provided a dynamic tool for exploring these relationships, offering insights for policymakers and community organizations, and citizens that enjoy living in an area where education is valued.