Introduction

In this lab, I explore Louisiana’s socio-economic landscape by looking at two important factors: education levels and income. By analyzing data at the parish level, I aim to understand how these factors vary across the state.

Using data from the American Community Survey (ACS) and visualization tools in R, we will explore county-level trends related to graduate degrees and median household income.


Part A: Analyzing Graduate Degree Percentages in Louisiana Parishes

Project Setup & Data Acquisition

We use the tidycensus package to retrieve ACS data for Louisiana.

# # Pull education data from ACS — focusing on graduate degrees
grad_data <- get_acs(
  geography = "county",
  variables = c(
    masters = "B15003_022",
    professional = "B15003_023",
    doctorate = "B15003_024"
  ),
  state = "LA",
  survey = "acs5",
  year = 2021,
  quiet = TRUE
)

# Aggregate total graduate degrees and compute MOE correctly
grad_data <- grad_data %>%
  group_by(GEOID, NAME) %>%
  summarise(
    grad_degree_count = sum(estimate, na.rm = TRUE),
    moe = sqrt(sum(moe^2, na.rm = TRUE))  # Correct MOE calculation
  )

# Get total adult population (for percentage calculation)
total_pop <- get_acs(
  geography = "county",
  variables = "B15003_001",
  state = "LA",
  survey = "acs5",
  year = 2021,
  quiet = TRUE
)

# Merge and compute graduate degree percentage
grad_data <- grad_data %>%
  left_join(total_pop %>% select(GEOID, total_pop = estimate), by = "GEOID") %>%
  mutate(
    grad_degree_pct = (grad_degree_count / total_pop) * 100,
    moe_pct = (moe / total_pop) * 100  # Convert MOE to percentage
  )

# Inspect first few rows
head(grad_data)
## # A tibble: 6 × 7
## # Groups:   GEOID [6]
##   GEOID NAME           grad_degree_count   moe total_pop grad_degree_pct moe_pct
##   <chr> <chr>                      <dbl> <dbl>     <dbl>           <dbl>   <dbl>
## 1 22001 Acadia Parish…              4840  539.     38314            12.6    1.41
## 2 22003 Allen Parish,…              1969  338.     16245            12.1    2.08
## 3 22005 Ascension Par…             22791 1543.     81640            27.9    1.89
## 4 22007 Assumption Pa…              1567  329.     14934            10.5    2.20
## 5 22009 Avoyelles Par…              3200  530.     27356            11.7    1.94
## 6 22011 Beauregard Pa…              4370  527.     24337            18.0    2.17

Top 5 and Bottom 5 Parishes by Graduate Degree Percentage

highest <- grad_data %>%
  arrange(desc(grad_degree_pct)) %>%
  drop_na(grad_degree_pct) %>%
  head(5)

lowest <- grad_data %>%
  arrange(grad_degree_pct) %>%
  drop_na(grad_degree_pct) %>%
  head(5)

filtered_grad_data <- bind_rows(highest, lowest)

# Display results
filtered_grad_data
## # A tibble: 10 × 7
## # Groups:   GEOID [10]
##    GEOID NAME          grad_degree_count   moe total_pop grad_degree_pct moe_pct
##    <chr> <chr>                     <dbl> <dbl>     <dbl>           <dbl>   <dbl>
##  1 22071 Orleans Pari…            101316 2324.    274825            36.9   0.846
##  2 22061 Lincoln Pari…              9562  780.     26493            36.1   2.95 
##  3 22033 East Baton R…            102287 2878.    289379            35.3   0.995
##  4 22103 St. Tammany …             63254 2104.    180157            35.1   1.17 
##  5 22055 Lafayette Pa…             53111 2005.    162259            32.7   1.24 
##  6 22035 East Carroll…               501  159.      4960            10.1   3.20 
##  7 22007 Assumption P…              1567  329.     14934            10.5   2.20 
##  8 22117 Washington P…              3307  470.     31218            10.6   1.51 
##  9 22091 St. Helena P…               819  251.      7523            10.9   3.34 
## 10 22123 West Carroll…               777  154.      6913            11.2   2.22

Margin of Error Plot

The following plot visualizes the percentage of graduate degree holders across Louisiana parishes, incorporating margins of error.

grad_plot <- ggplot(filtered_grad_data, aes(x = grad_degree_pct, y = reorder(NAME, grad_degree_pct))) +
  geom_errorbar(aes(xmin = grad_degree_pct - moe_pct, xmax = grad_degree_pct + moe_pct), width = 0.4, color = "black") +
  geom_point(color = "darkred", size = 2) +
  scale_x_continuous(labels = scales::percent_format(scale = 1)) +  # Format as percentage
  labs(
    title = "Graduate Degree Holders in Louisiana (ACS 2021)",
    subtitle = "Top 5 and Bottom 5 Parishes by Graduate Degree Percentage",
    x = "Percentage of Residents (%)",  # Update label for clarity
    y = "Parish",
    caption = "Source: U.S. Census Bureau (ACS 5-Year 2021)"
  ) +
  theme_minimal()

# Display plot
print(grad_plot)

Figure 1: This chart highlights the parishes in Louisiana with the highest and lowest rates of graduate degree holders. The error bars show ACS margins of error, which are larger in rural areas due to smaller sample sizes.

Interactive Version

grad_plot_interactive <- ggplot(filtered_grad_data, aes(y = reorder(NAME, grad_degree_pct), x = grad_degree_pct)) +
  geom_col(fill = "darkred") +
  labs(
    title = "Graduate Degree Holders in Louisiana (ACS 2021)",
    subtitle = "Top 5 and Bottom 5 Parishes by Graduate Degree Percentage",
    x = "ACS Estimate (%)",
    y = "Parish"
  ) +
  theme_minimal()

# Convert to interactive
ggplotly(grad_plot_interactive)

Figure 2: Interactive bar chart displaying graduate degree percentages for selected Louisiana parishes.


Part B: Analyzing Median Household Income in Louisiana Parishes

Project Setup & Data Acquisition

We now retrieve and analyze median household income data.

income_data <- get_acs(
  geography = "county",
  variables = "B19013_001",
  state = "LA",
  year = 2021,
  quiet = TRUE
) %>%
  rename(median_income = estimate)  # Rename for clarity

# Inspect first few rows
head(income_data)
## # A tibble: 6 × 5
##   GEOID NAME                         variable   median_income   moe
##   <chr> <chr>                        <chr>              <dbl> <dbl>
## 1 22001 Acadia Parish, Louisiana     B19013_001         42368  3789
## 2 22003 Allen Parish, Louisiana      B19013_001         47660  4627
## 3 22005 Ascension Parish, Louisiana  B19013_001         86256  3532
## 4 22007 Assumption Parish, Louisiana B19013_001         42831  3982
## 5 22009 Avoyelles Parish, Louisiana  B19013_001         37903  4966
## 6 22011 Beauregard Parish, Louisiana B19013_001         57130  4205

Static Map of Median Household Income

To ensure reliable access to geographic boundaries, we use the tigris package to manually load parish boundaries, as Census Bureau servers may occasionally experience downtime.

# Load Louisiana county boundaries
la_counties <- counties(state = "LA", cb = TRUE)

# GEOID needs to be a character for a successful merge
income_data <- income_data %>% mutate(GEOID = as.character(GEOID))
la_counties <- la_counties %>% mutate(GEOID = as.character(GEOID))

# Merge income data with spatial data
income_data_sf <- la_counties %>% left_join(income_data, by = "GEOID")

# Ensure NAME column exists and handle missing values
if (!"NAME" %in% colnames(income_data_sf)) {
  income_data_sf <- income_data_sf %>% mutate(NAME = "Unknown Parish")
}

# Handle missing income values properly (avoid setting to 0)
income_data_sf <- income_data_sf %>%
  mutate(median_income = ifelse(is.na(median_income), NA, median_income)) 

# Create static map
income_map <- ggplot(income_data_sf) +
  geom_sf(aes(fill = median_income), color = "white", size = 0.2) +
  scale_fill_viridis_c(name = "Median Income ($)", option = "turbo") +
  labs(
    title = "Median Household Income by Parish in Louisiana (ACS 2021)",
    subtitle = "Data from the U.S. Census Bureau (ACS 5-Year 2021)",
    caption = "Source: tidycensus"
  ) +
  theme_minimal()

# Display static plot
print(income_map)

Figure 3: Median household income by parish in Louisiana based on ACS 2021 data.

Interactive Choropleth Map

income_data_sf <- income_data_sf %>%
  mutate(median_income = ifelse(is.na(median_income), 0, median_income))  # Replace NA with 0

mapview(income_data_sf, zcol = "median_income", legend = TRUE)

Figure 4: Interactive map of Louisiana parishes displaying median household income levels.

Combining Graduate Degree and Income Data

We now join the two datasets to explore potential relationships.

merged_data <- grad_data %>%
  left_join(income_data %>% select(GEOID, median_income), by = "GEOID") %>%
  drop_na(median_income, grad_degree_pct)  # Ensure no missing data

# Inspect first few rows
head(merged_data)
## # A tibble: 6 × 8
## # Groups:   GEOID [6]
##   GEOID NAME           grad_degree_count   moe total_pop grad_degree_pct moe_pct
##   <chr> <chr>                      <dbl> <dbl>     <dbl>           <dbl>   <dbl>
## 1 22001 Acadia Parish…              4840  539.     38314            12.6    1.41
## 2 22003 Allen Parish,…              1969  338.     16245            12.1    2.08
## 3 22005 Ascension Par…             22791 1543.     81640            27.9    1.89
## 4 22007 Assumption Pa…              1567  329.     14934            10.5    2.20
## 5 22009 Avoyelles Par…              3200  530.     27356            11.7    1.94
## 6 22011 Beauregard Pa…              4370  527.     24337            18.0    2.17
## # ℹ 1 more variable: median_income <dbl>

Interactive Map of Median Income

# Ensure NAME column exists and handle missing values
if (!"NAME" %in% colnames(income_data_sf)) {
  income_data_sf <- income_data_sf %>% mutate(NAME = "Unknown Parish")
}

income_data_sf <- income_data_sf %>%
  mutate(
    median_income = ifelse(is.na(median_income), 0, median_income),  # Replace NA with 0
    NAME = ifelse(is.na(NAME), "Unknown Parish", NAME)  # Ensure NAME has no missing values
  )

leaflet(data = income_data_sf) %>%
  addTiles() %>%
  addPolygons(
    fillColor = ~colorNumeric("YlOrRd", median_income, na.color = "gray")(median_income),
    color = "black",
    weight = 1,
    opacity = 1,
    fillOpacity = 0.7,
    label = ~paste0(NAME, ": $", format(median_income, big.mark = ","))
  ) %>%
  addLegend(
    pal = colorNumeric("YlOrRd", income_data_sf$median_income, na.color = "gray"),
    values = income_data_sf$median_income,
    title = "Median Income ($)",
    position = "bottomright"
  )

Figure 5: Interactive choropleth map showing the distribution of median household income across Louisiana parishes.


Conclusion

This analysis confirmed what I expected: education and income are closely linked, especially in urban areas like Baton Rouge and New Orleans. It was interesting to see how rural areas lag behind, highlighting the educational and economic divide within the state. If I had more time, I’d explore whether cost of living adjustments change these findings, as some lower-income areas might still offer decent purchasing power.


References