Project 2

Author

Daniel Johnson

This dataset is from an organization called “United for ALICE.” They advocate for lower income families using data science. They believe that the federal poverty level, a metric created by the US Census Bureau to measure how many people/households are in poverty. The people behind ALICE believe that this metric doesn’t go far enough, and that there should be something more granular. They therefore have created the metric: ALICE. Asset Limited, Income Constrained, Employed (ALICE) takes into account how much it costs to live in every county in the us, versus average for the whole country like the Federal Poverty Level. ALICE tries to show how many people may have enough money to live or survive, but not enough for everything they might need. ALICE therefore encompasses more households in a given area compared to the federal metrics, and is calculated at a local level unlike federal metrics. I first became aware of the metric via the Maryland Food Bank, which also has a lot of great data. They use ALICE extensively in their metrics and to show who and what areas are in the most need. This was all relevant to me because of the recent government shutdown and millions of Americans losing their SNAP benefits.

They have data for every state, as well as the DC metro area, so I decided to use DC since it’s locally relevant and it encompasses multiple states, which I thought would be interesting. The dataset has data for different geographic subdivisions, the number of households in each, as well as the number of households below the federal poverty level and the ALICE level. I intent to use the 3 household variables to calculate 2 percentage variables for federal poverty and ALICE, and to create a map showing the differences in ALICE levels by region around the DC area.

The dataset has 3 main geographic categories, cities, zip codes and county sub districts. I decided to use county sub districts, which is a metric created by the census bureau that breaks down counties into a few districts so you can be more granular then the county level. I decided to use this over zip codes since there wouldn’t be so many shapes on the map that you couldn;t tell where you were or see the general outline of the conuties. Sub ditricts is a nice intermediate.

First, we load our libraries. Plotly is for interactivity on the exploratory plot. Leaflet is our mapping package, and sf will help leaflet understand our data. Tidycensus allows me to access census data so that I can attach actual mapping data to my chosen dataset.

library(tidyverse)
library(leaflet)
library(plotly)
library(tidycensus)
library(sf)

data <- read_csv("2025_ALICE_DC_METRO_2.csv")

The dataset provides several ways to “slice up” each county, and I chose the sub county districts because they are mroe granular then zip codes but there aren’t so many datapoints that it becomes overwhelming. Therefore, we can filter for just sub counties.

sub_county <- data |>
  filter(Type == "Sub_County")

Mutate sub_county so that we can have a separate state column for the exploratory plot later.

sub_county <- sub_county |>
  mutate(State = case_when(
    str_detect(GEO.display_label, "Maryland") ~ "MD",
    str_detect(GEO.display_label, "West Virginia") ~ "WV",
    str_detect(GEO.display_label, "Virginia") ~ "VA",
    str_detect(GEO.display_label, "District of Columbia") ~ "DC"
  )) |>
  relocate(State, .before = Households)

Add column for percentage of households that are below the poverty line.

sub_county <- sub_county |>
  mutate(percent_poverty = (`Poverty Households` / Households) * 100) |>
  relocate(percent_poverty, .before = `ALICE Households`)

Add column for percentage of households below the ALICE level.

sub_county <- sub_county |>
  mutate(percent_alice = ((`Poverty Households` = `ALICE Households`) / Households) * 100) |>
  relocate(percent_alice, .before = `Above ALICE Households`)

Filter out Washington DC, because it isn’t broken up into sub county districts like everything else, so it makes our plot hard to read since DC has so many more households. I will include DC in the map later.

sub_county2 <- sub_county |>
  filter(State != "DC")

Scatterplot with percent of households below the federal poverty level on the X, and percent below the ALICE level on the Y. Points are colored by state, and sized to show the number of households.

p1 <- ggplot(sub_county2, aes(x = percent_poverty, y = percent_alice)) +
  geom_point(size = (sub_county2$Households / 5000), alpha = 0.8, aes(color = State)) +
  geom_smooth(method = 'lm', formula = y ~ x, color = "black", se = FALSE) +
  labs(title = "Percentage of Households under Federal Poverty Level vs Under ALICE Level",
       x = "Percentage of Households Below Federal Poverty Level",
       y = "Percentage of Households Below ALICE Level",
       fill = "State",
       caption = "Size of points is number of households in county district \n DC removed because it isn't broken down by district \n Source: United for ALICE, US Census bureau") +
  #scale_color_brewer(palette = "Dark2") +
  scale_color_manual(values = c("MD" = "dodgerblue2", "VA" = "darkolivegreen", "WV" = "red")) +
  theme_bw()

Feed the plot into plotly for interactivity

ggplotly(p1)

I chose this graph because I wanted to see if there was any kind of correlation or relationship between the percentage of households below the federal poverty level vs the ALICE level. I also wanted to see if there were any major outliers. I firstly noticed there is a bit of a trend line, but probably not enough to draw any conclusions or inform what I wanted to do later. In terms of outliers, the main thing to notice is how much the population differs inside the metro area. i had to leave off Dc because it isn’t sub-divided, which made its point so much bigger then the others that it made it hard to read or even attempt to draw conclusions. i would say the main outliers are the 4 bigger Maryland points that are in the top right of the graph, but even then they all aren’t that far off everything else and there are other smaller points around them.

Here I created a variable called “variable_list” to tell the function “get_acs()” what data I wanted from the census. get_acs() is from tidycensus and will download data from the census bureau’s “American Community Survey”, which is a yearly household survey the Census conducts. I have to lead the ACS data in for each state, I tried doing it all at once and it didn’t work. This will give us every sub county district for each state, a couple population variables that we don’t need and most crucially, the geometric data for each sub county district.

variable_list = c(Population_ = "B01001_001")

md_data <-get_acs(
  geography = "county subdivision",
  state = "MD",
  variables = variable_list,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE
)

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |======================================================================| 100%
va_data <-get_acs(
  geography = "county subdivision",
  state = "VA",
  variables = variable_list,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE
)

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  64%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |======================================================================| 100%
wv_data <-get_acs(
  geography = "county subdivision",
  state = "WV",
  variables = variable_list,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE
)

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |==========================================                            |  59%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |======================================================================| 100%
dc_data <-get_acs(
  geography = "county subdivision",
  state = "DC",
  variables = variable_list,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE
)

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%

Since I had to download each state individually, here I combined them all into 1 big dataset.

acs_data <- bind_rows(md_data, va_data, wv_data, dc_data)

Here I joined the ACS data with our ALICE dataset so that we can have the geometric data in the same dataset as the poverty and ALICE levels.

merged_data <- acs_data |>
  left_join(sub_county, by = join_by(GEOID == GEO.id2))

Here I created the color palette for my map using the colornumeric() function from leaflet and a color brewer palette.

pal <- colorNumeric(
  palette = "YlOrRd",   # Yellow → Red
  domain = merged_data$percent_alice
)

Here I created the text for my maps tooltip. I have the name of the district, the percent below the ALICE level, percent below the poverty line and the total households.

tool_tip <- paste0(
  "<b>", merged_data$GEO.display_label, "</b><br>",
  "<b>% ALICE: </b>", merged_data$percent_alice, "<br>",
  "<b>% Poverty: </b>", merged_data$percent_poverty, "<br>",
  "<b>Households: </b>", merged_data$Households, "<br>"
)

Here is my map, which shows all the states in the DC metro area (DC, MD, VA, WV) broken down by census sub district. Only districts in counties in the dc metro area have data. The map uses a yellow to orange to red color palette showing the percentage of households that are below the ALICE level. The tooltip, as seen above, also shows the name, total number of households and percentage of households below the federal poverty level.

leaflet(merged_data) |>
  addProviderTiles("Esri.WorldStreetMap") |>
  addPolygons(
    fillColor = ~pal(percent_alice),
    fillOpacity = 0.8,
    color = "black",
    weight = 1,
    popup = tool_tip
  ) |>
  addLegend(
    pal = pal,
    values = merged_data$percent_alice,
    title = "% ALICE"
  )
Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'

I really like my map and am very proud of it, but I don’t really think it really reveals anything I at least didn’t already know. Generally, as you get more rural, areas tend to get poorer, even if the cost of living tends to decline. The map also shows some quite urban areas that are on the poorer side, particularly in Prince Goerge’s County along the DC border. It also shows some of the richest places in our area, which arn’t just the richest in the DC metro area but also the whole country. These areas being Bethesda and Potomac in Montgomery county, as well as parts of Northern Virginia, particularly in Fairfax and Loudon counties.

Source(s): https://rpubs.com/drkblake/ALICEmap used as a tutorial to map using something other then lat and long like in the 500 cities assignment and Japan earthquake tutorial.