Washington, D.C. is known for many things. It also is the 6th biggest metro area in the country, and is ranked #4 for median household incomes in the top 25 most populated cities in the U.S. However as of 2020, D.C. has about a rate of 90.4 people per 10,000 people, which is significantly higher than other states. For my project, I wanted to see if in areas where the median household income was higher, there were less metro stations and homeless shelters.
To do this, I first went to the DC Open Data site and pulled three data sets. The first contained the median household income for 2018, the second contained locations of homeless shelters in D.C., and the third consisted of where metro entrances where located in the city.
library(sf)
## Linking to GEOS 3.8.1, GDAL 3.2.1, PROJ 7.2.1
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.4 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(mapview)
library(leaflet)
library(leafpop)
hhi <- read_sf("~/Downloads/median.geojson")
homeless <- read_sf("~/Downloads/homeless.geojson")
metro <- read_sf("~/Downloads/metro.geojson")
After that, I had to filter the data sets to show the key points. The following data is cleaned data.
colnames(hhi)[8] <- "HHI"
select(hhi,c(5,8,44)) -> cleaned_hhi
select(homeless,c(3,4,10,11)) -> cleaned_homeless
select(metro,c(3,5,6,7)) -> cleaned_metro
Then, I combined all the new data, in addition to assigning the homeless data points to green and metro entrances to orange to make them easier to visualize.
mapview(cleaned_hhi, color="red", alpha=.3, alpha.regions=.3, legend=FALSE, popup = popupTable(cleaned_hhi,
zcol = c("HHI"
))) +
mapview(cleaned_homeless, color = "green", legend = FALSE, popup = popupTable(homeless, zcol=c("ADDRESS", "PROVIDER", "SUBTYPE")), burst = FALSE, hide=TRUE) +
mapview(cleaned_metro, color = "orange", legend=FALSE, popup = popupTable(metro, zcol=c("NAME", "ADDRESS","EXIT_TO_ST" )), burst=FALSE, hide=TRUE)
From the map above, you see that the areas where the median household income tend to be the highest are in the southeast corner of the city. There also happens to be no metro stations or homeless shelters in this area. When looking towards the center of the city where there are metro stations and homeless shelters, the median household income is significantly lower. One correlation between the metro entrances and homeless shelters are that most homeless shelters that are located near metro station tend to be green or yellow lines.
Overall, it appears that in areas where there are both metro stations and homeless shelters, the median household income tends to be lower.