Image by David Kovaluk for St. Louis Public Radio. Source: https://www.stlpr.org/government-politics-issues/2021-09-07/st-louis-county-council-reinstates-eviction-moratorium
U.S. Eviction Rates from the Prinction Eviction Lab, Pre-Visualization Work
About the Data
The Eviction Lab of Princeton University works to make eviction data publicly available and accessible nationwide. This dataset provided by them tracks how eviction filing counts have trended compared to pre-pandemic averages for 34 cities and 10 states across the U.S. There is no requirement from the government to track this data, so the Eviction Lab only tracks cities or states that are able to keep this data and choose to share their findings publicly. The data can be found on their site at the following link: https://evictionlab.org/eviction-tracking/
The variables in the dataset are as follows:
“site” and “site_id” track the name of a city/state and its given ID code. State codes are two digits, and city codes are 5 digits in length.
“month” tracks the month in which that row’s collection.
“month_filings” is the number of filings for that site in the given month.
“pct_of_historical” is the comparative percentage of the “month_filings” actual figure to the “avg_filings” figure.
“avg_filings” is the average filings for a given site pre-pandemic.
“eviction_filing_rate” is the overall proportion of evictions compared to the number of rental properties in the site.
Loading Necessary Libraries
# devtools::install_github("hrbrmstr/streamgraph") library(streamgraph) # install "streamgraph" as a packagelibrary (tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library (ggplot2)library (RColorBrewer)
Import Data
setwd("~/24X Course Work/DATA110") #sets where our dataset is stored and will be pulled fromevictions <-read_csv("main_landing_page_data.csv") #import dataset to our workspace
Rows: 2279 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): site_id, site
dbl (4): month_filings, pct_of_historical, avg_filings, eviction_filing_rate
date (1): month
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#running the Saidi cleaning just to safe!names(evictions) <-tolower (names(evictions)) #lowercase all column namesnames(evictions) <-gsub(" ","_",names(evictions)) #remove any spaces, replace with underscores
Creating Tiers for Eviction Filing Rate
evictionStates <- evictions |>mutate(efr_tier =case_when( eviction_filing_rate <=0.111~"low", eviction_filing_rate >0.111& eviction_filing_rate <0.222~"med", eviction_filing_rate >=0.222~"high")) #EFR_tier is to split Eviction Filing Rate into three respective groups, low, medium, and high
Removing Unneeded Data from Working Dataset
evictionStates$site_id <-as.numeric(as.character(evictionStates$site_id)) #set site_id to numeric to properly remove unneeded data as followsevictionStates <-filter(evictionStates,site_id <100) #ensure only states are kept evictionStates <-filter(evictionStates,site_id !=0) #remove all_siteshead(evictionStates)
evic_bp <-boxplot(evictionStates$pct_of_historical~evictionStates$site,main ="",xlab ="States",ylab ="Deviations from Averaged Filing Rate for Each Month")
Line Plot
evictionStates |>ggplot(aes(x=avg_filings, y=eviction_filing_rate, group=site)) +#setting axgeom_point(aes(color=site)) +scale_color_brewer(palette="Paired") +labs(title ="Eviction Filing Rates v. Average Filing Rates by State",x="Average Monthly Filings (Pre-Pandemic)",y="Actual Filing Rate by Month (By Percentage of All Rentals)")
Streamgraph
streamgraph(evictionStates, key="site", value="month_filings", date="month") |>sg_axis_x(1, "year", "%Y") %>%sg_fill_brewer("Paired") %>%sg_legend(TRUE, "State: ") %>%sg_title(title ="Eviction Rates around the United States")
Warning in widget_html(name, package, id = x$id, style = css(width =
validateCssUnit(sizeInfo$width), : streamgraph_html returned an object of class
`list` instead of a `shiny.tag`.
Warning: `bindFillRole()` only works on htmltools::tag() objects (e.g., div(),
p(), etc.), not objects of type 'list'.
Eviction Rates around the United States
Essay Portion
In preparing my data for visualization, I took many precautions to ensure the work would be as seamless as possible. I began by loading my necessary libraries in order to create my plots (streamgraph, ggplot2), load needed color packs (RColorBrewer), as well as run general standard code (tidyverse). After housekeeping by ensuring my casing was uniform as my spaces were made to underscores, I created a new column. Entitled “eft_tier”, short for Eviction Filing Rate Tier, this variable grouped similar filing rates into Low, Med, or High accordingly, creating a new categorical variable. Next, I took advantage oof the included “site-id” variable to easily remove all non-state sites, giving me a smaller, more concise dataset to work with and draw conclusions from when plotting and creating visualizations.
My primary visualization is that the Streamgraph, which examines the Monthly Filings of each city over time. In this, we can clearly see how sharply the COVID-19 Pandemic affected this data as well, as it sharply pinches around when lockdown went into place. There’s also a somewhat significant bump later that same year, perhaps coming from the lockdown taking effect on those who may have become unemployedd or for whatever reason may no longer be able to pay rent. Although there are no labels for the axes or a title, as I was unable to find such a feature for streamgraphs, the interactive State key allows users to highlight a state of their choice by name without needing to hover. Similarly, using the hover feature allows the reader to take note of the exact filing counts for the time and state they hover over, allowing a deeper experience with teh datas than other state version might allow.
Something I wish I could have done was include the data I initially excluded without overwhelming my visualizations. For example, there were multiple cities in Texas included in the dataset, but Texas is not represented since the state itself does not collect this data across all cities. Inclusion of this data would give a broader look to the eviction rates across the US, and a deeper investigation could look into and compare the practices of how this data is gathered from city to city or state to state, maybe eventually leading to some standardization of this practice. I imagine this is the goal of the Princeton University Eviction Lab, as an encompassing look at data will