What our data set looks like:

Animated Map of PA Counties

Stacked Bar Graph

Scatterplot animated

Shiny Interactive Scatterplot

App included as .R file (ScatterplotEDA.R)

Shiny Eviction Rate by County

App included as .R file (SearchbyCounty.R)

Final Paper

Introduction

For this data visualization project, we chose to investigate the eviction crisis of the United States, an issue that is so ubiquitous and terrifying, yet real, for millions of people across the nation. The majority of poor renting families in America spend over half of their income on housing costs, and even the fear of eviction itself becomes a significant stressor for all involved. Statistics regarding evictions is jarring, to say the least: in 2016, over 2 million eviction filings were made across the United States, which is equivalent to a rate of four every minute. Additionally, one in 50 renters was evicted from his or her home. It’s not up for debate that this number is far too high, and with all the negative lasting impacts eviction can have on families for generations, the American eviction crisis is undoubtedly a vital issue for all of our communities and our policymakers to address.

Not only does eviction make families susceptible to falling into a long-term cycle of poverty, but it also has severe lasting effects on the mental and physical health of individuals. Evictions have historically resulted in difficulty in finding new housing, and thus, homelessness; one study found that 47 percent of all families in New York City homeless shelters were there as a result of eviction. Furthermore, families who are evicted regularly lose their possessions, lose their jobs, and experience higher rates of depression. For children, the instability caused by eviction can result in worse outcomes in education, health, and future earnings. All in all, evictions create an extremely heavy burden on individuals and families, and combined with the housing affordability crisis, and consistent increases in housing prices in major metropolitan areas, evictions and all its negative effects become an increasingly alarming threat to our nation and its citizens.

Notably, COVID-19 has intensified and brought light to the U.S. eviction crisis. Tens of millions of Americans are potentially at risk of eviction. Many property owners, who lack the credit or financial ability to cover rental payments, will struggle to pay their mortgages and property taxes and maintain properties. COVID-19 has sharply increased the risk of foreclosure and bankruptcy, disrupted the affordable housing market; and destabilized communities across the United States. Although rent and eviction freezes are a temporary solution to this problem, the threat of eviction will not go away once these moratoriums lift. The eviction crisis must be addressed. As a result, we believe that it’s important to closely examine housing and eviction data sets that have become readily available, thanks to the Eviction Lab making this information available to the public, policymakers, and more. Looking for trends and clues within these data sets can help us identify key factors and notable patterns to this crisis that may give us insight into what next steps can be taken to address this issue.

Our primary research question is to investigate the following: What clear trends can we identify in the eviction crisis? What patterns do we see that we can begin to address in housing policy and 1 directly helping communities? The data set we are using comes from the Eviction Lab, which was formed to make nationwide eviction data publicly available, with the goal to help “document the prevalence, causes, and consequences of eviction and to evaluate laws and policies designed to promote residential security and reduce poverty.” We focused specifically on the state of Pennsylvania, and obtained a data set that organized all available eviction data by county, between the years of 2000 and 2016. With variables such as rent burden rates, poverty rates, demographic information, and more, our exploratory data analysis will hopefully shed light to the causes and consequences of eviction. Although there is no data from the most recent years, it will still be valuable to evaluate trends prior to COVID-19, as this can help describe the trajectory of eviction data, even in “normal” circumstances.



Methods

In order to investigate how the variables we have at hand possibly affect the eviction rate in communities across Pennsylvania, we plan on making a variety of graphs. We plan on beginning broadly, simply looking at the main variable: eviction rate across counties. We will create a choropleth map visualizing how eviction rates differ in different areas of Pennsylvania, and we’ll animate the map to demonstrate how these rates change over time for all of the counties. We had to merge datasets with the preloaded Pennsylvania data from the library maps(). We needed to do this in order to have the longitude and latitude to have the outlines of each county which we created by geom_polygon.

In order to interpret the animation, we’ll look at basic demographics (such as population density) to see how geography plays a role in eviction rates. Additionally, this graph will help readers quickly identify what areas of the state have lower and higher rates. In order to create this graph, we also merged our original data set from the Eviction Labs with the data from the maps package for the state of Pennsylvania. We have counties that appear gray at times, due to missing data. Unfortunately, there was not much we could do to fix this solution. We searched for the eviction rates for certain counties and for a specific year, but we were unable to find them. When we did find them, we were not one hundred percent certain on how accurate or reliable this data was. We did not want to create an unnecessary data point that did not properly align with our data from the Eviction Labs.

After visualizing the eviction rates by county in the format of a choropleth map, which was helpful in seeing the overall trends in eviction as an entire state, we felt that more specific insights could be drawn by being able to specifically focus and compare specific counties. In order to do this, we chose to visualize the same information as in the previous graph, but as a Shiny + Plotly interactive line graph. In this visualization, users can filter the counties they want to explore and compare, and a line graph will display the eviction rates for those selected counties over time. This is a useful tool to specifically examine trends in counties that might be difficult to pinpoint simply through a map animation in graph 1.

For our third graph, we decided to create a Shiny App that allows users to explore how different variables are correlated. On the X axis, one will be able to select median gross rent, percent renter occupied, median household income, and median property value. We chose these variables because they are relevant to the economics of renting a home. We wanted to investigate how these economic factors might have an impact or be correlated with variables directly related to evictions in Pennsylvania, such as eviction rate, eviction filing rate, poverty rate, and rent burden. Hopefully, this Shiny App will allow us to explore different correlations between the economics of housing and statistics of renting with eviction. This will allow us to address issues such as how the income and property value might be related to the rate of eviction and poverty. We may even be able to draw some interesting conclusions about how the pricing of homes affect a community’s poverty rate, which raises interesting policy questions.

For the fourth graph, we decided to narrow our focus from all of the variables in the previous interactive Shiny App, to specifically examine how Median Gross Rent, Rent Burden, and Eviction Rate correlate over time. This will give us a chance to take a deeper dive into how rent specifically affects how much a household is spending on rent, and also how this ties to the rate of eviction. The animation shows the data points for all of the counties in Pennsylvania over time. We predict that having a higher rent will increase the rent burden, and also correlate with a higher eviction rate in the subregion.

Finally, one important section of the dataset we have not picked apart is the racial and ethnic breakdown of counties in Pennsylvania. It was important for us to create a graph for this, because poverty, eviction, and general socio-economic status are often correlated with racial background. Systemic racism cuts deeply into essential parts of our lives, and we thought that it would be essential to visualize how changes in racial make-up of regions might have been correlated with the changes in poverty and eviction rates. Specifically, before creating our final graph, we had original made a scatterplot illustrating the correlation between rent burden and poverty rate across counties, and found that Philadelphia County was a massive outlier in the data set, and had shown a much higher poverty rate and rent burden rate than any other county for the year 2016. Additionally, as Philadelphia is a major metropolitan area with a high population density, holding its rank as the 6th most populous city in the United States, we made the decision to focus specifically on the changes in racial breakdown for Philadelphia County. Thus, our final bar graph was a stacked graph, with a bar for the racial percentages for every year from 2000 to 2016 for Philadelphia County. We hope to identify changes in how racial makeup has changed, which may provide valuable insight and reasoning towards changes in the changing poverty rates and eviction rates in Philadelphia.



Results: Graphs and interpretations

Graph 1:

In Graph 1, we created an animated map of the Pennsylvania counties depicting the eviction rate over time from 2000 to 2016. This data was from the Eviction Labs, and our main variable was the eviction rate. We can clearly see that overtime, the eviction rate has slowly decreased for the overall counties.

More specifically, the counties in the middle of Pennsylvania have a relatively low eviction rate throughout the time periods. We can see that the East and West side counties of Pennsylvania fluctuate more frequently throughout time. We can also notice that in 2016, the most recent year from our dataset, the eviction rate is relatively low overall. In contrast, it seems that around 2006, the eviction rates for general Pennsylvania counties spiked tremendously as the map became darker and darker. We can also see that in Philadelphia county specifically, the eviction rate continues to increase and darken in color. This is quite important because our interest about our topic sparked due to the fact that Swarthmore College is in Philadelphia county. We also included the year changing in the subtitle of our graph to clarify to the readers/viewers.

Overall, this might be one of our favorite graphs we have from our final project. This graph is visually appealing, straightforward, and easy to interpret. This is also a key component for insight on our topic.

Graph 2:

As an extension of the exploration from graph 1, this shiny app comparing eviction rates by county allowed us to specifically explore the rates and changes in rates of specific counties, as well as allowing us to compare counties with each other. You can also hover over specific years to see the exact rates. One very interesting pattern that we observed while looking through different counties, was that many counties seemed to follow the property of remaining stagnant and then quickly rising around 2005 until 2006, and then subsequently dropping drastically. Yet, Philadelphia County, the 6th largest city by population in the US and the county with the highest poverty rate in PA in 2016, did not follow this trend. Its eviction rates remained relatively constant (and higher than most of the other counties we looked at) throughout the entire time frame of the dataset (see below)

Graph 3: For our third graph, we create a Shiny interactive graph that allows the users to pick which variables they would like to see plotted against other variables in a scatter plot. This might be the most helpful or insightful graph we have included in our final project. We provided options for the x and y axis that make the most sense to plot against.

When we plot eviction rate against the percent of renter occupied, we can see that there is no overall linear relationship between the two variables. Overall, there is a clutter of many data points, which represents the counties, and that there is no pattern.

When comparing eviction rate with the median household income, or eviction rate against median property value, we can also notice that there is no linear relationship between the two. However, we might be able to say that there is a slight negative relationship as the points somewhat converge together at a lower eviction rate as median household income increases. Yet, when looking at the scatter plot in a big picture, we would say that there is no correlation or linear relationship.

Additionally, exploring the correlation between eviction rate and median gross rent, we see that there is an overall trend that is not strongly linear, but it is still positive. We can see when median gross rent is low, there is a positive correlation with eviction rates. However, when we see the overall x-axis of the median gross rent, like for instance when median gross rent is over 900, the correlation with eviction rate is non existent. Thus, we can say that there might not be an overall linear relationship between median gross rent and eviction rates.

After playing around with eviction rate, we wanted to dive into correlations of variables with poverty rate. Plotting poverty rate against median gross rent determined that there is a negative and slightly strong linear relationship to each other. As median gross rent increases, we can say that the poverty rate decreases. However, there is one prominent outlier that does not fit our data correctly. That county is Philadelphia which is an interesting point to note. With an eviction rate of over 20+ in 2016 and a median gross rent of around 900+ in 2016, it is an extreme outlier in our dataset.

Some variables did not show any clear indication of a pattern. For example, there is no clear linear relationship between poverty rate against percent renter occupied, over the years. There is a big cluster of counties that does not indicate much between the two and there is an extreme outlier that is far from the cluster of the other counties. We can strongly assume there is no linear relationship. Finally, there is a clear negative, strong linear relationship when we compare poverty rate with median household income. As median household income increases, the poverty rate decreases, which makes the most sense. As income and the economy of individuals overall increase, the poverty rate and those who struggle financially should be lessened. Again, there is an outlier that makes the relationship between the two almost like an exponential, concave up curve. With the outlier removed, the correlation between the two would most definitely increase. Similarly, the same trends apply to when we plot poverty rate against median property value.

Graph 2 presents many interesting interpretations between a selection of two variables that the user is most intrigued in. Not to mention we did not dive in depth about the analysis of the variables rent burden and eviction filing rate against median gross rent, percent renter occupied, median household income, and median property value, but we hoped to show the usefulness of graph 2. This is by far the most helpful graph for the users to compare how variables correlate, or not correlate with each other.

Graph 4:

This scatter plot graphing the median gross rent, rent burden, and eviction rates of counties in Pennsylvania over time was particularly interesting, because there were some very strong trends that we noticed. Before interpreting, it is important to note that the original dataset from the Eviction Labs had some odd data points, such that certain variables such as poverty rate and the variables we are graphing remained stagnant for four or five year increments. For example, the poverty rate for Adams County remained at 7.12% between 2000 and 2004. It is unclear whether these statistics are only gathered every few years, or if the unchanging values over time reflect a constant poverty rate over the selected years. Regardless, the unchanging values would explain how the animation sometimes does not show a smooth change over time; the points do not change for several frames due to equal data values across consecutive years.

This graph clearly illustrates that a higher median gross rent is correlated pretty strongly with rent burden. Additionally, as the size of the point indicates the relative eviction rate for that county, and over time, the counties with higher eviction rates continue to rise in terms of median gross rent and rent burden. On the other hand, counties with relatively low eviction rates seem to remain stagnant over time, indicating that while for areas not quite as affected by evictions, the rent burden and average rent does not change throughout the years, but for areas with high poverty rates, the rent burden and average rent fluctuates enormously over time. Areas that are poorer show high levels of rent burden as rent continues to rise, and thus without resources to cope with rising rent, poverty remains high in these areas: an unsettling cycle of generational poverty.

Graph 5:

In this bar graph outlining the racial breakdown of Philadelphia County between 2000 and 2016, there are a few interesting immediate observations. For one, the percentage of white residents in Philadelphia County steadily decreases throughout the years. This trend is often referred to as white flight, and it may be possible that white residents are moving out of the cities and into more suburban areas. While the African American population has not significantly changed, we have observed a steady increase in the Hispanic population of Philadelphia, as well as the percentage of Asians. Between these years, this county has experienced a large increase in the rent burden, which may shed light into how the movement of white folks away from the county and movement of minorities such as Hispanic people and Asian people correlates with percent of income spent on rent.



Discussion

Through this exploratory data analysis and data visualization project, we were able to draw a lot of interesting conclusions (see Results Section), many of which sparked new ideas for further research and data analysis. Firstly, areas with high poverty and eviction rates tended to be very large metropolitan areas with high population density and high racial heterogeneity, or very rural areas with low access to resources.

Our scatterplots, in particular, both were pretty powerful tools in showcasing the impact that variables related to housing and rent can have on eviction and poverty rates. Our variables investigated show pretty significant correlations with poverty rate, rent burden, eviction rate, and eviction filing rate. Many of these scatterplots confirmed our hypotheses and existing knowledge about eviction factors. For example, median household income and poverty rate are very highly negatively correlated, with NO outliers at all. One super interesting point of discussion is our visualization in Graph 3. The unchanging data points for counties with low poverty and the highly fluctuating data points (and increasing rent burden and average rent) for areas with high poverty are a strong indicator that we have a cycle of poverty that is really difficult to break. When families fall into poverty, it becomes extremely difficult for family members expand their focus beyond basic survival; individuals become backed into a corner, often living paycheck to paycheck, and are unable to focus on thriving economically, and focusing on things such as education, high quality produce, addressing mental health, and other really important aspects of life. Those in poverty experience high, constant stressors that inevitably lead to long term health impacts, in addition to the economic impacts, and not being able to pursue opportunities that would break this cycle of poverty.

Eviction plays a huge role in this. When families are worried about eviction, or have faced evictions, this presents severe challenges to their daily lives, that renders them focused on survival and thus unable to focus on aspects of life that help with breaking a cycle of poverty. Our visualizations were compelling illustrations of this, and gives us, and those in policy positions very strong reasons to address this issue.

COVID-19, specifically, has had a catastrophic impact on the existing eviction crisis. With the record breaking filings for unemployment and unprecedented lay-offs occurring nationwide, rent burden and fear of eviction has reached an all time high. The short term impact of the pandemic is clear: dangerous economic burden for families, individuals, and children as well as loss of home and basic security; but the long-term impacts are equally startling: how will the economics and health and overall well-beings of people fare with this crisis? What will happen once the eviction moratorium is lifted? The rent freeze only acts as a bandaid, a delay for the inevitable chaos. We had originally chosen this topic due to how salient the eviction crisis had become, and for future investigation, once public data for eviction data from this year is made available by either the Eviction Labs or another source, it would be a good idea to come back to this project and conduct similar analyses, and see how much variables such as rent burden and poverty levels have changed since 2000-2016, ultimately to inform the public about this issue and policymakers about what might be a good starting point for addressing the eviction crisis.

Some limitations we can draw are that our dataset is missing values for certain counties in eviction rate. This is a big drawback because graph 1, graph 2, and graph 3 might be a bit skewed without those small, yet important eviction rate data points. It is especially prevalent for our shiny graph mapping specific counties and their eviction rates. You can notice big gaps between certain years for counties, which makes the data look incomplete.

Additionally, one drawback is how our dataset only provides data between 2000 and 2016. This is quite unfortunate as it would be interesting to see how the eviction rate has changed in the recent years, especially with the pandemic these days. With COVID-19, the job market has significantly decreased, which impacts almost all of our variables in our data set. We would love to have access to a dataset similar to this.

Along with investigating data for 2019-2021 and beyond, two interesting trends we had noticed in our data visualizations included the abrupt rise and drop of eviction rates between 2005 and 2006. In the future, we plan to do research to determine what exactly happened in these years that causes this strange phenomenon, and what counties in particular were affected. Examining the variables at hand for these counties might uncover interesting observations detailing similarities and common characteristics among the affected counties. Secondly, we noticed a decrease in the white population and an increase in the Asian and Hispanic populations in the past 20 years–learning more about why these numbers have changed, as well as exploring the counties with noticeable changes in racial demographics might be interesting in seeing how that correlates with trends we see in eviction.

Future analysis can use the Evictions Lab data set to investigate how eviction trends have fluctuated over time across the United States. One interesting idea might be exploring eviction rate predictions for future years, 2025, 2026, 2027… We would potentially start by looking at which variables best fit a linear regression model so we can plot future trends. It would also be helpful to further investigate the racial breakdown of major metropolitan areas and how that has affected or has been affected by the eviction crisis. Overall, we enjoyed this final project because it means more than looking at simple datasets. This is a topic that is of serious interest to both of us, and hopefully we can explore and expand when ample amounts of data are available.



Reference List:

Stat41 Lab 9, Dynamic Graphics Stat41 Lab 10, Introduction to Shiny Stat41 Lab 11, gganimate HTML customization Eviction Lab Shiny Tools and Tips