EvictionData <- read.csv("C:/Users/schic/OneDrive/Documents/Predictive Analytics and Forecasting/Monthly Eviction Filings by Location.csv")
This data set, from data.world, provides monthly evictions for 10 states and 34 cities from 2020 until 2023. Including the city, the data collection type, the GEOID, the racial majority for the area, the month-year of occurrences, the number of filings, average filings, and the date of last update.
#check for missing values
missmap(EvictionData)
There is no missing data, so we will not need to impute any data points
#selecting columns of interest from original dataset
Evictions <- EvictionData %>%
select('city', 'racial_majority', 'month', 'filings', 'GEOID')
#filter for a specific location
#target <- c("35001000107", "35001002100", "35001002300")
Evictions <- Evictions %>%
filter(city == 'Albuquerque, NM' ) #GEOID %in% target #racial_majority == "Latinx"
#convert month-year to date
Evictions$date <- as.Date(paste0("01-", Evictions$month), format = "%d-%b-%y")
#create new data frame to sum each instance of eviction filing to homogenize city communities
EvictionsNew <- Evictions %>%
select('date', 'filings')
EvictionsFinal <- EvictionsNew %>%
group_by(date) %>%
summarize(total_filings = sum(filings))
#Historic Record of Evictions in the US from 2020 to 2023
ggplot(EvictionsFinal, aes(x = date, y = total_filings)) +
geom_line() +
labs( x = 'Date', y = "Evictions", title = "Total Number of Monthly Evictions in Albuquerque, NM from 2020 to 2023") +
theme_minimal() +
scale_x_date(date_breaks = "3 months", date_labels = "%b-%y")
#theme(legend.position = 'none')
In this historical data graph, we can see the historical/observed eviction data for the city of Albuquerque, NM.
# Create a time series object
EvictionsTS <- ts(EvictionsFinal$total_filings, start = c(2020, 1), frequency = 12)
# Perform decomposition
additive <- decompose(EvictionsTS, type = 'additive')
multiplicative <- decompose(EvictionsTS, type = 'multiplicative')
plot(additive)
plot(multiplicative)
Based on the Multiplicative and Additive decomposition plots above, we can see that both methods of decomposition perform similarly in the visual nature of these trends, however, when analyzing the nature of the remainder values, we can see that there are larger fluctuations from peak to trough in the multiplicative decomposition. Since the seasonality trend seems to possess the same pattern across months from year to year, it is safe to assume, based on the pros of additive decomposition, that the additive decomposition method is better suited to the predictions of monthly eviction data in Albuquerque, NM from the data presented (highly impacted by COVID). Additionally, the outlined trend seems to be somewhere between an exponential and linear trend. In this case, I would even argue that it seems to be more linear than exponential, further pointing toward the additive decomposition being the stronger method in this instance.
Since this data is isolated to the effects of COVID, it is safe to say that there isn’t much predictive power for the current future, however, this could have been useful in a scenario where the effects of COVID are still an issue today.