Abstract

This analysis considers how DV-related 911 calls in Douglas County, Nebraska have behaved leading up to and throughout the onset of COVID-19. More specifically, we consider to what extent DV-related 911 calls might have been impacted by the onset of COVID-19. We employ a quasi-experimental approach. First, we rely on a seasonal autoregressive integrated moving average (SARIMA) model to predict how DV-related 911 calls would be expected to behave in the absence of COVID-19 (i.e. a quasi-control group). We then compare projected rates of DV-related 911 calls in the quasi-control group to actual rates of calls (i.e. the treatment group). Comparisons between the quasi-control and treatment groups indicate [summary of findings].

This analysis includes the folloiwing sections:

1. Initial Data Exploration
2. Selecting the Model & Tuning the Model's Hyperparameters
3. Validating the Model Through Train & Test Splits
4. Modeling Anticipated Number Calls Absent COVID-19 (i.e. creating the quasi-control group)
5. Inferring Impact of COVID-19 on DV-Related 911 Calls

1. Initial Data Exploration


As a first step, we conducted initial exploration of the data. Monthly number of calls ranged from 881 to 1616, with a mean of 1325. The data exhibited strong seasonality, with monthly number of calls peaking in the summer months and subsiding in the winter months. Trends involving overall increases or decline in rates of calls over time were not apparent.

2. Selecting the Model & Tuning the Model's Hyperparameters


Next, we applied an Augmented-Dickey Fuller test to the data in order determine it's stationarity, that is, whether it's variance, autocorrelation, etc. are all constant over time. Not surprisingly, the results of the Augmented-Dickey Fuller test allow us to reject the null hypothesis that the data are non-stationary. In other words, we can be confident the data is not stationary.

## 
##  Augmented Dickey-Fuller Test
## 
## data:  ts_call_data_control_14
## Dickey-Fuller = -5.2432, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary


Next, we applied a seasonal decomposition of the data in an effort to better understand the data's trends, specifically it's seasonality. Plotting the trend-cycle and seasonal indices computed by the R's stl() function shows that the data have strong seasonal fluctuations and a varying overall trend.


Given the non-stationarity of the data, specifically the strong seasonal fluctuations exhibited, we sought to apply a seasonal autoregressive integrated moving average (SARIMA) model. ARIMA models provide an alternative approach to time series forecasting when data is non-stationary.

We relied on R's auto.arima() function to get an initial sense of ideal model parameters. R's auto.arima() function uses a variation of the Hyndman-Khandakar algorithm to obtain the "best" fit SARIMA model, as determined by the smallest AICc value.

## Series: ts_call_data_control_14 
## ARIMA(0,1,1)(0,1,1)[12] 
## 
## Coefficients:
##           ma1     sma1
##       -0.5744  -0.7322
## s.e.   0.1069   0.2303
## 
## sigma^2 estimated as 5155:  log likelihood=-350.91
## AIC=707.82   AICc=708.24   BIC=714.15


In an effort to vet parameters provided by R's auto.arima() function, we plotted the data's autocorrelation function (ACF) and partial autocorrelation function (PACF). ACF plots display the correlation between a series and its lags. PACF plots display the partial correlation between a series and its lags, absent the impact of previous lags.

Within the context of ARIMA modeling, ACF and PACFs are used to tentatively identify the parameters of AR and/or MA components of the ARIMA model. Results of our ACF and PACFs indicate...

[summative text to go here]

3. Validating the Model Through Train & Test Splits


[text to go here explaining the train and test split process]


[text to go here explaining the accuracy of the model by relying on MAPE (mean absolute percentage error)]

##                    ME     RMSE      MAE       MPE      MAPE      MASE
## Training set 6.274432 32.15437 10.31893 0.5481795 0.8837064 0.1910913
##                    ACF1
## Training set -0.1877914