library(forecast)
library(tidyverse)
library(kableExtra)
library(tsoutliers)
library(lubridate)
library(stringr)
library(gridExtra)
library(imputeTS)
library(urca)

What will be shown

Many models were evaluated for each time series, but only the results of only the best few will be shown. If a notable model was excluded, an explanation will be given.

Evaluation Process

There are 3 main components of model evaluation

  1. RMSE on time series cross validation
  2. Evaluation of residuals and model summary
  3. Human evaluation of how well the model should work in theory

If one model has a considerably better RMSE than the rest then criteria 1 will mostly trump 2 & 3. Prediction is the ultimate goal of forecasting and as long as the strong predictions can be explained, the more predictive model should be the choice. This, however, is somewhat rare and 2 & 3 will generally factor into the decision.

We use RMSE as the metric because in real world applications large errors should be given more weight. In electricity generation, blackouts are much worse than brownouts. In ATM service, running out of cash mid-day will cause customer complaints, while running out at midnight could be somewhat expected.

Handling Missing Values

Based on sources, interpolation is the best method for filling missing values. The observation after the missing value adds a considerable amount of information to the imputation, making this method better than a naive or ARIMA method. We will use seasonal interpolation when the data is seasonal.

(source: https://towardsdatascience.com/how-to-handle-missing-data-8646b18db0d4)

Reproducibility

Because Cross-Validation is so computationally expensive, these blocks were evaluated offline and the results loaded into CSVs. You can turn eval on for these blocks to compile the code.

ATM Data

This dataset contains daily data for 365 days. (below are some helper functions for dealing with this data)

atm <- read_csv('C:\\Users\\pgood\\OneDrive\\Documents\\DATA624\\ATM624data.csv')

replace_missing <- function (vec, diff = 7){
  for (i in 1:length(vec)){
    if (is.na(vec[i])){
      vec[i] <- vec[i-diff]
    }
  }
  return(vec)
}

clean_atm <- function(name){

  atm_s <- atm %>%
    filter(ATM == name) %>%
    select(Cash)
    
  atm_s <- atm_s[[1]] %>%
    ts(frequency = 7, start = c(1,6)) %>%
    na.seadec(algorithm = "interpolation")
  return(atm_s)
} 

time_plot <- function (name){ 
  autoplot(atm1) + labs(x = 'Week', y = toupper(name), title = 'Withdrawals by Day (100s)')
}

splot <- function (name) ggsubseriesplot(atm1) + labs (Title = 'Subseries Plot', y = toupper(name))
rmse <- function(e) sqrt(mean(e**2, na.rm=TRUE))

get_order <- function(model){ 
  order <- c(model$arma[1], model$arma[6], model$arma[2])
  seasonal <- c(model$arma[3], model$arma[7], model$arma[4])
  return(list(order, seasonal))
}

atm1 <- clean_atm('ATM1')
atm2 <- clean_atm('ATM2')
atm3 <- clean_atm('ATM3')
atm4 <- clean_atm('ATM4')

Seasonality

Our only viable option for frequency is to keep the data at a daily granularity and use a frequency of 7. Using this format, we will be looking for a weekly seasonal pattern, which can be seen to some degree, just by looking at total withdrawals broken out by day.

atm <- read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/ATM624Data.csv')
atmtest <- atm %>% mutate(DATE = mdy_hms(str_replace_all(DATE, '[A-Z]', '')), 
                          wkday = weekdays(DATE)) %>%
  filter(ATM != '')

ggplot(atmtest) + geom_bar(aes(x = wkday, y = Cash), stat = 'sum') + facet_wrap(~ATM) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 90, hjust = 1),
    legend.position = 'none'
    )

The raw sums show that the lowest total withdrawals occur on Thursdays. From a business perspective, this is likely because pay day is on Friday so Thursday is when bank accounts are at their lowest.

p1 <- ggsubseriesplot(atm1) + labs(title = 'ATM1', y = 'ATM1') +   theme(
    axis.text.x = element_text(angle = 90, hjust = 1))
p2 <- ggsubseriesplot(atm2) + labs(title = 'ATM2', y = 'ATM2') +   theme(
    axis.text.x = element_text(angle = 90, hjust = 1))
p3 <- ggsubseriesplot(atm4) + labs(title = 'ATM4', y = 'ATM4') +   theme(
    axis.text.x = element_text(angle = 90, hjust = 1))
grid.arrange(p1, p2, p3, nrow = 2, ncol = 2)

In each of the ATMs there is a stark change in the level of withdrawals on Thursday. Our models will have to be able to account for abrupt changes in seasonality. Using STL to model the seasonally component will not be viable. ETS will need to have a high gamma, and ARIMA models probably won’t have AR terms in the seasonal component.

In addition to abrupt changes in the seasonal level, there are several additional types of time series outliers:

An Additive Outlier (AO) represents an isolated spike.
A Level Shift (LS) represents an abrupt change in the mean level and it may be seasonal (Seasonal Level Shift, SLS) or not.
A Transient Change (TC) represents a spike that takes a few periods to disappear.
An Intervention Outlier (IO) represents a shock in the innovations of the model

(source: https://datascienceplus.com/outliers-detection-and-intervention-analysis/)

The strategy for dealing with these outliers will be discuss in more detail below.

6 types of models will be tested on each ATM:

  1. Standard ARIMA
  2. ARIMA with Box-Cox transform
  3. ARIMA with outlier intervention
  4. Box-Cox ARIMA with outlier intervention
  5. ETS
  6. ETS + BC

Bias adjustment will be applied in the cross validation step because we want the forecasts to be additive. For the 4 ATMS, we want to be able to determine how much money needs to be loaded onto trucks for delivery.

We will go into more detail for each of the 4 ATMs, then wrap up with conclusions and a final model for each.

ATM1

atm1 <- clean_atm('ATM1')
time_plot(atm1)

Based on this plot we notice:

  • Extreme seasonality
  • Cyclical Patterns
  • Some outliers
  • No trend pattern
splot('atm1')

Here we notice some interesting features. There are seasonal level shifts on Thursday and Tuesday. If we add a seasonal difference term, this will create large outliers and could skew our parameters if not dealt with. This is our chief motivation for outlier intervention.

ggtsdisplay(atm1)

It is pretty obvious that a seasonal difference term will be required. It is unclear if a standard difference term will be necessary. We’ll confirm with a KPSS test.

ur.kpss(atm1) %>% summary ()
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.5104 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The test statistic is somewhat borderline, but we’ll try a seasonal difference term.

diffed_atm1 <- diff(atm1, lag = 7)
ggtsdisplay(diffed_atm1)

It appears that the order of the model will be somewhere around (1,0,1)(1,1,1). If auto.arima yields something drastically difference, that will raise red flags.

We confirm the KPSS statistic:

ur.kpss(diffed_atm1) %>% summary()
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0219 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

A Box-Cox tranformation might help with stabalizing the variance. We’ll check for the appropriate lambda

BoxCox.lambda(atm1)
## [1] 0.2645209
atm1_bc <- BoxCox(atm1, 0)

This is close enough to zero to use the log transform.

Now we build and save the ARIMA models. Using auto.arima on each cross validation step is too computationally expensive so we need to build each model with the full dataset and store the order and seasonal terms. This will be passed to the Arima function so that parameter values are updated on each cross validation step.

mod1 <- auto.arima(atm1, stepwise = FALSE, allowmean = FALSE)
mod2 <- auto.arima(atm1, stepwise = FALSE, allowmean = FALSE, lambda = 0)

Now we’ll try to deal with the outliers in our first model. We’ll identify several types of outliers using the tsoutliers package. With these outliers we’ll create a one-hot encoded matrix for the index of each outlier, and then pass that to arima as an exogenous variable. The Arima function will perform a regression using this matrix and ARIMA on the residuals from that regression. This will effectively smooth over the outliers.

atm_tso <- tso(atm1,types=c( "AO", "LS", "TC", 'SLS'))
create_exos <- function (model, rows = 365){
  olers <- model$outliers[[2]]
  
  exos <- matrix(nrow = rows, ncol = length(olers), data = 0)
  
  for (i in 1:length(olers)){
    exos[olers[i], i] <- 1
  }
  
  return(exos)
}

create_zeros <- function (model, h){
  olers <- model$outliers[[2]]
  
  exos <- matrix(nrow = h, ncol = length(olers), data = 0)
  
  return(exos)
}

atm1_exos <- create_exos(atm_tso, 365)
atm1_zeros <- create_zeros(atm_tso, h = 31)

We’ll choose to use time auto.arima function to over the ARIMA function from tsoutliers because it is the best ARIMA algorithm available. In preliminary testing, this mehtod greatly outperformed using the ARIMA modle straight from the tso function.

mod3 <- auto.arima(atm1, include.mean = F, xreg = atm1_exos, stepwise = F)
mod4 <- auto.arima(atm1, include.mean = F, xreg = atm1_exos, stepwise = F, lambda = 0)

Test Models

atm1_olers <- atm_tso$outliers

compare_models <- function (base_data, lambda, exos, olers, model1, model2, model3, model4, ets_mod = 'ANA') {

  
  
  farima1 <- function(x, h) {
    forecast(Arima(base_data, order = get_order(model1)[[1]], seasonal = get_order(model1)[[2]]), h=h)
  }
  
  farima2 <- function(x, h) {
    forecast(Arima(base_data, order = get_order(model2)[[1]], seasonal = get_order(model2)[[2]], lambda = lambda, biasadj = T), h=h, lambda = lambda)
  }
  
  farima3 <- function(x, h) {
    dim <- sum(olers$ind <= length(x))
    if (dim == 0){
      forecast(Arima(base_data,  order = get_order(model3)[[1]], seasonal = get_order(model3)[[2]]), h=h)
    }else{
      forecast(Arima(base_data, xreg = exos[, dim], order = get_order(model3)[[1]], seasonal =       get_order(model3)[[2]]), h=h)
    }
  }
  
  farima4 <- function(x, h) {
    dim <- sum(olers$ind <= length(x))
    if (dim == 0){
      forecast(Arima(base_data,  order = get_order(model4)[[1]], seasonal = get_order(model4)[[2]], lambda = lambda, biasadj = T), h=h, lambda = lambda)
    }else{
      forecast(Arima(base_data, xreg = exos[, dim], order = get_order(model3)[[1]], seasonal =   get_order(model3)[[2]], lambda = lambda, biasadj = T), h=h, lambda = lambda)
    }
    
  }
  
  fets1 <- function(x,h){
    forecast(ets(base_data, ets_mod), h = h)
  }
  
  fets2 <- function(x,h){
    forecast(ets(base_data, ets_mod, lambda = lambda, biasadj = T), h = h, lambda = lambda, biasadj = T)
  }
  

  mean1 <- tsCV(base_data, farima1, h=7) %>% rmse
  mean2 <- tsCV(base_data, farima2, h=7) %>% rmse
  mean3 <- tsCV(base_data, farima3, h=7) %>% rmse
  mean4 <- tsCV(base_data, farima4, h=7) %>% rmse
  mean5 <- tsCV(base_data, fets1, h=7) %>% rmse
  mean6 <- tsCV(base_data, fets2, h=7) %>% rmse

  
  mse_df <- data.frame("Base Model" = round(mean1,2), "BC" = round(mean2, 2),"TSO" = round(mean3,2), "TSO and BC" = round(mean4,2),
                       'ETS' = round(mean5, 2), 'ETS and BC' = round(mean6,2))
  return (mse_df)
}
mse_df1 <- compare_models(atm1, 0, atm1_exos, atm1_olers, mod1, mod2, mod3, mod4)
mse_df1 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/atm1mse.csv') %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Base.Model BC TSO TSO.and.BC ETS ETS.and.BC
47.87 48.6 44.68 44.98 48.3 51.79

The ARIMA models appear best based on interpretability (over the NN) and prediction (over ETS). Of the ARIMA models, the Box-Cox transformation didn’t seem to help predictions so we eliminate those given they add complexity. We’re left with the Base Model and the TSO model.

Forecast Comparison

We now compare a forecast with a holdout set using the best two models, the TSO and and Base Models. The cross validated RMSE comparison should be more reliable, but this will give us some idea of whether one model is better at forecasting certain days. We’ll also have an idea if one model is underpredicting, which could be more of a probablem than overpredicting. We wouldn’t want one of the ATMs to run out of cash.

mod1_temp <- auto.arima(head(atm1, 325), allowmean = FALSE, num.cores = 4)
mod3_temp <- auto.arima(head(atm1, 325), allowmean = FALSE, num.cores = 4, xreg = head(atm1_exos, 325))

test1 <- forecast(mod1_temp, h = 40)
test2 <- forecast(mod3_temp, h = 40, xreg = tail(atm1_exos, 40))

mean1 <- mean((atm1[326:365] - test1$mean) ** 2)
mean2 <- mean((atm1[326:365] - test2$mean) ** 2)

mse_df2 <- data.frame(`Model1 MSE` = round(mean1, 2), `Model3 MSE` = round(mean2, 2))
mse_df2 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Model1.MSE Model3.MSE
268.86 171.17

The MSE is reamins somewhat lower with the TSO model when using holdout data, but his shouldn’t hold too much weight, as mentioned.

We now evaluate how each model handled the seasonality in the holdout set.

test_errors1 <- atm1[326:365] - test1$mean
test_errors2 <- atm1[326:365] - test2$mean

plot1 <- ggsubseriesplot(test_errors1) + labs(y = 'Test Error', title = 'Base Model')

  
plot2 <- ggsubseriesplot(test_errors2) + labs(y = 'Test Error', title = 'No Outliers Model')


grid.arrange(plot1, plot2, ncol = 2, nrow = 1) 

Both models handle Tuesday poorly, but the second model does a somewhat better job. Given that Tuesday was one of the days with a seasonal level shift and the base model underpredicts, this adds credence to using this method. The TSO model could be somewhat better at preventing the ATM from running out of cash.

Validate Models

summary(mod1)
## Series: atm1 
## ARIMA(0,0,1)(0,1,2)[7] 
## 
## Coefficients:
##          ma1     sma1     sma2
##       0.1976  -0.5770  -0.1077
## s.e.  0.0545   0.0506   0.0494
## 
## sigma^2 estimated as 554.4:  log likelihood=-1639.36
## AIC=3286.73   AICc=3286.84   BIC=3302.25
## 
## Training set error measures:
##                       ME     RMSE     MAE       MPE     MAPE      MASE
## Training set -0.07373961 23.22041 14.4814 -102.2182 117.0791 0.8210126
##                      ACF1
## Training set -0.008464868
checkresiduals(mod1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(0,1,2)[7]
## Q* = 14.85, df = 11, p-value = 0.1895
## 
## Model df: 3.   Total lags used: 14
summary(mod3)
## Series: atm1 
## Regression with ARIMA(0,0,4)(0,1,1)[7] errors 
## 
## Coefficients:
##          ma1     ma2     ma3     ma4     sma1    drift           
##       0.2395  0.1193  0.1002  0.1974  -0.4472  -0.0084  -117.4285
## s.e.  0.0581  0.0560  0.0572  0.0536   0.0533   0.1174    13.8109
##                                                                         
##       -95.2432  -121.6092  60.9133  -99.3000  -47.1889  42.5813  52.8908
## s.e.   13.7755    13.7352  13.8093   14.6328   13.9785  13.9895  14.4393
##                                           
##       39.9552  -84.6918  80.0213  -52.7787
## s.e.  14.7506   14.0960  14.7773   14.0744
## 
## sigma^2 estimated as 295.6:  log likelihood=-1517.99
## AIC=3073.97   AICc=3076.22   BIC=3147.7
## 
## Training set error measures:
##                       ME     RMSE      MAE       MPE     MAPE     MASE
## Training set 0.006559564 16.59303 11.75095 -36.49817 56.42534 0.666212
##                      ACF1
## Training set 0.0005893254
checkresiduals(mod3)

## 
##  Ljung-Box test
## 
## data:  Residuals from Regression with ARIMA(0,0,4)(0,1,1)[7] errors
## Q* = 17.031, df = 3, p-value = 0.0006965
## 
## Model df: 18.   Total lags used: 21

Model 1 Base ARIMA

The diagnostics look great, though the model isn’t very complex.

Model 3 TSO

There are 4 MA terms, but the coefficients are small, meaning several recent observations have some, but not a large effect on predictions. There is a drift term, but its value is not material. In 30 days, it would ammount to a less than 100 dollars.

The residuals show a signficant autocorrelation at lag 12 and it fails the Ljung-Box test.

Conclusion

Betweeen the drift term and the autocorrelated residuals, I would not feel comfortable deploying the TSO model, even though it predicts slightly better. Until these irregularities can be explained, the “base” ARIMA model will be best.

fcastatm1 <- forecast(mod1, h = 30)
fcastatm1 %>% autoplot()

ATM2

atm2 <- clean_atm('ATM2')

ap <- autoplot(atm2) + labs(y = 'Withdrawals (100s)', x = 'Week', title = '') 
ag <- ggsubseriesplot(atm2) + labs(title = '' , y = '') +   theme(
    axis.text.x = element_text(angle = 90, hjust = 1))
grid.arrange(ap, ag, ncol = 2, nrow = 1)

This time around we have more days showing a seasonal level shift, making outliers more of a problem.

This series follows a similar pattern to the last, but we still confirm it is not stationary with a unit root test.

ur.kpss(atm2)
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 2.0275

We apply the difference and test again.

atm2_diff <- diff(atm2, 7)

ggtsdisplay(atm2_diff)

ur.kpss(atm2_diff)
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0163

This looks like the series follows somewhere around a (0,0,0)(0,1,1) pattern.

Box-Cox

BoxCox.lambda(atm2)
## [1] 0.6737294

Though we get a parameter around1, we’ll round down to .5 in an attempt to stabalize variance. The Box-Cox lambda selection routine does not account for heteroskedacity; it only seeks to make the data as close as possible to a normal distribution. This is like how correlation is a guide for the strength of relationship between variables, but is not an absolute rule since it doesn’t account for non-linear relations.

Build Models

matm22 <- auto.arima(atm2, stepwise = FALSE, lambda = .5)
matm21 <- auto.arima(atm2, stepwise = FALSE, allowmean = FALSE)
atm_tso2 <- tso(atm2,types=c( "AO", "LS", "TC", 'SLS'))
atm2_exos <- create_exos(atm_tso2)
atm2_olers <- atm_tso2$outliers
matm23 <- auto.arima(atm2, xreg = atm2_exos, stepwise = F)
matm24 <- auto.arima(atm2, xreg = atm2_exos, stepwise = F, lambda = .5)

Test Models

mse_df2 <- compare_models(atm2, .5, atm2_exos, atm2_olers, matm21, matm22, matm23, matm24)
mse_df2 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/atm2mse.csv') %>%  kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Base.Model BC TSO TSO.and.BC ETS ETS.and.BC
51.19 52.06 57.76 58.74 52.8 52.96

Just as with ATM1, the Box-Cox transformation doesn’t help predictions, but this time neither does outlier removal. The base ETS and ARIMA models are the two candidates given their prediction strength and simplicity.

Forecast comparison

matm21_temp <- auto.arima(head(atm2, 325), stepwise = FALSE, allowmean = FALSE, num.cores = 4)
matam22_temp <- ets(head(atm2, 325))

test1 <- forecast(matm21_temp, h = 40)
test2 <- forecast(matam22_temp, h = 40)

mean1 <- mean((atm2[326:365] - test1$mean) ** 2)
mean2 <- mean((atm2[326:365] - test2$mean) ** 2)

mse_df2t <- data.frame(`ARIMA` = round(mean1, 2), `ETS` = round(mean2, 2))
mse_df2t %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
ARIMA ETS
661.88 412.63

ETS outperforms ARIMA in this case with the longer prediction horizon. As mentioned before, cross validation is more reliable, but the difference in cross validation was not material so this prediction should not be ignored.

test_errors1 <- atm2[326:365] - test1$mean
test_errors2 <- atm2[326:365] - test2$mean

plot1 <- ggsubseriesplot(test_errors1) + labs(y = 'Test Error', title = 'ARIMA')

  
plot2 <- ggsubseriesplot(test_errors2) + labs(y = 'Test Error', title = 'ETS')


grid.arrange(plot1, plot2, ncol = 2, nrow = 1) 

Validate Models

summary(matm21)
## Series: atm2 
## ARIMA(2,0,2)(0,1,1)[7] 
## 
## Coefficients:
##           ar1      ar2     ma1     ma2     sma1
##       -0.4241  -0.9128  0.4598  0.8005  -0.7487
## s.e.   0.0547   0.0420  0.0848  0.0572   0.0388
## 
## sigma^2 estimated as 586:  log likelihood=-1648.64
## AIC=3309.28   AICc=3309.52   BIC=3332.56
## 
## Training set error measures:
##                      ME     RMSE      MAE  MPE MAPE      MASE         ACF1
## Training set -0.8529648 23.80665 16.73152 -Inf  Inf 0.8239432 -0.002639386
checkresiduals(matm21)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,0,2)(0,1,1)[7]
## Q* = 9.0594, df = 9, p-value = 0.4318
## 
## Model df: 5.   Total lags used: 14
mets2 <- ets(atm2)
summary(mets2)
## ETS(A,N,A) 
## 
## Call:
##  ets(y = atm2) 
## 
##   Smoothing parameters:
##     alpha = 5e-04 
##     gamma = 0.364 
## 
##   Initial states:
##     l = 72.4338 
##     s = -58.4896 -38.503 16.367 12.0238 24.4642 15.5669
##            28.5707
## 
##   sigma:  24.969
## 
##      AIC     AICc      BIC 
## 4513.224 4513.845 4552.223 
## 
## Training set error measures:
##                     ME     RMSE      MAE  MPE MAPE      MASE       ACF1
## Training set -0.656803 24.65925 17.20917 -Inf  Inf 0.8474651 0.01150378
checkresiduals(mets2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 35.7, df = 5, p-value = 1.09e-06
## 
## Model df: 9.   Total lags used: 14

Model 1 ARIMA

This model has more complexity than the model for ATM1, but nothing looks out of place. -.9 would be normally be an extreme AR term, but it is offset by a .8 MA term. There is no pattern remaining in the residuals

Model 2 ETS

The gamma parameter of .364 is unusually high for ets models and is why the model is able to adapt to changing seasonality. There is some pattern remaining in the residuals, as seen in the ACF plot. This is possibly around the seasonal level shifts since we see a tapering wave-like pattern around week 43 when level shifts occurred. Given that there were no level shifts at the very end of the dataset, so this shouldn’t be a problem for forecasting the next month.

Conclusion

Given the virtual tie in cross validation and ETS model outperforming ARIMA better on the holdout set, the ETS model is slightly preferred.

fcastatm2 <- forecast(mets2, 31)
autoplot(fcastatm2) + labs(title = 'ATM2 Forecast', y = 'withdrawals (100s)', x = 'Week')

ATM3

With ATM3, we do not have enough data to construct an intricate model like ARIMA. ARIMA is based on patterns and with just 3 observations, there are not enough patterns that can be identified. In more technical terms, only AR1 and MA1 terms have 2 observations and the parameter estimate for these terms would have extreme variance.

We could use seasonality from the other models to estimate a seasonal pattern for this ATM, but due to the level shifts seen in the other ATMs, this would not be advisable.

This leaves us with a mean forecast and a naive forecast for options. We choose the mean forecast for two reasons:

  1. There is seasonality that is unaccounted for. If we carry Friday’s observation forward, we’re assuming every day will be like Friday, but that isn’t the case. By taking the average of the 3, we smooth over the potential seasonality to some degree. We’re less right on Friday, but more right every other day.

  2. For other ATMs, the non-seasonal pattern doesn’t follow a random walk. There is some mean reversion. If the observations are noise around a mean then then a mean forecast would be best.

filter(atm, ATM == 'ATM3') %>% tail(7) %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)
DATE ATM Cash
4/24/2010 12:00:00 AM ATM3 0
4/25/2010 12:00:00 AM ATM3 0
4/26/2010 12:00:00 AM ATM3 0
4/27/2010 12:00:00 AM ATM3 0
4/28/2010 12:00:00 AM ATM3 96
4/29/2010 12:00:00 AM ATM3 82
4/30/2010 12:00:00 AM ATM3 85

Conclusion

Forecasts, using the mean, can be seen below.

fcastatm3 <- meanf(tail(atm3,3), 31)
autoplot(fcastatm3) + labs(title = 'ATM3 Forecast', y = 'Withdrawals (100s)', x = 'week')

ATM4

atm4 <- clean_atm('ATM4')

autoplot(atm4) + labs(y = 'Withdrawals (100s)', x = 'Week', title = '', y = 'Withdrawls (100s)')

An observation that is more than an order of magnitude greater than every other observation can be pretty safely removed. This is especially true when considering the logistics of the situation; an ATM can not hold that much cash. We use the same method as before to fill missing values (seasonal interpolation).

atm4[which.max(atm4)] <- NA

atm4 <- na.seadec(atm4, algorithm = "interpolation")
autoplot(atm4) + labs(title = 'ATM4', y = 'Withdrawals (100s)', x ='Week')

ggsubseriesplot(atm4) + labs(title = 'ATM4', y = 'Withdrawals (100s)', x ='Week')

We see a change on Thursday at around the same time index as the previous two ATMs, but this is more of a seasonal pulse because Thursday returns to the orginal level by the end of the series.

Stationarity

ur.kpss(atm4) %>% summary()
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.1229 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The series appears to be stationary, so a difference is not needed.

Box-Cox

BoxCox.lambda(atm4)
## [1] 0.4489024

We will try a square root transform, but there is no theoretical reason for the Box-Cox transformation to work, other than possibly variance stabalization.

Build Models

matm41 <- auto.arima(atm4, stepwise = FALSE)
matm42 <- auto.arima(atm4, stepwise = FALSE, lambda = .5)
matm43 <- ets(atm4)
matm44 <- ets(atm4, lambda = .5)

TSO

atm_tso4 <- tso(atm4, types=c( "AO", "LS", "TC"))
atm4_olers <- atm_tso4$outliers
atm4_olers
## [1] type    ind     coefhat tstat  
## <0 rows> (or 0-length row.names)

There are no outliers so this step is not necesary.

Test Models

compare_models2 <- function (base_data, lambda, model1, model2) {

  
  
  farima1 <- function(x, h) {
    forecast(Arima(base_data, order = get_order(model1)[[1]], seasonal = get_order(model1)[[2]]), h=h)
  }
  
  farima2 <- function(x, h) {
    forecast(Arima(base_data, order = get_order(model2)[[1]], seasonal = get_order(model2)[[2]], lambda = lambda, biasadj = T), h=h, lambda = lambda, biasadj = T)
  }
  
  
  fets1 <- function(x,h){
    forecast(ets(base_data, 'MNA'), h = h)
  }
  
  fets2 <- function(x,h){
    forecast(ets(base_data, 'ANA', lambda = 0, biasadj = T), h = h, lambda = 0, biasadj = T)
  }

  mean1 <- tsCV(base_data, farima1, h=7) %>% rmse
  mean2 <- tsCV(base_data, farima2, h=7) %>% rmse
  mean5 <- tsCV(base_data, fets1, h=7) %>% rmse
  mean6 <- tsCV(base_data, fets2, h=7) %>% rmse

  
  mse_df <- data.frame("Base Model" = round(mean1,2), "BC" = round(mean2, 2),
                       'ETS' = round(mean5, 2), 'ETS+BC' = round(mean6,2))
  return (mse_df)
}

mse_df4 <- compare_models2(atm4, 0, matm41, matm42)
mse_df4 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Base.Model BC ETS ETS.BC
355.36 394.97 391.43 428.52
read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/atm4mse.csv') %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Base.Model BC ETS ETS.BC
355.36 394.97 391.43 428.52

The two simplest models are the top candiates. They will evaluated in more detail.

Forecast Comparison

matm41_temp <- auto.arima(head(atm4, 325), stepwise = FALSE, num.cores = 4)
matm43_temp <- ets(head(atm4, 325))

test1 <- forecast(matm41_temp, h = 40)
test2 <- forecast(matm43_temp, h = 40)

mean1 <- sqrt(mean((atm4[326:365] - test1$mean) ** 2))
mean2 <- sqrt(mean((atm4[326:365] - test2$mean) ** 2))

mse_df4t <- data.frame(`ARIMA RMSE` = round(mean1, 2), `ETS RMSE` = round(mean2, 2))
mse_df4t %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
ARIMA.RMSE ETS.RMSE
282.79 326.35
test_errors1 <- atm4[326:365] - test1$mean
test_errors2 <- atm4[326:365] - test2$mean

plot1 <- ggsubseriesplot(test_errors1) + labs(y = 'Test Error', title = 'ARIMA')

  
plot2 <- ggsubseriesplot(test_errors2) + labs(y = 'Test Error', title = 'ETS')


grid.arrange(plot1, plot2, ncol = 2, nrow = 1) 

The difference is so stark that these graphs use a different scale on the y axis. We will only check that the ARIMA model is valid before stamping approval. The Tuesday predictions are quite poor, but that was probably due to a mini level shift.

summary(matm41)
## Series: atm4 
## ARIMA(1,0,0)(2,0,0)[7] with non-zero mean 
## 
## Coefficients:
##          ar1    sar1    sar2      mean
##       0.0838  0.1467  0.1147  443.3059
## s.e.  0.0523  0.0521  0.0525   26.2340
## 
## sigma^2 estimated as 118621:  log likelihood=-2648.37
## AIC=5306.74   AICc=5306.9   BIC=5326.24
## 
## Training set error measures:
##                      ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -0.2066266 342.5219 282.6796 -501.7264 533.9613 0.8160772
##                       ACF1
## Training set -0.0008499504
checkresiduals(matm41)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,0)(2,0,0)[7] with non-zero mean
## Q* = 15.965, df = 10, p-value = 0.1006
## 
## Model df: 4.   Total lags used: 14
summary(matm43)
## ETS(M,N,M) 
## 
## Call:
##  ets(y = atm4) 
## 
##   Smoothing parameters:
##     alpha = 0.0557 
##     gamma = 0.2341 
## 
##   Initial states:
##     l = 417.5298 
##     s = 0.0814 0.5882 1.2746 1.2056 1.3884 1.1836
##            1.2782
## 
##   sigma:  0.854
## 
##      AIC     AICc      BIC 
## 6404.106 6404.728 6443.105 
## 
## Training set error measures:
##                     ME     RMSE     MAE       MPE     MAPE      MASE
## Training set -30.41475 354.1178 276.572 -433.2215 465.2009 0.7984449
##                    ACF1
## Training set 0.05751325
checkresiduals(matm43)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,M)
## Q* = 13.242, df = 5, p-value = 0.02121
## 
## Model df: 9.   Total lags used: 14

Conclusion

Residuals show a bit of a pattern and they both fail the Ljung-Box Test. There is also noticeable skew in the distribution of residuals. However, residual diagnostics don’t always determine our course of action. The Box-Cox transformed models had slightly better residual diagnostics, but predicted worse:

checkresiduals(matm42)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,0)(2,0,0)[7] with non-zero mean
## Q* = 16.717, df = 10, p-value = 0.08087
## 
## Model df: 4.   Total lags used: 14

The skew means we can’t necessarily trust our prediction intervals and parameter significance. Neither of these affects RMSE.
The pattern means that we could eek out some more information from the model. This does not affect the relative RMSE with the Box-Cox model; untransformed still performs better. In the ATM1 evaluation we paid more attention to the diagnostics because the TSO model is not as well documented as the standard ARIMA. It might fail us in an unexpected manner. Standard ARIMA will not.

fcastatm4 <- forecast(matm41, 31)
autoplot(fcastatm4) + labs(title = 'ATM4 Forecast', y = 'withdrawals (100s)', x = 'Week')

Power

The electricity data uses just one time series so we will not worry about making comparisons across series. It shows monthly residential electricity demand from 1998 until 2013. In this case, under-predictions are only slightly worse than over-predictions. Under-predicting could cause brown outs, but over-predicting will cost money. We will consider than when making our final decision.

For models we will consider the same as the ATM data:

  • Standard ARIMA
  • ARIMA with Box-Cox
  • TSO
  • TSO with Box-Cox
  • ETS
  • ETS with Box-Cox
electric <- read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/Electricdata.csv')


electric_ts <- ts(electric$KWH, frequency = 12, start = c(1998,1))

autoplot(electric_ts) + labs(y = 'KWH', title = 'Electricity Demand') 

There are some temporary shocks in electricity demand. These are legitimate, but unpredictable. Our method of dealing with time series outliers will also be used for this data.

Missing Data

The first order of business will be imputing the missing value. We again use seasonal interpolation.

missing_val <- which(is.na(electric_ts))
mod_el_imp <- auto.arima(head(electric_ts, missing_val - 1))
electric_ts[missing_val] <- as.vector(forecast(mod_el_imp, 1)$mean)

Seasonality

ggsubseriesplot(electric_ts) + labs(y = 'KWH', title = 'Electricity Demand')

There appears to be a trend pattern in the second half of each seasonal series. Both ETS and ARIMA should be able to handle that. The one advantage ARIMA has is being able to take exogenous variables and use our TSO strategy to handle the shock in July.

Stationarity and ACF

ggtsdisplay(electric_ts)

This series will definitely need to be differenced in the ARIMA model. So we apply a seasonal difference.

elec_diff <- diff(electric_ts, 12)
ggtsdisplay(elec_diff)

Based on the ACF plots, we should see somewhere around a (1,0,1)(2,1,1) model. Anything considerable different should be re-evaluated.

Box-Cox

A transformation could help with two features we see in the plot:

  • Stabilizing Variance
  • Seasonality not depending on level of the series

Based on the optimal lambda returned from the function, we will use a log transform.

BoxCox.lambda(electric_ts)
## [1] 0.1227932

Build Models

melec1 <- auto.arima(electric_ts)
melec2 <- auto.arima(electric_ts, lambda = 0)
electric_tso <- tso(electric_ts,types=c( "AO", "LS", "TC", 'SLS'))
ols <- electric_tso$outliers
exos_elec <- create_exos(electric_tso, 192)
melec3 <- auto.arima(electric_ts, xreg = exos_elec)
melec4 <- auto.arima(electric_ts, xreg = exos_elec, lambda = 0)

Test Models

We use the almost the same process as the ATM models because the model types to be evaluated are the same.

lambda <- 0
  
  farima1 <- function(x, h) {
    forecast(Arima(electric_ts, order = get_order(melec1)[[1]], seasonal = get_order(melec1)[[2]]), h=h)
  }
  
  farima2 <- function(x, h) {
    forecast(Arima(electric_ts, order = get_order(melec2)[[1]], seasonal = get_order(melec2)[[2]], lambda = lambda), h=h, lambda = lambda)
  }
  
  farima3 <- function(x, h) {
    dim <- sum(ols$ind <= length(x))
    if (dim == 0){
      forecast(Arima(electric_ts,  order = get_order(melec3)[[1]], seasonal = get_order(melec3)[[2]]), h=h)
    }else{
      forecast(Arima(electric_ts, xreg = exos_elec[, dim], order = get_order(melec3)[[1]], seasonal = get_order(melec3)[[2]]), h=h)
    }
  }
  
  farima4 <- function(x, h) {
    dim <- sum(ols$ind <= length(x))
    if (dim == 0){
      forecast(Arima(electric_ts,  order = get_order(melec4)[[1]], seasonal = get_order(melec4)[[2]], lambda = lambda), h=h, lambda = lambda)
    }else{
      forecast(Arima(electric_ts, xreg = exos_elec[, dim], order = get_order(melec4)[[1]], seasonal = get_order(melec4)[[2]], lambda = lambda), h=h, lambda = lambda)
    }
    
  }
  
  fets1 <- function(x,h){
    forecast(ets(electric_ts), h = h)
  }
  
  fets2 <- function(x,h){
    forecast(ets(electric_ts, lambda = lambda), h = h, lambda = lambda)
  }
  

  mean1 <- tsCV(electric_ts, farima1, h=12) %>% rmse
  mean2 <- tsCV(electric_ts, farima2, h=12) %>% rmse
  mean3 <- tsCV(electric_ts, farima3, h=12) %>% rmse
  mean4 <- tsCV(electric_ts, farima4, h=12) %>% rmse
  mean5 <- tsCV(electric_ts, fets1, h=12) %>% rmse
  mean6 <- tsCV(electric_ts, fets2, h=12) %>% rmse

  
mse_df5 <- data.frame("Base Model" = round(mean1,2), "BC" = round(mean2, 2),"TSO" = round(mean3,2), "TSO and BC" = round(mean4,2),
                       'ETS' = round(mean5, 2), 'ETS and BC' = round(mean6,2))

mse_df5 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/elecmse.csv') %>%  kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Base.Model BC TSO TSO.and.BC ETS ETS.and.BC
2111292 1601145 2398081 2301340 2256008 2003663

This data has best theoretical reason for the Box-Cox transformation to help, seasonality depends on level, which is likely why we see the BC models outperform all of their conterparts. We’ll choose the ETS and ARIMA models to further test, both with log transformations applied.

melec1_temp <- auto.arima(head(electric_ts, 168), stepwise = FALSE, num.cores = 4, lambda = 0)
melec2_temp <- ets(head(electric_ts, 168), lambda = 0)

test1 <- forecast(melec1_temp, h = 24)
test2 <- forecast(melec2_temp, h = 24)

mean1 <- sqrt(mean((electric_ts[169:192] - test1$mean) ** 2))
mean2 <- sqrt(mean((electric_ts[169:192] - test2$mean) ** 2))

mse_temp3 <- data.frame(`ARIMA RMSE` = round(mean1, 2), `ETS RMSE` = round(mean2, 2))
mse_temp3 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
ARIMA.RMSE ETS.RMSE
2011325 1300648

This is a big enough difference to take note.

test_errors1 <- electric_ts[169:192]- test1$mean
test_errors2 <- electric_ts[169:192] - test2$mean

plot1 <- ggsubseriesplot(test_errors1) + labs(y = 'Test Error', title = 'ARIMA')

  
plot2 <- ggsubseriesplot(test_errors2) + labs(y = 'Test Error', title = 'ETS')


grid.arrange(plot1, plot2, ncol = 2, nrow = 1) 

It should be noted that the ARIMA model underpredicts much more often than the ETS model. We had mentioned earlier that underpredictions could be devestating, given they could cause blackouts.

summary(melec2)
## Series: electric_ts 
## ARIMA(0,1,2)(0,0,2)[12] 
## Box Cox transformation: lambda= 0 
## 
## Coefficients:
##           ma1      ma2    sma1    sma2
##       -0.7551  -0.2225  0.2146  0.1962
## s.e.   0.0719   0.0708  0.0773  0.0660
## 
## sigma^2 estimated as 0.05525:  log likelihood=5.67
## AIC=-1.33   AICc=-1.01   BIC=14.93
## 
## Training set error measures:
##                    ME    RMSE      MAE       MPE     MAPE     MASE
## Training set 259952.2 1220740 874916.7 -2.615399 16.59105 1.275419
##                   ACF1
## Training set 0.1614835
checkresiduals(melec2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,2)(0,0,2)[12]
## Q* = 48.067, df = 20, p-value = 0.0004163
## 
## Model df: 4.   Total lags used: 24
melec6 <- ets(electric_ts, lambda = 0)
summary(melec6)
## ETS(A,N,A) 
## 
## Call:
##  ets(y = electric_ts, lambda = 0) 
## 
##   Box-Cox transformation: lambda= 0 
## 
##   Smoothing parameters:
##     alpha = 0.0442 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 15.5848 
##     s = -0.0286 -0.2531 -0.1097 0.1793 0.2659 0.0773
##            0.0164 -0.2288 -0.1927 -0.0551 0.0914 0.2377
## 
##   sigma:  0.1997
## 
##      AIC     AICc      BIC 
## 406.2832 409.0105 455.1457 
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 225357.2 883406.9 565558.4 -1.432896 11.95219 0.8244489
##                   ACF1
## Training set 0.2465017
checkresiduals(melec6)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 9.8757, df = 10, p-value = 0.4515
## 
## Model df: 14.   Total lags used: 24

Model 1 ARIMA + BC

It is clear based on the pattern in residuals that a seasonal difference is needed; though none of these values actually exceeds the critical value, the pattern repeats over the lags tested. However, in offline, testing the cross validation RMSE for a model with a seasonal difference term and no standard difference term was about 15% higher. Again, we are choosing prediction over theory.

Model 2 ETS + BC

The ETS model’s residuals look like white noise. The gamma parameter is quite low, indicating the ETS model is not very adaptable to changes in seasonality.

Conclusion

There were somewhat conflicting indicators between the ETS and ARIMA models. In the end we will go with ARIMA because:

  • The ETS defeated ARIMA over a 24 month horizon, but we are predicting 12 months. Cross validation was done on 12 month predictions and ARIMA outperformed by around 20%
fcast_elec <- forecast(melec2, 12)
autoplot(fcast_elec) + labs(title = 'Electricty Forecast ARIMA + BC', y = 'KWH', x = 'Year')

Pipes Data

The pipes dataset is less straightforward than our other time series.

  • The data must be aggregated by hour before input into a model
  • Seasonality is more difficult to establish and estimate
  • There are more observations making CV more difficult

Data Cleaning

standardize <- function(x) (x)/sd(x)
pipe1_raw <- read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/Waterflow_Pipe1.csv')
pipe2_raw <- read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/Waterflow_Pipe2.csv')

pipe1_test <- ts(pipe1_raw$WaterFlow, frequency = 24)
pipe2_test <- ts(pipe2_raw$WaterFlow, frequency = 24)

pipe2_sd <- sd(pipe2_raw$WaterFlow)

pipe1_temp <- pipe1_raw %>%
  mutate(dt = floor_date(mdy_hm(`Date Time`), unit = 'hour'), WaterFlow = standardize(WaterFlow))

pipe2_temp <- pipe2_raw %>%
  mutate(dt = floor_date(mdy_hm(`Date Time`), unit = 'hour'), WaterFlow = standardize(WaterFlow))

all_pipes <- union_all(pipe1_temp, pipe2_temp) %>%
  group_by(dt) %>%
  summarize(mean_flow = mean(WaterFlow))

water_ts1 <- ts(all_pipes$mean_flow, frequency = 24)

p1 <- autoplot(pipe1_test) + labs(title = 'Pipe1 Raw', y = '')
p2 <- autoplot(pipe2_test) + labs(title = 'Pipe2 Raw', y = '')
p3 <- autoplot(water_ts1) + labs(title = 'Aggregated Pipes', y = '')
grid.arrange(p1, p2,p3, nrow = 2, ncol = 2)

The flow from the two pipes has different levels. If we aggregate without a transformation there is an apparent upward trend until the data from pipe1 ends. What we want to assume is that waterflow from the two pipes follows the same pattern of change, just at different levels. To make this work, we scale the data from each pipe by is standard deviation. Now that the two pipes are on the same scale, they can be treated as one.

We confirm no values are missing by testing whether all days have 24 hours. Only this final day should be below 24 hours.

test_df <- all_pipes %>% group_by(date(dt)) %>%
  summarize(hrs = n())

sum(test_df$hrs != 24)
## [1] 1

Seasonality Evaluation

ggsubseriesplot(water_ts1) + labs(title = 'Pipes Data', y = 'Waterflow', x ='Day')

ggtsdisplay(water_ts1)+ labs(title = 'Pipes Data', y = 'Waterflow', x ='Day')

It is difficult to determine if this seasonal pattern is white noise or not. From the subseries plot, the within-hour pattern looks like white noise, but the ACF plot has spikes at lag24, giving us conflicting stories. Indstead of seasonal level shifts in the seasonal pattern, there is a more whitenoise-like pattern with mean reversion. It is possible the smoothing of stl will help model seasonality. We will, therefore, add stl + ARIMA to our list of models.

When we plot the stl decomposition of the data, we see the seasonal portion gradually, rather than abrubtly, change over time. ETS and ARIMA also can model gradual changes, but we have a little more control with stl by setting the seasonal window.

plot(stl(water_ts1, s.window = 13))

Stationarity

ggtsdisplay(water_ts1)

ur.kpss(water_ts1) %>% summary()
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 7 lags. 
## 
## Value of test-statistic is: 0.075 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

The series passes the stationarity test, and given the level of mean reversion, this is not a surprise; the series is extremely choppy.

Box-Cox

BoxCox.lambda(water_ts1)
## [1] 0.7574478

We use a log transform despite the results of the BoxCox.lambda function because the heteroskedacity. During the time when we had obvervations from both pipes, the variance is naturally lower (by a factor of $). We can see that visually in the timeplot. We choose the log transform because it will cause the highest degree of stabalization.

Build Models

mwater11 <- auto.arima(water_ts1)

mwater12 <- auto.arima(water_ts1, lambda = 0)
mwater13 <- stlm(water_ts1, s.window = 13)

Outliers

water1_tso <- tso(water_ts1, types=c( "AO", "LS", "TC"))
water1_tso$outliers
## [1] type    ind     coefhat tstat  
## <0 rows> (or 0-length row.names)

The TSO package is not able to identify any outliers. The TSO model will not be needed for this evaluation

Test Models

ETS isn’t a good model to evaluate longer term seasonality. We will replace ETS with an STL + ARIMA model. The TSO model isn’t of use, given the lack of outliers, so we will create the following models:

  • ARIMA
  • ARIMA + BC
  • STL + ARIMA
  • STL + ARIMA + BC
water_sub <- window(water_ts1, start = 4, end = 24)


farima_w1 <- function(x, h){
  forecast(auto.arima(water_sub), h = h)
}

farima_w2 <- function(x, h){
  forecast(auto.arima(water_sub, lambda = 0), h = h, lambda = 0)
}


farima_w3 <- function(x, h) {
    forecast(stlm(water_sub, method = 'arima', s.window = 13), h=h)
}

farima_w4 <- function(x, h) {
    forecast(stlm(water_sub, method = 'arima', lambda = 0, s.window = 13), h=h, lambda = 0)
}


mean1 <- tsCV(water_sub, farima_w1, h=24, window = 300) %>% rmse
mean2 <- tsCV(water_sub, farima_w2, h=24, window = 300)%>% rmse
mean3 <- tsCV(water_sub, farima_w3, h=24, window = 300)%>% rmse
mean4 <- tsCV(water_sub, farima_w4, h=24, window = 300)%>% rmse


mse_df6 <- data.frame('Base ARIMA' = round(mean1,2), 'BC+ARIMA' = round(mean2, 2), 'STL' = round(mean3,2),
                      'STL+BC' = round(mean4,2))
mse_df6 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
read_csv('https://raw.githubusercontent.com/TheFedExpress/Data/master/watermse.csv') %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
Base.ARIMA BC.ARIMA STL STL.BC
0.99 0.99 1.03 1.05

The simpler model outperforms the more complex model in this case. We’ll also compare the STL model to the standard ARIMA on a holdout set, given the concessions we had to make to get tsCV to work.

Forecast Comparison

We choose 1 week as the length of our holdout set for this data. This is the length of time we will ultimately need to predict on and is a good percentage of the dataset length for a time series.

mwater_temp <- auto.arima(head(water_ts1, 833), num.cores = 4)
mwater_temp2 <- stlm(head(water_ts1, 833), s.window = 13)

test1 <- forecast(mwater_temp, h = 168)
test2 <- forecast(mwater_temp2, h = 168)

mean1 <- sqrt(mean((water_ts1[834:1001] - test1$mean) ** 2))
mean2 <- sqrt(mean((water_ts1[834:1001] - test2$mean) ** 2))

mse_temp5 <- data.frame('ARIMA' = round(mean1, 2), 'ARIMA+STL' = round(mean2, 2))
mse_temp5 %>% kable () %>% kable_styling(bootstrap_options = "striped", full_width = F)
ARIMA ARIMA.STL
1.04 1.1

The simpler model outperforming has been the pattern. Standard ARIMA outperforms on both tests.

Model Evaluation

pipe_data_new <- water_ts1 * pipe2_sd

mwater_new <- auto.arima(pipe_data_new )
summary(mwater11)
## Series: water_ts1 
## ARIMA(0,0,0)(0,0,1)[24] with non-zero mean 
## 
## Coefficients:
##         sma1    mean
##       0.0737  2.4578
## s.e.  0.0316  0.0304
## 
## sigma^2 estimated as 0.807:  log likelihood=-1312.07
## AIC=2630.14   AICc=2630.16   BIC=2644.86
## 
## Training set error measures:
##                         ME      RMSE      MAE       MPE     MAPE      MASE
## Training set -8.787421e-05 0.8974075 0.706295 -27.36252 47.38482 0.7347257
##                     ACF1
## Training set -0.01807167
checkresiduals(mwater11)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,0)(0,0,1)[24] with non-zero mean
## Q* = 64.039, df = 46, p-value = 0.04034
## 
## Model df: 2.   Total lags used: 48
checkresiduals(mwater13)

Model 1 Standard ARIMA

We have an extremely simple model with just one coefficient, but that shouldn’t be too much of a surprise given the look of the data. There is stil some pattern in the residuals, indicating some informtation could still be uncovered.

Model 2 ARIMA + STL

Especially notable is the spike at 24 hours. Seasonality was hard for us to deal with in this data, but it appears we did the correct thing; nothing. The STL decomposition seems to have overfit the seasonal adjustment and induced some sort of autocorrelation.

Conclusion

We will forecast with our extremly simple SMA model. It had the best results and the best diagnostics. Since we scaled the data, it would need to be rescaled to form true forecasts. We’re assuming these forecasts are for pipe2, so we’ll multiply by the standard deviation of pipe2 waaterflow. To forecast for pipe1, the data could be multiplied by pipe1’s standard deviation.

fcast_pipes <- forecast(mwater_new, 168)
autoplot(fcast_pipes) + labs(title = 'WaterFlow Forecast ARIMA', y = 'WaterFlow', x = 'Day')