Forecasting

Python is great, but when it comes to forecasting, R still has the upper hand. The Forecast package is the most complete forecasting package available on R or Python, and it’s worth knowing about it.

Here is what we will see in this article: 1. Naive methods 2. Exponential Smoothing (State-space models and DSHW) 3. BATS and TBATS 4. ARIMA/SARIMA models

How to set up a one-step-ahead forecast

For every method, we will build a model on a validation set, forecast with it for the duration of the validation set and compare the forecast with the real observations to obtain a Mean Absolute Percentage Error (MAPE).

library(forecast)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(MLmetrics)
## 
## Attaching package: 'MLmetrics'
## The following object is masked from 'package:base':
## 
##     Recall
data=AirPassengers

View(data)
str(data)
##  Time-Series [1:144] from 1949 to 1961: 112 118 132 129 121 135 148 148 136 119 ...
#Create samples

training=window(data, start = c(1949,1), end = c(1955,12))
validation=window(data, start = c(1956,1))

1. Naive Methods

Any forecasting method should be evaluated by being compared to a naive method. This helps ensure that the efforts put in having a more complex model are worth it in terms of performance. The simplest of all methods is called simple naive. Extremely simple: the forecast for tomorrow is what we are observing today. Another approach, seasonal naive, is a little more “complex”: the forecast for tomorrow is what we observed the week/month/year (depending what horizon we are working with) before. Here is how to do a seasonal naive forecast:

naive = snaive(training, h=length(validation))
MAPE(naive$mean, validation) * 100
## [1] 27.04689

That gives us an MAPE of 27.04%. That’s the score to beat. By the way, remove the s from “snaive” and you have the code for simple naive. Here is how to plot the forecast:

plot(data, col="blue", xlab="Year", ylab="Passengers", main="Seasonal Naive Forecast", type='l')
lines(naive$mean, col="red", lwd=2)

2. Exponential Smoothing

There are many ways to do exponential smoothing. The idea is always to have a declining weight given to observations. The more recent an observation, the more importance it will have in our forecast. Parameters can also be added. You can for instance add a trend paramenter (Holt method) or add a seasonality (Holt-Winters).

2.1 State Space Models

With the Forecast Package, smoothing methods can be placed within the structure of state space models. By using this structure, we can find the optimal exponential smoothing model, using the ets function.

ets_model = ets(training, allow.multiplicative.trend = TRUE)
summary(ets_model)
## ETS(M,Md,M) 
## 
## Call:
##  ets(y = training, allow.multiplicative.trend = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.6749 
##     beta  = 0.0082 
##     gamma = 1e-04 
##     phi   = 0.98 
## 
##   Initial states:
##     l = 119.9101 
##     b = 1.0177 
##     s = 0.9065 0.7973 0.9184 1.0524 1.1863 1.1987
##            1.0896 0.9798 0.9957 1.046 0.9167 0.9126
## 
##   sigma:  0.0395
## 
##      AIC     AICc      BIC 
## 726.4475 736.9706 770.2022 
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE      ACF1
## Training set 0.9845273 7.241276 5.726611 0.2712157 2.907857 0.2145244 0.0453492

We see ETS (M, Md, M). This means we have an ets model with multiplicative errors, a multiplicative trend and a multiplicative seasonality. Basically, multiplicative means that the parameter is “amplified” over time. For instance the trend is getting bigger and bigger as time goes by (our case here). Here is how to forecast using the estimated optimal smoothing model:

ets_forecast = forecast(ets_model, h=length(validation))
MAPE(ets_forecast$mean, validation) *100
## [1] 12.59147

We see that the upward trend in demand is being capture a little bit (far from perfect, better than naive). It gives an MAPE of 12.6%.

2.2 Double Seasonal Holt-Winters

The ets function is good, but it only allows for one seasonality. Sometimes, the data we have can be composed of multiple seasonalities (monthly and yearly for instance). Double Seasonal Holt-Winters (DSHW) allows for two seasonalities: a smaller one repeated often and a bigger one repeated less often. For the method to work however, the seasonalities need to be nested, meaning one must be an integer multiple of the other (2 and 4, 24 and 168, etc.). The code here is a bit different since we need to specify the lenghts of our two seasonalities (which is not always something we know) and the forecast is computed directly when creating the model with the dshw function.

dshw_model = dshw(training, period1=4, period2 = 12, h=length(validation))
MAPE(dshw_model$mean, validation)*100
## [1] 3.27219

We get a MAPE of 3.7% with this method!

3. BATS and TBATS

DSHW is good, but some processes have more complex seasonalities, which our previous functions cannot handle. Indeed, you could have both a weekly and yearly seasonality. You could even have more than 2! BATS and TBATS allow for multiple seasonalities. TBATS is a modification (an improvement really) of BATS that allows for multiple non-integer seasonality cycles. Here is how to build a TBATS model and forecast with it:

tbats_model = tbats(training)
tbats_forecast = forecast(tbats_model, h=length(validation))
MAPE(tbats_forecast$mean, validation) * 100
## [1] 12.84821

We get a MAPE of 12.9% for this method.

4. ARIMA/SARIMA models

ARIMA models contain three things: AR(p): autoregressive part of the model. Means that we use p past observations from the timeseries as predictors. Differencing (d): Used to transform the timeseries into a stationary one by taking the differences between successive observations at appropriate lags d. MA(q): uses q past forecast errors as predictors. That’s it for ARIMA but if you know the data you have is seasonal, then you need more. That’s where SARIMA comes into play. SARIMA adds a seasonal part to the model.

The auto.arima function can be used to return the best estimated model. Here is the code:

arima_optimal = auto.arima(training)

The function returned the following model: ARIMA(0,1,1)(1,1,0)[12]. To forecast a SARIMA model (which is what we have here since we have a seasonal part), we can use the sarima.for function from the astsa package.

library(astsa)
## 
## Attaching package: 'astsa'
## The following object is masked from 'package:forecast':
## 
##     gas
sarima_f <- sarima.for(training, n.ahead=length(validation),
                              p=0,d=1,q=1,P=1,D=1,Q=0,S=12)

MAPE(sarima_f$pred, validation) * 100
## [1] 6.544624

We get a MAPE of 6.5% with this SARIMA model.

Just so you know, we could in theory complexify things even more by adding exogenous variables (explanatory variables) to an ARIMA/SARIMA model, which would make it SARIMAX. For this data, DSHW gave the best results. Keep in mind however that no model does best all the time.