Introduction

Forecasting is the first important stage of workforce management planning. WFM forecastors create the forecast of ,including others, volume of conversations to be expected in some future time. Forecasting future contacts is a difficult task because of the uncertainty associated with the many factors that determine contacts as well as the nature of distribution of contacts in time.

Time series data is an object observed in many consecutive units of time in chronological order. There are three various methods of dealing with time series forecasting.

Time series models - Time series models define the current value of a variable as a function of its own history. Thus no other parameter is involved.

Econometric models - Econometric models use other supportive variables to model the variation in the target variable.

Hybrid models - Hybrid models enjoy the best of the two worlds.

While there is not a single best method for forecasting, there are widely used methods and best practices. We will discuss some of the most commonly used time series models and hybrid approaches in this blog. We will also cover the metrics which can indicate the accuracy of the forecast.

accuracy() function from forecast package gives mean error (ME), root mean square error (RMSE), mean absolute error (MAE), mean percentage error (MPE), mean absolute percentage (MAPE), mean absolute squared error (MASE), autocorrelation function index (ACFI) and Theil’s U values.

MAPE is considered the best accuracy measure since it is not sensitive to sign of error and the magnitude of units. We will use MAPE measures to compare the accuracy of different methods to be discussed.

EDA

Loading the data

HistoricalData <- read.csv("historicalData.csv")

Data Description

The data we will use is historical monthly volume data from Jan 2019 to Dec 2022 from a hypothetical contact center for a single channel.

  • Period - first day of the month representing the particular month
  • Email - the number of email contacts
  • Phone - the number of phone call contacts
HistoricalData$Period <- as.Date(HistoricalData$Period,format = "%d/%m/%Y")
HistoricalData$Email <- as.numeric(HistoricalData$Email)
HistoricalData$Phone <- as.numeric(HistoricalData$Phone)
str(HistoricalData)
## 'data.frame':    48 obs. of  3 variables:
##  $ Period: Date, format: "2019-01-01" "2019-01-02" ...
##  $ Email : num  12322 10160 9827 8986 9798 ...
##  $ Phone : num  5544 4368 4422 3863 4507 ...

Our goal is to forecast the email contacts for the next 12 months. The phone contacts will be used as an additional variable when we discuss hybrid models.

Let’s define the Email column as a time series object. ts() function from stats package does just that.

data_ts <- ts(HistoricalData[,2], frequency = 12) # 12 since monthly data

Time Series Plots

Time series plots help us to identify outliers in the data. The outliers need to be handled before modeling.

To plot the time series data, we will use fpp2 package.

# install.packages("fpp2")
library(fpp2) 
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.4 --
## v ggplot2   3.3.5     v fma       2.4  
## v forecast  8.15      v expsmooth 2.3
## 
autoplot(data_ts)  # time series plot

ggseasonplot(data_ts,main = "Season Plot") # time series plot for each season

ggseasonplot(data_ts, polar = T,main = "Polar Season Plot") # polar season plot

Data partitioning

It is important to test your models on the historical data. We will, therefore, divide our historical data to train and test samples. The true test data is ofcourse unknown and it is in the future, however, testing on a out-of-sample data gives you confidence that the model will be robust against the future reality as well.

train <- subset(data_ts, end = length(data_ts) - 12) # year 2019 and 2020
test <- subset(data_ts, start = length(data_ts) - 12+1) # 2021 data

Methods

1. Naive Method/ Random walk without drift

naive() function from forecast base package can be used for naive forecasts. Naive forecast extends the last value for the selected forecast period.

fc_naive <- naive(train, 12) # forecast 12 periods ahead

autoplot(fc_naive) # train, forecast together. 

We can check the error band and the residuals.

summary(fc_naive)
## 
## Forecast method: Naive method
## 
## Model Information:
## Call: naive(y = train, h = 12) 
## 
## Residual sd: 1772.4178 
## 
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 6.457143 1772.418 1343.314 -1.167537 12.18943 0.8223115
##                     ACF1
## Training set -0.08304126
## 
## Forecasts:
##       Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## Jan 4          12548 10276.555 14819.44 9074.1250 16021.87
## Feb 4          12548  9335.692 15760.31 7635.1989 17460.80
## Mar 4          12548  8613.742 16482.26 6531.0721 18564.93
## Apr 4          12548  8005.111 17090.89 5600.2501 19495.75
## May 4          12548  7468.895 17627.10 4780.1794 20315.82
## Jun 4          12548  6984.119 18111.88 4038.7789 21057.22
## Jul 4          12548  6538.322 18557.68 3356.9908 21739.01
## Aug 4          12548  6123.384 18972.62 2722.3978 22373.60
## Sep 4          12548  5733.666 19362.33 2126.3751 22969.62
## Oct 4          12548  5365.061 19730.94 1562.6428 23533.36
## Nov 4          12548  5014.470 20081.53 1026.4602 24069.54
## Dec 4          12548  4679.485 20416.52  514.1441 24581.86
checkresiduals(fc_naive) # Do they look like white noise? 

## 
##  Ljung-Box test
## 
## data:  Residuals from Naive method
## Q* = 8.5062, df = 7, p-value = 0.2901
## 
## Model df: 0.   Total lags used: 7

2. Random Walk with Drift - Seasonal Naive

Instead of extending the very last instance of train data, we can extend the respective last value from the last period for each month using seasonal naive models.

fc_snaive <- snaive(train, 12)
autoplot(fc_snaive) # captures some seasonality in the data

# error band and the residuals
checkresiduals(fc_snaive) # does the residual look like white noise?

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 19.273, df = 7, p-value = 0.007373
## 
## Model df: 0.   Total lags used: 7
summary(fc_snaive)
## 
## Forecast method: Seasonal naive method
## 
## Model Information:
## Call: snaive(y = train, h = 12) 
## 
## Residual sd: 2008.8533 
## 
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE MASE      ACF1
## Training set 861.4167 2008.853 1633.583 6.104395 13.31924    1 0.6081251
## 
## Forecasts:
##       Point Forecast     Lo 80    Hi 80    Lo 95    Hi 95
## Jan 4          12770 10195.551 15344.45  8832.72 16707.28
## Feb 4          10433  7858.551 13007.45  6495.72 14370.28
## Mar 4          10823  8248.551 13397.45  6885.72 14760.28
## Apr 4          10110  7535.551 12684.45  6172.72 14047.28
## May 4          12970 10395.551 15544.45  9032.72 16907.28
## Jun 4          15306 12731.551 17880.45 11368.72 19243.28
## Jul 4          15637 13062.551 18211.45 11699.72 19574.28
## Aug 4          15923 13348.551 18497.45 11985.72 19860.28
## Sep 4          12102  9527.551 14676.45  8164.72 16039.28
## Oct 4          11747  9172.551 14321.45  7809.72 15684.28
## Nov 4          11358  8783.551 13932.45  7420.72 15295.28
## Dec 4          12548  9973.551 15122.45  8610.72 16485.28
#accuracy
accuracy(fc_snaive, data_ts)['Test set','MAPE']
## [1] 0

3.Holt-Winter’s exponential smoothing method (triple exponential smoothing)

ses(),holt(),hw() functions from the base package - forecast can be used for Holt-Winter’s exponential smoothing algorithm. We need to check whether or not our data has trend, seasonality. Or we can be lazy and use ets() function from the same package that works great in all cases.

  1. Simple exponential smoothing - no trend, seasonality
fc_ses <- ses(train, h = 12) # only for a stationary ts
autoplot(fc_ses) # forecast plot

  1. Simple exponential smoothing with trend
fc_ses_t <- holt(train,h = 12)
autoplot(fc_ses_t) 

  1. Simple exponential smoothing with trend and seasonality
# Simple exponential smoothing with trend and seasonality
fc_ses_ts_m <- hw(train, seasonal = "multiplicative", h = 12)
fc_ses_ts_a <- hw(train, seasonal = "additive", h = 12)

autoplot(fc_ses_ts_m) 

autoplot(fc_ses_ts_a)

Accuracy

accuracy(fc_ses_ts_m, data_ts)['Test set','MAPE']
## [1] 8.407766

4. ETS - error trend seasonality

It is Automatic forecasting with exponential smoothing for a wide range of time series.

fitets <- ets(train)

fc_ETS <- forecast(fitets,h = 12)

Accuracy

accuracy(fc_ETS, data_ts)['Test set','MAPE']
## [1] 11.98624

5. Box-Jenkins - ARIMA/SARIMA/SARIMAX

Box-Jenkin’s methods are popular choices for short term forecasts. They should not be used beyond few periods since after some periods, the values are constant.

Box Jenkin’s models require a stationary time series. A stationary data does not have trend, seasonality and has mean zero variance over time. Therefore, it is important to remove those components from the time series, in other words stationarize the data.

A signal can be decomposed into a DC and AC component. (The DC component is also called level and the AC components can further be decomposed into trend, seasonality, noise.)

Trend - the general short-term change in direction (Up/down)

Seasonality - increases and decreases in regular time of day, day of week, etc

Stationarity Tests Dickey-Fuller Tests and augmented Dickey-Fuller Test are the commonly used tests for stationarity. KSPSS test, PP test and White noise test(Ljung-Box test) can also be used.

library(tseries)

adf.test(train) # augmented dicky fuller test
## 
##  Augmented Dickey-Fuller Test
## 
## data:  train
## Dickey-Fuller = -2.8081, Lag order = 3, p-value = 0.2587
## alternative hypothesis: stationary

Interpretation: p-value is not less than 0.05, we fail to reject the null hypothesis. This means the time series is non-stationary.

Data stationarization - stabilize the variance

  1. Box-Cox transformations

Attempt for various lambda vales until the plot is stationary.

# Box-Cox transformations
train %>% BoxCox(lambda = .3) %>% autoplot()

train %>% BoxCox(lambda = -0.7) %>% autoplot()

BoxCox.lambda(train) # the best lambda
## [1] -0.6863084
  1. Differencing
diff(train)
##     Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 1       -2162  -333  -841   812   444  1663  2637     0 -4994   246  -407
## 2  1394 -2237  2531 -1791   709  1950   -48   415 -1178 -1789   487   401
## 3  2539 -2337   390  -713  2860  2336   331   286 -3821  -355  -389  1190
autoplot(diff(train)) # Plot of differenced data

The autocorrelation function plots can tell if there is remaining trend/seasonality.

ggAcf(diff(train)) 

  1. Seasonal differencing
difftrain <- diff(train, lag = 12)
  1. logarithmic damping - used to reduce variance.
log_train <- log(train)

Automatic Arima models

Auto.Arima is a robust technique which can handle non-stationary time series data. The algorithm finds the types and number of operations needed to stationarize the data and applies accordingly.

fit <- auto.arima(train,seasonal = T) # seasonal = F for non-Seasonal
fc_arima<- fit %>% forecast(h = 12)
autoplot(fc_arima)

Accuracy

accuracy(fc_arima, data_ts)['Test set','MAPE']
## [1] 18.73199

ARIMA models

We can also define customized ARIMA models after defining the parameters order = (p,d,q), seasonal = (P,D,Q). The parameters are found in the process of stationarizing the series.

  • p - number of lags (Autoregressive part)

  • d - number of differencing needed

  • q - number of error lags (Moving average part)

  • P - number of lags (Autoregressive part)

  • D - number of seasonal differencing needed

  • Q - number of error lags (Moving average part)

train %>% Arima(order = c(0, 1, 0),seasonal = c(0, 1, 0), include.constant = F) %>% forecast() %>% autoplot()

Hybrid Method

In this section, we will see how we can combine econometric approaches and time series approaches. We will need a second variable at this stage.

Dynamic regression with auto.arima

data_ts_all <- ts(HistoricalData[,c(2,3)], frequency = 12)

data_ts_phoneTest <- data_ts_all[,"Phone"][37:48] # to be used later

Let’s plot them together

autoplot(data_ts_all, facets = TRUE) # separately

Fit the model

fit <- auto.arima(data_ts_all[,'Email'], xreg = data_ts_all[,'Phone'], stationary = TRUE)

Checking the regression coefficient - positive coefficient indicates direct relationship.

# check the regression coefficient
coefficients(fit) 
##      sma1      xreg 
## 0.5753787 2.2429109

Forecast

fc_dynRegre <- forecast(fit, xreg = data_ts_phoneTest)

# Plot fc with x and y labels
autoplot(fc_dynRegre) + xlab("Month") + ylab("Email")

We can also assume non linear relationships between the variables.

xreg <- cbind(Phone = data_ts_all[,'Phone'], 
              PhoneSquared = data_ts_all[,'Phone']^2, 
              PhoneCubed = data_ts_all[,'Phone']^3)

# Fit model
fit <- auto.arima(data_ts_all[, "Email"], xreg = xreg)


# Forecast fit 12 month ahead
fc_dynRegre_NL<- forecast(fit, 
                          xreg = cbind(data_ts_phoneTest, data_ts_phoneTest^2, data_ts_phoneTest^3))
## Warning in forecast.forecast_ARIMA(fit, xreg = cbind(data_ts_phoneTest, : xreg
## contains different column names from the xreg used in training. Please check
## that the regressors are in the same order.

Dynamic Harmonic regression - using fourier terms

# Set up harmonic regressors of order K
# K must be not be greater than period/2 

harmonics <- fourier(train, K = 1)

# Fit regression model with ARIMA errors
fit <- auto.arima(train, xreg = harmonics, seasonal = FALSE)

# Forecasts next years
newharmonics <- fourier(train, K = 1, h = 12)
fc_fourier <- forecast(fit, xreg = newharmonics)

# Plot forecasts fc_fourier
autoplot(fc_fourier)

Accuracy

accuracy(fc_fourier, data_ts)['Test set','MAPE']
## [1] 9.635415

TBATS model

Combination of models so far - trignometric models, box cox, arma errors, trend, seasonal. It is great for multiple seasonality time series with strong trend.

fit <- tbats(train)

# Forecast the series for the next 5 years
fc_TBAT <- forecast(fit, h = 12)

# Plot the forecasts
autoplot(fc_TBAT)

Accuracy

accuracy(fc_TBAT, data_ts)['Test set','MAPE']
## [1] 6.212802

Conclusion

As we have seen above, complex approaches do not always give the best result. It is, therefore, necessary to try multiple methods and choose the one that works the best for your problem. Forecasting is a process where models are continually fine-tuned for better accuracy. R can definitely make some of the hard work that comes with this easier.