ARIMA?!

When a forecasting project is not one

Reza Hosseini

Outline

  • The problem

  • Why typical forecast methods won't work

  • Multiseasonal time series

  • What to do

The problem

Is it a forecast problem

  • It seems so, but looking at the lag plots

plot of chunk unnamed-chunk-4

Is it a forecast problem

  • It seems so, but looking at the lag plots

plot of chunk unnamed-chunk-5

Is it a forecast problem

  • It seems so, but looking at the lag plots

plot of chunk unnamed-chunk-6

Is it a forecast problem

  • It seems so, but looking at the lag plots

plot of chunk unnamed-chunk-7

Is it a forecast problem

  • And the autocorrelation plot

plot of chunk unnamed-chunk-8

Weekly seasonlity

  • Autocorrelation plot suggested weekly seasonality in the data
  • The missing days were imputed and the time series is decomposed plot of chunk unnamed-chunk-9

Accuracy of the seasonality

plot of chunk unnamed-chunk-10

Forecasting via decomposition

plot of chunk unnamed-chunk-11

plot of chunk unnamed-chunk-12

Last three years

plot of chunk unnamed-chunk-13

Last three years

plot of chunk unnamed-chunk-14

  • One could remove seasonality and do the prediction on the remainder, then add seasonaity

  • The best RMSE I could get with this approach was 330.15

Multiseasonality approach

  • As there are two kinds of sesonality, one can use multiseasonal time series and TBATS

plot of chunk unnamed-chunk-16

Forecast by TBATS

  • Reacehd rmse of 376 plot of chunk unnamed-chunk-17

Machine learning approach

  • School and bank holidays
  • Weekday and month name to consider seasonality
  • Weather data (temprature, wind, preception,...)
  • New features
    • Monthly average temprature
    • Warmer than monthly average
    • Warmer than the previous day
    • Heat index
  • Adjust prices by consumer price index (CPI)

Machine learning approach

  • Train a random forest to get feature importnce
  • Use the most important feature (99% cumulative importance)
  • Use gradient boosting (XGboost) for the final prediction
  • Adjust christmas and new years manually to the previous year value

  • The final rmse is 247.33 (269.89 without manuall adjusment)

Machine learning approach

plot of chunk unnamed-chunk-18

Machine learning approach

plot of chunk unnamed-chunk-19

Thank you for your patience

With confidence interval

plot of chunk unnamed-chunk-20