Time Series Forecasting

Author

Karim Abdul Aziz Chatab

I. Time series analysis of monthly turnover

I.I Training set time-series plot

show code
#Create a training set
train <- my_series %>% filter(Month < yearmonth("2019 Jan"))

#plot the time series with appropriate labels
plot = train %>% autoplot() + 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(1982 - 2020)") + ylab("Turnover ($ Millions)") + 
  xlab("Month")  + theme_minimal()
Plot variable not specified, automatically selected `.vars = y`
show code
#Original plot
plot

Very strong seasonality and trend component. handling this with seasonal differencing is required due to seasonality.First order differencing is required as well due to the linear trend. However, transformation is necessary to stabilize the variance, in order to create a simpler and consistent pattern for more accurate forecast.

I.II Initial Data Transformation Selection

show code
#Logarithmic Transformation
Log = train %>% autoplot(log(y)) + xlab("Month") + 
  ylab("Logarithmic Turnover") + ggtitle("Logarithmic Transformation") + theme_minimal()

#Squared-Root Transformation
SQR = train %>% autoplot(sqrt(y)) + xlab("Month") +
  ylab("Squared-Root of Turnover ") + ggtitle("Squared-Root Transformation") + theme_minimal()

#Inference Transformation
INF = train %>% autoplot(-1 / y) + xlab("Month") + 
  ylab("Inference of Turnover ") + ggtitle("Inference Transformation") + theme_minimal()

#calculate the lambda value through Guerrero method
lambda <- train %>%
  features(y, features = guerrero) %>%
  pull(lambda_guerrero)

print(paste("Guerrero's value is", lambda))  
[1] "Guerrero's value is 0.0228463936805025"
show code
#Plot the box cox transformation using guerrero as the lambda value
box_cox = train %>%
  autoplot(box_cox(y, lambda))  + xlab("Month") +
  ylab("Box Cox of Turnover ") +  ggtitle("Box Cox Transformation") + theme_minimal()

#see the plot
(box_cox + SQR ) / (Log + INF)

The lambda value from the Guerrero method suggest 0.0228 value, indicates 0 value would be the best in a box cox transformation. 0 lambda value would compute a natural logarithm instead. Furthermore, squared root still has sharp variances and the inference is inconsistent. Therefore, Log transformation would be better.

I.III Initial Data Differencing

show code
#transformation
Log

show code
#does it need differencing?
train %>% features(log(y), feat_stl)
# A tibble: 1 × 9
  trend_strength seasonal_strength_year seasonal_peak_year seasonal_trough_year
           <dbl>                  <dbl>              <dbl>                <dbl>
1          0.996                  0.917                  9                    1
# ℹ 5 more variables: spikiness <dbl>, linearity <dbl>, curvature <dbl>,
#   stl_e_acf1 <dbl>, stl_e_acf10 <dbl>
show code
#0.917 seasonal strength, yes it does. 

#How much differencing? 
train %>% features(log(y), unitroot_nsdiffs)
# A tibble: 1 × 1
  nsdiffs
    <int>
1       1
show code
#Seasoned differencing required (n = 1)

#First order as well?
train %>% features(log(y), unitroot_ndiffs)
# A tibble: 1 × 1
  ndiffs
   <int>
1      1
show code
#First order required (n = 1)

#seasoned differencing + logarithmic transformation
train %>% autoplot(difference(log(y),12)) + theme_minimal()
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_line()`).

The seasoned differencing did well by handling the seasonality component from the data. It looks fairly stationary, but indicates a few trending behaviour on the line graph. That is, there are still visible trend patterns or movement in the data.Therefore, following with first order difference would be sufficient.

show code
#first order differencing + logarithmic transformation
train %>% autoplot(difference(log(y))) + theme_minimal()
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

The first order differencing sufficiently removes visible trending movement in the dataset. Looks somewhat stationary but the spikes from the plot indicates a strong seasonality component. Therefore, combining first order and seasoned differencing would be ideal to handle both time series components.

show code
#does it still need differencing?
train %>% 
  mutate(y = difference(difference(log(y),12),1)) %>%
  features(y, feat_stl)
# A tibble: 1 × 9
  trend_strength seasonal_strength_year seasonal_peak_year seasonal_trough_year
           <dbl>                  <dbl>              <dbl>                <dbl>
1         0.0491                 0.0160                  9                    0
# ℹ 5 more variables: spikiness <dbl>, linearity <dbl>, curvature <dbl>,
#   stl_e_acf1 <dbl>, stl_e_acf10 <dbl>
show code
#0.0160 seasonal strength, below 0.64, no more. 

#check again.
train %>%  mutate(y = difference(difference(log(y),12),1)) %>%
  features(y, unitroot_nsdiffs)
# A tibble: 1 × 1
  nsdiffs
    <int>
1       0
show code
#seasoned difference not required (n=0)

train %>%  mutate(y = difference(difference(log(y),12),1)) %>% 
  features(y, unitroot_ndiffs)
# A tibble: 1 × 1
  ndiffs
   <int>
1      0
show code
#first order difference not required (n=0)
#that's enough.

#plot the difference combination
train %>% autoplot(difference(difference(log(y),12),1)) + theme_minimal()
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).

Most stationary, first order and seasoned differencing has compliment both limitations well. Thus, successfully reducing the trending and seasonality component. No predictable patterns detected. However, there are few long spikes in the beginning but they appear aperiodic. Differencing stabilizes the mean and transformation stabilizes the variances. Select this.

show code
tibble = train %>% 
  mutate(y = difference(difference(log(y),12),1)) %>%
  features(y, feat_stl)

II. Autocorrelation and Residual Analysis

II.I Auto Correlation Function (ACF) and Partial Auto Correlation Function (PACF)

show code
#first order differencing + transformation ACF and PACF
train %>% 
  mutate(y = difference(difference(log(y),12),1)) %>%
  gg_tsdisplay(plot_type ="partial", lag = 24) +
  labs(title = "ACF plot with log transformation and first order + seasoned differencing", y= "")
Plot variable not specified, automatically selected `y = y`
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).

show code
#lag = 24, instead of lag = 36. 36 lags might add more parameters

Both plots has significant spikes at lag 12 and PACF lags on 12, 24, and 36 are decaying in significance indicating an MA(1) on seasonal ARIMA part. Before lag 12, ACF has significant lags on 1, 3, 7 while PACF are on lag 1, 2 and 7. AR(2) would be more parsimonious for the model than MA(3) while putting AR(7) or MA(7) is too complex. Therefore, ARIMA(2,1,0)(0,1,1) is sufficient.

show code
#fit the model
fit = train %>% model(arima = ARIMA(log(y) ~ pdq(2, 1, 0) + PDQ(0, 1, 1)))
report(fit) 
Series: y 
Model: ARIMA(2,1,0)(0,1,1)[12] 
Transformation: log(y) 

Coefficients:
          ar1      ar2     sma1
      -0.3894  -0.2255  -0.8547
s.e.   0.0472   0.0471   0.0290

sigma^2 estimated as 0.00285:  log likelihood=640.41
AIC=-1272.81   AICc=-1272.72   BIC=-1256.58

II.II Residual Diagnostic of the fitted ARIMA

  • Check the whiteness of the residuals from the fitted ARIMA model. Based on these evaluate and if necessary review the ARIMA model .
show code
fit %>% gg_tsresiduals()

show code
#Generate the Ljung Box's Value
augment(fit)%>%features(.innov, ljung_box, lag = 24, dof =4)
# A tibble: 1 × 3
  .model lb_stat lb_pvalue
  <chr>    <dbl>     <dbl>
1 arima     45.8  0.000854
show code
#residuals is not white noise, sufficient evidence to reject at 95% confidence level

#Generate the Box Pierce Value
augment(fit)%>%features(.innov, box_pierce,lag = 24, dof =4) 
# A tibble: 1 × 3
  .model bp_stat bp_pvalue
  <chr>    <dbl>     <dbl>
1 arima     44.2   0.00143

Residuals are approximately Gaussian, centered around 0. Innovation residuals seems homoskedastic, with constant variances that are centered around 0 as well. ACF shows significant spikes, thus, autocorrelation is present, not white noise. Both hypothesis test reject the null at 95% confidence with 4 degrees of freedom at lag 24.

III. Alternative ARIMA combinations

III.I Manual Selection

  • Consider four alternative ARIMA models. Use information criteria to choose the best model considered so far.
show code
#use up all possible alternatives
fit_4= train %>% model(arima210_011 = ARIMA(log(y) ~ pdq(2, 1, 0) + PDQ(0, 1, 1)),
                           arima013_011 = ARIMA(log(y) ~ pdq(0, 1, 3) + PDQ(0, 1, 1)),
                           arima210_210 = ARIMA(log(y) ~ pdq(2, 1, 0) + PDQ(2, 1, 0)),
                           arima013_210 = ARIMA(log(y) ~ pdq(0, 1, 3) + PDQ(2, 1, 0)))
#View the Information Criterion
glance(fit_4) %>% arrange(AICc) %>% select(.model:BIC)%>%gt() %>%
  # Add a title and subtitle
  tab_header(
    title = md("**Information Criterion of ARIMA**"),
    subtitle = md("Using three different models")
  ) %>%
  # Format number columns with commas
  fmt_number(
    decimals = 4
  )  %>%
  # Add borders and style
  tab_style(
    style = cell_borders(
      sides = "all",
      color = "grey",
      weight = px(1)
    ),
    locations = cells_body()
  ) %>%
  # Apply bold style to headers
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_column_labels(everything())
  ) 
Error in tab_style(., style = cell_text(weight = "bold"), locations = cells_column_labels(everything())): could not find function "tab_style"

The Arima that was chosen at question 2 has the lowest AICc value, the ARIMA(2,1,0)(0,1,1)[12]. The rest have more parameters, which could suggest lower parameters might be more parsimonious and provides a better fit to the data relative to the complexity of the model.

III.II Automatic ARIMA() Selection

show code
#let R work harder to find all possible options.
fit_r = train %>% model(arima_r = 
                          ARIMA(log(y), 
                                stepwise = FALSE, 
                                approx = FALSE, 
                                order_constraint = p + q + P + Q <= 9 ))
Warning: Model specification induces a quadratic or higher order polynomial trend. 
This is generally discouraged, consider removing the constant or reducing the number of differences.
show code
#make it extra hard for the ARIMA() function to explore 

#check the model selection
fit_r$arima_r
<lst_mdl[1]>
[1] <ARIMA(4,1,2)(1,1,2)[12]>
show code
#ARIMA(4,1,2)(1,1,2)[12]

#combined to see it better
combined_fit = cross_join(fit_r, fit_4)

#select Q4 chosen model and ARIMA() model
combined_fit[c("arima210_011","arima_r")] %>% 
  glance() %>% 
  arrange(AICc) %>% select(.model:BIC)
# A tibble: 2 × 6
  .model        sigma2 log_lik    AIC   AICc    BIC
  <chr>          <dbl>   <dbl>  <dbl>  <dbl>  <dbl>
1 arima_r      0.00279    648. -1276. -1275. -1235.
2 arima210_011 0.00285    640. -1273. -1273. -1257.
show code
#view the Information Criterion for both models

#automatic ARIMA() function is better due to lower AICc
fit_r %>% gg_tsresiduals()

show code
##Generate the Ljung Box's Value
augment(fit_r)%>%features(.innov, ljung_box, lag = 24, dof =6)
# A tibble: 1 × 3
  .model  lb_stat lb_pvalue
  <chr>     <dbl>     <dbl>
1 arima_r    25.6     0.109
show code
#residuals is not white noise, sufficient evidence to reject at 95% confidence level

#Generate the Box Pierce Value
augment(fit_r)%>%features(.innov, box_pierce,lag = 24, dof =6) 
# A tibble: 1 × 3
  .model  bp_stat bp_pvalue
  <chr>     <dbl>     <dbl>
1 arima_r    24.7     0.135

R-select model is ordered ARIMA(4,1,2)(1,1,2)[12]. R-select is different by combining both seasonal and non seasonal AR and MA. Both models uses the same amount of differencing. Choose the R-select model due to lower AICc value.

Residuals are approximately Gaussian, centered around 0. Innovation residuals seems homoskedastic, with constant variances that are centered around 0 as well. ACF plot has no significant spikes at all, no autocorrelation present. The portmanteu test accepts the null hypothesis at 95% confidence level. It is white noise. This indicate a good model that sufficiently captured all variations in the data and its not biased.

III.III ARIMA Champion Model Selection

show code
#Comparison of their accuracy measures
combined_fit %>% forecast(h = 24)%>% accuracy(my_series) %>%
  arrange(RMSE)
# A tibble: 5 × 10
  .model       .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
  <chr>        <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 arima013_210 Test   17.6  38.0  24.3  3.64  5.66  1.51  1.68 0.709
2 arima210_210 Test   18.5  38.8  24.9  3.88  5.79  1.54  1.72 0.714
3 arima013_011 Test   20.9  40.9  27.5  4.41  6.38  1.70  1.82 0.718
4 arima210_011 Test   21.9  41.7  27.9  4.65  6.47  1.73  1.85 0.720
5 arima_r      Test   26.3  45.4  31.3  5.71  7.23  1.94  2.02 0.733
show code
#Comparison of AICC AIC and BIC
combined_fit %>% 
  glance() %>% 
  arrange(AICc) %>% 
  select(.model:AICc) %>% 
  gt() %>% 
  tab_header(title = "ARIMA model comparison",
             subtitle = "Sorted by Akaike Infomation Criterion (AIC)") %>%
  gt::tab_style(style = cell_text(weight = "bold"), 
                locations = cells_column_labels(everything())) 
ARIMA model comparison
Sorted by Akaike Infomation Criterion (AIC)
.model sigma2 log_lik AIC AICc
arima_r 0.002789766 647.7865 -1275.573 -1275.045
arima210_011 0.002849818 640.4067 -1272.813 -1272.719
arima013_011 0.002860393 640.1332 -1270.266 -1270.124
arima210_210 0.003461398 604.7224 -1199.445 -1199.303
arima013_210 0.003480290 604.1067 -1196.213 -1196.014
show code
#All arima model have same amount of differencing 

Interestingly, ARIMA(4,1,2)(1,1,2) performed the worse in the test set but has the lowest AICc. ARIMA(0,1,3)(2,1,0) has the lowest error values in all error measure but has the highest AICc. Nevertheless, ARIMA(4,1,2)(1,1,2) is chosen due to lower AICc as it reflects this model fit the best to the entire dataset.

IV. Forecast the next two years

show code
#whole dataset
fit_r %>% forecast(h=24) %>% autoplot(my_series) + ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(2016 - 2022) (Forecast to 2022)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") + theme_minimal()

Zoom in

show code
#zoom in 
fit_r %>% forecast(h=24) %>% autoplot(my_series[ 420:465,]) + ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(2016 - 2022) (Forecast to 2022)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") + theme_minimal()

It shows reasonable mean over the next two years because the movement of the prediction intervals suggest it capture trend and seasonal. Prediction intervals are quite wide around the point forecast and suggest high degree of uncertainty especially for periods further in the future.

V. Whole data forecast

V.I Full Dataset Plot

show code
library(readxl)
#read the data and perform data wrangling
excel = read_xlsx("Datasets/8501011.xlsx", sheet = 2, skip = 9)

#create a tsibble dataset
df = excel %>% select("Series ID", A3349637X) %>% 
  rename(
  Month = `Series ID`,
  y = A3349637X) %>% mutate(Month = yearmonth(Month)) %>%
  as_tsibble(index = Month)

#plot the data
df %>% autoplot()+ 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(1982 Apr - 2023 March)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()
Plot variable not specified, automatically selected `.vars = y`

This dataset is a continuation of the previous monthly turnover dataset. The plot suggests that it might benefit during covid-19 as there is a visible jump on the spikes from 2020 and a slight increase overall onwards.

V.II Forecast using three different methods

show code
#Train 
train_df = df %>% slice(1:(n()-27)) 

#best benchmark, Best ETS and ARIMA)
best_model = train_df %>% model(ETS = ETS(log(y) ~ error("M") + trend("Ad") + season("M")),
             ARIMA = ARIMA(log(y) ~ 0 + pdq(4,1,2) + PDQ(1,1,2)),
             SNAIVE = SNAIVE(log(y))) 
#plot on the whole data set
best_model %>% forecast(h = 27) %>% autoplot(df) + 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(1982 Apr - 2023 Mar) (with forecast)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()

show code
#zoom in
best_model %>% forecast(h = 27) %>% autoplot(df[430:492,], alpha = 0.6)+ 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(2018 Jan - 2023 Mar) (with forecast)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()

show code
#seperate
best_model %>% forecast(h = 27) %>% autoplot(df[430:492,]) + 
  facet_wrap(~.model)+ 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(2018 Jan - 2023 Mar) (with forecast)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()

Out of all three models, it seems that the model for ETS is the closest from the actual data to the point forecast overall. However, The ETS Model has the widest confidence interval overall. Thus, ETS method has higher level of uncertainty in predicting the future values of the time series compared to the other methods. ETS and ARIMA models has capture the seasonality component and trend component. SNAIVE models use the seasonal pattern from historical data to directly forecast future values.

As SNAIVE is more sensitive to the past data, the forecast seem to do better due to Covid-19 increasing the turnover, allowing the actual data to meet a few of the forecast. For ETS model, it seems that the weight given to the past observations when forecasting future values align perfectly with the positive effects of Covid-19. The ARIMA method seems to perform the worse as the point forecast seem to slowly get farther away from the actual data over time.

V.III Accuracy Evaluation

show code
acc_table = best_model%>% forecast(h =27) %>% 
  accuracy(df) %>% arrange(RMSE)

#evaluate each model's accuracy
acc_table %>% 
  gt() %>%
  # Add a title and subtitle
  tab_header(
    title = md("**Accuracy Metrics of Forecast**"),
    subtitle = md("Using three different models")
  ) %>%
  # Format number columns with commas
  fmt_number(
    decimals = 4
  )  %>%
  # Add borders and style
  gt::tab_style(
    style = cell_borders(
      sides = "all",
      color = "grey",
      weight = px(1)
    ),
    locations = cells_body()
  ) %>%
  # Apply bold style to headers
  gt::tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_column_labels(everything())
  ) 
Accuracy Metrics of Forecast
Using three different models
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
ETS Test −13.9926 27.8895 21.1148 −3.2708 4.8783 1.2123 1.1129 0.2490
SNAIVE Test 0.5025 31.3555 26.0714 0.3471 6.3103 1.4968 1.2513 0.5060
ARIMA Test −35.3641 44.1171 35.3641 −8.3509 8.3509 2.0304 1.7605 0.3795

ETS performed the best, it has the lowest RMSE, MAE, MPE and MAPE from all 3 models. Suggesting that the accuracy of the point forecast for ETS is the most accurate forecast as compare to Seasonal Naive method and ARIMA method for this dataset.

V.IV Two years ahead forecast evaluation

show code
#Create your final fit 
final_fit =df %>% model(ETS = ETS(log(y) ~ error("M") + trend("Ad") + season("M")),
             ARIMA = ARIMA(log(y) ~ 0 + pdq(4,1,2) + PDQ(1,1,2)),
             SNAIVE = SNAIVE(log(y))) 

#plot the forecast on the whole data set
final_fit %>% forecast(h = 24) %>% autoplot(df) + 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(1982 Apr - 2023 Mar) (forecast to 2025 Mar)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()

show code
#zoom in
final_fit %>% forecast(h = 24) %>% autoplot(df[454:492,], alpha = 0.6)+ 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(2020 Jan - 2023 Mar) (forecast to 2025 Mar)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()

show code
#seperate
final_fit %>% forecast(h = 24) %>% autoplot(df[454:492,]) + facet_wrap(~.model)+ 
  ggtitle("Monthly turnover of electrical and electronic goods retailing in Queensland", 
          subtitle = "(2020 Jan - 2023 Mar) (forecast to 2025 Mar)") + ylab("Turnover ($ Millions)") + 
  xlab("Month") +theme_minimal()

All three model attempted to forecast on the recovery period of Covid-19.

These three model Forecast shows reasonable mean over the next two years because the movement of the prediction intervals suggest it capture trend and seasonal variation except SNAIVE model. The three model’s prediction intervals are quite wide around the point forecast and suggest high degree of uncertainty especially for periods further in the future. Implying that the actual values are expected to vary widely around the mean forecasts over time.

SNAIVE model rely on the seasonal patterns observed in historical data to directly forecast future values. Therefore, it does not consider the trend component. In this case, it has given a wider prediction interval compared to the rest. It gives a similair movement as to post 2020 or during Covid-19 times.

The ETS model breaks down the time series into error, damped trend, and seasonality components to capture various patterns and dynamics in the forecast.ARIMA models, on the other hand, account for autoregressive, moving average, and differencing components to model linear dependencies and make predictions. The movement caused by Covid-19 seems to affect the movement of both forecasts as the models detect the cyclical pattern of Covid-19 combined with the previous cyclical pattern from 2005 onwards.