Direct vs Iterative Multi-Step Forecasts

Author

William McWilliams

In this exercise, we will compare direct and iterative forecasting methods using a simple scenario and explore both concepts using R code. We will forecast the values of the All Food CPI series over a three-month horizon with known observations from 1992 to September 2023. Our forecasts will be made using univariate ARIMA(1,1,0) models, with Equation 1 for the iterative approach and Equation 2 for the direct approach.

\[ \Delta Y_t = \phi_1 \Delta Y_{t-1} + \epsilon_t \tag{1}\]

Where t represents time in months, Y represents the CPI series, $\Delta$ represents the change in CPI series $Y$ calculated by difference the series once ($Y_t-Y_{t-1}$), and $\phi$ is the parameter for the lagged change in CPI ($\Delta Y_{t-1}$)

\[ \Delta Y_{t+h} = \phi_1 \Delta Y_{t-1} + \epsilon_t \tag{2}\]

Where t represents time in months,h represents the forecast horizon, Y represents the CPI series, $\Delta$ represents the change in CPI series $Y$ calculated by difference the series once ($Y_t-Y_{t-1}$), and $\phi$ is the parameter for the lagged change in CPI ($\Delta Y_{t-1}$)

In iterative multi-step forecasting, predicted values from previous time steps are used as input to forecast the values for the subsequent time steps. The process begins by predicting one time step ahead and then using that forecast as an input for the next time step. This recursive process continues until the desired forecast horizon is reached.

When using direct multi-step forecasting, a separate model is trained to predict each step in the forecast horizon. This is different from recursive multi-step forecasting, where a single model is used to make predictions for all future time steps by using its own output as input repeatedly. Direct multi-step forecasting may be more computationally expensive as it requires training multiple models. However, it can often achieve better accuracy in certain scenarios, especially when dealing with complex patterns and dependencies that are difficult to capture with a single model.

To begin the example we will load the CPI All food series using the BLS API

key <- 'da1ff5fcbb8f4da7a4376608698a927b'# must define


key <- 'da1ff5fcbb8f4da7a4376608698a927b'# must define


#note that CUUR0000SEFM was removed because it is missing important years

series_data <- data.frame(
  seriesID = c("CUUR0000SAF1", "CUUR0000SEFV", "CUUR0000SAF11",
               "CUUR0000SAF1121", "CUUR0000SAF11211", "CUUR0000SEFC",
               "CUUR0000SEFD", "CUUR0000SEFE", "CUUR0000SEFF",
               "CUUR0000SEFG", "CUUR0000SEFH", "CUUR0000SEFJ",
               "CUUR0000SEFS", "CUUR0000SAF113", "CUUR0000SAF1131",
               "CUUR0000SEFK", "CUUR0000SEFL", #"CUUR0000SEFM",
               "CUUR0000SEFR", "CUUR0000SAF111", "CUUR0000SAF114",
               "CUUR0000SAF115"),
  category = c("All food", "Food away from home", "Food at home",
               "Meats, poultry, and fish", "Meats", "Beef and veal",
               "Pork", "Other meats", "Poultry", "Fish and seafood",
               "Eggs", "Dairy products", "Fats and oils",
               "Fruits and vegetables", "Fresh fruits and vegetables",
               "Fresh fruits", "Fresh vegetables", #"Processed fruits and vegetables",
               "Sugar and sweets", "Cereals and bakery products",
               "Nonalcoholic beverages", "Other foods")
)




## -------------------------------------------------
#this requires 3 different requests since there is a limit of 10 years at a time
X1 <- blsAPI(list('registrationkey'=key, 'startyear'=1991, 'endyear'=2001,'seriesid'=series_data$seriesID), 
              2, return_data_frame=T)
X2 <- blsAPI(list('registrationkey'=key, 'startyear'=2002, 'endyear'=2021,'seriesid'=series_data$seriesID), 
              2, return_data_frame=T)
X3 <- blsAPI(list('registrationkey'=key, 'startyear'=2022, 'endyear'=2023,'seriesid'=series_data$seriesID), 
              2, return_data_frame=T)

#combine the data frames
CPI_df <- rbind(X3,X2,X1) %>% 
  mutate(
    period = sub("M", "",period),
    date = yearmonth(paste(year,period)),
    value = as.numeric(value))%>% 
  dplyr::select(-year, -period,-periodName) %>%
  pivot_wider(names_from = seriesID, values_from = value) %>% 
  as_tsibble(index = date)

We can take our first look at the All food series by plotting the original series (denoted by $Y$ in the above equations) and the single difference ($Y $) that will be what the we fit the AR(1) model to.

p1 <- CPI_df %>% autoplot(CUUR0000SAF1, col = "cornflowerblue")+theme_bw()

p2 <- CPI_df %>% autoplot(difference(CUUR0000SAF1), col = "coral")+theme_bw()


p1 / p2

The data needs to be partitioned into a training and test set before moving forward. The following accomplishes that by establishing a training set from January 2020 to September 2023. These partitions were chosen somewhat arbitrarily as this is not necessarily meant to be a crash course on forecasting but instead an instructional exercise for comparing direct and iterative approaches. It is best practice to visualize the partitioned data to ensure you did what you set out to so lets take a look before moving on.

train <- CPI_df %>% 
  filter_index("2020 Jan"~"2023 Sep")

test <- CPI_df %>% 
  filter_index("2023 Oct"~.)

p3 <- train %>% autoplot(CUUR0000SAF1, col="black")+
  autolayer(test,CUUR0000SAF1, col = "red")+
  theme_bw() 

p3

Iterative Forecasting

Now we are all set to move forward with the two forecasting methods. We will begin with the iterative approach. Recall the iterative equation:

\[ \Delta Y_t = \phi_1 \Delta Y_{t-1} + \epsilon_t \] Which creates a predictive model that solves for $\Delta Y _t$ given $\Delta Y_{t-1}$.

Where t represents time in months, Y represents the CPI series, $\Delta$ represents the change in CPI series $Y$ calculated by difference the series once ($Y_t-Y_{t-1}$), and $\phi$ is the parameter for the lagged change in CPI ($\Delta Y_{t-1}$)

Lets fit and ARIMA(1,1,0) model to the training set and return the corresponding model parameters.

model1 <- train %>% 
  model(
    AR1 = ARIMA(CUUR0000SAF1~pdq(1,1,0)+PDQ(0,0,0))
  )
model1 %>% report()

Series: CUUR0000SAF1 
Model: ARIMA(1,1,0) w/ drift 

Coefficients:
         ar1  constant
      0.5846    0.5824
s.e.  0.1200    0.1334

sigma^2 estimated as 0.8703:  log likelihood=-58.52
AIC=123.04   AICc=123.64   BIC=128.4

This gives us the parameters for our model. We can plug the $\phi$ and constant term back into the iterative equation like so: \[\Delta Y_t = .5824+ .5846\hat{\Delta Y_{t-1}} + \epsilon_t\] Now Using that equation lets manually forecast the next 3 months using the parameters from the model and an iterative method that I employ using a for-loop.

h <- 3 #forecast horizon

#values from the equation
constant <- .5824
ar1 <- .5846 

# Create an empty numeric vector of length n
Y_vars <- numeric(4)
Y_vars[1] <- train$CUUR0000SAF1[nrow(train)]-train$CUUR0000SAF1[nrow(train)-1]

cpi_val <- numeric(4)
cpi_val[1]<-train$CUUR0000SAF1[nrow(train)]


#predict using the for loop

for (i in 1:h) {
  # Calculate y using the predefined equation
  Y_vars[i+1] <- ar1 * Y_vars[i] + constant
  
  cpi_val[i+1] <- cpi_val[i]+ Y_vars[i+1]
 
}

itterative_predictions <- cpi_val[2:4]

Apologies if the code annotations are not as clear as they could be. I will try to walk through the forecasting method with some equations to help shed some light on the process. But first here is a useful visualization of the process that I borrowed.

Because the model is fit such that it predicts only a single period ahead, we must first predict a value for one period ahead as in the first line of the image. This prediction in our case will be for October 2023 and we will make this prediction using our last observed value in our data which is the change in the All food series from August to September ($\Delta Y_{Sep}$). The equations for this can be written as:

\[\hat{\Delta Y_{Oct}} = .5824+ .5846\Delta Y_{Sep} + \epsilon_t\] And results in an estimated value for the change in CPI for October ($\hat{\Delta Y_{Oct}}$).We then plug this estimated value back into the original equation like so.

\[\hat{\Delta Y_{Nov}} = .5824+ .5846\hat{\Delta Y_{Oct}} + \epsilon_t\] and lastly we estimate the final month of December with the following:

\[\hat{\Delta Y_{Dec}} = .5824+ .5846\hat{\Delta Y_{Nov}} + \epsilon_t\]

We can the sum all of the resulting estimate $\{\hat{\Delta Y_{Oct}},\hat{\Delta Y_{Nov}},\hat{\Delta Y_{Dec}}\}$ and add them to the Septemeber All food series to get a predictions for the All food series value in December 2023 ($\hat{Y_{Dec}}$). In an equation that looks like the following:

\[ \hat{Y_{Dec}} = Y_{Sep}+ \hat{\Delta Y_{Oct}}+\hat{\Delta Y_{Nov}}+\hat{\Delta Y_{Dec}}\] And like wise we could recover the estimates for the All food series along the horizon of the forecast: \[ \hat{Y_{Oct}} = Y_{Sep}+ \hat{\Delta Y_{Oct}}\] \[ \hat{Y_{Nov}} = Y_{Sep}+ \hat{\Delta Y_{Oct}}+\hat{\Delta Y_{Nov}}\]

Direct Forecasting

Now that we have a decent understanding of the iterative approach, lets take a look at the direct and compare the two. To begin I will proved another visual similar to the one in the iterative section.

Recall the direct forecast equation provided at the beginning of this tutorial:

\[ \Delta Y_{t+h} = \phi_1 \Delta Y_{t-1} + \epsilon_t \]

Where t represents time in months,h represents the forecast horizon, Y represents the CPI series, $\Delta$ represents the change in CPI series $Y$ calculated by difference the series once ($Y_t-Y_{t-1}$), and $\phi$ is the parameter for the lagged change in CPI ($\Delta Y_{t-1}$)

There is a subtle difference between the direct forecast equation and that of the iterative method. Notice that the dependent variable located on the left hand side of the equation is not longer $\Delta Y_t$ but instead $\Delta Y_{t+h}$. In order to fit a model according to these specifications we must create these variables within our data set by adding the appropriate value for h. The following code creates three new columns in our data frame one for each value of h. It does this by using the lead() function. In order to understand what it is this function does we should take a look at the data frame after the new columns have been added.

leads <- train %>% 
  mutate(
    lead1 = lead(CUUR0000SAF1, n = 1),
    lead2 = lead(CUUR0000SAF1, n = 2),
    lead3 = lead(CUUR0000SAF1, n = 3)
  ) 

#Dont just take my word for it, look at the df
leads %>% 
  select(date,CUUR0000SAF1,lead1,lead2, lead3) %>% 
  tail()

# A tsibble: 6 x 5 [1M]
      date CUUR0000SAF1 lead1 lead2 lead3
     <mth>        <dbl> <dbl> <dbl> <dbl>
1 2023 Apr         322.  322.  323.  324.
2 2023 May         322.  323.  324.  324.
3 2023 Jun         323.  324.  324.  325.
4 2023 Jul         324.  324.  325.   NA 
5 2023 Aug         324.  325.   NA    NA 
6 2023 Sep         325.   NA    NA    NA

As you can see from the new data frame, the lead function takes the value of $Y_t$ and makes it the value of $Y_{t+h}$. Notice how for certain dates we will not have observations for these new data sets. Now that we have the created the three columns for $Y_{t+1}$, $Y_{t+2}$, and $Y_{t+3}$ we can begin fit three individual models for each horizon.

Horizon = 1

We fit the following model to forecast the October 2023 change in All food ($\hat{\Delta Y_{Oct}}$):

\[\Delta Y_{t+1} = \phi_1 \Delta Y_{t-1} + \epsilon_t\]

oct_frcst_model <- leads %>% 
  model(
    AR1 = ARIMA(lead1~pdq(1,1,0)+PDQ(0,0,0))
  )
oct_frcst_model %>% report()

Series: lead1 
Model: ARIMA(1,1,0) w/ drift 

Coefficients:
         ar1  constant
      0.5815    0.5926
s.e.  0.1217    0.1363

sigma^2 estimated as 0.8689:  log likelihood=-57.65
AIC=121.31   AICc=121.91   BIC=126.66

Using the parameters given by the models output lets rewrite the equations to solve for the the change in All food from October to September:

\[\hat{\Delta Y_{Oct}} = .5926+ .5815 \Delta Y_{t-1} + \epsilon_t\]

Horizon = 2

We fit the following model to forecast the November 2023 change in All food ($\hat{\Delta Y_{Nov}}$):

\[\Delta Y_{t+2} = \phi_1 \Delta Y_{t-1} + \epsilon_t\]

nov_frcst_model <- leads %>% 
  model(
    AR1 = ARIMA(lead2~pdq(1,1,0)+PDQ(0,0,0))
  )
nov_frcst_model %>% report()

Series: lead2 
Model: ARIMA(1,1,0) w/ drift 

Coefficients:
         ar1  constant
      0.6918    0.4803
s.e.  0.1192    0.1268

sigma^2 estimated as 0.7409:  log likelihood=-53.58
AIC=113.16   AICc=113.76   BIC=118.51

Using the parameters given by the models output lets rewrite the equations to solve for the the change in All food from October to September:

\[\hat{\Delta Y_{Nov}} = .4803+ .6918 \Delta Y_{t-1} + \epsilon_t\]

Horizon = 3

We fit the following model to forecast the December 2023 change in All food ($\hat{\Delta Y_{Dec}\hat{$):

\[\Delta Y_{t+3} = \phi_1 \Delta Y_{t-1} + \epsilon_t\]

dec_frcst_model <- leads %>% 
  model(
    AR1 = ARIMA(lead3~pdq(1,1,0)+PDQ(0,0,0))
  )
dec_frcst_model %>% report()

Series: lead3 
Model: ARIMA(1,1,0) w/ drift 

Coefficients:
         ar1  constant
      0.6833    0.4394
s.e.  0.1096    0.1164

sigma^2 estimated as 0.6005:  log likelihood=-48.47
AIC=102.94   AICc=103.54   BIC=108.3

Using the parameters given by the models output lets rewrite the equations to solve for the change in All food from October to September:

\[\hat{\Delta Y_{Dec}} = .4394+ .6833 \Delta Y_{t-1} + \epsilon_t\]

Now, with all three model specifications and $Y_{t-1}$ we ca predict the change in All food for all three months. Before doing so lets create a place to store all three estimates of $\hat{Y_{t+h}}$ (a vector we will call “direct_pred_val”), a place to store the predicted All food series ($\hat{Y_{t+h}}$) (a vector called “direct_predictions”), and finally the value for $\Delta Y_{t-1}$.

#empty vectors to store results
#this will story the estimated change in All food values
direct_pred_val <- numeric(3)

#This will store the estimated all food values
direct_predictions <- numeric(3)

#here we store the value of change in All food at point t-1
t_minus_one <- leads$CUUR0000SAF1[nrow(leads)-1]-leads$CUUR0000SAF1[nrow(leads)-2]

The following will take the estimated models for each horizon and the value of $\Delta Y_{t-1}=.577$ to first predict ($\hat{\Delta Y_{t+h}}$).

#change in all food for October
direct_pred_val[1] <- .5815*t_minus_one+.5926
#change in all food for November
direct_pred_val[2] <- .6918*t_minus_one+.4803
#change in all food for December
direct_pred_val[3] <- .6833*t_minus_one+.4394

And then we will transform it from $\hat{\Delta Y_{t}}$ values to $\hat{Y_t}$ values by following Equation 3

\[ \hat{Y_{Oct}}= Y_{Sep}+\hat{\Delta Y_{Oct}}\\ \hat{Y_{Nov}}= Y_{Sep}+\hat{\Delta Y_{Oct}}+\hat{\Delta Y_{Nov}}\\ \hat{Y_{Dec}}= Y_{Sep}+\hat{\Delta Y_{Oct}}+\hat{\Delta Y_{Nov}}+\hat{\Delta Y_{Dec}} \tag{3}\]

Where $Y_{Sep}$ is the observed value of the All food series in September

direct_predictions[1] <- leads$CUUR0000SAF1[nrow(leads)]+direct_pred_val[1]
direct_predictions[2] <- direct_predictions[1]+direct_pred_val[2] 
direct_predictions[3] <- direct_predictions[2]+direct_pred_val[3]

Now we can put the predictions from both approaches into a single data frame.

# Create a sequence of monthly dates from October 2023 to December 2023
date_sequence <- seq(ymd("2023-10-01"), ymd("2023-12-01"), by = "1 month")

# Combine the vectors into a data frame
predictions_df <- data.frame(
  Date = date_sequence,
  Direct_Predictions = direct_predictions,
  Iterative_Predictions = itterative_predictions
)

# Convert the data frame to a tsibble
predictions_tsbl <- as_tsibble(predictions_df, index = Date)

# Print the tsibble
print(predictions_tsbl)

# A tsibble: 3 x 3 [1D]
  Date       Direct_Predictions Iterative_Predictions
  <date>                  <dbl>                 <dbl>
1 2023-10-01               326.                  326.
2 2023-11-01               327.                  327.
3 2023-12-01               327.                  328.

And finally we can observe how they performed against the actual values the occurred in Oct to Dec 2023

CPI_df %>%
  filter_index("2023 Jan" ~ .) %>% 
  autoplot(CUUR0000SAF1, col = "black", size = 1) +
  autolayer(predictions_tsbl, Direct_Predictions, col = "blue", series = "Direct Predictions") +
  autolayer(predictions_tsbl, Iterative_Predictions, col = "green", series = "Iterative Predictions") +
  labs(title = "Prediction Comparison", 
       y = "CPI: All Food", 
       x = "Date") +
  theme_bw() +
  guides(colour = guide_legend(title = "Legend")) +
  theme(legend.position = "top")

Direct vs Iterative Multi-Step Forecasts

Where t represents time in months, Y represents the CPI series, \(\Delta\) represents the change in CPI series \(Y\) calculated by difference the series once (\(Y_t-Y_{t-1}\)), and \(\phi\) is the parameter for the lagged change in CPI (\(\Delta Y_{t-1}\))

Where t represents time in months,h represents the forecast horizon, Y represents the CPI series, \(\Delta\) represents the change in CPI series \(Y\) calculated by difference the series once (\(Y_t-Y_{t-1}\)), and \(\phi\) is the parameter for the lagged change in CPI (\(\Delta Y_{t-1}\))

Iterative Forecasting

Where t represents time in months, Y represents the CPI series, \(\Delta\) represents the change in CPI series \(Y\) calculated by difference the series once (\(Y_t-Y_{t-1}\)), and \(\phi\) is the parameter for the lagged change in CPI (\(\Delta Y_{t-1}\))

Direct Forecasting

Where t represents time in months,h represents the forecast horizon, Y represents the CPI series, \(\Delta\) represents the change in CPI series \(Y\) calculated by difference the series once (\(Y_t-Y_{t-1}\)), and \(\phi\) is the parameter for the lagged change in CPI (\(\Delta Y_{t-1}\))

Where \(Y_{Sep}\) is the observed value of the All food series in September