Module 9 Discussion

Author

Teddy Kelly

Part I

For this discussion, I will explore the US CPI time series from FRED to determine whether fitting an ETS or ARIMA model will produce more accurate forecasts.

Below, I have loaded in all of the necessary packages, set my Fred API key, and loaded in the US CPI time series into my environment.

rm(list=ls())
library(fredr)
library(fpp3)
library(kableExtra)
library(patchwork)

# Set Fred API key
fredr_set_key(Sys.getenv('fred_api_key'))

# Loading in the US GDP
gdp <- fredr(series_id = 'GDP',
              observation_start = as.Date('2005-01-01'),
              observation_end = as.Date('2025-10-01'))

# Convert time series into a tsibble
gdp <- gdp |>
  mutate(quarter = yearquarter(date)) |>
  select(quarter, value) |>
  as_tsibble(index = quarter)

# loading in the CPI Data
cpi <- fredr(series_id = 'CPIAUCSL',
             observation_start = as.Date('2010-01-01'),
             observation_end = as.Date('2025-12-01'))

cpi <- cpi |>
  mutate(month = yearmonth(date)) |>
  select(month, value) |>
  as_tsibble(index = month)

# Imputing Missing value for Oct 2025
which(is.na(cpi$value))

[1] 190

cpi[190,2] <- (cpi[189,2] + cpi[191,2]) / 2

cpi |> autoplot(value) +
  labs(title = 'US CPI since 1990',
       x = 'Time (Months)',
       y = 'Nominal GDP In Billions of Dollars')

The US CPI time series above displays a clear upward trend over time and there was significant increase immediately after Covid as a result of the stimulus checks which increased the money supply and thus the price level in the economy.
Before comparing the ETS and ARIMA forecasts for this time series, I think that ETS will perform better because the time series exhibits a clear upward trend which results in non-stationarity of the time series.

Below, I have split the CPI time series into 80% for fitting the ETS and ARIMA models and 20% for comparing the forecasting accuracy of the models

# Splitting the data into 80% training and 20% testing
cpi_train <- cpi |> filter_index(~'2022 Oct')
cpi_test <- cpi |> filter_index('2022 Nov'~.)

# Fitting the ETS and ARIMA models
cpi_fit <- cpi_train |> model(
  ETS = ETS(value),
  ARIMA = ARIMA(value)
)

# Generating forecasts for both models on the testing set
cpi_fc <- cpi_fit |>
  forecast(h = nrow(cpi_test))

cpi_fc |> autoplot(cpi_train) +
  geom_line(data = cpi_test,
            mapping = aes(y = value), linetype = 'dashed')

cpi_metrics <- accuracy(cpi_fc, cpi) |>
  rename('Model' = '.model') |>
  select(Model, ME, RMSE, MAE, MPE, MAPE)
cpi_metrics |> kable()

Model	ME	RMSE	MAE	MPE	MAPE
ARIMA	-15.65016	18.07687	15.65016	-4.933444	4.933444
ETS	-13.51086	15.64838	13.51086	-4.258456	4.258456

Looking at the visual comparison of the forecasts and the metric table, the ETS and ARIMA models produce very similar, but not identical forecasts.
The training set goes up until August of 2022, when the CPI was increasing at a rapid rate, which causes both the ETS and ARIMA models to systematically overestimate the actual CPI in the testing set. This is confirmed in the metric table where both models have a negative Mean Error (ME) and Mean Percentage Error (MPE).
Overall, the ETS model’s error metrics are lower in absolute value compared to the errors produced by the ARIMA model. Therefore, the \(ETS(M,A,N)\) model provides more accurate forecasts than the \(ARIMA(0,2,2)\) model as anticipated.

Below, I have also compared the residuals for both the ETS and ARIMA models. As we can see, the ACF of the residuals and the distribution of the residuals are slightly different which further shows that the ETS and ARIMA models are not identical.

ets_residuals <- cpi_fit |> select(ETS) |> 
  gg_tsresiduals() +
  labs(title = 'ETS(M,A,N) Residuals')
ets_residuals

arima_residuals <- cpi_fit |> select(ARIMA) |> 
  gg_tsresiduals() +
  labs(title = 'ARIMA(0,2,2) Residuals')

arima_residuals

Part II

Now, I will build a dynamic regression model to forecast US CPI by using the M2 time series from Fred. The M2 time series represents the US money supply which then determines the price level, and hence the CPI which is one measure of the price level. When the money supply increases, the price level will increase and vice versa.

This goal here is to see if using the M2 as an external variable in a dynamic regression on the ARIMA model will improve forecasting accuracy of the CPI.

I have loaded in the M2 time series from Fred below and combined the M2 and CPI time series to be able to use M2 as an external regressor for forecasting the CPI.

# Loading in 
m2 <- fredr(series_id = 'M2SL',
            observation_start = as.Date('2010-01-01'),
            observation_end = as.Date(('2025-12-01')))

m2 <- m2 |> 
  mutate(month = yearmonth(date)) |>
  select(month, value) |>
  as_tsibble(index = month)

# Combining cpi and m2 time series into one
money <- cpi |>
  left_join(m2, by = 'month') |>
  rename('CPI' = 'value.x',
         'M2' = 'value.y')

Now that I have joined together the CPI and M2 time series, I will now split the money tsibble into 80% for training and 20% for testing and then build the dynamic regression model using ARIMA.

# Split into 80% for training and 20% for testing
money_train <- money |> filter_index(~'2022 Oct')
money_test <- money |> filter_index('2022 Nov'~.)

# Fitting the dynamic regression model
money_fit <- money_train |>
  model(ETS = ETS(CPI),
        ARIMA = ARIMA(CPI),
        ARIMAX = ARIMA(CPI ~ M2 + pdq(0, 2, 2)))

money_fit |> report() |>
  rename('Model' = '.model') |>
  select(Model, sigma2, log_lik, AIC, AICc, BIC) |>
  kable(digits = 2)

Model	sigma2	log_lik	AIC	AICc	BIC
ETS	0.00	-306.76	623.52	623.93	638.71
ARIMA	0.34	-132.50	271.00	271.17	280.08
ARIMAX	0.31	-124.38	256.76	257.04	268.86

According to the model report above, including the money supply as an external regressor in the ARIMA model improves Model performance
Specifically, the ARIMAX model has a higher log likelihood and lower AIC, AICc, and BIC values than the original ARIMA(0,2,2) model.
This indicates that including the money supply as an external variable can help improve forecasting accuracy of the CPI which makes sense because whether the money supply is increased or decreased has a direct effect on the CPI.
The residuals of the ARIMAX model fit appear to be white noise which further shows that the dynamic regressor may capture the CPI behavior better than the regular ARIMA model.

# ARIMAX Residuals
money_fit |> select(ARIMAX) |> gg_tsresiduals() +
  labs(title = 'ARIMAX Residuals')

Now, I will compare the forecasting accuracy of the models on the testing set. Because the fitted models have not yet encountered the testing set data, to ensure the ARIMAX model can make forecasts on the test set, we must use the observed values of the external regressor, M2, in the testing set.

This requires us to feed in the future values of M2 as a separate tsibble when generating the forecasts for CPI which I outline in the code below.

# Forecasting with the models
m2_future <- new_data(money_train, nrow(money_test)) |>
  mutate(M2 = money_test$M2)

money_fc <- money_fit |> 
  forecast(new_data = m2_future)

# Plotting the forecasts
money_fc |> autoplot(money_train) +
  geom_line(data = money_test,
            aes(y = CPI), linetype = 'dashed') +
  labs(title = 'Comparing Forecasting Accuracy of CPI',
       subtitle = 'ETS, ARIMA, and ARIMAX',
       x = 'Time (Months)',
       y = 'CPI')

# Accuracy metrics
money_metrics <- accuracy(money_fc, money) |>
  rename('Model' = '.model') |>
  select(Model, ME, RMSE, MAE, MPE, MAPE)
money_metrics |> kable()

Model	ME	RMSE	MAE	MPE	MAPE
ARIMA	-15.65016	18.07687	15.65016	-4.933444	4.933444
ARIMAX	-16.66698	18.77084	16.66698	-5.261911	5.261911
ETS	-13.51086	15.64838	13.51086	-4.258456	4.258456

Despite the improved model performance of the ARIMAX model on the training set, the graph and metric table above display that the ARIMAX model produces slightly more inaccurate forecasts compared to the regular ARIMA and ETS models.
Specifically, using dynamic regression with M2 as an external regressor causes the forecasts to overestimate the CPI.
ARIMAX has a larger Mean Error and Mean Percentage Error in the negative direction which indicates this increased overestimation in CPI.
One potential reason that ARIMAX does not improve forecasting accuracy is that changes to the money supply do not have an immediate effect on the price level, but rather influences the price level in the long-run
Therefore, since I produced forecasts for only several periods ahead, using observed values of the money supply did not necessarily improve the forecasting accuracy of CPI levels in the immediate future.
This highlights the importance of considering the time it takes for an external regressor to affect the variable of interest. In this case, the effects of monetary policy do not act right away, but instead take time.