remove(list = ls())
library(fpp3) # tsibble, fable, feasts — main forecasting toolkit
library(fredr) # pull FRED data
library(vars) # base VAR / VECM / IRF
library(urca) # unit-root and cointegration tests
library(lpirfs) # local projections
library(patchwork) # combine ggplots with `/` and `+`
library(tidyverse)
library(conflicted)
# Resolve common name conflicts in favor of dplyr / fable
conflict_prefer("select", "dplyr")
conflict_prefer("filter", "dplyr")
conflict_prefer("lag", "dplyr")
conflict_prefer("summarise", "dplyr")
conflict_prefer("rename", "dplyr")
conflict_prefer("VAR", "fable")
# Get a free key at https://fred.stlouisfed.org/docs/api/api_key.html
fredr_set_key("8a9ec1330374c1696f05cc8e526233b5")VAR, VECM, and Local Projections
A Teaching Note on Multivariate Time Series for Macro Forecasting
Why This Note Exists
Most introductory time-series teaching stops at univariate models — ARIMA, ETS, exponential smoothing. That is fine for one series at a time. But macro variables move together. Inflation, unemployment, interest rates, output — none of these is a closed system. The whole point of macro forecasting is letting variables talk to each other.
This note covers four models, in order of how much “talking” they allow:
- AR(p) — each variable on its own. Univariate baseline.
- VAR(p) — every variable predicts every other variable, lag for lag.
- VECM — a VAR for cointegrated series, with a long-run equilibrium constraint.
- Local Projections (LP) — not for forecasting, but for causal impulse response estimation. The modern alternative to VAR-based IRFs.
Running example throughout: monthly U.S. CPI inflation (year-over-year) and the unemployment rate, pulled from FRED.
Setup
Data from FRED
We pull monthly CPI (CPIAUCSL) and unemployment rate (UNRATE), then construct year-over-year inflation as
\[ \pi_t \;=\; 100 \times \left( \ln P_t - \ln P_{t-12} \right) \]
Sample period: 1985–2023 (avoids the early Volcker disinflation; ends before COVID-era distortions can leak into the holdout).
start_date <- as.Date("1985-01-01")
end_date <- as.Date("2023-12-01")
cpi_raw <- fredr("CPIAUCSL", observation_start = start_date, observation_end = end_date)
unemp_raw <- fredr("UNRATE", observation_start = start_date, observation_end = end_date)
macro <- cpi_raw |>
select(date, cpi = value) |>
mutate(inflation = 100 * (log(cpi) - log(lag(cpi, 12)))) |>
inner_join(unemp_raw |> select(date, unemployment = value), by = "date") |>
drop_na() |>
mutate(month = yearmonth(date)) |>
as_tsibble(index = month) |>
select(month, inflation, unemployment)
macroEyeball the series
macro |>
pivot_longer(c(inflation, unemployment)) |>
mutate(name = recode(name,
inflation = "Inflation (YoY %)",
unemployment = "Unemployment Rate (%)"
)) |>
autoplot(value) +
facet_wrap(~ name, ncol = 1, scales = "free_y") +
labs(title = "U.S. Inflation and Unemployment, Monthly", x = NULL, y = NULL) +
theme_minimal(base_size = 13) +
theme(legend.position = "none")Both series are persistent and show clear negative co-movement — when unemployment rises, inflation tends to fall. That is the empirical Phillips curve. Whether this co-movement helps us forecast is the question the next few sections answer.
Train / Test Split
Train on 1986–2018; forecast all twelve months of 2019. Stopping at 2019 keeps the COVID structural break out of our holdout.
1. AR(2) — the Univariate Baseline
Why we start here
Before claiming that “letting inflation and unemployment talk to each other helps”, we need a baseline that doesn’t let them talk. AR(2) is that baseline: each series uses only its own past.
\[ \pi_t \;=\; c_1 + \phi_{11}\,\pi_{t-1} + \phi_{12}\,\pi_{t-2} + \varepsilon_{1t} \]
\[ u_t \;=\; c_2 + \phi_{21}\,u_{t-1} + \phi_{22}\,u_{t-2} + \varepsilon_{2t} \]
Two separate equations. No cross-talk. Anything VAR adds beyond this is the value of the multivariate structure.
Estimate
Forecast 2019
fc_ar <- list(
inf = forecast(fit_ar$inf, h = 12),
unemp = forecast(fit_ar$unemp, h = 12)
)
p1 <- fc_ar$inf |>
autoplot(macro |> filter(year(month) >= 2016), level = c(80, 95)) +
geom_point(data = test, aes(x = month, y = inflation),
color = "red", size = 1.5) +
labs(title = "AR(2) Forecast: Inflation",
subtitle = "Red dots = actual 2019",
y = "% YoY", x = NULL) +
theme_minimal(base_size = 12)
p2 <- fc_ar$unemp |>
autoplot(macro |> filter(year(month) >= 2016), level = c(80, 95)) +
geom_point(data = test, aes(x = month, y = unemployment),
color = "red", size = 1.5) +
labs(title = "AR(2) Forecast: Unemployment",
subtitle = "Red dots = actual 2019",
y = "%", x = NULL) +
theme_minimal(base_size = 12)
p1 / p22. VAR(2) — Letting the Two Series Talk
Intuition
The Phillips curve says past unemployment should help predict inflation. AR(2) cannot use that. A vector autoregression can: every variable in the system depends on lags of all variables.
In matrix form for two variables and two lags:
\[ \begin{pmatrix} \pi_t \\ u_t \end{pmatrix} \;=\; \mathbf{c} \;+\; \mathbf{A}_1 \begin{pmatrix} \pi_{t-1} \\ u_{t-1} \end{pmatrix} \;+\; \mathbf{A}_2 \begin{pmatrix} \pi_{t-2} \\ u_{t-2} \end{pmatrix} \;+\; \boldsymbol{\varepsilon}_t \]
The two off-diagonal entries of \(\mathbf{A}_1\) and \(\mathbf{A}_2\) are where the cross-talk lives:
- \(A_{12}\): how lagged unemployment moves inflation — the Phillips curve channel.
- \(A_{21}\): how lagged inflation moves unemployment — the policy reaction / cost-push channel.
If both are statistically zero, VAR(2) collapses to two AR(2)s and offers nothing extra. If either is non-zero, the multivariate model carries information AR(2) cannot.
Estimate VAR(2)
Forecast 2019
fc_var_fbl <- fit_var_fbl |> forecast(h = 12)
var_fc_tbl <- as_tibble(fc_var_fbl)
p3 <- ggplot(macro |> filter(year(month) >= 2016),
aes(x = month, y = inflation)) +
geom_line(color = "grey40") +
geom_line(
data = tibble(month = var_fc_tbl$month,
forecast = var_fc_tbl$.mean[, "inflation"]),
aes(y = forecast), color = "steelblue", linewidth = 1
) +
geom_point(data = test, aes(y = inflation), color = "red", size = 1.5) +
labs(title = "VAR(2) Forecast: Inflation",
subtitle = "Blue = forecast, red dots = actual 2019",
y = "% YoY", x = NULL) +
theme_minimal(base_size = 12)
p4 <- ggplot(macro |> filter(year(month) >= 2016),
aes(x = month, y = unemployment)) +
geom_line(color = "grey40") +
geom_line(
data = tibble(month = var_fc_tbl$month,
forecast = var_fc_tbl$.mean[, "unemployment"]),
aes(y = forecast), color = "steelblue", linewidth = 1
) +
geom_point(data = test, aes(y = unemployment), color = "red", size = 1.5) +
labs(title = "VAR(2) Forecast: Unemployment",
subtitle = "Blue = forecast, red dots = actual 2019",
y = "%", x = NULL) +
theme_minimal(base_size = 12)
p3 / p4AR(2) vs. VAR(2): does Phillips help?
ar_pts <- fc_ar$inf |>
as_tibble() |>
select(month, ar_mean = .mean)
var_pts <- tibble(
month = var_fc_tbl$month,
var_mean = var_fc_tbl$.mean[, "inflation"]
)
comparison <- test |>
as_tibble() |>
select(month, actual = inflation) |>
left_join(ar_pts, by = "month") |>
left_join(var_pts, by = "month")
comparison |>
pivot_longer(c(ar_mean, var_mean),
names_to = "Model", values_to = "forecast") |>
group_by(Model) |>
summarise(
RMSE = sqrt(mean((forecast - actual)^2)),
MAE = mean(abs(forecast - actual)),
MAPE = mean(abs((forecast - actual) / actual)) * 100,
.groups = "drop"
) |>
mutate(
Model = recode(Model, ar_mean = "AR(2)", var_mean = "VAR(2)"),
across(where(is.numeric), \(x) round(x, 3))
)comparison |>
pivot_longer(c(actual, ar_mean, var_mean),
names_to = "series", values_to = "value") |>
mutate(series = recode(series,
actual = "Actual",
ar_mean = "AR(2) forecast",
var_mean = "VAR(2) forecast"
)) |>
ggplot(aes(x = month, y = value, color = series, linetype = series)) +
geom_line(linewidth = 1) +
scale_color_manual(values = c("black", "steelblue", "firebrick")) +
scale_linetype_manual(values = c("solid", "dashed", "dashed")) +
labs(title = "AR(2) vs. VAR(2) — 2019 Inflation Forecasts",
y = "% YoY", x = NULL, color = NULL, linetype = NULL) +
theme_minimal(base_size = 12) +
theme(legend.position = "bottom")Key result. When the VAR(2) RMSE comes in below the AR(2) RMSE, the empirical Phillips curve is doing useful forecasting work — past unemployment is carrying information about future inflation that the univariate model misses.
If the gap is small (or AR(2) wins narrowly), the honest reading is that over this particular holdout, the cross-variable signal was weak. That is itself a useful classroom result — multivariate is not magic.
3. Vector Error Correction Model (VECM)
The problem VECM solves
VAR assumes the series are stationary. What if both inflation and unemployment have a unit root — i.e., are I(1)? Then we have three options, only one of which is right:
- VAR in levels → may produce spurious results.
- VAR in first differences → loses long-run information.
- VECM → correct if the series are cointegrated (share a common stochastic trend, cannot drift apart forever).
The VECM augments the differenced VAR with an error-correction term that pulls the system back toward its long-run equilibrium:
\[ \Delta \mathbf{y}_t \;=\; \boldsymbol{\alpha}\, \underbrace{(\boldsymbol{\beta}'\, \mathbf{y}_{t-1})}_{\text{disequilibrium}} \;+\; \sum_{j=1}^{p-1} \boldsymbol{\Gamma}_j\, \Delta \mathbf{y}_{t-j} \;+\; \boldsymbol{\varepsilon}_t \]
- \(\boldsymbol{\beta}\) — the cointegrating vector, encoding the long-run relationship.
- \(\boldsymbol{\alpha}\) — the speed of adjustment: how fast each series corrects back toward equilibrium when it strays.
Step 3a — Unit-root tests
infl_ts <- ts(train$inflation, start = c(1986, 1), frequency = 12)
unemp_ts <- ts(train$unemployment, start = c(1986, 1), frequency = 12)
# ADF test — null is unit root
adf_infl <- ur.df(infl_ts, type = "drift", selectlags = "AIC")
adf_unemp <- ur.df(unemp_ts, type = "drift", selectlags = "AIC")
tibble(
Series = c("Inflation", "Unemployment"),
ADF_stat = c(adf_infl@teststat[1], adf_unemp@teststat[1]),
CV_5pct = c(adf_infl@cval[1, 2], adf_unemp@cval[1, 2]),
Conclusion = c(
ifelse(adf_infl@teststat[1] < adf_infl@cval[1, 2],
"Reject H0 (stationary)", "Fail to reject (unit root)"),
ifelse(adf_unemp@teststat[1] < adf_unemp@cval[1, 2],
"Reject H0 (stationary)", "Fail to reject (unit root)")
)
) |> mutate(across(where(is.numeric), round, 3))If both series have unit roots, move on to cointegration testing. If either is stationary (I(0)), a standard VAR in levels is fine — and VECM does not apply.
Step 3b — Johansen cointegration test
######################
# Johansen-Procedure #
######################
Test type: trace statistic , without linear trend and constant in cointegration
Eigenvalues (lambda):
[1] 5.339340e-02 4.785024e-03 5.536769e-18
Values of teststatistic and critical values of test:
test 10pct 5pct 1pct
r <= 1 | 1.89 7.52 9.24 12.97
r = 0 | 23.51 17.85 19.96 24.60
Eigenvectors, normalised to first column:
(These are the cointegration relations)
inflation.l2 unemployment.l2 constant
inflation.l2 1.0000000 1.000000 1.00000
unemployment.l2 0.1389269 8.851291 23.68385
constant -3.4322515 -39.512669 -172.14074
Weights W:
(This is the loading matrix)
inflation.l2 unemployment.l2 constant
inflation.d -0.062524078 -0.0002269016 -7.182036e-20
unemployment.d 0.006861133 -0.0004988239 3.883569e-19
Test the hypotheses in order, starting with \(r = 0\) (no cointegration):
- Reject \(r = 0\) but fail to reject \(r = 1\) → one cointegrating vector — VECM with \(r = 1\) is appropriate.
- Fail to reject \(r = 0\) → no cointegration — use VAR in differences, not VECM.
- The long-run Phillips curve predicts a negative entry in \(\boldsymbol{\beta}\): high unemployment ↔︎ low inflation.
Step 3c — Estimate the VECM
Response inflation.d :
Call:
lm(formula = inflation.d ~ ect1 + inflation.dl1 + unemployment.dl1 -
1, data = data.mat)
Residuals:
Min 1Q Median 3Q Max
-2.00677 -0.16594 -0.00136 0.17513 1.47660
Coefficients:
Estimate Std. Error t value Pr(>|t|)
ect1 -0.06252 0.01358 -4.604 5.62e-06 ***
inflation.dl1 0.36522 0.04578 7.978 1.67e-14 ***
unemployment.dl1 0.01510 0.11082 0.136 0.892
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3371 on 391 degrees of freedom
Multiple R-squared: 0.2022, Adjusted R-squared: 0.1961
F-statistic: 33.03 on 3 and 391 DF, p-value: < 2.2e-16
Response unemployment.d :
Call:
lm(formula = unemployment.d ~ ect1 + inflation.dl1 + unemployment.dl1 -
1, data = data.mat)
Residuals:
Min 1Q Median 3Q Max
-0.52900 -0.09638 -0.00881 0.09506 0.47268
Coefficients:
Estimate Std. Error t value Pr(>|t|)
ect1 0.006861 0.006078 1.129 0.2596
inflation.dl1 -0.046648 0.020487 -2.277 0.0233 *
unemployment.dl1 0.079603 0.049593 1.605 0.1093
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1508 on 391 degrees of freedom
Multiple R-squared: 0.02781, Adjusted R-squared: 0.02035
F-statistic: 3.728 on 3 and 391 DF, p-value: 0.01151
Step 3d — Forecast with the VECM
We convert the VECM back to its VAR representation for forecasting:
vecm_var <- vec2var(jo_test, r = 1)
fc_vecm <- predict(vecm_var, n.ahead = 12, ci = 0.90)
vecm_infl <- tibble(
month = test$month,
mean = fc_vecm$fcst$inflation[, "fcst"],
lower = fc_vecm$fcst$inflation[, "lower"],
upper = fc_vecm$fcst$inflation[, "upper"]
)
ggplot() +
geom_ribbon(data = vecm_infl,
aes(x = month, ymin = lower, ymax = upper),
fill = "darkorange", alpha = 0.2) +
geom_line(data = vecm_infl,
aes(x = month, y = mean, color = "VECM"), linewidth = 1) +
geom_line(data = test,
aes(x = month, y = inflation, color = "Actual"), linewidth = 1) +
scale_color_manual(values = c("Actual" = "black", "VECM" = "darkorange")) +
labs(title = "VECM Forecast: 2019 Inflation",
subtitle = "Shaded band = 90% CI",
y = "% YoY", x = NULL, color = NULL) +
theme_minimal(base_size = 12) +
theme(legend.position = "bottom")Three-way comparison: AR(2) vs. VAR(2) vs. VECM
vecm_pts <- tibble(
month = test$month,
vecm_mean = fc_vecm$fcst$inflation[, "fcst"]
)
comparison2 <- comparison |>
left_join(vecm_pts, by = "month")
comparison2 |>
pivot_longer(c(ar_mean, var_mean, vecm_mean),
names_to = "Model", values_to = "forecast") |>
group_by(Model) |>
summarise(
RMSE = sqrt(mean((forecast - actual)^2)),
MAE = mean(abs(forecast - actual)),
MAPE = mean(abs((forecast - actual) / actual)) * 100,
.groups = "drop"
) |>
mutate(
Model = recode(Model,
ar_mean = "AR(2)",
var_mean = "VAR(2)",
vecm_mean = "VECM"),
across(where(is.numeric), \(x) round(x, 3))
) |>
arrange(RMSE)When does VECM actually help?
| Condition | Short horizon (1–4 steps) | Long horizon (12+ steps) |
|---|---|---|
| Series are truly cointegrated | VECM ≈ VAR | VECM wins |
| Series are NOT cointegrated | VAR wins | VAR wins |
| Unit roots — but VECM imposed anyway | VECM may hurt | VECM may hurt |
VECM’s edge comes from the error-correction mechanism pulling the system back toward long-run equilibrium. At 1–4 step horizons, this pull is small — VAR is competitive. Over longer horizons, the equilibrium force matters more, and VECM should win — if cointegration is real. If you impose cointegration where it doesn’t exist, you bias the forecast.
4. Local Projections — the Modern Way to Estimate IRFs
What LP does, in one sentence
Instead of estimating a VAR and then computing impulse responses analytically, Local Projections (Jordà 2005) run a separate regression for each forecast horizon:
\[ y_{t+h} \;=\; \alpha^{h} + \beta^{h}_{1}\, x_{t} + \beta^{h}_{2}\, z_{t} + \cdots + \varepsilon_{t+h}, \qquad h = 0, 1, 2, \ldots, H \]
The sequence \(\{\hat{\beta}^{h}_{1}\}_{h=0}^{H}\) traces out the impulse response function directly, horizon by horizon. No cumulating powers of a VAR companion matrix; no implicit assumption that the IRF follows a VAR’s parametric path.
VAR-IRFs vs. LP-IRFs
| Feature | VAR IRF | LP IRF |
|---|---|---|
| Efficiency | More efficient if correctly specified | Less efficient |
| Misspecification robustness | Sensitive to lag length, linearity | Robust — each horizon estimated independently |
| Standard errors | Analytic or bootstrap | Newey-West (built in) |
| Adding controls | Awkward | Natural — just add regressors |
| Nonlinearities (state-dependent IRFs) | Needs TVAR / regime-switching | Easy — interact with regime dummy |
| Trend in top journals | Established | Growing fast |
LP is now the default choice for structural IRF estimation in top empirical macro journals (Ramey 2016 review; Stock & Watson 2018).
Estimate LP: response of inflation to an unemployment shock
lp_data <- train |>
as_tibble() |>
select(inflation, unemployment) # variable order matters
lp_fit <- lp_lin(
endog_data = lp_data,
lags_endog_lin = 2, # 2 lags of both variables as controls
trend = 0, # no deterministic trend
shock_type = 0, # reduced-form one-unit shock
confint = 1.65, # 90% confidence bands
hor = 24, # trace 24 months out
num_cores = 1
)# IRF: response of inflation (var 1) to an unemployment shock (var 2)
# lpirfs arrays are indexed [response, horizon, shock].
# With hor = 24, the horizon dimension has length 25 (h = 0, 1, ..., 24).
irf_mean <- lp_fit$irf_lin_mean[1, , 2]
H_lp <- length(irf_mean) - 1
irf_df <- tibble(
horizon = 0:H_lp,
irf = irf_mean,
lower = lp_fit$irf_lin_low[1, , 2],
upper = lp_fit$irf_lin_up[1, , 2]
)
ggplot(irf_df, aes(x = horizon)) +
geom_ribbon(aes(ymin = lower, ymax = upper),
fill = "steelblue", alpha = 0.25) +
geom_line(aes(y = irf), color = "steelblue", linewidth = 1.1) +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey40") +
labs(title = "LP Impulse Response: Inflation Response to Unemployment Shock",
subtitle = "Shaded band = 90% CI (Newey-West SEs)",
x = "Months after shock", y = "Response of inflation (pp)") +
theme_minimal(base_size = 12)Side-by-side: LP vs. VAR-based IRF
var_for_irf <- vars::VAR(lp_data, p = 2, type = "const")
irf_var <- vars::irf(var_for_irf,
impulse = "unemployment",
response = "inflation",
n.ahead = H_lp, # match LP horizon
ci = 0.90,
boot = TRUE)
irf_var_df <- tibble(
horizon = 0:H_lp,
irf = as.numeric(irf_var$irf$unemployment),
lower = as.numeric(irf_var$Lower$unemployment),
upper = as.numeric(irf_var$Upper$unemployment)
)
bind_rows(
irf_df |> mutate(method = "Local Projection (LP)"),
irf_var_df |> mutate(method = "VAR(2)")
) |>
ggplot(aes(x = horizon, fill = method, color = method)) +
geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0.15) +
geom_line(aes(y = irf), linewidth = 1.1) +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey40") +
scale_fill_manual(values = c("steelblue", "firebrick")) +
scale_color_manual(values = c("steelblue", "firebrick")) +
facet_wrap(~ method) +
labs(title = "LP vs. VAR — Inflation Response to Unemployment Shock",
subtitle = "Both at 90%. LP bands tend to be wider — more honest uncertainty.",
x = "Months after shock", y = "Response of inflation (pp)",
fill = NULL, color = NULL) +
theme_minimal(base_size = 12) +
theme(legend.position = "none")- Both IRFs should agree on sign and rough shape if the VAR is well specified.
- LP bands are typically wider — the VAR imposes the restriction that responses follow a specific parametric path; LP does not, so it shows you more of the true uncertainty.
- If the two point estimates diverge a lot at long horizons, that is a red flag for VAR misspecification (wrong lag length, omitted variables, nonlinearity).
- When you want causal credibility without betting on a perfectly specified VAR, prefer LP.
Putting It All Together
Key Takeaways
- AR(p) is the baseline. Anything multivariate has to beat it to justify the extra complexity.
- VAR is the right call when series are stationary and cross-variable feedback matters. Granger causality tests tell you whether the cross-talk is real.
- Test for unit roots before reaching for VECM. If both series are I(1) and cointegrated, VECM is theoretically correct. Its edge mostly shows up at long horizons.
- Use LP when the goal is causal IRFs, not point forecasts. It is more robust to misspecification and naturally extends with controls, interactions, and state-dependence.
- Forecasting and structural inference are different jobs. A VAR can forecast well even when its IRFs are biased; LP exists to get the IRFs right, not to win on RMSE.
References
- Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). OTexts. https://otexts.com/fpp3
- Jordà, Ò. (2005). Estimation and inference of impulse responses by local projections. American Economic Review, 95(1), 161–182.
- Lütkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer.
- Ramey, V. A. (2016). Macroeconomic shocks and their propagation. Handbook of Macroeconomics, vol. 2.
- Stock, J. H., & Watson, M. W. (2018). Identification and estimation of dynamic causal effects in macroeconomics using external instruments. Economic Journal, 128(610), 917–948.