This report pulls quarterly U.S. Real GDP data from the Federal Reserve Economic Data (FRED) database, converts it to an annualized growth rate, tests for stationarity and fits an Autoregressive Model AR(2) time series model. Forecasts for the next eight quarters are produced from the fitted model.
library(fredr)
library(tseries)
library(forecast)
library(ggplot2)
library(dplyr)
library(zoo)
fred_key <- Sys.getenv("FRED_API_KEY")
if (nchar(fred_key) == 0)
stop("FRED_API_KEY not found. Set it with Sys.setenv(FRED_API_KEY = 'your_key')")
fredr_set_key(fred_key)
Real GDP is the inflation adjusted value of the goods and services
produced by labor and property located in the United States. Real GDP is
measured in billions of chained 2017 dollars, seasonally adjusted at
annual rates. This measurement functions like an index anchored to 2017,
where changes in the series reflect only changes in the volume of output
rather than changes in prices. The chained method links successive year
to year price comparisons rather than fixing a single base year basket
permanently. This process allows the index to reveals changes in the
composition of the economy. Sourced from the U.S. Bureau of Economic
Analysis, Real Gross Domestic Product FRED series GDPC1
April 28, 2026.
local_file <- "gdp_raw.csv"
if (file.exists(local_file)) {
gdp_raw <- read.csv(local_file, stringsAsFactors = FALSE)
gdp_raw$date <- as.Date(gdp_raw$date)
gdp_raw$realtime_start <- as.Date(gdp_raw$realtime_start)
gdp_raw$realtime_end <- as.Date(gdp_raw$realtime_end)
} else {
httr::set_config(httr::timeout(30))
gdp_raw <- fredr(
series_id = "GDPC1",
observation_start = as.Date("1947-01-01"),
observation_end = Sys.Date(),
frequency = "q"
)
write.csv(gdp_raw, local_file, row.names = FALSE)
}
cat(sprintf("Observations: %d | Period: %s to %s",
nrow(gdp_raw), min(gdp_raw$date), max(gdp_raw$date)))
## Observations: 316 | Period: 1947-01-01 to 2025-10-01
The annualized quarter over quarter growth rate is computed as:
\[g_t = \left[\left(\frac{GDP_t}{GDP_{t-1}}\right)^4 - 1\right] \times 100\]
gdp <- gdp_raw %>%
select(date, gdp = value) %>%
arrange(date) %>%
filter(!is.na(gdp)) %>%
mutate(
gdp_growth_qoq = (gdp / lag(gdp) - 1) * 100,
gdp_growth_ann = ((gdp / lag(gdp))^4 - 1) * 100
) %>%
filter(!is.na(gdp_growth_ann))
ggplot(gdp, aes(x = date, y = gdp_growth_ann)) +
geom_line(colour = "#2C7BB6", linewidth = 0.6) +
geom_hline(yintercept = 0, linetype = "dashed", colour = "grey50") +
labs(
title = "U.S. Real GDP Growth Rate (Annualized, QoQ)",
subtitle = "Source: FRED - GDPC1",
x = NULL,
y = "Growth Rate (%)"
) +
theme_minimal(base_size = 13)
start_yr <- as.numeric(format(gdp$date[1], "%Y"))
start_qtr <- as.numeric(format(gdp$date[1], "%m")) %/% 3 + 1
growth_ts <- ts(gdp$gdp_growth_ann, start = c(start_yr, start_qtr), frequency = 4)
Three complementary tests are applied. The ADF and Phillips-Perron tests have a null hypothesis of a unit root (non-stationary); the KPSS test has a null of level stationarity.
adf_result <- adf.test(growth_ts, alternative = "stationary")
data.frame(
Statistic = round(adf_result$statistic, 4),
p.value = round(adf_result$p.value, 4),
Lags = adf_result$parameter,
Conclusion = ifelse(adf_result$p.value < 0.05,
"Reject H0 — Stationary",
"Fail to Reject H0 — Non-Stationary")
)
## Statistic p.value Lags Conclusion
## Dickey-Fuller -7.4624 0.01 6 Reject H0 — Stationary
kpss_result <- kpss.test(growth_ts, null = "Level")
data.frame(
Statistic = round(kpss_result$statistic, 4),
p.value = round(kpss_result$p.value, 4),
Conclusion = ifelse(kpss_result$p.value > 0.05,
"Fail to Reject H0 — Stationary",
"Reject H0 — Non-Stationary")
)
## Statistic p.value Conclusion
## KPSS Level 0.4616 0.0506 Fail to Reject H0 — Stationary
pp_result <- pp.test(growth_ts, alternative = "stationary")
data.frame(
Statistic = round(pp_result$statistic, 4),
p.value = round(pp_result$p.value, 4),
Conclusion = ifelse(pp_result$p.value < 0.05,
"Reject H0 — Stationary",
"Fail to Reject H0 — Non-Stationary")
)
## Statistic p.value Conclusion
## Dickey-Fuller Z(alpha) -279.2955 0.01 Reject H0 — Stationary
All three tests confirm the GDP growth rate series is stationary at the 5% significance level. No differencing is required before fitting the AR model.
The Autocorrelation Function, ACF and the Partial Autocorrelation Function, PACF are examined to motivate the AR(2) specification. A significant PACF at lags 1 and 2 with a gradual decay in the ACF is consistent with an AR(2) process.
par(mfrow = c(1, 2))
acf(growth_ts, lag.max = 20, main = "ACF - GDP Growth Rate", col = "#2C7BB6")
pacf(growth_ts, lag.max = 20, main = "PACF - GDP Growth Rate", col = "#D7191C")
par(mfrow = c(1, 1))
The AR(2) model is estimated by maximum likelihood:
\[g_t = \phi_0 + \phi_1 g_{t-1} + \phi_2 g_{t-2} + \varepsilon_t, \quad \varepsilon_t \sim WN(0, \sigma^2)\]
ar2_model <- Arima(growth_ts, order = c(2, 0, 0))
se_vals <- sqrt(diag(vcov(ar2_model)))
data.frame(
Parameter = names(coef(ar2_model)),
Estimate = round(coef(ar2_model), 5),
Std.Error = round(se_vals, 5),
t.value = round(coef(ar2_model) / se_vals, 3)
)
## Parameter Estimate Std.Error t.value
## ar1 ar1 0.12051 0.05604 2.150
## ar2 ar2 0.09933 0.05605 1.772
## intercept intercept 3.18665 0.32049 9.943
cat(sprintf("AIC: %.2f BIC: %.2f Sigma^2: %.4f",
AIC(ar2_model), BIC(ar2_model), ar2_model$sigma2))
## AIC: 1841.52 BIC: 1856.53 Sigma^2: 19.9311
For stationarity, all roots of the AR characteristic polynomial must lie outside the unit circle (modulus > 1).
ar_roots <- polyroot(c(1, -coef(ar2_model)[c("ar1", "ar2")]))
data.frame(
Root = paste0("Root ", seq_along(ar_roots)),
Modulus = round(Mod(ar_roots), 4),
Status = ifelse(Mod(ar_roots) > 1, "Stationary [OK]", "Non-Stationary [FAIL]")
)
## Root Modulus Status
## 1 Root 1 2.6237 Stationary [OK]
## 2 Root 2 3.8369 Stationary [OK]
checkresiduals(ar2_model)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,0,0) with non-zero mean
## Q* = 2.9609, df = 6, p-value = 0.8137
##
## Model df: 2. Total lags used: 8
lb_test <- Box.test(residuals(ar2_model), lag = 10, type = "Ljung-Box", fitdf = 2)
data.frame(
Chi.squared = round(lb_test$statistic, 4),
p.value = round(lb_test$p.value, 4),
Conclusion = ifelse(lb_test$p.value > 0.05,
"Residuals are white noise [OK]",
"Remaining autocorrelation [FAIL]")
)
## Chi.squared p.value Conclusion
## X-squared 4.8059 0.7781 Residuals are white noise [OK]
Eight quarter ahead forecasts with 80% and 95% prediction intervals.
fc <- forecast(ar2_model, h = 8, level = c(80, 95))
print(fc)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2026 Q1 2.978818 -2.742579 8.700214 -5.771303 11.72894
## 2026 Q2 2.892965 -2.869828 8.655757 -5.920466 11.70640
## 2026 Q3 3.130610 -2.668883 8.930102 -5.738949 12.00017
## 2026 Q4 3.150720 -2.650635 8.952076 -5.721687 12.02313
## 2027 Q1 3.176750 -2.625191 8.978691 -5.696553 12.05005
## 2027 Q2 3.181884 -2.620108 8.983877 -5.691498 12.05527
## 2027 Q3 3.185089 -2.616915 8.987092 -5.688310 12.05849
## 2027 Q4 3.185985 -2.616020 8.987990 -5.687415 12.05939
autoplot(fc) +
labs(
title = "AR(2) Forecast - U.S. Real GDP Growth Rate",
subtitle = "80% and 95% prediction intervals",
x = NULL,
y = "Growth Rate (%)"
) +
theme_minimal(base_size = 13)
The annualized quarterly GDP growth rate is stationary, confirmed by three independent tests. The AR(2) model produces stable estimates with clean white noise residuals (Ljung-Box p = 0.778). Forecasts converge quickly to the long-run mean of approximately 3.19%, with wide prediction intervals reflecting the inherent volatility of quarterly GDP growth (\(\hat{\sigma} \approx\) 4.46%).