library(fpp3)
library(tseries)ADEC7406 Week 14 Discussion
Part I
ARIMA model
An ARIMA model is primarily a model for the conditional mean of a single time series. Let \(B\) denote the backshift operator, so that \(B y_t = y_{t-1}\). An ARIMA\((p,d,q)\) model can be written as
\[\begin{equation} \phi(B)(1-B)^d y_t = c + \theta(B)\varepsilon_t, \end{equation}\]
where
\[\begin{equation} \phi(B)=1-\phi_1B-\phi_2B^2-\cdots-\phi_pB^p \end{equation}\]
and
\[\begin{equation} \theta(B)=1+\theta_1B+\theta_2B^2+\cdots+\theta_qB^q. \end{equation}\]
Equivalently,
\[\begin{equation} (1-\phi_1B-\cdots-\phi_pB^p)(1-B)^d y_t = c+(1+\theta_1B+\cdots+\theta_qB^q)\varepsilon_t, \quad \varepsilon_t \sim WN(0,\sigma^2). \end{equation}\]
The autoregressive coefficients \(\phi_1,\ldots,\phi_p\) capture persistence in the level or differenced series, while the moving-average coefficients \(\theta_1,\ldots,\theta_q\) capture persistence in past shocks. The differencing parameter \(d\) is used to remove stochastic trends and make the series closer to stationary.
ADL model
An autoregressive distributed lag model includes both lagged values of the dependent variable and current or lagged values of explanatory variables. A simple ADL\((p,q)\) equation is
\[\begin{equation} y_t = \alpha + \sum_{i=1}^{p}\phi_i y_{t-i} + \sum_{j=0}^{q}\beta_j x_{t-j} + u_t, \quad u_t \sim WN(0,\sigma_u^2). \end{equation}\]
The ADL model is usually estimated by ordinary least squares when the regressors are predetermined or exogenous enough for the application:
\[\begin{equation} \min_{\alpha,\phi_i,\beta_j} \sum_{t=1}^{T} \left( y_t-\alpha-\sum_{i=1}^{p}\phi_i y_{t-i} -\sum_{j=0}^{q}\beta_j x_{t-j} \right)^2. \end{equation}\]
The coefficients \(\beta_j\) describe how \(x_t\) affects \(y_t\) over time, while the coefficients \(\phi_i\) describe dynamic persistence in \(y_t\). Compared with ARIMA, the ADL model is more explicitly regression-based because it includes outside predictors.
ARCH model
ARCH models are different from ARIMA and ADL because they focus on the conditional variance rather than only the conditional mean. A basic mean equation can be written as
\[\begin{equation} y_t = \mu + \varepsilon_t, \quad \varepsilon_t = \sigma_t z_t, \quad z_t \overset{iid}{\sim} (0,1). \end{equation}\]
An ARCH\((q)\) variance equation is
\[\begin{equation} \sigma_t^2 = \omega + \sum_{i=1}^{q}\alpha_i \varepsilon_{t-i}^2, \quad \omega>0, \quad \alpha_i \geq 0. \end{equation}\]
Here, large past shocks increase the current conditional variance. This feature makes ARCH models useful for financial data, where large price movements are often followed by more large movements.
GARCH model
GARCH generalizes ARCH by adding lagged conditional variances to the variance equation. A GARCH\((p,q)\) model can be written as
\[\begin{equation} y_t = \mu + \varepsilon_t, \quad \varepsilon_t = \sigma_t z_t, \quad z_t \overset{iid}{\sim} (0,1), \end{equation}\]
with
\[\begin{equation} \sigma_t^2 = \omega + \sum_{i=1}^{q}\alpha_i \varepsilon_{t-i}^2 + \sum_{j=1}^{p}\beta_j \sigma_{t-j}^2, \quad \omega>0, \quad \alpha_i \geq 0, \quad \beta_j \geq 0. \end{equation}\]
Under a Gaussian quasi-likelihood, the parameters are commonly estimated by maximizing
\[\begin{equation} \ell(\Theta) = -\frac{1}{2} \sum_{t=1}^{T} \left[ \log(\sigma_t^2) + \frac{\varepsilon_t^2}{\sigma_t^2} \right]. \end{equation}\]
where \(\Theta\) contains the mean and variance parameters. In the GARCH\((1,1)\) case, covariance stationarity of the conditional variance generally requires \(\alpha_1+\beta_1<1\).
Connections among the models
ARIMA and ADL are mainly conditional-mean models. ARIMA uses the history of the series and its shocks, while ADL adds outside explanatory variables and their lags. If an ADL model has lagged \(y_t\) but no external \(x_t\), it becomes close to an autoregressive mean model. If an ARIMA model includes external regressors, it becomes closely related to an ARIMAX or dynamic regression model.
ARCH and GARCH are mainly conditional-variance models. They are not substitutes for the mean equation; rather, they can be combined with ARIMA, ARMA, or ADL mean equations when the residual variance changes over time. ARCH is nested inside GARCH because a GARCH model with \(\beta_j=0\) for all \(j\) reduces to ARCH. GARCH is usually more parsimonious because one lag of \(\sigma_t^2\) can summarize a long memory of past squared shocks.
Part II
For the empirical example, I use NVIDIA Corporation adjusted daily closing prices from Yahoo Finance for 2020-01-02 through 2025-12-30. The model is fitted to daily continuously compounded returns,
\[\begin{equation} r_t = 100\left[\log(P_t)-\log(P_{t-1})\right], \end{equation}\]
where \(P_t\) is the adjusted closing price. I model returns rather than price levels because stock prices are usually nonstationary, while daily returns are more appropriate for a volatility model.
data_path <- "data/nvda_2020_2025.csv"
if (file.exists(data_path)) {
nvda_raw <- read.csv(data_path)
} else {
if (!requireNamespace("quantmod", quietly = TRUE)) {
stop("File not found. See data/nvda_2020_2025.csv.")
}
nvda_xts <- quantmod::getSymbols(
"NVDA",
src = "yahoo",
from = "2020-01-01",
to = "2025-12-31",
auto.assign = FALSE
)
nvda_raw <- data.frame(
date = as.Date(zoo::index(nvda_xts)),
open = as.numeric(quantmod::Op(nvda_xts)),
high = as.numeric(quantmod::Hi(nvda_xts)),
low = as.numeric(quantmod::Lo(nvda_xts)),
close = as.numeric(quantmod::Cl(nvda_xts)),
volume = as.numeric(quantmod::Vo(nvda_xts)),
adjusted = as.numeric(quantmod::Ad(nvda_xts))
)
dir.create("data", showWarnings = FALSE)
write.csv(nvda_raw, data_path, row.names = FALSE)
}
nvda <- nvda_raw |>
mutate(date = as.Date(date)) |>
as_tsibble(index = date, regular = FALSE)
nvda_returns <- nvda |>
mutate(
log_price = log(adjusted),
r = 100 * (log_price - lag(log_price))
) |>
filter(!is.na(r))
glimpse(nvda_returns)Rows: 1,506
Columns: 9
$ date <date> 2020-01-03, 2020-01-06, 2020-01-07, 2020-01-08, 2020-01-09,…
$ open <dbl> 5.87750, 5.80800, 5.95500, 5.99400, 6.09625, 6.18325, 6.1915…
$ high <dbl> 5.94575, 5.93175, 6.04425, 6.05100, 6.14825, 6.21375, 6.3247…
$ low <dbl> 5.85250, 5.78175, 5.90975, 5.95375, 6.02150, 6.09375, 6.1687…
$ close <dbl> 5.90175, 5.92650, 5.99825, 6.00950, 6.07550, 6.10800, 6.2995…
$ volume <dbl> 205384000, 262636000, 314856000, 277108000, 255112000, 31629…
$ adjusted <dbl> 5.875186, 5.899824, 5.971252, 5.982451, 6.048155, 6.080508, …
$ log_price <dbl> 1.770738, 1.774923, 1.786957, 1.788830, 1.799753, 1.805088, …
$ r <dbl> -1.6135368, 0.4184750, 1.2034150, 0.1873649, 1.0922895, 0.53…
Original time series
nvda |>
ggplot(aes(x = date, y = adjusted)) +
geom_line(color = "#2563eb", linewidth = 0.45) +
labs(
title = "NVIDIA adjusted closing price",
x = "Date",
y = "Adjusted close"
) +
theme_minimal(base_size = 11)nvda_returns |>
ggplot(aes(x = date, y = r)) +
geom_hline(yintercept = 0, color = "gray55", linewidth = 0.3) +
geom_line(color = "#b91c1c", linewidth = 0.35) +
labs(
title = "NVIDIA daily log returns",
x = "Date",
y = "Daily log return (%)"
) +
theme_minimal(base_size = 11)Mean and variance equations before estimation
I estimate a GARCH\((1,1)\) model, so both \(p\) and \(q\) are greater than zero. The mean equation is
\[\begin{equation} r_t = \mu + \varepsilon_t. \end{equation}\]
The innovation is written as
\[\begin{equation} \varepsilon_t = \sigma_t z_t, \quad z_t \overset{iid}{\sim} (0,1). \end{equation}\]
The conditional variance equation is
\[\begin{equation} \sigma_t^2 = \omega + \alpha_1 \varepsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2. \end{equation}\]
Because tseries::garch() estimates the variance equation for a zero-mean input, I first estimate the constant mean \(\mu\) as the sample mean of \(r_t\), and then fit the GARCH model to the demeaned return series \(r_t-\hat{\mu}\).
mu_hat <- mean(nvda_returns$r)
garch_fit <- tseries::garch(
x = nvda_returns$r - mu_hat,
order = c(1, 1),
trace = FALSE
)
summary(garch_fit)
Call:
tseries::garch(x = nvda_returns$r - mu_hat, order = c(1, 1), trace = FALSE)
Model:
GARCH(1,1)
Residuals:
Min 1Q Median 3Q Max
-6.11358 -0.56898 0.02905 0.61816 8.26032
Coefficient(s):
Estimate Std. Error t value Pr(>|t|)
a0 0.66014 0.15163 4.354 1.34e-05 ***
a1 0.10729 0.01706 6.289 3.19e-10 ***
b1 0.83750 0.02757 30.379 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Diagnostic Tests:
Jarque Bera Test
data: Residuals
X-squared = 1475.9, df = 2, p-value < 2.2e-16
Box-Ljung test
data: Squared.Residuals
X-squared = 0.34883, df = 1, p-value = 0.5548
garch_summary <- summary(garch_fit)
garch_coef <- as.data.frame(garch_summary$coef)
garch_coef <- tibble::rownames_to_column(garch_coef, "Parameter")
names(garch_coef) <- c("Parameter", "Estimate", "Std. Error", "t value", "Pr(>|t|)")
knitr::kable(
garch_coef,
digits = 4,
caption = "GARCH$(1,1)$ parameter estimates from R."
)| Parameter | Estimate | Std. Error | t value | Pr(>|t|) |
|---|---|---|---|---|
| a0 | 0.6601 | 0.1516 | 4.3535 | 0 |
| a1 | 0.1073 | 0.0171 | 6.2893 | 0 |
| b1 | 0.8375 | 0.0276 | 30.3791 | 0 |
omega_hat <- unname(coef(garch_fit)["a0"])
alpha1_hat <- unname(coef(garch_fit)["a1"])
beta1_hat <- unname(coef(garch_fit)["b1"])
persistence <- alpha1_hat + beta1_hat
long_run_variance <- omega_hat / (1 - persistence)
long_run_sd <- sqrt(long_run_variance)
vol_half_life <- log(0.5) / log(persistence)
parameter_table <- tibble(
Quantity = c(
"$\\hat{\\mu}$",
"$\\hat{\\omega}$",
"$\\hat{\\alpha}_1$",
"$\\hat{\\beta}_1$",
"$\\hat{\\alpha}_1+\\hat{\\beta}_1$",
"Long-run variance",
"Long-run daily volatility",
"Volatility half-life"
),
Estimate = c(
mu_hat,
omega_hat,
alpha1_hat,
beta1_hat,
persistence,
long_run_variance,
long_run_sd,
vol_half_life
)
)
knitr::kable(
parameter_table,
digits = 4,
escape = FALSE,
caption = "Key estimated quantities."
)| Quantity | Estimate |
|---|---|
| \(\hat{\mu}\) | 0.2289 |
| \(\hat{\omega}\) | 0.6601 |
| \(\hat{\alpha}_1\) | 0.1073 |
| \(\hat{\beta}_1\) | 0.8375 |
| \(\hat{\alpha}_1+\hat{\beta}_1\) | 0.9448 |
| Long-run variance | 11.9575 |
| Long-run daily volatility | 3.4580 |
| Volatility half-life | 12.2056 |
Updated equations with estimated coefficients
The estimated mean equation is
\[\begin{equation} r_t = 0.2289 + \hat{\varepsilon}_t. \end{equation}\]
The estimated variance equation is
\[\begin{equation} \hat{\sigma}_t^2 = 0.6601 + 0.1073\hat{\varepsilon}_{t-1}^2 + 0.8375\hat{\sigma}_{t-1}^2. \end{equation}\]
nvda_volatility <- nvda_returns |>
mutate(sigma_hat = as.numeric(garch_fit$fitted.values[, "sigt"]))
nvda_volatility |>
ggplot(aes(x = date, y = sigma_hat)) +
geom_line(color = "#047857", linewidth = 0.45) +
labs(
title = "Estimated conditional volatility",
x = "Date",
y = "Conditional volatility (%)"
) +
theme_minimal(base_size = 11)Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Interpretation
The estimated mean coefficient is \(\hat{\mu}=0.2289\), meaning that the average daily log return over this sample is about 0.229 percent. This should be interpreted as a historical average for this period, not as a guaranteed daily return.
The constant term in the variance equation is \(\hat{\omega}=0.6601\). It is the baseline component of conditional variance, although in a GARCH model the current variance is mostly determined by past shocks and past variance.
The ARCH coefficient is \(\hat{\alpha}_1=0.1073\). This means yesterday’s squared return shock has a clear effect on today’s volatility. The GARCH coefficient is \(\hat{\beta}_1=0.8375\), which is much larger than the ARCH coefficient. Therefore, volatility is highly persistent: once conditional volatility becomes high, it tends to remain high for several trading days.
The persistence measure is
\[\begin{equation} \hat{\alpha}_1+\hat{\beta}_1 = 0.9448. \end{equation}\]
Since this number is below 1, the conditional variance is mean reverting. However, it is close to 1, so the mean reversion is slow. The estimated long-run variance is
\[\begin{equation} \frac{\hat{\omega}}{1-\hat{\alpha}_1-\hat{\beta}_1} = 11.9575, \end{equation}\]
which corresponds to a long-run daily volatility of about 3.458 percent. The volatility half-life is approximately 12.21 trading days, so a volatility shock loses half of its impact after roughly two and a half trading weeks.
The return plot shows volatility clustering: large positive or negative returns tend to occur in groups rather than being evenly spread across the sample. The GARCH estimates are consistent with that visual evidence, because \(\hat{\beta}_1\) is large and \(\hat{\alpha}_1+\hat{\beta}_1\) is close to 1.