Week 14 Discussion

Author

Ryan Bean

Part I: Comparing ARIMA, ADL, ARCH, and GARCH Models

ARIMA, ADL, ARCH, and GARCH models are all time series models, but they are used for somewhat different purposes. ARIMA and ADL models are mainly used to model and forecast the conditional mean of a time series, while ARCH and GARCH models are mainly used to model changing volatility over time. In other words, ARIMA and ADL focus on predicting the level of a variable, while ARCH and GARCH focus on predicting the variance or uncertainty around that variable.

ARIMA Model

An ARIMA model stands for autoregressive integrated moving average. It is useful when a time series depends on its own past values and past forecast errors. The general ARIMA\((p,d,q)\) model can be written as:

\[\phi(B)(1-B)^d y_t = c + \theta(B)\varepsilon_t\]

where \(y_t\) is the time series, \(B\) is the backshift operator, \(d\) is the number of differences needed to make the series stationary, \(\phi(B)\) represents the autoregressive terms, \(\theta(B)\) represents the moving average terms, and \(\varepsilon_t\) is the error term.

More explicitly, after differencing the series if needed, an ARMA\((p,q)\) version can be written as:

\[y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \cdots + \theta_q \varepsilon_{t-q}\]

The autoregressive part uses past values of the dependent variable, while the moving average part uses past errors. If the model only uses lagged values of the dependent variable, it is basically an AR model. The example from the link makes this distinction clear: predicting unemployment using only past unemployment is an AR or ARIMA model, not an ADL model, because there is no outside explanatory variable involved.

ADL Model

An ADL model stands for autoregressive distributed lag model. It extends the basic autoregressive idea by including both lags of the dependent variable and lags of one or more external explanatory variables. A simple ADL\((p,q)\) model can be written as:

\[y_t = \alpha + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \beta_0 x_t + \beta_1 x_{t-1} + \beta_2 x_{t-2} + \cdots + \beta_q x_{t-q} + \varepsilon_t\]

Here, \(y_t\) is the dependent variable, \(x_t\) is an external predictor, \(\phi_i\) measures the effect of past values of \(y\), and \(\beta_j\) measures the effect of current or lagged values of \(x\).

The key difference between ARIMA and ADL is that ARIMA is usually univariate, while ADL brings in another variable. For example, if unemployment is predicted using its own lags plus lagged inflation, that is an ADL model. The linked example emphasizes that ADL is still a single-equation model: unemployment is on the left-hand side, and lags of unemployment and inflation appear on the right-hand side.

ARCH Model

ARCH stands for autoregressive conditional heteroskedasticity. Unlike ARIMA and ADL, ARCH does not mainly model the mean of the series. Instead, it models the variance of the error term. This is useful when volatility changes over time, such as in financial returns where calm periods and volatile periods tend to cluster.

A basic mean equation could be written as:

\[y_t = \mu + \varepsilon_t\]

where:

\[\varepsilon_t = z_t \sigma_t\]

and:

\[z_t \sim iid(0,1)\]

The ARCH\((q)\) variance equation is:

\[\sigma_t^2 = \omega + \alpha_1 \varepsilon_{t-1}^2 + \alpha_2 \varepsilon_{t-2}^2 + \cdots + \alpha_q \varepsilon_{t-q}^2\]

where \(\sigma_t^2\) is the conditional variance at time \(t\), \(\omega > 0\), and the \(\alpha_i\) coefficients measure how past squared shocks affect current volatility. The important idea is that large past errors increase the predicted variance today. So ARCH models volatility clustering by allowing the error variance to change over time.

GARCH Model

GARCH stands for generalized autoregressive conditional heteroskedasticity. It extends ARCH by including lagged conditional variances as well as lagged squared errors. A GARCH\((p,q)\) model can be written as:

\[\sigma_t^2 = \omega + \alpha_1 \varepsilon_{t-1}^2 + \alpha_2 \varepsilon_{t-2}^2 + \cdots + \alpha_q \varepsilon_{t-q}^2 + \gamma_1 \sigma_{t-1}^2 + \gamma_2 \sigma_{t-2}^2 + \cdots + \gamma_p \sigma_{t-p}^2\]

A common version is the GARCH\((1,1)\) model:

\[\sigma_t^2 = \omega + \alpha_1 \varepsilon_{t-1}^2 + \gamma_1 \sigma_{t-1}^2\]

The ARCH part captures the effect of recent shocks, while the GARCH part captures persistence in volatility itself. Compared with ARCH, GARCH is usually more flexible because it can model long-lasting volatility with fewer parameters.

Connections Between the Models

These models are connected because they all use lagged information to explain time series behavior. ARIMA uses lags of the dependent variable and lags of the error term to model the conditional mean. ADL builds on the autoregressive idea by adding lags of external predictors. Therefore, an ADL model can be thought of as a broader single-equation forecasting model when outside explanatory variables are available.

ARCH and GARCH are different because they focus on the conditional variance rather than just the conditional mean. However, they can be combined with ARIMA or ADL models. For example, an analyst might first model the mean of a series using ARIMA or ADL and then model the remaining volatility using ARCH or GARCH if the residuals show changing variance over time.

Overall, ARIMA is useful when the goal is to forecast one variable using its own history. ADL is useful when another variable is expected to help predict the dependent variable. ARCH is useful when volatility depends on past shocks, and GARCH is useful when volatility is persistent over time. The main distinction is that ARIMA and ADL model the expected value of the series, while ARCH and GARCH model the uncertainty or variance around that expected value.

Part II:

library(tidyquant)
library(tidyverse)
library(lubridate)
library(rugarch)
library(knitr)
library(kableExtra)

# Pull Microsoft daily stock prices
msft_daily <- tq_get(
  "MSFT",
  get = "stock.prices",
  from = "1990-01-01"
)

# Convert daily prices to monthly closing prices and monthly percent returns
msft_monthly <- msft_daily %>%
  group_by(month = floor_date(date, unit = "month")) %>%
  summarise(close = last(close), .groups = "drop") %>%
  mutate(
    return = 100 * (close - lag(close)) / lag(close)
  ) %>%
  filter(!is.na(return))

head(msft_monthly)
# A tibble: 6 × 3
  month      close return
  <date>     <dbl>  <dbl>
1 1990-02-01 0.686   6.76
2 1990-03-01 0.769  12.2 
3 1990-04-01 0.806   4.74
4 1990-05-01 1.01   25.9 
5 1990-06-01 1.06    4.11
6 1990-07-01 0.924 -12.5 
ggplot(msft_monthly, aes(x = month, y = return)) +
  geom_line(color = "steelblue", linewidth = 0.4) +
  labs(
    title = "Microsoft Monthly Percent Returns",
    subtitle = "Monthly returns computed from closing prices",
    x = "Date",
    y = "Monthly Return (%)"
  ) +
  theme_minimal()

The plot shows Microsoft monthly percent returns over time. The series appears to fluctuate around a relatively stable average return, but the size of the fluctuations changes noticeably over time. This suggests volatility clustering, where large movements tend to occur near other large movements and calmer periods tend to occur near other calm periods. Because the mean may be relatively stable while the variance changes over time, this is a reasonable setting for a GARCH model.

GARCH(3,2) Model Specification

For this analysis, I estimate a GARCH\((3,2)\) model on Microsoft monthly percent returns. Let \(r_t\) represent the monthly return at time \(t\). The mean equation is:

\[r_t = \mu + u_t\]

where:

\[u_t = \sigma_t z_t, \quad z_t \sim N(0,1)\]

The conditional variance equation for the GARCH\((3,2)\) model is:

\[\sigma_t^2 = \omega + \alpha_1 u_{t-1}^2 + \alpha_2 u_{t-2}^2 + \alpha_3 u_{t-3}^2 + \beta_1 \sigma_{t-1}^2 + \beta_2 \sigma_{t-2}^2\]

In this model, \(\mu\) is the mean return, \(\omega\) is the variance intercept, the \(\alpha\) terms are ARCH effects that measure the effect of past squared shocks, and the \(\beta\) terms are GARCH effects that measure persistence in past conditional variances. This matches the structure described in the GARCH notes, where volatility depends on both recent shocks and its own past values.

# Specify a standard GARCH(3,2) model
spec_garch32 <- ugarchspec(
  variance.model = list(
    model = "sGARCH",
    garchOrder = c(3, 2)
  ),
  mean.model = list(
    armaOrder = c(0, 0),
    include.mean = TRUE
  ),
  distribution.model = "norm"
)

# Fit the model
fit_garch32 <- ugarchfit(
  spec = spec_garch32,
  data = msft_monthly$return
)

show(fit_garch32)

*---------------------------------*
*          GARCH Model Fit        *
*---------------------------------*

Conditional Variance Dynamics   
-----------------------------------
GARCH Model : sGARCH(3,2)
Mean Model  : ARFIMA(0,0,0)
Distribution    : norm 

Optimal Parameters
------------------------------------
        Estimate  Std. Error  t value Pr(>|t|)
mu      1.854038    0.354842  5.22496 0.000000
omega   4.814033    2.553099  1.88556 0.059354
alpha1  0.023437    0.043759  0.53559 0.592240
alpha2  0.062018    0.057736  1.07417 0.282748
alpha3  0.068959    0.067659  1.01921 0.308103
beta1   0.463705    0.467537  0.99180 0.321294
beta2   0.310689    0.400226  0.77628 0.437581

Robust Standard Errors:
        Estimate  Std. Error  t value Pr(>|t|)
mu      1.854038    0.323842  5.72513 0.000000
omega   4.814033    2.687872  1.79102 0.073290
alpha1  0.023437    0.036545  0.64133 0.521311
alpha2  0.062018    0.056963  1.08873 0.276272
alpha3  0.068959    0.056241  1.22613 0.220151
beta1   0.463705    0.226863  2.04399 0.040955
beta2   0.310689    0.178003  1.74542 0.080912

LogLikelihood : -1516.17 

Information Criteria
------------------------------------
                   
Akaike       7.0031
Bayes        7.0687
Shibata      7.0026
Hannan-Quinn 7.0290

Weighted Ljung-Box Test on Standardized Residuals
------------------------------------
                        statistic p-value
Lag[1]                      2.163  0.1414
Lag[2*(p+q)+(p+q)-1][2]     2.187  0.2335
Lag[4*(p+q)+(p+q)-1][5]     3.527  0.3192
d.o.f=0
H0 : No serial correlation

Weighted Ljung-Box Test on Standardized Squared Residuals
------------------------------------
                         statistic p-value
Lag[1]                   0.0008253  0.9771
Lag[2*(p+q)+(p+q)-1][14] 5.1202841  0.7545
Lag[4*(p+q)+(p+q)-1][24] 8.9758406  0.7990
d.o.f=5

Weighted ARCH LM Tests
------------------------------------
             Statistic Shape Scale P-Value
ARCH Lag[6]      0.374 0.500 2.000  0.5408
ARCH Lag[8]      1.354 1.480 1.774  0.6688
ARCH Lag[10]     1.964 2.424 1.650  0.7777

Nyblom stability test
------------------------------------
Joint Statistic:  1.0654
Individual Statistics:             
mu     0.2581
omega  0.1836
alpha1 0.2794
alpha2 0.3706
alpha3 0.3135
beta1  0.3126
beta2  0.3148

Asymptotic Critical Values (10% 5% 1%)
Joint Statistic:         1.69 1.9 2.35
Individual Statistic:    0.35 0.47 0.75

Sign Bias Test
------------------------------------
                   t-value   prob sig
Sign Bias           0.8667 0.3866    
Negative Sign Bias  1.4514 0.1474    
Positive Sign Bias  0.4515 0.6518    
Joint Effect        2.3813 0.4971    


Adjusted Pearson Goodness-of-Fit Test:
------------------------------------
  group statistic p-value(g-1)
1    20     19.11       0.4495
2    30     22.59       0.7951
3    40     42.70       0.3151
4    50     47.41       0.5376


Elapsed time : 0.08404398 
params <- coef(fit_garch32)

param_table <- tibble(
  Software_Name = names(params),
  Estimate = as.numeric(params),
  Textbook_Role = c(
    "Mean intercept, usually written as mu or beta_0",
    "Variance intercept, usually written as omega or alpha_0",
    "ARCH(1): effect of u_{t-1}^2",
    "ARCH(2): effect of u_{t-2}^2",
    "ARCH(3): effect of u_{t-3}^2",
    "GARCH(1): effect of sigma_{t-1}^2",
    "GARCH(2): effect of sigma_{t-2}^2"
  )
)

kable(
  param_table,
  digits = 6,
  caption = "Estimated GARCH(3,2) Parameters for MSFT Monthly Returns"
) %>%
  kable_styling(full_width = FALSE)
Estimated GARCH(3,2) Parameters for MSFT Monthly Returns
Software_Name Estimate Textbook_Role
mu 1.854038 Mean intercept, usually written as mu or beta_0
omega 4.814033 Variance intercept, usually written as omega or alpha_0
alpha1 0.023437 ARCH(1): effect of u_{t-1}^2
alpha2 0.062018 ARCH(2): effect of u_{t-2}^2
alpha3 0.068959 ARCH(3): effect of u_{t-3}^2
beta1 0.463705 GARCH(1): effect of sigma_{t-1}^2
beta2 0.310689 GARCH(2): effect of sigma_{t-2}^2
mu_hat     <- params["mu"]
omega_hat  <- params["omega"]
alpha1_hat <- params["alpha1"]
alpha2_hat <- params["alpha2"]
alpha3_hat <- params["alpha3"]
beta1_hat  <- params["beta1"]
beta2_hat  <- params["beta2"]

persistence <- alpha1_hat + alpha2_hat + alpha3_hat + beta1_hat + beta2_hat

long_run_variance <- omega_hat / (1 - persistence)
long_run_volatility <- sqrt(long_run_variance)

summary_table <- tibble(
  Quantity = c("Persistence", "Long-run variance", "Long-run volatility"),
  Formula = c(
    "alpha1 + alpha2 + alpha3 + beta1 + beta2",
    "omega / (1 - persistence)",
    "sqrt(long-run variance)"
  ),
  Value = c(persistence, long_run_variance, long_run_volatility)
)

kable(
  summary_table,
  digits = 6,
  caption = "Volatility Persistence and Long-Run Variance"
) %>%
  kable_styling(full_width = FALSE)
Volatility Persistence and Long-Run Variance
Quantity Formula Value
Persistence alpha1 + alpha2 + alpha3 + beta1 + beta2 0.928808
Long-run variance omega / (1 - persistence) 67.620493
Long-run volatility sqrt(long-run variance) 8.223168
msft_monthly <- msft_monthly %>%
  mutate(
    conditional_sigma = as.numeric(sigma(fit_garch32))
  )

ggplot(msft_monthly, aes(x = month, y = conditional_sigma)) +
  geom_line(color = "darkred", linewidth = 0.5) +
  labs(
    title = "Estimated Conditional Volatility from GARCH(3,2)",
    subtitle = "Time-varying volatility of MSFT monthly returns",
    x = "Date",
    y = "Conditional Standard Deviation"
  ) +
  theme_minimal()

The conditional volatility plot shows that the estimated volatility is not constant over time. Instead, volatility rises during turbulent periods and falls during calmer periods. This supports the use of a GARCH model because the model allows the variance of returns to change over time rather than assuming one constant variance for the entire sample.

Estimated GARCH(3,2) Equations

Using the estimated coefficients from the rugarch output, the fitted mean equation is:

\[r_t = 1.854038 + u_t\]

where:

\[u_t = \sigma_t z_t, \quad z_t \sim N(0,1)\]

The fitted conditional variance equation is:

\[\begin{aligned} \sigma_t^2 &= 4.814033 + 0.023437u_{t-1}^2 + 0.062018u_{t-2}^2 + 0.068959u_{t-3}^2 \\ &\quad + 0.463705\sigma_{t-1}^2 + 0.310689\sigma_{t-2}^2 \end{aligned}\]

The software notation must be interpreted carefully. In rugarch, mu is the intercept in the mean equation, while omega is the intercept in the variance equation. This matches the notation warning in the reconciliation example, where software labels may differ from textbook labels even though the same underlying equations are being estimated.

Interpretation of the GARCH(3,2) Model

The estimated mean equation gives the average monthly return for Microsoft over the sample period. Since the mean model is specified as ARMA\((0,0)\), the return equation only includes a constant mean term and an innovation term, \(u_t\). This means the main focus of the model is not predicting the return level from past returns, but modeling how the volatility of returns changes over time.

The ARCH coefficients, \(\alpha_1\), \(\alpha_2\), and \(\alpha_3\), measure how recent shocks affect current volatility. A larger squared residual, such as \(u_{t-1}^2\), represents a larger surprise in returns. Positive ARCH coefficients mean that large shocks in recent months increase the conditional variance in the current month. This is how the model captures volatility clustering: large movements in returns tend to be followed by periods of higher volatility.

The GARCH coefficients, \(\beta_1\) and \(\beta_2\), measure how much past conditional volatility carries forward into the current period. Larger GARCH coefficients imply more persistence, meaning that once volatility rises, it does not immediately disappear. Instead, volatility gradually decays over time.

The total volatility persistence is:

\[\alpha_1 + \alpha_2 + \alpha_3 + \beta_1 + \beta_2 = 0.928808\]

Because this value is below 1, the model suggests that volatility is mean-reverting. In other words, shocks to volatility are persistent, but they eventually fade back toward a long-run average level. If this value were very close to 1, it would suggest that volatility shocks last for a long time. If it were much smaller, volatility would return to normal more quickly.

The long-run variance is calculated as:

\[\frac{\omega}{1 - \alpha_1 - \alpha_2 - \alpha_3 - \beta_1 - \beta_2} = 67.620493\]

The long-run volatility is therefore:

\[\sqrt{67.620493} = 8.223168\]

This long-run variance represents the average level that the conditional variance returns to after shocks fade. Overall, the model indicates that Microsoft monthly returns exhibit time-varying volatility, volatility clustering, and mean reversion. The GARCH framework improves on a constant-variance model because it allows risk to rise during turbulent periods and fall during calmer periods.