Part I: Comparing ARIMA, ADL, ARCH, and GARCH Models
ARIMA, ADL, ARCH, and GARCH models are all time series models, but they are used for somewhat different purposes. ARIMA and ADL models are mainly used to model and forecast the conditional mean of a time series, while ARCH and GARCH models are mainly used to model changing volatility over time. In other words, ARIMA and ADL focus on predicting the level of a variable, while ARCH and GARCH focus on predicting the variance or uncertainty around that variable.
ARIMA Model
An ARIMA model stands for autoregressive integrated moving average. It is useful when a time series depends on its own past values and past forecast errors. The general ARIMA\((p,d,q)\) model can be written as:
\[\phi(B)(1-B)^d y_t = c + \theta(B)\varepsilon_t\]
where \(y_t\) is the time series, \(B\) is the backshift operator, \(d\) is the number of differences needed to make the series stationary, \(\phi(B)\) represents the autoregressive terms, \(\theta(B)\) represents the moving average terms, and \(\varepsilon_t\) is the error term.
More explicitly, after differencing the series if needed, an ARMA\((p,q)\) version can be written as:
The autoregressive part uses past values of the dependent variable, while the moving average part uses past errors. If the model only uses lagged values of the dependent variable, it is basically an AR model. The example from the link makes this distinction clear: predicting unemployment using only past unemployment is an AR or ARIMA model, not an ADL model, because there is no outside explanatory variable involved.
ADL Model
An ADL model stands for autoregressive distributed lag model. It extends the basic autoregressive idea by including both lags of the dependent variable and lags of one or more external explanatory variables. A simple ADL\((p,q)\) model can be written as:
Here, \(y_t\) is the dependent variable, \(x_t\) is an external predictor, \(\phi_i\) measures the effect of past values of \(y\), and \(\beta_j\) measures the effect of current or lagged values of \(x\).
The key difference between ARIMA and ADL is that ARIMA is usually univariate, while ADL brings in another variable. For example, if unemployment is predicted using its own lags plus lagged inflation, that is an ADL model. The linked example emphasizes that ADL is still a single-equation model: unemployment is on the left-hand side, and lags of unemployment and inflation appear on the right-hand side.
ARCH Model
ARCH stands for autoregressive conditional heteroskedasticity. Unlike ARIMA and ADL, ARCH does not mainly model the mean of the series. Instead, it models the variance of the error term. This is useful when volatility changes over time, such as in financial returns where calm periods and volatile periods tend to cluster.
where \(\sigma_t^2\) is the conditional variance at time \(t\), \(\omega > 0\), and the \(\alpha_i\) coefficients measure how past squared shocks affect current volatility. The important idea is that large past errors increase the predicted variance today. So ARCH models volatility clustering by allowing the error variance to change over time.
GARCH Model
GARCH stands for generalized autoregressive conditional heteroskedasticity. It extends ARCH by including lagged conditional variances as well as lagged squared errors. A GARCH\((p,q)\) model can be written as:
The ARCH part captures the effect of recent shocks, while the GARCH part captures persistence in volatility itself. Compared with ARCH, GARCH is usually more flexible because it can model long-lasting volatility with fewer parameters.
Connections Between the Models
These models are connected because they all use lagged information to explain time series behavior. ARIMA uses lags of the dependent variable and lags of the error term to model the conditional mean. ADL builds on the autoregressive idea by adding lags of external predictors. Therefore, an ADL model can be thought of as a broader single-equation forecasting model when outside explanatory variables are available.
ARCH and GARCH are different because they focus on the conditional variance rather than just the conditional mean. However, they can be combined with ARIMA or ADL models. For example, an analyst might first model the mean of a series using ARIMA or ADL and then model the remaining volatility using ARCH or GARCH if the residuals show changing variance over time.
Overall, ARIMA is useful when the goal is to forecast one variable using its own history. ADL is useful when another variable is expected to help predict the dependent variable. ARCH is useful when volatility depends on past shocks, and GARCH is useful when volatility is persistent over time. The main distinction is that ARIMA and ADL model the expected value of the series, while ARCH and GARCH model the uncertainty or variance around that expected value.
ggplot(msft_monthly, aes(x =month, y =return))+geom_line(color ="steelblue", linewidth =0.4)+labs( title ="Microsoft Monthly Percent Returns", subtitle ="Monthly returns computed from closing prices", x ="Date", y ="Monthly Return (%)")+theme_minimal()
The plot shows Microsoft monthly percent returns over time. The series appears to fluctuate around a relatively stable average return, but the size of the fluctuations changes noticeably over time. This suggests volatility clustering, where large movements tend to occur near other large movements and calmer periods tend to occur near other calm periods. Because the mean may be relatively stable while the variance changes over time, this is a reasonable setting for a GARCH model.
GARCH(3,2) Model Specification
For this analysis, I estimate a GARCH\((3,2)\) model on Microsoft monthly percent returns. Let \(r_t\) represent the monthly return at time \(t\). The mean equation is:
\[r_t = \mu + u_t\]
where:
\[u_t = \sigma_t z_t, \quad z_t \sim N(0,1)\]
The conditional variance equation for the GARCH\((3,2)\) model is:
In this model, \(\mu\) is the mean return, \(\omega\) is the variance intercept, the \(\alpha\) terms are ARCH effects that measure the effect of past squared shocks, and the \(\beta\) terms are GARCH effects that measure persistence in past conditional variances. This matches the structure described in the GARCH notes, where volatility depends on both recent shocks and its own past values.
# Specify a standard GARCH(3,2) modelspec_garch32<-ugarchspec( variance.model =list( model ="sGARCH", garchOrder =c(3, 2)), mean.model =list( armaOrder =c(0, 0), include.mean =TRUE), distribution.model ="norm")# Fit the modelfit_garch32<-ugarchfit( spec =spec_garch32, data =msft_monthly$return)show(fit_garch32)
params<-coef(fit_garch32)param_table<-tibble( Software_Name =names(params), Estimate =as.numeric(params), Textbook_Role =c("Mean intercept, usually written as mu or beta_0","Variance intercept, usually written as omega or alpha_0","ARCH(1): effect of u_{t-1}^2","ARCH(2): effect of u_{t-2}^2","ARCH(3): effect of u_{t-3}^2","GARCH(1): effect of sigma_{t-1}^2","GARCH(2): effect of sigma_{t-2}^2"))kable(param_table, digits =6, caption ="Estimated GARCH(3,2) Parameters for MSFT Monthly Returns")%>%kable_styling(full_width =FALSE)
Estimated GARCH(3,2) Parameters for MSFT Monthly Returns
Software_Name
Estimate
Textbook_Role
mu
1.854038
Mean intercept, usually written as mu or beta_0
omega
4.814033
Variance intercept, usually written as omega or alpha_0
msft_monthly<-msft_monthly%>%mutate( conditional_sigma =as.numeric(sigma(fit_garch32)))ggplot(msft_monthly, aes(x =month, y =conditional_sigma))+geom_line(color ="darkred", linewidth =0.5)+labs( title ="Estimated Conditional Volatility from GARCH(3,2)", subtitle ="Time-varying volatility of MSFT monthly returns", x ="Date", y ="Conditional Standard Deviation")+theme_minimal()
The conditional volatility plot shows that the estimated volatility is not constant over time. Instead, volatility rises during turbulent periods and falls during calmer periods. This supports the use of a GARCH model because the model allows the variance of returns to change over time rather than assuming one constant variance for the entire sample.
Estimated GARCH(3,2) Equations
Using the estimated coefficients from the rugarch output, the fitted mean equation is:
The software notation must be interpreted carefully. In rugarch, mu is the intercept in the mean equation, while omega is the intercept in the variance equation. This matches the notation warning in the reconciliation example, where software labels may differ from textbook labels even though the same underlying equations are being estimated.
Interpretation of the GARCH(3,2) Model
The estimated mean equation gives the average monthly return for Microsoft over the sample period. Since the mean model is specified as ARMA\((0,0)\), the return equation only includes a constant mean term and an innovation term, \(u_t\). This means the main focus of the model is not predicting the return level from past returns, but modeling how the volatility of returns changes over time.
The ARCH coefficients, \(\alpha_1\), \(\alpha_2\), and \(\alpha_3\), measure how recent shocks affect current volatility. A larger squared residual, such as \(u_{t-1}^2\), represents a larger surprise in returns. Positive ARCH coefficients mean that large shocks in recent months increase the conditional variance in the current month. This is how the model captures volatility clustering: large movements in returns tend to be followed by periods of higher volatility.
The GARCH coefficients, \(\beta_1\) and \(\beta_2\), measure how much past conditional volatility carries forward into the current period. Larger GARCH coefficients imply more persistence, meaning that once volatility rises, it does not immediately disappear. Instead, volatility gradually decays over time.
Because this value is below 1, the model suggests that volatility is mean-reverting. In other words, shocks to volatility are persistent, but they eventually fade back toward a long-run average level. If this value were very close to 1, it would suggest that volatility shocks last for a long time. If it were much smaller, volatility would return to normal more quickly.
This long-run variance represents the average level that the conditional variance returns to after shocks fade. Overall, the model indicates that Microsoft monthly returns exhibit time-varying volatility, volatility clustering, and mean reversion. The GARCH framework improves on a constant-variance model because it allows risk to rise during turbulent periods and fall during calmer periods.