Module 14 Discussion

Author

Teddy Kelly

Part I

ARIMA

ARIMA models forecast future values using only lags of the dependent variable and lags of forecasting errors. These forecasts do not involve using any other variables when making predictions unlike ADL.
ARIMA is used to predict the expected value of the dependent variable based on past observations of the dependent variable and past errors, and specifically, capture the persistence of the mean. This is different from the ARCH and GARCH models which focus on the persistence of volatility in a time series.

ARIMA(p,d,q) estimating equation:

\[y_t=c+\phi_1\cdot y_{t-1}+...+\phi_p\cdot y_{t-p}+\theta_1\cdot\varepsilon_{t-1}+...+\theta_q\cdot \varepsilon_{t-q}+\varepsilon_t\]

$y_{t-p}$ represent the lagged values of the dependent variable and $\varepsilon_{t-q}$ represent the lagged forecast errors of the model.
$c$, $\phi_1,...,\phi_p$, and $\theta_1,...,\theta_q$ are unknown parameters that must be estimated.

ADL

ADL stands for Auto-regressive Distributed Lag and is similar to ARIMA in that it also uses lagged values of the dependent variable to make forecasts.
However, unlike ARIMA, ADL also uses lagged values of an external variable to generate forecasts which puts its focus on capturing the persistence in the mean across multiple variables.
ADL also forecasts the expected value of the dependent variable in the future just like ARIMA and does not forecast the conditional variance like ARCH or GARCH.

ADL estimating equation:

\[y_t=c+\phi_1\cdot y_{t-1}+...+\phi_p\cdot y_{t-p}+\beta_0\cdot x_{t}+...+\beta_q\cdot x_{t-q}+\varepsilon_t\]

The ADL estimating equation includes the same lagged dependent variable terms, $y_{t-1},...,y_{t-p}$, as in the ARIMA estimating equation.
The $x_t,..,x_{t-q}$ terms represent the current and lagged values of the external regressor that are used to forecast the expected value of $y_t$.

ARCH

ARCH stands for Auto-regressive Conditional Heteroskedasticity and it forecasts the expected variability of a time series using past values of squared errors.
Its focus, like GARCH, is on explaining the persistence in the volatility of a time series.
The limitation of ARCH is that if volatility persists, then ARCH must use many parameters to capture that persistent, making it inefficient

ARCH estimating equations

Mean estimating equation:

\[y_t=\beta_0+u_t\]

where $u_t \sim \mathcal{N}(0,\sigma^2_t)$

$\beta_0$ represents the average value of the dependent variable $y$. This is essentially the mean of the time series.
$u_t$ represents the errors or the shock and the errors are normally distributed around zero with conditional variance $\sigma^2_t$

Variance Estimating Equation

\[\sigma^2_t=\alpha_0+\alpha_1\cdot u^2_{t-1}+...+\alpha_p\cdot u^2_{t-p}\]

$u^2_{t-1},...,u^2_{t-p}$ represent the squared error or shock. Squaring the error helps us to gauge the magnitude of the volatility present in the time series.
$\alpha_0$, $\alpha_1,...,\alpha_p$ are unknown parameters which respectively measure the long-run average variance of the time series and how strongly the volatility reacts to new shocks.

GARCH

GARCH stands for Generalized Auto-regressive Conditional Heteroskedasticity and it also models the conditional variance of a time series based on past errors.
However, in addition to using squared errors like ARCH, GARCH solves that problem that ARCH faces of needed many lags by also including lagged variances to model the conditional variance

GARCH estimating equations

Mean estimating equation:

\[y_t=\beta_0+u_t\]

where $u_t \sim \mathcal{N}(0,\sigma^2_t)$

Variance estimating equation:

\[\sigma^2_t=\alpha_0+\alpha_1\cdot u^2_{t-1}+...+\alpha_p\cdot u^2_{t-p} +\phi_1\cdot \sigma^2_{t-1}+...+\phi_q\cdot \sigma^2_{t-q}\]

In addition to the variables in the ARCH model, the GARCH model includes $\sigma^2_{t-1},...,\sigma^2_{t-q}$ which represent the lagged conditional variances.
$\phi_1,...,\phi_q$ are unknown parameters that we must estimate which represent the persistence of the volatility from the previous periods.

Part II

I have decided to build a GARCH model on the Apple stock price dataset from the tidyquant package.

rm(list=ls())
library(fpp3)
library(tidyquant)
library(fGarch)

apple <- tq_get('AAPL',
                get = 'stock.prices',
                from = '1995-01-01')

# Converting the Apple dataset into a monthly time series and calculating 
# the monthly percentage change in Apple's closing stock price

apple_monthly <- apple |>
  group_by(year_month = floor_date(x = date, unit = "month")) |>
  summarise(close = last(close), .groups = "drop") |>
  mutate(
    pct_change = (close - lag(close)) / lag(close) * 100
  ) |> filter(!is.na(pct_change))

# Plotting Apple's monthly change time series
ggplot(data = apple_monthly,
       mapping = aes(x = year_month, y = pct_change)) +
  geom_line() +
  labs(title = "Monthly Percentage Change in Apple's Closing Stock Price",
       x = "Time (Months)",
       y = "Monthly Percentage Price Change")

Below modeling the GARCH(1,1) model on the monthly Apple time series, I have included the mean and variance estimating equations below:

Mean estimating equation:

\[y_t=\beta_0+u_t\]

where $u_t\sim \mathcal{N}(0,\sigma^2_t)$

$\beta_0$ represents the mean of the monthly percentage change in Apple’s closing price. This is the value that the percentage change generally hovers around. Looking at the graph above, it appears this value is about zero and we should expect to obtain a similar estimate from running the garchFit model.
$u_t$ is the error term and it is normally distributed around zero with conditional variance of $\sigma^2_t$. We will try to model this conditional variance term using the estimating equation seen below.

Variance estimating equation:

\[\sigma^2_t=\alpha_0+\alpha_1\cdot u^2_{t-1}+\phi_1\cdot \sigma^2_{t-1}\]

$\alpha_0$ represents the long-run average of the variance of the monthly change in Apple’s closing stock price. In other words, it is the baseline risk level.
$\alpha_1$ represents the shock in the most recent period.
$\phi_1$ represents the persistence of volatility from the most recent period.

# Fitting a GARCH(1,1) Model

garch_apple <- garchFit(
  data = apple_monthly$pct_change,
  trace = FALSE
)

summary(garch_apple)


Title:
 GARCH Modelling 

Call:
 garchFit(data = apple_monthly$pct_change, trace = FALSE) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x13d5cc0a8>
 [data = apple_monthly$pct_change]

Conditional Distribution:
 norm 

Coefficient(s):
     mu    omega   alpha1    beta1  
2.17211  2.06940  0.11036  0.87488  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu       2.17211     0.47250    4.597 4.28e-06 ***
omega    2.06940     1.96464    1.053  0.29219    
alpha1   0.11036     0.04126    2.675  0.00748 ** 
beta1    0.87488     0.04716   18.551  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 -1408.266    normalized:  -3.755375 

Description:
 Wed Apr 29 16:36:18 2026 by user:  



Standardised Residuals Tests:
                                 Statistic   p-Value
 Jarque-Bera Test   R    Chi^2   4.5709983 0.1017233
 Shapiro-Wilk Test  R    W       0.9955703 0.3692608
 Ljung-Box Test     R    Q(10)   6.0391195 0.8119660
 Ljung-Box Test     R    Q(15)  10.7265704 0.7717255
 Ljung-Box Test     R    Q(20)  13.0892501 0.8735220
 Ljung-Box Test     R^2  Q(10)  12.3658664 0.2613163
 Ljung-Box Test     R^2  Q(15)  15.4981373 0.4161656
 Ljung-Box Test     R^2  Q(20)  21.1570884 0.3879389
 LM Arch Test       R    TR^2   13.5734983 0.3287684

Information Criterion Statistics:
     AIC      BIC      SIC     HQIC 
7.532084 7.573971 7.531859 7.548713

Coefficient Interpretations

We have trained the GARCH(1,1) model and we can see the model coefficients above.

mu: This corresponds to the $\beta_0$ parameter from the mean equation, so $\beta_0=2.17$.
- Interpretation: As mentioned earlier, $\beta_0$ represents the mean of the time series. Therefore, $2.17 is the mean monthly percentage change in Apple’s closing price.
omega: This corresponds to the $\alpha_0$ parameter from the variance equation, so $\alpha_0=2.07$.
- As stated earlier, $\alpha_0$ is the long run average variance. Therefore, the long-run average variance in Apple’s closing price monthly percentage change is about $2.07.
alpha1: As expected, this corresponds to the $\alpha_1$ parameter in the variance equation, so $\alpha_1=0.11$.
- This represents the effect of the most recent shock on the volatility of the current period.
- A value of $\alpha_1=0.11$ means that about 11% of the previous month’s shock size carries into today’s volatility.
beta1: This corresponds to the $\phi_1$ parameter from the variance equation, so $\phi_1=0.87$.
- $\phi_1$ represents the persistence of volatility from the previous period. The value of 0.87 means that 87% of last month’s volatility persists in this current month which is fairly high.
- A high value means that there is a slow decay in the volatility after a shock occurs
- This can be seen in the graph from earlier where during the dot-com bubble caused high volatility and this volatility persisted for some time.

Notation Differences and Estimating Equations

Notice this difference in the notation used by the software and the notation that I have used in my estimating equations. The notation in the estimating equations is consistent with the Introduction to Econometrics with R textbook.

We can now rewrite the mean and variance estimating equations for the GARCH(1,1) model as the following

Mean estimating equation:

\[y_t=2.17+u_t\]

where $u_t\sim \mathcal{N}(0,\sigma^2_t)$

Variance estimating equation:

\[\sigma^2_t=2.07+0.11\cdot u^2_{t-1}+0.87\cdot \sigma^2_{t-1}\]

Comments on Volatility clustering, Persistence, Long-Run Variance, and Mean Reversion

Persistence

Adding together the parameters $\alpha_1=0.11$ and $\phi_1=0.87$ gives us the persistence of shocks. $\alpha_1+\phi_1=0.98$ which is extremely high, indicating that shocks have very long-lasting effects on Apple’s monthly percentage change in closing price.

Volatility Clustering

Volatility clustering basically refers to the concept that periods of high volatility will be clustered together while low periods of volatility will be clustered together.
In this case, volatility clustering is certainly present as seen with the high value of 0.98 for the persistence in volatility. Months with high volatility will be followed by subsequent months of elevated volatility.

Long-Run Variance

Calculating the following equation will give us the average variance in the time series over time:

\[\frac{\alpha_0}{1-\alpha_0-\phi_1}\]

From the GARCH(1,1) model, we found that $\alpha_0=2.07$ and $\phi_1=0.87$, giving us the following output:

\[\frac{2.07}{1-2.07-0.87}=103.5\]

Therefore, the long-run variance in the time series is 103.5.

Mean Reversion

Mean reversion in the monthly percentage change in Apple’s closing price will occur at a slow rate because of the persistence in volatility that we have found from running the GARCH model.