Introduction

Securities do not move like a random walk. We have seen that there is serial correlation (positive in the short-run and negative in the long-run). It is also commonly perceived that the variance of financial asset returns is not constant.

Use the package quantmod and get data for Bank of America.

require(quantmod)
getSymbols('BAC')

## [1] "BAC"

plot(ROC(Cl(BAC)))

As we know from experience and as can be seen from plot of BAC returns, in financial markets, there are periods of calm and there are periods of volatility.

We can compare this to the returns of a real random walk that we created last week.

ry = rnorm(10000)
plot(ry, type = 'l')

There are two ways to deal with changing volatility: attempt to model a variety of regimes; attempt to capture the autoregressive nature of the volatility.

Modelling regimes

In this case we categorise periods into different regimes: often calm and crisis. During the period of calm we may have small steady returns with a small standard deviation while the periods of crisis will be lower (possibly negative) returns on average with a much larger standard deviation. The crisis may also be more prone to non-symmetric distribution.

A skewed return is associated with crash risk and lottery tickets. Crash risk comes with negative skew and it is the risk that extremely negative events such as total wipe out can happen more frequently than would be expected with a normal distribution. Positive skew means that there is a small chance of very good things happening. Risk averse investors would avoid assets with negative skew while risk-loving should aim for positive skew assets.

Example

This is an example that is adapted from Eran Raviv. You can find his example on his very useful blog

# using quantmod library
library(quantmod)
sym <- c("SPY")
l <- length(sym)
end <- format(Sys.Date(), "%Y-%m-%d")
start <- format(as.Date("1990-01-01"), "%Y-%m-%d")
dat0 <- getSymbols(sym[1], src = 'yahoo', from = start, to = end, auto.assign = FALSE)
w0 <- dailyReturn(dat0)
colnames(w0) <- "SPYR"
head(w0)

##                     SPYR
## 1993-01-29 -0.0007107321
## 1993-02-01  0.0071123755
## 1993-02-02  0.0021186441
## 1993-02-03  0.0105708245
## 1993-02-04  0.0041841004
## 1993-02-05 -0.0006944444

tail(w0)

##                     SPYR
## 2025-04-21 -0.0238026802
## 2025-04-22  0.0260177376
## 2025-04-23  0.0154954631
## 2025-04-24  0.0210489333
## 2025-04-25  0.0072253237
## 2025-04-28  0.0003813035

Now we will use the mixtools package to estimate the regimes. You can get more details here about mixtools. This will use the Expectation-Maximisation model to find the most likely regimes from the data. It does this by iterating back and forth between random estimates for the parameters for the number of distributions that have been specified.

#install.packages('mixtools')
library(mixtools)

## mixtools package, version 2.0.0, Released 2022-12-04
## This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).

set.seed(123)
mix_mod <- normalmixEM(w0[, "SPYR"], k = 2, lambda = c(0.2, 0.8))

## number of iterations= 110

summary(mix_mod)

## summary of normalmixEM object:
##             comp 1     comp 2
## lambda 0.792600265  0.2073997
## mu     0.000942401 -0.0017617
## sigma  0.007079912  0.0218634
## loglik at estimate:  25513.95

This shows that the two regimes (let’s name them as Crisis and Calm) comprise 21% and 79% of the recorded returns. The expected return of the calm period is estimated at about 28% per year and a volatility of i11%. If you look back at other examples of expected return and volatility, you will see that this is extremely impressive. The expected return of the crisis regime is -34% with a volatility of 34%. However, when you look at the distribution, you will see that this does not tell the whole story.

Turn the estimated probability that each return is in either regime into a binary variable based on the most probable regime and plot the distribution of returns in each regime.

regime <- apply(mix_mod$posterior, 2, round)
plot(density(w0[regime[, 1] == 0]), col = rgb(0, 1, 0, 0.5), 
        xlim = c(min(w0), max(w0)), ylim = c(0, 70), 
     main = "Probability Distribution",  xlab = "Returns")
polygon(density(w0[regime[, 1] == 0]), col = rgb(0, 1, 0.8, 0.5))
lines(density(w0[regime[, 1] == 1]))
polygon(density(w0[regime[, 1] == 1]), col = rgb(.7, 0, 0, 0.5))     
legend('topleft', inset = 0.05, legend = c('Crisis', 'Calm'), 
       col = c(rgb(0, 1, 0, 0.5), rgb(0.7, 0, 0, 0.5)), cex = 0.8, 
       fill = c(rgb(0, 1, 0.8), rgb(1, 0, 0)))

You can read about using regimes to understand carry-trade risk in Hayward and Hölscher (2014) and Hayward and Hölscher (2017). Also available on My Studies.

Practice 10.1

Choose a firm, download the the daily data from 2000 to the present
Compare the results for two and three regimes
Which do you prefer? Why?

GARCH

GARCH will model the general autoregressive conditional hetroskedasticity. The two main authors here are Engle (1982) and Bollerslev (1986). GARCH is a more general form of the ARCH (Autoregressive hetroskedasticity) of the form:

\[\sigma_t^2 = \mu + \sum_{i=1}^q \alpha_i \varepsilon_{t - i}^2 + \sum_{i = 1}^p \beta_i \sigma_{t-i}^2\]

where \(\mu\) includes external regressors and auto-regressive elements to remove serial correlation and \(\varepsilon_t\) are the residuals once target series have been filtered (usually this will be the squared returns). The GARCH order is determined by \((q,p)\). \(P\) is the persistence calculated as:

\[P = \sum_{i = 1}^q \alpha_i + \sum_{i=1}^p \beta_i\]

This must be less than 1 outside the IGARCH model (where it is equal to one) and measures the persistence of volatility shocks. The other model constraints are:

\[1 > \alpha > 0\] \[1 > \beta > 0 \] The unconditional or long-run variance is given by:

\[ \sigma^2 = \frac{\mu}{1 - \alpha - \beta}\]

We will use the rugarch model. This is quite complex and sophisticated but it can be simplified as two processes. There is a more up-to-date package that has been developed by the authors of rugarch. This is called tsgarch. There is an overview of the package here:

Create the specification with the ugarchspec function. This allows you to model different types of GARCH, specify the underlying ARIMA model that will remove serial correlation and any seasonal effects and provide a lot of other customisation such as. You can mostly use the defaults to start.
Fit the chosen GARCH model with ugarchfit. This will use maximum-likelihood to estimate the chosen model.

Main GARCH models

There are four main GARCH models (though many other variations and subtle adjustments that can be made):

The basic GARCH (1,1) model has one autoregresssive (ARCH) term and one moving average (GARCH) component. This means that p and q equal one in Equation 1.
EGARCH is exponential GARCH and allows for asymmetric effects between positive and negative effective. This is also called the leverage effect as a result of the way that leverage positions have to be unwound in a downturn. Nelson (1990)
IGARCH is integrated GARCH where the autoregressive and moving average components sum to unity and any innovation in variance is persistent. This is where the \(\alpha\) and \(\beta\) coefficients sum to one.
GJR GARCH Glosten, Jagannathan, and Runkle (1993) also models asymmetric effects with \(\gamma\) representing an leverage term that takes 1 for positive values and zero for negative values.
Asymmetric power ARCH Ding, Granger, and Engle (1993) allows for both leverage and power effects.

Variance targetting

Variance targetting can simplify the estimation process. It is particularly useful with complex models. This will fix the model’s long-run (unconditional) variance to the sample variance of the data, reducing the number of parameters to be estimated. This will help with model stability.

Example

A plan to deal with the many possibilities that are available is to use the basic GARCH model, carry out the diagnostic tests and then see if they suggest an adjustment to this standard way of addressing conditional hetroskedasticity. This may show that you need additional lags, asymmetric effects or non-normal distributions.

From the rugarch package.

require(rugarch)
# start with the standard specification
spec <- ugarchspec()
show(spec)

## 
## *---------------------------------*
## *       GARCH Model Spec          *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## ------------------------------------
## GARCH Model      : sGARCH(1,1)
## Variance Targeting   : FALSE 
## 
## Conditional Mean Dynamics
## ------------------------------------
## Mean Model       : ARFIMA(1,0,1)
## Include Mean     : TRUE 
## GARCH-in-Mean        : FALSE 
## 
## Conditional Distribution
## ------------------------------------
## Distribution :  norm 
## Includes Skew    :  FALSE 
## Includes Shape   :  FALSE 
## Includes Lambda  :  FALSE

# use S&P 500 return data that comes with the package
data(sp500ret)
# fit a standard model with the default specifications
fit <- ugarchfit(spec = spec, data = sp500ret)
show(fit)

## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,1)
## Mean Model   : ARFIMA(1,0,1)
## Distribution : norm 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.000523    0.000087   5.9926  0.00000
## ar1     0.870077    0.072183  12.0538  0.00000
## ma1    -0.897302    0.064617 -13.8865  0.00000
## omega   0.000001    0.000001   1.3971  0.16239
## alpha1  0.087711    0.013658   6.4221  0.00000
## beta1   0.904940    0.013704  66.0350  0.00000
## 
## Robust Standard Errors:
##         Estimate  Std. Error    t value Pr(>|t|)
## mu      0.000523    0.000129   4.038545 0.000054
## ar1     0.870077    0.088201   9.864737 0.000000
## ma1    -0.897302    0.080362 -11.165757 0.000000
## omega   0.000001    0.000014   0.094161 0.924982
## alpha1  0.087711    0.185043   0.474004 0.635497
## beta1   0.904940    0.190544   4.749240 0.000002
## 
## LogLikelihood : 17902.41 
## 
## Information Criteria
## ------------------------------------
##                     
## Akaike       -6.4807
## Bayes        -6.4735
## Shibata      -6.4807
## Hannan-Quinn -6.4782
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                         statistic   p-value
## Lag[1]                      5.553 1.845e-02
## Lag[2*(p+q)+(p+q)-1][5]     6.444 1.225e-05
## Lag[4*(p+q)+(p+q)-1][9]     7.200 1.100e-01
## d.o.f=2
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                      1.105  0.2933
## Lag[2*(p+q)+(p+q)-1][5]     1.499  0.7401
## Lag[4*(p+q)+(p+q)-1][9]     1.958  0.9100
## d.o.f=2
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale P-Value
## ARCH Lag[3]   0.01981 0.500 2.000  0.8881
## ARCH Lag[5]   0.17496 1.440 1.667  0.9713
## ARCH Lag[7]   0.53652 2.315 1.543  0.9750
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  174.2382
## Individual Statistics:              
## mu      0.2054
## ar1     0.1482
## ma1     0.1051
## omega  21.3506
## alpha1  0.1346
## beta1   0.1134
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          1.49 1.68 2.12
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value      prob sig
## Sign Bias           0.4294 6.676e-01    
## Negative Sign Bias  2.9494 3.198e-03 ***
## Positive Sign Bias  2.3922 1.678e-02  **
## Joint Effect       28.9851 2.256e-06 ***
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20     178.6    5.738e-28
## 2    30     188.3    2.924e-25
## 3    40     217.8    1.084e-26
## 4    50     227.7    7.646e-25
## 
## 
## Elapsed time : 1.128303

Diagnostic tests

In addition to the estimates of the coefficient estimates, the ugarchfit function will return a number of diagnostics. As usual, these are based on the requirement that the remaining error represented by residuals from the estimated model have a constant mean and variance and no serial correlation.

Log-Likelihood ratios
AIC, BIC, HQIC and SIC tests of fit (these will deflate the Log-Likelihood for additional variables).
Autoregressive and Arch tests of residuals. Null hypothesis is that there is no serial correlation or hetroskedasticity in the residuals.
Nybold stability tests the stability of the estimate coefficients. Null is that the parameters are constant.
The signbias tests will assess asymmetric GARCH effects. Null is that there are no effects.
Goodness of fit (gof) will compare the density of the residuals with those of the specification.
Adjusted Pearson Goodness of Fit compares the distribution of the residuals with the theoretical distribution. The null is that the theoretical and null are identical.

Update the GARCH model

Now respond to the diagnostic statistics by:

adding more autoregressive/moving average elements
allowing for asymmetric effects
adjusting the expected distribution

In this case of the S&P 500 it appears that more lags are required, asymmetric responses are needed and fatter tails are required. You can see below that the specification is now apARCH, the distribution is std which is student-t-distribution (fatter tails) and the armaOrder is 2,2, with two auto-regressive and two moving average components.

spec <- ugarchspec(variance.model = list('apARCH'),
                                      distribution.model = 'std',
                   mean.model = list(armaOrder = c(2,2)))
fit <- ugarchfit(spec = spec, sp500ret)
show(fit)

## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,1)
## Mean Model   : ARFIMA(2,0,2)
## Distribution : std 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error     t value Pr(>|t|)
## mu      0.000605    0.000089      6.7876 0.000000
## ar1     1.734997    0.006153    281.9765 0.000000
## ar2    -0.877288    0.006214   -141.1870 0.000000
## ma1    -1.755679    0.000043 -41170.7160 0.000000
## ma2     0.882688    0.000042  21107.5077 0.000000
## omega   0.000001    0.000000      2.3452 0.019015
## alpha1  0.063080    0.004490     14.0485 0.000000
## beta1   0.934264    0.004206    222.1121 0.000000
## shape   5.933294    0.453517     13.0828 0.000000
## 
## Robust Standard Errors:
##         Estimate  Std. Error     t value Pr(>|t|)
## mu      0.000605    0.000089  6.8053e+00 0.000000
## ar1     1.734997    0.005505  3.1517e+02 0.000000
## ar2    -0.877288    0.005363 -1.6358e+02 0.000000
## ma1    -1.755679    0.000038 -4.5728e+04 0.000000
## ma2     0.882688    0.000038  2.3457e+04 0.000000
## omega   0.000001    0.000001  8.7254e-01 0.382914
## alpha1  0.063080    0.020232  3.1179e+00 0.001821
## beta1   0.934264    0.018761  4.9797e+01 0.000000
## shape   5.933294    0.625160  9.4908e+00 0.000000
## 
## LogLikelihood : 18119.93 
## 
## Information Criteria
## ------------------------------------
##                     
## Akaike       -6.5584
## Bayes        -6.5476
## Shibata      -6.5584
## Hannan-Quinn -6.5546
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                          statistic p-value
## Lag[1]                       2.951 0.08582
## Lag[2*(p+q)+(p+q)-1][11]     5.496 0.79562
## Lag[4*(p+q)+(p+q)-1][19]     9.475 0.55734
## d.o.f=4
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                      4.400 0.03595
## Lag[2*(p+q)+(p+q)-1][5]     6.288 0.07688
## Lag[4*(p+q)+(p+q)-1][9]     6.890 0.20858
## d.o.f=2
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale P-Value
## ARCH Lag[3]    0.8179 0.500 2.000  0.3658
## ARCH Lag[5]    0.8270 1.440 1.667  0.7849
## ARCH Lag[7]    1.0441 2.315 1.543  0.9063
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  672.6332
## Individual Statistics:                
## mu       0.13580
## ar1      0.04676
## ar2      0.11618
## ma1      0.04294
## ma2      0.07678
## omega  172.33133
## alpha1   1.11176
## beta1    0.74772
## shape    2.01813
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          2.1 2.32 2.82
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value      prob sig
## Sign Bias           0.2926 7.699e-01    
## Negative Sign Bias  3.8169 1.366e-04 ***
## Positive Sign Bias  2.0077 4.472e-02  **
## Joint Effect       33.5364 2.482e-07 ***
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20     57.78    8.633e-06
## 2    30     73.53    9.826e-06
## 3    40     91.49    4.187e-06
## 4    50    107.57    2.824e-06
## 
## 
## Elapsed time : 2.243246

The estimated volatility

plot(fit@fit$sigma, type = 'l', main = "Estimated volatility for S&P 500")

Forecasting

it is also possible to use the preferred model to forecast using the ugarchforecast function.

forecast <- ugarchforecast(fit, n.head = 5)
plot(forecast, which = 1)

plot(forecast, which = 3)

Practice 10.2

Get the returns for an asset of your choice
Fit a GARCH model, starting with the basic and then adjust to try to ensure that model is well-behaved.
What range is forecast for the next 5 days?

GARCH

Rob Hayward

Introduction

Modelling regimes

Example

Practice 10.1

GARCH

Main GARCH models

Variance targetting

Example

Diagnostic tests

Update the GARCH model

Forecasting

Practice 10.2

Strategy

Bibliography