Introduction
Securities do not move like a random walk. We have seen that there is
serial correlation (positive in the short-run and
negative in the long-run). It is also commonly perceived that the
variance of financial asset returns is not constant.
Use the package quantmod and get data for Bank of
America.
require(quantmod)
getSymbols('BAC')
## [1] "BAC"
plot(ROC(Cl(BAC)))

As we know from experience and as can be seen from plot of BAC
returns, in financial markets, there are periods of calm and there are
periods of volatility.
We can compare this to the returns of a real random walk that we
created last week.
ry = rnorm(10000)
plot(ry, type = 'l')

There are two ways to deal with changing volatility: attempt to model
a variety of regimes; attempt to capture the autoregressive nature of
the volatility.
Modelling regimes
In this case we categorise periods into different regimes: often calm
and crisis. During the period of calm we may have small steady returns
with a small standard deviation while the periods of crisis will be
lower (possibly negative) returns on average with a much larger standard
deviation. The crisis may also be more prone to non-symmetric
distribution.
A skewed return is associated with crash risk and
lottery tickets. Crash risk comes with negative skew
and it is the risk that extremely negative events such as total wipe
out can happen more frequently than would be expected with a normal
distribution. Positive skew means that there is a small chance of very
good things happening. Risk averse investors would avoid assets with
negative skew while risk-loving should aim for positive skew assets.
Example
This is an example that is adapted from Eran Raviv. You can find his
example on his
very useful blog
# using quantmod library
library(quantmod)
sym <- c("SPY")
l <- length(sym)
end <- format(Sys.Date(), "%Y-%m-%d")
start <- format(as.Date("1990-01-01"), "%Y-%m-%d")
dat0 <- getSymbols(sym[1], src = 'yahoo', from = start, to = end, auto.assign = FALSE)
w0 <- dailyReturn(dat0)
colnames(w0) <- "SPYR"
head(w0)
## SPYR
## 1993-01-29 -0.0007107321
## 1993-02-01 0.0071123755
## 1993-02-02 0.0021186441
## 1993-02-03 0.0105708245
## 1993-02-04 0.0041841004
## 1993-02-05 -0.0006944444
tail(w0)
## SPYR
## 2025-04-21 -0.0238026802
## 2025-04-22 0.0260177376
## 2025-04-23 0.0154954631
## 2025-04-24 0.0210489333
## 2025-04-25 0.0072253237
## 2025-04-28 0.0003813035
Now we will use the mixtools package to estimate the
regimes. You can get more details here about mixtools.
This will use the Expectation-Maximisation model to
find the most likely regimes from the data. It does this by iterating
back and forth between random estimates for the parameters for the
number of distributions that have been specified.
#install.packages('mixtools')
library(mixtools)
## mixtools package, version 2.0.0, Released 2022-12-04
## This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).
set.seed(123)
mix_mod <- normalmixEM(w0[, "SPYR"], k = 2, lambda = c(0.2, 0.8))
## number of iterations= 110
summary(mix_mod)
## summary of normalmixEM object:
## comp 1 comp 2
## lambda 0.792600265 0.2073997
## mu 0.000942401 -0.0017617
## sigma 0.007079912 0.0218634
## loglik at estimate: 25513.95
This shows that the two regimes (let’s name them as Crisis
and Calm) comprise 21% and 79% of the recorded returns. The
expected return of the calm period is estimated at about 28% per year
and a volatility of i11%. If you look back at other examples of expected
return and volatility, you will see that this is extremely impressive.
The expected return of the crisis regime is -34% with a volatility of
34%. However, when you look at the distribution, you will see that this
does not tell the whole story.
Turn the estimated probability that each return is in either regime
into a binary variable based on the most probable regime and plot the
distribution of returns in each regime.
regime <- apply(mix_mod$posterior, 2, round)
plot(density(w0[regime[, 1] == 0]), col = rgb(0, 1, 0, 0.5),
xlim = c(min(w0), max(w0)), ylim = c(0, 70),
main = "Probability Distribution", xlab = "Returns")
polygon(density(w0[regime[, 1] == 0]), col = rgb(0, 1, 0.8, 0.5))
lines(density(w0[regime[, 1] == 1]))
polygon(density(w0[regime[, 1] == 1]), col = rgb(.7, 0, 0, 0.5))
legend('topleft', inset = 0.05, legend = c('Crisis', 'Calm'),
col = c(rgb(0, 1, 0, 0.5), rgb(0.7, 0, 0, 0.5)), cex = 0.8,
fill = c(rgb(0, 1, 0.8), rgb(1, 0, 0)))

You can read about using regimes to understand carry-trade
risk in Hayward and Hölscher
(2014) and Hayward and Hölscher
(2017). Also available on My Studies.
Practice 10.1
- Choose a firm, download the the daily data from 2000 to the
present
- Compare the results for two and three regimes
- Which do you prefer? Why?
GARCH
GARCH will model the general autoregressive conditional
hetroskedasticity. The two main authors here are Engle (1982) and Bollerslev (1986). GARCH is a more general form
of the ARCH (Autoregressive hetroskedasticity) of the form:
\[\sigma_t^2 = \mu + \sum_{i=1}^q \alpha_i
\varepsilon_{t - i}^2 + \sum_{i = 1}^p \beta_i
\sigma_{t-i}^2\]
where \(\mu\) includes external
regressors and auto-regressive elements to remove serial correlation and
\(\varepsilon_t\) are the residuals
once target series have been filtered (usually this will be the squared
returns). The GARCH order is determined by \((q,p)\). \(P\) is the persistence calculated as:
\[P = \sum_{i = 1}^q \alpha_i +
\sum_{i=1}^p \beta_i\]
This must be less than 1 outside the IGARCH model (where it is equal
to one) and measures the persistence of volatility shocks. The other
model constraints are:
\[1 > \alpha > 0\] \[1 > \beta > 0 \] The
unconditional or long-run variance is given by:
\[ \sigma^2 = \frac{\mu}{1 - \alpha -
\beta}\]
We will use the rugarch model. This is quite complex and
sophisticated but it can be simplified as two processes. There is a more
up-to-date package that has been developed by the authors of
rugarch. This is called tsgarch. There is an
overview of the package here:
Create the specification with the ugarchspec
function. This allows you to model different types of GARCH, specify the
underlying ARIMA model that will remove serial correlation and any
seasonal effects and provide a lot of other customisation such as. You
can mostly use the defaults to start.
Fit the chosen GARCH model with ugarchfit. This will
use maximum-likelihood to estimate the chosen
model.
Main GARCH models
There are four main GARCH models (though many other variations and
subtle adjustments that can be made):
The basic GARCH (1,1) model has one
autoregresssive (ARCH) term and one moving average (GARCH) component.
This means that p and q equal one in Equation 1.
EGARCH is exponential GARCH and allows for
asymmetric effects between positive and negative effective. This is also
called the leverage effect as a result of the way that leverage
positions have to be unwound in a downturn. Nelson (1990)
IGARCH is integrated GARCH where the
autoregressive and moving average components sum to unity and any
innovation in variance is persistent. This is where the \(\alpha\) and \(\beta\) coefficients sum to one.
GJR GARCH Glosten,
Jagannathan, and Runkle (1993) also models asymmetric effects
with \(\gamma\) representing an
leverage term that takes 1 for positive values and zero
for negative values.
Asymmetric power ARCH Ding, Granger, and Engle (1993) allows for both
leverage and power effects.
Variance targetting
Variance targetting can simplify the estimation process. It is
particularly useful with complex models. This will fix the model’s
long-run (unconditional) variance to the sample variance of the data,
reducing the number of parameters to be estimated. This will help with
model stability.
Example
A plan to deal with the many possibilities that are available is to
use the basic GARCH model, carry out the diagnostic tests and then see
if they suggest an adjustment to this standard way of addressing
conditional hetroskedasticity. This may show that you need additional
lags, asymmetric effects or non-normal distributions.
From the rugarch package.
require(rugarch)
# start with the standard specification
spec <- ugarchspec()
show(spec)
##
## *---------------------------------*
## * GARCH Model Spec *
## *---------------------------------*
##
## Conditional Variance Dynamics
## ------------------------------------
## GARCH Model : sGARCH(1,1)
## Variance Targeting : FALSE
##
## Conditional Mean Dynamics
## ------------------------------------
## Mean Model : ARFIMA(1,0,1)
## Include Mean : TRUE
## GARCH-in-Mean : FALSE
##
## Conditional Distribution
## ------------------------------------
## Distribution : norm
## Includes Skew : FALSE
## Includes Shape : FALSE
## Includes Lambda : FALSE
# use S&P 500 return data that comes with the package
data(sp500ret)
# fit a standard model with the default specifications
fit <- ugarchfit(spec = spec, data = sp500ret)
show(fit)
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,1)
## Mean Model : ARFIMA(1,0,1)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.000523 0.000087 5.9926 0.00000
## ar1 0.870077 0.072183 12.0538 0.00000
## ma1 -0.897302 0.064617 -13.8865 0.00000
## omega 0.000001 0.000001 1.3971 0.16239
## alpha1 0.087711 0.013658 6.4221 0.00000
## beta1 0.904940 0.013704 66.0350 0.00000
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.000523 0.000129 4.038545 0.000054
## ar1 0.870077 0.088201 9.864737 0.000000
## ma1 -0.897302 0.080362 -11.165757 0.000000
## omega 0.000001 0.000014 0.094161 0.924982
## alpha1 0.087711 0.185043 0.474004 0.635497
## beta1 0.904940 0.190544 4.749240 0.000002
##
## LogLikelihood : 17902.41
##
## Information Criteria
## ------------------------------------
##
## Akaike -6.4807
## Bayes -6.4735
## Shibata -6.4807
## Hannan-Quinn -6.4782
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 5.553 1.845e-02
## Lag[2*(p+q)+(p+q)-1][5] 6.444 1.225e-05
## Lag[4*(p+q)+(p+q)-1][9] 7.200 1.100e-01
## d.o.f=2
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 1.105 0.2933
## Lag[2*(p+q)+(p+q)-1][5] 1.499 0.7401
## Lag[4*(p+q)+(p+q)-1][9] 1.958 0.9100
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.01981 0.500 2.000 0.8881
## ARCH Lag[5] 0.17496 1.440 1.667 0.9713
## ARCH Lag[7] 0.53652 2.315 1.543 0.9750
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 174.2382
## Individual Statistics:
## mu 0.2054
## ar1 0.1482
## ma1 0.1051
## omega 21.3506
## alpha1 0.1346
## beta1 0.1134
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.49 1.68 2.12
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.4294 6.676e-01
## Negative Sign Bias 2.9494 3.198e-03 ***
## Positive Sign Bias 2.3922 1.678e-02 **
## Joint Effect 28.9851 2.256e-06 ***
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 178.6 5.738e-28
## 2 30 188.3 2.924e-25
## 3 40 217.8 1.084e-26
## 4 50 227.7 7.646e-25
##
##
## Elapsed time : 1.128303
Diagnostic tests
In addition to the estimates of the coefficient estimates, the
ugarchfit function will return a number of diagnostics. As
usual, these are based on the requirement that the remaining error
represented by residuals from the estimated model have a constant mean
and variance and no serial correlation.
- Log-Likelihood ratios
- AIC, BIC, HQIC and SIC tests of fit (these will
deflate the Log-Likelihood for additional variables).
- Autoregressive and Arch tests of residuals. Null
hypothesis is that there is no serial correlation or hetroskedasticity
in the residuals.
- Nybold stability tests the stability of the
estimate coefficients. Null is that the parameters are constant.
- The signbias tests will assess asymmetric GARCH
effects. Null is that there are no effects.
- Goodness of fit (gof) will compare the density of the residuals with
those of the specification.
- Adjusted Pearson Goodness of Fit compares the
distribution of the residuals with the theoretical distribution. The
null is that the theoretical and null are identical.
Update the GARCH model
Now respond to the diagnostic statistics by:
- adding more autoregressive/moving average elements
- allowing for asymmetric effects
- adjusting the expected distribution
In this case of the S&P 500 it appears that more lags are
required, asymmetric responses are needed and fatter tails are required.
You can see below that the specification is now apARCH,
the distribution is std which is student-t-distribution
(fatter tails) and the armaOrder is 2,2, with two auto-regressive and
two moving average components.
spec <- ugarchspec(variance.model = list('apARCH'),
distribution.model = 'std',
mean.model = list(armaOrder = c(2,2)))
fit <- ugarchfit(spec = spec, sp500ret)
show(fit)
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,1)
## Mean Model : ARFIMA(2,0,2)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.000605 0.000089 6.7876 0.000000
## ar1 1.734997 0.006153 281.9765 0.000000
## ar2 -0.877288 0.006214 -141.1870 0.000000
## ma1 -1.755679 0.000043 -41170.7160 0.000000
## ma2 0.882688 0.000042 21107.5077 0.000000
## omega 0.000001 0.000000 2.3452 0.019015
## alpha1 0.063080 0.004490 14.0485 0.000000
## beta1 0.934264 0.004206 222.1121 0.000000
## shape 5.933294 0.453517 13.0828 0.000000
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.000605 0.000089 6.8053e+00 0.000000
## ar1 1.734997 0.005505 3.1517e+02 0.000000
## ar2 -0.877288 0.005363 -1.6358e+02 0.000000
## ma1 -1.755679 0.000038 -4.5728e+04 0.000000
## ma2 0.882688 0.000038 2.3457e+04 0.000000
## omega 0.000001 0.000001 8.7254e-01 0.382914
## alpha1 0.063080 0.020232 3.1179e+00 0.001821
## beta1 0.934264 0.018761 4.9797e+01 0.000000
## shape 5.933294 0.625160 9.4908e+00 0.000000
##
## LogLikelihood : 18119.93
##
## Information Criteria
## ------------------------------------
##
## Akaike -6.5584
## Bayes -6.5476
## Shibata -6.5584
## Hannan-Quinn -6.5546
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 2.951 0.08582
## Lag[2*(p+q)+(p+q)-1][11] 5.496 0.79562
## Lag[4*(p+q)+(p+q)-1][19] 9.475 0.55734
## d.o.f=4
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 4.400 0.03595
## Lag[2*(p+q)+(p+q)-1][5] 6.288 0.07688
## Lag[4*(p+q)+(p+q)-1][9] 6.890 0.20858
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.8179 0.500 2.000 0.3658
## ARCH Lag[5] 0.8270 1.440 1.667 0.7849
## ARCH Lag[7] 1.0441 2.315 1.543 0.9063
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 672.6332
## Individual Statistics:
## mu 0.13580
## ar1 0.04676
## ar2 0.11618
## ma1 0.04294
## ma2 0.07678
## omega 172.33133
## alpha1 1.11176
## beta1 0.74772
## shape 2.01813
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 2.1 2.32 2.82
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.2926 7.699e-01
## Negative Sign Bias 3.8169 1.366e-04 ***
## Positive Sign Bias 2.0077 4.472e-02 **
## Joint Effect 33.5364 2.482e-07 ***
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 57.78 8.633e-06
## 2 30 73.53 9.826e-06
## 3 40 91.49 4.187e-06
## 4 50 107.57 2.824e-06
##
##
## Elapsed time : 2.243246
The estimated volatility
plot(fit@fit$sigma, type = 'l', main = "Estimated volatility for S&P 500")

Forecasting
it is also possible to use the preferred model to forecast using the
ugarchforecast function.
forecast <- ugarchforecast(fit, n.head = 5)
plot(forecast, which = 1)

plot(forecast, which = 3)

Practice 10.2
- Get the returns for an asset of your choice
- Fit a GARCH model, starting with the basic and then adjust to try to
ensure that model is well-behaved.
- What range is forecast for the next 5 days?