Computing returns

For managing financial risk, we need to first measure the risk by analyzing the return series. In this exercise, you are given the S&P 500 price series and you need to plot the daily returns. You will see that large (positive or negative) returns tends to be followed by large returns, of either sign, and small returns tend to be followed by small returns. The periods of sustained low volatility and high volatility are called volatility clusters.

## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.

Standard deviation on subsamples

In the financial crisis of 2008-2009 the volatility was substantially higher than average. Let’s get our hands dirty with analyzing the volatility of the daily returns for the S&P 500 index. You will see that the standard deviation over the complete sample can be substantially different from the standard deviation on subsamples. Recall that standard deviations on daily returns give daily standard deviations. They are annualized by multiplying them using the square-root-of-time rule.

The daily returns sp500ret you calculated in the previous exercise are available in your workspace.

## [1] 0.2087576
## [1] 0.228392
## [1] 0.09050465

Roll, roll, roll You can visualize the time-variation in volatility by using the function chart.RollingPerformance() in the package PerformanceAnalytics. An important tuning parameter is the choice of the window length. The shorter the window, the more responsive the rolling volatility estimate is to recent returns. The longer the window, the smoother it will be. The function sd.annualized lets you compute annualized volatility under the assumption that the number of trading days in a year equals the number specified in the scale argument.

In this exercise you need to complete the code to compute the rolling estimate of annualized volatility for the daily S&P 500 returns in sp500ret for the period 2005 until 2017.

The GARCH equation for volatility prediction

We can use GARCH models to predict volatility of the future return. Input: Time series of returns

Prediction error Under the GARCH model, the variance is driven by the square of the prediction errors e=R−μ. In order to calculate a GARCH variance, you thus need to first compute the prediction errors. For daily returns, it is common practice to set μ equal to the sample average.

You’re going to implement this and then verify that there is a large positive autocorrelation in the absolute value of the prediction errors. Positive autocorrelation reflects the presence of volatility clusters. When volatility is above average, it stays above average for some time. When volatility is low, it stays low for some time.

Note the waves in the absolute prediction errors in the top plot. They correspond to the presence of high and low volatility clusters. In the bottom plot, you can see the large positive autocorrelations in the absolute prediction errors. Almost all of them are above 0.15.

The large spike in volatility in mid september 2008 is due to the collapse of Lehman Brothers on September 15, 2008.

The rugarch package

Three steps: ugarchspec(): Specify which GARCH model you want to use ugarchfit(): Estimate the GARCH model on your time series with returns R_1,…,R_TR ugarchforecast(): Use the estimated GARCH model to make volatility predictions for R_{T+1}

Application to tactical asset allocation: w in a risky asset (volatility sigma(t)) and keeps 1-w on a risk-free bank deposit accounts has volatility equal to wsigma(t). How to set w? One approach is volatility targeting: w is such that the predicted annualized portfolio volatility equals a target level, say 5%, then w = 0.05/sigma(t) Since GARCH volatilities change, the optimal weight changes as well.

Specify and taste the GARCH model flavors In the next chapters, you will see that GARCH models come in many flavors. You thus need to start off by specifying the mean model, the variance model and the error distribution that you want to use. The best model to use is application-specific. A realistic GARCH analysis thus involves specifying, estimating and testing various GARCH models.

In R, this is simple thanks to the rugarch package of Alexios Ghalanos. This package has already been loaded for you. You will apply it to analyze the daily returns in sp500ret.

## Loading required package: parallel
## 
## Attaching package: 'rugarch'
## The following object is masked from 'package:stats':
## 
##     sigma

Well done. Notice the typical GARCH behavior: after a large unexpected return, volatility spikes upwards and then decays away until there is another shock. Let’s move on to forecasting!

Out-of-sample forecasting

The garchvol series is the series of predicted volatilities for each of the returns in the observed time series sp500ret.

For decision making, it is the volatility of the future (not yet observed) return that matters. You get it by applying the ugarchforecast() function to the output from ugarchfit() In forecasting, we call this the out-of-sample volatility forecasts, as they involve predictions of returns that have not been used when estimating the GARCH model.

This exercise uses the garchfit and garchvol objects that you created in the previous exercise. If you need to check which arguments a function takes, you can use ?name_of_function in the Console to access the documentation.

## [1] 0.01288
##                   [,1]
## 2017-12-15 0.008501463
## 2017-12-18 0.008302661
## 2017-12-19 0.008234302
## 2017-12-20 0.008072244
## 2017-12-21 0.007873386
## 2017-12-22 0.008165352
## 2017-12-26 0.007953158
## 2017-12-27 0.007821857
## 2017-12-28 0.007640453
## 2017-12-29 0.007565508
##      2017-12-29
## T+1 0.007415905
## T+2 0.007477487
## T+3 0.007538063
## T+4 0.007597661
## T+5 0.007656309

Well done. In the next exercise you learn how to use GARCH models for portfolio allocation.

Volatility targeting in tactical asset allocation GARCH volatility predictions are of direct practical use in portfolio allocation. According to the two-fund separation theorem of James Tobin, you should invest a proportion w of your wealth in a risky portfolio and the remainder in a risk free asset, like a US Treasury bill.

When you target a portfolio with 5% annualized volatility, and the annualized volatility of the risky asset is σt, then you should invest 0.05/σt in the risky asset.

Non-normality of standardized returns

Actually, it is not realistic to assume a normal distribution when analyzing stock returns using a GARCH model. Normal dist is not consistent with stock market phenomenon. To account this, change distribution.model of ugarchspec() from norm to sstd.

A realistic distribution thus needs to accommodate the presence of: fat tails: higher probability to observe large (positive or negative) returns than under the normal distribution skewness: asymmetry of the return distribution

Compared to the normal dist, the skewed student t distribution has two extra params: degree of freedom, the lower is v, the fatter the tails skewness parameter, <1 negative skewness, >1 positive skewness

##           mu        omega       alpha1        beta1         skew        shape 
## 5.340569e-04 7.420376e-07 4.934102e-02 9.466871e-01 1.015420e+00 5.149017e+00

Leverage effect

Size and sign of et matter for volatility prediction! Negative returns induce higher leverage. We thus need two equations: - One explaining the GARCH variance following a negative unexpected returns - One equation describing the variance reaction after a positive surprise in returns In case the positive surprise in returns, we take the usual GARCH(1,1). In case of a negative surprise, the predicted variance should be higher than after a positive surprise, then we need a higher coefficient: GJR model. These two define the GJR GARCH model.

Just change model="sGARCH" to model="gjrGARCH"

##           mu        omega       alpha1        beta1       gamma1         skew 
## 4.361528e-04 1.063630e-06 2.910480e-02 9.368581e-01 5.672383e-02 1.005481e+00 
##        shape 
## 5.324901e+00

Visualize volatility response using newsimpact()

Mean model GARCH-in-mean model quantify the risk-reward trade-off.

##           mu        archm        omega       alpha1        beta1       gamma1 
## 4.227656e-04 1.757936e-01 1.064220e-06 2.915116e-02 9.369175e-01 5.626739e-02 
##         skew        shape 
## 1.006670e+00 5.310319e+00

archm is a risky parameter lambda

GARCH-in-mean model uses the financial theory of a risk-reward trade-off to build a conditional mean model. Let’s now use statistical theory to make a mean model that exploits the correlation between today’s return and tomorrow’s return.

The most popular model is the AR(1) model: AR(1) stands for autoregressive model of order 1 It predicts the next return using the deviation of the return from its long term mean value μ.

If the AR coeff is positive, when the return above it mean value, the next will be also above the mean value. Possible explanation: market underreact to news and hence there is momentum n returns.

If the AR coeff is negative, then a higher than average return is followed by a lower than average return. Possible explanation: market overreact to news and hence there is reversal in returns.

##            mu           ar1         omega        alpha1         beta1 
##  4.332635e-04 -6.528526e-03  1.064851e-06  2.949195e-02  9.367644e-01 
##        gamma1          skew         shape 
##  5.606955e-02  1.005186e+00  5.307971e+00

Here the AR1 is negative which hints towards overreaction and thus a reversal on the next day.

Other popular model for the conditional mean are the MA(1) and ARMA(1,1) model.

That’s right! In contrast with using an ARMA model for the mean, we have that a GARCH-in-mean model does not exploit the correlation between today’s and tomorrow’s return. It exploits the relationship between the expected return and the variance of the return. The higher the risk in terms of variance, the higher should be the expected return on investment.

Avoid unnecessary complexity The parameters of a GARCH model are estimated by maximum likelihood. Because of sampling uncertainty, the estimated parameters have for sure some estimation error. If we know the true parameter value, it is therefore best to impose that value and not to estimate it.

Let’s do this in case of the daily EUR/USD returns available in the console as the variable EURUSDret and for which an AR(1)-GARCH model with skewed student t distribution has already been estimated, and made available as the ugarchfit object called flexgarchfit.

Parameter bounds and impact on forecasts Let’s take again the flexible GARCH model specification for which the estimated coefficients are printed in the console. Now assume that you believe that the GARCH parameter α should be between 0.05 and 0.1, while the β parameter is between 0.8 and 0.95. You are asked here to re-estimate the model by imposing those bounds and see the effect on the volatility forecasts for the next ten days obtained using ugarchforecast.

Variance targeting Financial return volatility clusters through time: periods of above average volatility are followed by period of below average volatility. The long run prediction is that: - when volatility is high, it will decrease and revert to its long run average. - when volatility is low, it will increase and revert to its long run average. In the estimation of GARCH models we can exploit this mean reversion behavior of volatility by means of volatility targeting. We then estimate the GARCH parameters in such a way that the long run volatility implied by the GARCH model equals the sample standard deviation.

Let’s do this for the EUR/USD returns.

Well done. Since volatility is mean reverting, it is wise to use this empirical fact when estimating the GARCH model. As you’ve probably realized by now, there is a lot of skill involved in finding the right model. You sharpen those skills in the next chapter.

Statistical significance

Are the variables in your GARCH model relevant? Can you simplify the model? ⟺ Are there parameters zero? - If the ar1 parameter is zero, you can use a constant mean model. - If the gamma1 parameter is zero, there is no GARCH-in-mean and you can use a standard GARCH model instead of the GJR. It seem simple, but actually we don’t know the true parameter value is, all needs to be estimated.

##         Estimate  Std. Error    t value Pr(>|t|)
## mu      0.000433    0.000136   3.195030 0.001398
## ar1    -0.006529    0.014611  -0.446814 0.655009
## omega   0.000001    0.000000   3.092377 0.001986
## alpha1  0.029492    0.004696   6.279831 0.000000
## beta1   0.936764    0.005014 186.812603 0.000000
## gamma1  0.056070    0.011499   4.875930 0.000001
## skew    1.005186    0.020473  49.098137 0.000000
## shape   5.307971    0.397403  13.356652 0.000000

ar1 is statistically insignificant, we can made a simpler model.

Goodness of fit Do the GARCH predictions fit well with the observed returns?

## [1] 0.0001728729
## [1] 2.974194e-07
## [1] 14248.54

The result of the likelihood should not be interpreted by itself, but compare it with the likelihood obtained using other models, like a GJR model, etc.

##                       
## Akaike       -6.291382
## Bayes        -6.280040
## Shibata      -6.291388
## Hannan-Quinn -6.287386

That’s right! By choosing the model with the highest likelihood, you will end up with the most complex model that is not parsimonious and has a high risk of overfitting.

Diagnosing absolute standardized returns Check 1: Mean and sd of standardized returns. Sample mean 0, sample sd 1 Check 2: Time series plot of standardized returns. Should have constant variability Check 3: No predictability in the absolute standardized returns Verify that there is no correlation between the past absolute standardized return and the current absolute standardized return. Why? The magnitude of the absolute standardized return should be constant -> no correlations n the absolute standardized returns. Check 4: Ljung-Box test

## 
##  Box-Ljung test
## 
## data:  abs(stdret)
## X-squared = 29.575, df = 22, p-value = 0.1292

Backtesting using ugarchroll

Solution to avoid look-ahead bias: Rolling estimation/ - Program a for loop - Built-in function ugarchroll() You can reduce the computational cost by estimating the model every K observations.

Evaluate the accuracy of preds$Mu and preds$Sigma by comparing it with preds$Realized

## [1] 8.333045e-05
## [1] 3.764772e-08

Value-at-risk

How much would you lose in the best of the 5% worst cases?

A popular measure of downside risk: 5% value-at-risk. The 5% quantile of the return distribution represents the best return in the 5% worst scenarios.

The workflow to obtain predicted 5% quantiles from ugarchroll is:

## [1] 0.04982733

VaR coverage and model validation Interpretation of coverage for VaR at loss probability α (e.g. 5%): Valid prediction model has a coverage that is close to the probability level α used. If coverage ≫ α: too many exceedances: the predicted quantile should be more negative. Risk of losing money has been underestimated. If coverage ≪ α: too few exceedances, the predicted quantile was too negative. Risk of losing money has been overestimated.

For production and simulation

Use the validated GARCH model in production Use ugarchfilter() for analyzing the recent dynamics in the mean and volatility Use ugarchforecast() applied to a ugarchspec object (instead of ugarchfit()) object for making the predictions about the future mean and volatility

Step 1: Defines the final model specs

Step 2: Analysis of past mean and volatility dynamics

Step 3: Make predictions about future returns

##        2017-12-29  2017-12-29
## T+1  0.0004132459 0.007659785
## T+2  0.0004333941 0.007707519
## T+3  0.0004332626 0.007754692
## T+4  0.0004332635 0.007801316
## T+5  0.0004332635 0.007847402
## T+6  0.0004332635 0.007892961
## T+7  0.0004332635 0.007938003
## T+8  0.0004332635 0.007982539
## T+9  0.0004332635 0.008026579
## T+10 0.0004332635 0.008070131

You can also use the complete model spec to simulate artificial log-returns, defined as the difference between the current log-price and the past log-price. The simulation is useful to assess the randomness in future returns and their impact on future price levels.

Step 1: Calibrate the simulation model

Step 2: Run the simulation with ugarchpath() Requries to choose: spec : completely specified GARCH model m.sim : number of time series of simulated returns you want n.sim: number of observations in the simulated time series (e.g. 252) rseed : any number to fix the seed used to generate the simulated series (needed for reproducibility)

Step 3: Analysis of simulated returns

Analysis of simulated prices

That’s right! Stock returns tend to have fat tails and an asymmetric distribution. The normal distribution is not a realistic assumption for stock returns.

In a corporate environment, there is often a distinction between the stage of model engineering and the stage to using the model in production. When using the model in production, it may be that the model is not re-estimated at each stage. You then use the model with fixed coefficients but integrating on each prediction day the new data. The function ugarchfilter() is designed to complete this task.

In this exercise you use a model fitted on the January 1989 till December 2007 daily S&P 500 returns to make a prediction of the future volatility in a turbulent period (September 2008) and a stable period (September 2017). The model has already been specified as is available as garchspec in the R console.

Note that the on the long run the volatility is predicted to return to its average level. This explains why the predicted volatility at T+252 is similar in September 2008 and 2017.

Model Risk Sources: modeling choices starting values in the optimization outliers in the return series Solution: Protect yourself through a robust approach model-averaging: averaging the predictions of multiple models trying several starting values and choosing the one that leads to the highest likelihood

If you cannot choose which model to use, you could estimate them all

The average vol prediction

Cleaning the data Avoid that outliers distort the volatility predictions How? Through winsorization: reduce the magnitude of the return to an acceptable level using the function Return.clean() in the package PerformanceAnalytics with method="boudt":

Be a robustnik: it is better to be roughly right than exactly wrong.

That’s indeed wrong: if your returns are affected by outliers, also the maximum likelihood estimation of the various models will be affected. Garbage in, garbage out. When all models become unreliable due to the outliers, model averaging is not a solution. Instead a robust approach should be used that either cleans the data directly or uses robust estimation methods.

GARCH covariance GARCH volatility leads to time-varying variability of the returns. GARCH covariance estimation in four steps

Step 1: Use ugarchfit() to estimate the GARCH model for each return series.

Step 2: Use residuals() to compute the standardized returns.

Step 3: Use cor() to estimate ρ as the sample correlation of the standardized returns.

Step 4: Compute the GARCH covariance by multiplying the estimated correlation and volatilities

Minimum variance portfolio weights

The daily beta of MSFT