Value at Risk

Value at Risk (VaR) is widely used to measure Market Risk. Surprisingly, and against empirical evidence, almost all institutions keep the risk constant over time when measuring VaR. In this note I employ VaR with time-varying volatility. Risks changes considerably over time, so VaR should change in tandem. I´m allowing for non-normal distributions with heavy and asymmetric tails.

Goldman Sachs as example

Let’s use the company Goldman Sachs as example. Daily price data is collected over the period 2010 - today. The figure below shows how price evolved over time.

The blue line denotes closing prices adjusted for dividends and stock splits.

The 95% VaR using a Normal distribution is -3.01%. The 99% VaR using a Normal distribution is -4.28%.

There are many problems with these two measures. We will see that the risk is overstated for the 95% VaR, understated for the 99% VaR, and that the correct VaR is not constant, but moves considerably over time. Let’s have a sneak view of the time-varying 99% VaR:

Some other often-used risk statistics are presented in the table below:
Goldman Sachs
Semi Deviation 0.0133
Gain Deviation 0.0130
Loss Deviation 0.0140
Downside Deviation (MAR=210%) 0.0178
Downside Deviation (Rf=0%) 0.0132
Downside Deviation (0%) 0.0132
Maximum Drawdown 0.5176
Historical VaR (95%) -0.0273
Historical ES (95%) -0.0435
Modified VaR (95%) -0.0271
Modified ES (95%) -0.0401

A disadvantage of all these measures is that they are static. The risk is the same no matter what the circumstances. In the graph below the returns of Goldman Sachs are in grey; the blue line represents the static 99% VaR, while the red line shows the time-varying dynamic 99% VaR (which will be explained in detail below). In volatile times, the dynamic VaR is able to adjust to the increased risk and as a result much less exceedances take place. We can also see from the graph that most of the time the dynamic VaR is above the static VaR, so the static version overstates risk most of the time. Most importantly, we see that volatility clusters over time and that the risk is most-certainly time-varying.

Note that usually VaRs are defined as positive numbers, as it represents losses. For illustration purposes and to compare it to return losses, I define VaRs as negative numbers in this article. Let´s start with examining and modeling of the returns series.

Are the returns stationary?

To check whether the return series are stationary or not, an Augmented Dickey-Fuller test is employed. If means, variances and covariances change over time, the series are non-stationary, which may lead to unreliable forecasts. A stationary process is mean-reverting, i.e, it fluctuates around a constant mean with constant variance. The Augmented Dickey-Fuller test where the null hypothesis indicates non-stationary time series is applied to the return series:

## 
##  Augmented Dickey-Fuller Test
## 
## data:  returns
## Dickey-Fuller = -13.444, Lag order = 13, p-value = 0.01
## alternative hypothesis: stationary

The small p-value suggests there is sufficient evidence to reject the null hypothesis, therefore time series are considered stationary.

Modelling the returns process

To model the return process, the Box-Jenkins approach for time series analysis is applied. It aims to find the best fit of a time series model that represent the stochastic process. This method consists of three stages: identification, estimation and diagnostic checking.

To identify the model the Akaike Information Criterion (AIC) is used. AIC is an estimator of out-of-sample prediction errors. AIC provides a means for model selection by estimates the quality of each model:

\(AIC = ln\frac{\sum{e}^2}{T} + \frac{2k}{T}\)

where

When extra lag parameters are added to the model, the sum of squared residuals decreases but overfitting problems may occur. AIC deals with both the risk of overfitting and underfitting. The model with the lowest AIC will be selected. In this case, the best fitted model is:

## Series: returns 
## ARIMA(0,0,3) with zero mean 
## 
## Coefficients:
##           ma1     ma2      ma3
##       -0.0975  0.0698  -0.0285
## s.e.   0.0193  0.0196   0.0193
## 
## sigma^2 estimated as 0.0003378:  log likelihood=6914.66
## AIC=-13821.32   AICc=-13821.31   BIC=-13797.74

The estimates of the coefficients were obtained using Maximum Likelihood. The diagnostics tests includes a standardized residuals plot observing residual plot, its ACF diagram, the Q-Q plot and a Ljung-Box test:

There appear to be no clear patterns in the standardized residuals, and the ACF plot suggests that there is no autocorrelations in the residuals. From the Q-Q plot there seems to be evidence against a Normal distribution of the residuals. The quantile-quantile (Q-Q) plot shows the distribution of the data against the normal distribution N(0, \(\sigma^2\)). For normally distributed data, observations should lie on the straight blue line. If the data is non-normal, the points form a curve that deviates from the straight (blue) line. The inverted S-shape suggests “fatter than normal” tails. To test the hypothesis that the residual are not autocorrelated, we perform a Ljung-Box test. The Ljung-Box is defined as:

\[Q_{LB} = T(T+2)\sum_{k=1}^{m} \frac{\hat\rho_{k}^{2}}{T-k}\]

The \(Q_{LB}\) statistic follows asymmetrically a \(\chi^2\) distribution. The null hypothesis of no correlation can be formulated as \(H_{0}: \rho_{1}=\rho_{2}=\dots=\rho_{m}=0\)

## 
##  Box-Ljung test
## 
## data:  model.arima$residuals
## X-squared = 50.202, df = 18, p-value = 7.036e-05

We cannot reject the null hypothesis. Therefore we cannot reject that the residuals behave like white noise, and the selected model seems to be appropriate.

Volatility modelling

Although the ACF of residuals have no significant lags, the ACF of the absolute value of the residuals will have significant lags, i.e. time series of residuals exhibit volatility clustering. The Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model will be used to estimate and forecast volatility. The time series of returns, \(r_{t}\), can be decomposed into two parts: the predictable and unpredictable component. In the equation \(r_{t} = E(r_{t}|I_{t-1}) + \epsilon_{t}\), \(E(r_{t}|I_{t-1})\) is the predictable part and \(\epsilon_t\) the unpredictable part.

The unpredictable part can be expressed as a GARCH process in the following form:

\[\epsilon_{t} = z_{t}\sigma_{t}\]

where \(z_{t}\) is independently and identically distributed with zero mean and variance of 1. The conditional variance of \(\epsilon_{t}\) is denoted as \(\sigma_{t}\), a time-varying function of the information set at time \(t-1\).

Next step is to define the the second part of the error term decomposition which is the conditional variance, \(\sigma_{t}\). A GARCH(1,1) specification is defined as:

\[\sigma_{t}^{2} = \omega + \alpha_{1}\epsilon_{t-1}^{2} + \beta_{1}\sigma_{t-1}^{2}\]

The model proposed by Glosten, Jagannathan and Runkle (GJR) is a GARCH(1,1) model with an extension to allow for leverage effects. Negative returns induce higher leverage. The equation for the variance following negative and positive unexpected return becomes \[\sigma_{t}^{2} = \omega + \alpha_{1}\epsilon_{t-1}^{2} + \gamma_{1}I_{t-1}\epsilon_{t-1}^{2} + \beta_{1}\sigma_{t-1}^{2} \]

\(I_{t-1}\) denotes an indicator variable, equal to 1 if there was a negative surprise at time \(t-1\). Consequently, in case of a positive surprise the formula takes the usual GARCH(1,1) equation. In case of a negative surprise the predicted variance should be higher than after a positive surprise.

Estimate Std. Error t value Pr(>|t|)
mu 0.0004302 0.0002753 1.562557 0.1181569
omega 0.0000086 0.0000013 6.515454 0.0000000
alpha1 0.0439824 0.0047331 9.292564 0.0000000
beta1 0.8859195 0.0106224 83.400910 0.0000000
gamma1 0.0847073 0.0192408 4.402477 0.0000107
skew 0.9692371 0.0257638 37.620057 0.0000000
shape 5.5684691 0.5186771 10.735907 0.0000000

Both \(\alpha_{1}\) and \(\beta_{1}\) are significantly different from zero, therefore it is reasonable to assume time-varying volatility of the residuals. We see from the results that the estimated coefficient for the leverage effect, \(\gamma_{1}\), is significant. This means that in case of a negative surprise the predicted variance is higher than after a positive surprise.

Value at Risk

Value at Risk (VaR) is a statistical measure of downside risk based on current position. It estimates how much a set of investments might lose given normal market conditions in a set time period. A VaR statistic has three components: time period, confidence level and loss amount (or loss percentage). For the 95% confidence level, we can say that the worst daily loss will not exceed VaR estimation. If we use historical data, we can estimate VaR by taking the 5% quantile value. For our data this estimation is:

95% VaR: -2.73%; 99% VaR: -4.72%.

If instead of historical data, we use a Normal distribution we obtain:

95% VaR: -3.01%; 99% VaR: -4.28%.

Comparing these figures, we can see that the Normal distribution overstate risk for the 95% VaR, but understates risk for the 99% VaR. This is most likely due to fatter than Normal tails. In the next section we look further into the distributional properties.

Distributional properties

A Jarque-Bera test can be used to test the hypothesis that stock returns follow a Normal distribution:

\[JB = \frac{n-k+1}{6}(S^{2} + \frac{1}{4}(K-3)^{2})\]

where \(S\) is the skewness and \(K\) is the kurtosis. For a Normal distribution \(JB\) will be equal to zero. The \(JB\) score observed in the returns is:

## 
##  Jarque Bera Test
## 
## data:  returns
## X-squared = 10490, df = 2, p-value < 0.00000000000000022

The p-value indicates that the returns are not normally distributed. Let´s look at the histogram of the returns and compare it with a Normal distribution.

The blue line corresponds to the Normal distribution. From this graph we see that are more concentrated around the mean and that there are more outliers than a Normal distribution would predict, especially on the left tail. Let´s look at this left tail:

From this figure we can see that there are much more returns in the left tail extremes than predicted by a Normal distribution. This implies that it is more prone to producing values that fall far from its mean. This figure indicates that for 95% significance, normal distribution usage may overestimate the value at risk. However, for 99% significance level, a normal distribution would underestimate the risk. These facts, combined that returns are concentrated around the mean, makes the student’s t-distribution more suitable. This is illustrated in the two graphs below.

\[VaR(a)=\mu + \sigma*N^{-1}(a)\]

where \(\mu\) is the mean stock return, \(\sigma\) is the standard deviation of returns, \(a\) is the selected confidence level and \(N^{-1}\) is the inverse PDF function, generating the corresponding quantile of a normal distribution given \(a\).

The results of such a simple model is often disappointing and are rarely used in practice today. The assumption of normality and constant daily variance is usually wrong and that is the case for our data as well.

Previously we observed that returns exhibit time-varying volatility. Hence for the estimation of VaR we use the conditional variance given by GARCH(1,1) model. For the underlined asset’s distribution properties we use the student’s t-distribution. For this method Value at Risk is expressed as:

\[VaR(a)=\mu + \hat{\sigma}_{t|t-1}*t^{-1}(a)\]

where \(\hat{\sigma}_{t|t-1}\) is the conditional standard deviation given the information at \(t-1\) and \(t^{-1}\) is the inverse PDF function of t-distribution.

The blue line refers to the Normal 99% VaR, and the red line shows the 99% VaR produced by a dynamic GARCH model with leverage effects and using a fat-tailed t-distribution. We see that the Normal VaR understated the risk in recent months. Today’s VaR for Goldman Sachs, based on the dynamic VaR, is around -4.3%, which happens to be close to the Normal VaR.

Conclusion

Although Value at Risk (VaR) is widely used to measure Market Risk, it is overly simplistic and often leads to measurement of risk. In this note I employ VaR with time-varying volatility. Risks changes considerably over time, so VaR should change in tandem. In addition to the dynamic aspect, the VaR used in this article allows for fat tails and leverage effects in volatility.