Abstract
In this workshop we will learn about Value at Risk (VaR) and Expected Shortfall (ES) and how to apply it to a real-world market index. We will also use a GARCH(1,1) model to forecast volatility and VaR.You will work in RStudio. Create an R Notebook document (File -> New File -> R Notebook), where you have to write whatever is asked in this workshop.
At the beginning of the R Notebook write Workshop 9 - Financial Econometrics II and your name (as we did in previous workshop).
You have to replicate all the steps explained in this workshop, and ALSO you have to do whatever is asked. Any QUESTION or any STEP you need to do will be written in CAPITAL LETTERS. For ANY QUESTION, you have to RESPOND IN CAPITAL LETTERS right after the question.
It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.
You have to keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file. Pay attention in class to know how to generate an html file from your .Rmd.
Value at Risk (VaR) is a measure of risk that focuses on potential losses of an investment in a short period of time.
Investment firms and banks constantly monitor VaR of their portfolios in order to be prepared in case a bad scenario. VaR is usually estimated daily using a small probability such as 5%, 1%, or 0.1%.
Very bad scenarios in the short term have a certain probability p to happen in one day. For example, if the probability of losing 10% or MORE of an investment in one single day is 5%, then we say that the 5% Value at Risk (5%VaR) is equal to 10%.
We can also calculate VaR in money if we know the initial investment. For example, using the same example, if my initial investment is $1’000,000, then the 5%VaR = 10%, which is equal to $100,000.00. VaR can be reported as a positive or negative amount; this is just a matter of preference. But we have to be aware that VaR is a measure of potential loss in % or in $ with a certain probability.
p%VaR is the MINIMUM % loss of an investment with a probability of p. In other words, there is a probability p that I can lose %VaR or MORE. Another way to define p%VaR is the following. p%VaR is the MAXIMUM %loss of an investment with a probatility (1-p). In other words, there is a probability (1-p) that I can have a return of %VaR or HIGHER.
Then, we can say that p%VaR is the best scenario that can happen after a very bad scenario that can happen with probability p. Also, we can say that p%VaR is the worse scenario that can happen when the scenario is not the bad one, so when there is a probability of (1-p)%.
Let’s illustrate VaR using real data from the US index S&P500 daily returns from January 2019 to April 26, 2022.
library(quantmod)
getSymbols("^GSPC", from="2019-01-01",
to="2022-04-27")
## [1] "^GSPC"
# I get the adjusted column:
= Ad(GSPC)
SPindex # I calculate daily returns of the index:
<- na.omit(diff(log(SPindex)))
SPreturnnames(SPreturn)<-c("return")
We plot the daily index over time:
plot(SPindex)
We see strong decline in March 2020, another important decline in early 2022.
Now we do a histogram of the return of this index:
hist(SPreturn, breaks=40)
The height of each bar represents the frequency in number of days that a specific daily return range has happened.
If the daily returns behaves like a normal-distributed variable, then about 2.5% of the days the S&P500 returns would be less than the its mean minus 2 times its standard deviation. Remember that for a normal-distributed variable, in the range plus/minus 2 standard deviations from its mean, we will find about 95% of the values.
We downloaded 835 days of return, so we would expect that about 20.875 will be the # of days that the SP&500 would offer returns less than -2 times its standard deviation from its mean:
# Number of days for the S&P returns:
=nrow(SPreturn)
N N
## [1] 835
# The 2.5% of these # of days is about:
=0.025*nrow(SPreturn)
n25 n25
## [1] 20.875
Now we can look the real data and see how many days the S&P500 has offered returns less than 2 times its standard deviation from its mean. We can calculate the mean and standard deviation (volatility) of daily returns, and then identify how many days the S&P500 has offered extreme returns less than the mean minus 2 times its volatility:
# We calculate the mean and standard deviation of returns:
=mean(SPreturn$return)
mean_return mean_return
## [1] 0.0006094223
= sd(SPreturn$return)
volatility volatility
## [1] 0.01417127
# We calculate the expected minimum return of the 95% confidence interval (C.I.)
# assuming normality (the 2.5 percentile)
= mean_return-2*volatility
min95ci_returns min95ci_returns
## [1] -0.02773312
# We select the days with returns less than the minimum of the 95% C.I.
<- SPreturn[SPreturn$return<min95ci_returns,]
extremes extremes
## return
## 2019-08-05 -0.03023018
## 2019-08-14 -0.02973035
## 2020-02-24 -0.03408808
## 2020-02-25 -0.03074790
## 2020-02-27 -0.04516814
## 2020-03-03 -0.02851048
## 2020-03-05 -0.03451078
## 2020-03-09 -0.07901041
## 2020-03-11 -0.05010289
## 2020-03-12 -0.09994485
## 2020-03-16 -0.12765220
## 2020-03-18 -0.05322227
## 2020-03-20 -0.04432762
## 2020-03-23 -0.02973150
## 2020-03-27 -0.03426785
## 2020-04-01 -0.04514636
## 2020-04-21 -0.03115512
## 2020-05-01 -0.02846021
## 2020-06-11 -0.06075269
## 2020-09-03 -0.03575759
## 2020-09-08 -0.02814883
## 2020-10-28 -0.03592554
## 2022-03-07 -0.02996259
## 2022-04-22 -0.02813208
## 2022-04-26 -0.02855001
= nrow(extremes)
nextremes # Number of extreme returns:
nextremes
## [1] 25
= nextremes / N * 100
percextremes percextremes
## [1] 2.994012
In this case, we have 25 extreme days with returns less than the minimum of the 95% C.I. of daily returns. Then, it seems that we have had more extreme returns than expected. This represents about 2.994012% of the total # of days, which is greater than the expected 2.5%.
Then, 2.994012% of the time, the S&P500 has experience a loss of -2.7733123% or MORE. If we believe that in the future the S&P500 will behave in a similar manner, then we can say that there is a probability of 2.994012% that I can lose -2.7733123% or MORE in only ONE DAY.
We can also say that there is a probability of 97.005988% (100% - 2.994012%) that the MAXIMUM loss I can experience in ONE DAY is -2.7733123%. Then, the 2.994012%VaR of the daily S&P500 returns is -2.7733123%:
= min95ci_returns*100
VaR1 VaR1
## [1] -2.773312
For turbulence periods in financial markets it is likely that a bad scenario with probability p can happen. Once this scenario happens, then p%VaR is the best scenario, but in reality this the p%VaR is not too realistic. We need to be more conservative and estimate which might be the expected loss once the probability p of the bad scenario happens. This conservative estimate is given by the Expected Shortfall. We will review the concept of ES in the following section.
Expected Shortfall with a probability p refers to the expected average loss of an investment in ONE period (in this case, in one day) once the probability p of a bad scenario happens. Then, the absolute value of p%ES will always be greater (more negative) than the p%VaR. Let’s calculate the ES using the previous example of the S&P500 500 daily returns.
To estimate the ES at the 2.994012% probability, we just calculate the average of the days when the S&P500 returns have been less than -2.7733123%.We can do this as follows:
extremes
## return
## 2019-08-05 -0.03023018
## 2019-08-14 -0.02973035
## 2020-02-24 -0.03408808
## 2020-02-25 -0.03074790
## 2020-02-27 -0.04516814
## 2020-03-03 -0.02851048
## 2020-03-05 -0.03451078
## 2020-03-09 -0.07901041
## 2020-03-11 -0.05010289
## 2020-03-12 -0.09994485
## 2020-03-16 -0.12765220
## 2020-03-18 -0.05322227
## 2020-03-20 -0.04432762
## 2020-03-23 -0.02973150
## 2020-03-27 -0.03426785
## 2020-04-01 -0.04514636
## 2020-04-21 -0.03115512
## 2020-05-01 -0.02846021
## 2020-06-11 -0.06075269
## 2020-09-03 -0.03575759
## 2020-09-08 -0.02814883
## 2020-10-28 -0.03592554
## 2022-03-07 -0.02996259
## 2022-04-22 -0.02813208
## 2022-04-26 -0.02855001
=mean(extremes$return) * 100
ES ES
## [1] -4.412946
The Expected Shortfall at the 2.994012% probability is the average of these extreme return, which is -4.4129462%. As expected, the 2.994012% ES is more negative than the 2.994012%VaR (-4.4129462% vs -2.7733123%).
There are more than one method to estimate VaR and ES. Read the my Note “Introduction to Value at Risk” to learn different methods to estimate VaR and Expected Shortfall.
The most popular methods are 1) Percentile method using real distribution, 2) assuming normal distribution of returns, and 3) assuming a fat-tailed distribution of returns using the t-Student probability distribution.
Another not popular method (but more accurate) is using a volatility model like an GARCH(1,1) model to estimate and forecast daily volatility, and use this volatility to estimate Value at Risk.
Skewness and Kurtosis are statistical measures of a random variable. Here is a quick summary of Skewness and Kurtosis:
Skewness is a measure for asymmetry of the distribution. A normal distribution has about the same values that are less than then mean compared with the values that are higher than the mean. A normal distribution has a skewness close to zero. A negative skewness means that the tail to the left has more values (skewed to the left). A positive skewness means that the tail to the right has more values (skewed to the right).
Kurtosis can be used to examine know whether a variable has a distribution with much more extreme values compared to the normal distribution. According to the value of Kurtosis, we can identify whether we have a significant number of extreme values in a variable:
If Kurtosis=3 then, the variable behaves close to normal distribution
If Kurtosis>3, then this means that there are extreme values in the variable that makes the distribution to be a fat-tailed distribution.
Here is a quick review of moments in statistics, which are the base to understand Variance, Skewness and Kurtosis:
Moments in Statistics:
A Moment in Statistis is a specific measure about a probability distribution of a random variable.
The expected value of a random variable X can be calculated as the mean of X using historical values:
\[ Mean\left(X\right)=\overline{X}=E\left[X\right]=\frac{1}{N}\sum\left(X_{i}\right)=\frac{X_{1}+X_{2}+...X_{N}}{N} \]
The 4 Moments in Statistics are:
1st moment: \(E\left[X-\overline{X}\right]=0\)
2nd moment: \(E\left[X-\overline{X}\right]^{2}=\frac{1}{N}\sum\left(X_{i}-\overline{X}\right)^{2}=VAR\left(X\right)\) = Variance(X);
The Standard Deviation of X is the square root of the Variance(X):
\(SD\left(X\right)=\sqrt{VAR\left(X\right)}\)
The 3rd moment is used to estimate Skewness(X) as:
\(Skewness\left(X\right)=\frac{E\left[X-\overline{X}\right]^{3}}{SD\left(X\right)^{3}}\)
The 4rd moment is used to estimate Kurtosis(X) as:
\(Kurtosis\left(X\right)=\frac{E\left[X-\overline{X}\right]^{4}}{SD\left(X\right)^{4}}\)
Calculate Kurtosis of S&P500 daily returns. Do you see a sign of fat-tail distribution of returns
The moments library has functions for Kurtosis and Skewness. Install this library.
library(moments)
= as.numeric(kurtosis(SPreturn))
kurtosis kurtosis
## [1] 20.0883
Since kurtosis is much higher than 3, this is a sign of fat-tailed distribution.
Considering that the distribution of real returns is more a fat-tail distribution (more-than-normal extreme values), it is better to use the t distribution rather than the normal distribution. Research indicates that there is a relationship between the kurtosis and the degrees of freedom of the t distribution to model fat-tail distributions.
A t-distribution with low degrees of freedom shows fatter tails compared with a t distribution with higher degrees of freedom (or compared to a normal distribution).
The way to estimate the degrees of freedom of a t distribution to model a fat-tail distribution, we use the kurtosis of the distribution as follows:
Degrees of freedom (df)= 6/Kurtosis + 4
You have to estimate the degrees of freedom and then use the t distribution to estimate the 1% VaR of $1million for the next future day of the sample. Compare both estimations of VaR, the one calculated assuming normal distribution and this one. Briefly reports this comparison.
Hint: the function qt(0.01,df) gives you the t value, which represents the # of standard deviations you need to go from the mean to the left to arrive to the 1% percentile.
I calculate the degrees of freedom I use to simulate a fat-tailed distribution using the t-Student distribution:
# I calculate the degrees of freedom for the t distribution:
= 6/kurtosis + 4
df df
## [1] 4.298681
# I calculate the t critical value according to the probability of 1% and the df:
= qt(0.01,df)
t1 t1
## [1] -3.607109
Now I get the 1%VaR assuming fat-tail distribution:
= mean(SPreturn)
meansp = sd(SPreturn)
volatilitysp = meansp + t1 * volatilitysp
VaR2 VaR2
## [1] -0.05050791
Interpretation of this 1%VaR:
Assuming fat-tail distribution of daily returns of the S&P500, I can lose in one day -5.0507905% or more with a probability of 1%.
Compared to the first 1%VaR=-277.331227% (assuming normal distribution), this 1%VaR=-5.0507905% assuming fat-tail distribution is more conservative, and more realistic.
To calculate the Expected Shortfall assuming fat-tail distribution, we can calculate the average return of the bad days that have had returns less than the 1%VaR assuming fat-tail distribution:
<- SPreturn[SPreturn$return<=VaR2,]
baddays2 baddays2
## return
## 2020-03-09 -0.07901041
## 2020-03-12 -0.09994485
## 2020-03-16 -0.12765220
## 2020-03-18 -0.05322227
## 2020-06-11 -0.06075269
Now we can get the average of these very bad days:
= mean(baddays2$return)
ES3 ES3
## [1] -0.08411649
This would be the more conservative estimation of 1% Expected Shortfall of the S&P500. Then, assuming fat-tail distribution of daily returns of the S&P500, and using real historical daily returns, ON AVERAGE, I can lose in one day -8.4116486% with a probability of 1%.
Work with the same dataset (daily S&P500 returns).
library(rugarch)
=ugarchspec(variance.model= list(model= "sGARCH", garchOrder= c(1, 1),
Especgarch1submodel= NULL, variance.targeting= FALSE),
mean.model= list(armaOrder= c(0, 0), include.mean= TRUE, archm= FALSE,
archpow= 0, arfima= FALSE, external.regressors= NULL, archex= FALSE),
distribution.model= "norm", start.pars= list(), fixed.pars= list())
#Estimación del GARCH(1,1)
<- ugarchfit(spec=Especgarch1, data=SPreturn$return)
garch1 garch1
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : sGARCH(1,1)
## Mean Model : ARFIMA(0,0,0)
## Distribution : norm
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.001047 0.000270 3.8804 0.000104
## omega 0.000006 0.000003 2.3265 0.019992
## alpha1 0.253621 0.036941 6.8656 0.000000
## beta1 0.721068 0.011208 64.3365 0.000000
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.001047 0.000260 4.02891 0.000056
## omega 0.000006 0.000007 0.84434 0.398480
## alpha1 0.253621 0.051157 4.95771 0.000001
## beta1 0.721068 0.094269 7.64904 0.000000
##
## LogLikelihood : 2677.884
##
## Information Criteria
## ------------------------------------
##
## Akaike -6.4045
## Bayes -6.3819
## Shibata -6.4046
## Hannan-Quinn -6.3958
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.7817 0.3766
## Lag[2*(p+q)+(p+q)-1][2] 0.9404 0.5180
## Lag[4*(p+q)+(p+q)-1][5] 1.3737 0.7709
## d.o.f=0
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.4232 0.5153
## Lag[2*(p+q)+(p+q)-1][5] 1.1354 0.8283
## Lag[4*(p+q)+(p+q)-1][9] 1.7665 0.9310
## d.o.f=2
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[3] 0.01124 0.500 2.000 0.9156
## ARCH Lag[5] 0.92434 1.440 1.667 0.7555
## ARCH Lag[7] 1.06258 2.315 1.543 0.9032
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 2.5098
## Individual Statistics:
## mu 0.2157
## omega 0.2498
## alpha1 0.1074
## beta1 0.2079
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.07 1.24 1.6
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 2.0405 0.04162 **
## Negative Sign Bias 0.6192 0.53597
## Positive Sign Bias 0.4766 0.63375
## Joint Effect 7.0996 0.06879 *
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 43.06 0.0012721
## 2 30 60.44 0.0005448
## 3 40 80.40 0.0001071
## 4 50 77.28 0.0061301
##
##
## Elapsed time : 0.133045
*INTERPRETATION the model output**
What does each coefficient mean?
THE beta1 COEFFICIENT IS 0.7210678, AND IT IS SIGNIFICANTLY DIFFERENT THAN ZERO. THIS MEANS THAT ABOUT 72.1067762% OF THE RETURN VARIANCE OF YESTERDAY IS PASSED TO THE RETURN VARIANCE OF TODAY, BESIDES THE IMPACT OF THE YESTERDAY’S SQUARED ERROR. SINCE THIS COEFFICIENT IS GREATER THAN 0.7, THIS IS A SIGN OF CLUSTERING OF VOLATILITY.
CONSIDERING THE VARIANCE OF YESTERDAY, THE SQUARED ERROR OF YESTERDAY IMPACTS THE RETURN VARIANCE OF TODAY IN A POSITIVE AND SIGNIFICANT WAY, BUT ONLY ABOUT 25.3621103% OF THE SQUARED ERROR OF YESTERDAY IS PASSED TO THE VARIANCE OF TODAY.
<- ugarchforecast(garch1,n.ahead= 1) garch1.prediction
garch1.prediction will be a very special R class: an uGARCHforecast class. This is special R class object that stores information about the GARCH prediction.
For this type of R class objects, we use the @ character to get access to the content of each element stored in this object.
The forecast for the daily return is stored in:
@forecast$seriesFor garch1.prediction
## 2022-04-26
## T+1 0.001047322
The forecast for the standard deviation of returns (volatility) is stored in:
@forecast$sigmaFor garch1.prediction
## 2022-04-26
## T+1 0.0201652
Now using this predicted volatility, I calculate a new 1%VaR for tomorrow. I will assume fat-tail distribution, so I will use the t-Student distribution with the degrees of freedom I calculated above:
= meansp + t1 * garch1.prediction@forecast$sigmaFor[1]
VaR1_G VaR1_G
## [1] -0.07212867
*The 1%VaR using the predicted volatility from the GARCH(1,1) model is -7.2128673%.
We do a forecast for 10 days:
<- ugarchforecast(garch1,n.ahead= 10) garch1.prediction
We can plot the historic and future volatility:
plot(garch1.prediction,which = 3)
Now I calculate the daily 1%VaR for these 10 future days:
= meansp + t1 * garch1.prediction@forecast$sigmaFor
VaR1_G VaR1_G
## 2022-04-26
## T+1 -0.07212867
## T+2 -0.07174392
## T+3 -0.07136692
## T+4 -0.07099756
## T+5 -0.07063570
## T+6 -0.07028122
## T+7 -0.06993400
## T+8 -0.06959392
## T+9 -0.06926085
## T+10 -0.06893468
# Now I calculate 1%VaR for a $1 million per day:
= VaR1_G * 1000000
VaR1_G_1mill VaR1_G_1mill
## 2022-04-26
## T+1 -72128.67
## T+2 -71743.92
## T+3 -71366.92
## T+4 -70997.56
## T+5 -70635.70
## T+6 -70281.22
## T+7 -69934.00
## T+8 -69593.92
## T+9 -69260.85
## T+10 -68934.68
*The GARCH(1,1) fit estimation for volatility (sigma) and for stock return in the history is stored in the garch1 R object at:
garch1@fit$sigma
We can look at the last values of the volatility fitted values:
tail(garch1@fit$sigma)
## [1] 0.01017609 0.01169813 0.01026554 0.01209087 0.01809300 0.01573199
We can use these fit sigma values and attach them in the SPreturn xts dataset along with the daily returns:
$gvol = garch1@fit$sigma SPreturn
Now we add a column for the 1%VaR using these historical GARCH volatilities. I will use the t-Student distribution assuming fat-tails ( I will use the t1 I estimated above):
$gVaR = meansp + t1 * SPreturn$gvol
SPreturn$gVaR1mill = SPreturn$gVaR * 1000000 SPreturn
Now I plot daily historical returns, and historial GARCH(1,1) fitted values for daily volatilities and 1%Value at Risk:
plot(SPreturn$gVaR1mill)
plot(SPreturn[,c(2,3)])
The green line is the GARCH(1,1) 1% Value at Risk; the red line is the GARCH(1,1) fitted volatility, and the black line is the daily return series.
I recommend you to read the article GARCH 101: The use of [http://www.cmat.edu.uy/~mordecki/hk/engle.pdf||ARCH/GARCH Models in Applied Econometrics] written by Robert Engle to better understand how you can apply ARCH model to forecast volatility and VaR.
Go to Canvas and respond Quiz 9. You will have 3 attempts. Questions in this Quiz are related to concepts of the readings related to this Workshop.
The grade of this Workshop will be the following:
Remember that you have to submit your .html file through Canvas BEFORE NEXT CLASS.