Workshop 9, Financial Econometrics II

Alberto Dorantes, Ph.D. Apr 27, 2022

Consideration

Estimados estudiantes: En estos días getSymbols de la librería quantmod NO está funcionando (marca un error al bajar datos). Para arreglar el error, tienen que re-instalar la librería quantmod. Pueden hacerlo desde Packages en la ventana inferior derecha de RStudio, ó también escribir en la consola:

install.packages("quantmod")
Error in install.packages : Updating loaded packages
library(quantmod)

Abstract

In this workshop we will learn about Value at Risk (VaR) and Expected Shortfall (ES) and how to apply it to a real-world market index. We will also use a GARCH(1,1) model to forecast volatility and VaR.

Introduction to Value at Risk (VaR)

Value at Risk (VaR) is a measure of risk that focuses on potential losses of an investment in a short period of time. Investment firms and banks constantly monitor VaR of their portfolios in order to be prepared in case a bad scenario. VaR is usually estimated daily using a small probability such as 5%, 1%, or 0.1%. Very bad scenarios in the short term have a certain probability p to happen in one day. For example, if the probability of losing 10% or MORE of an investment in one single day is 5%, then we say that the 5% Value at Risk (5%VaR) is equal to 10%. We can also calculate VaR in money if we know the initial investment. For example, using the same example, if my initial investment is $1’000,000, then the 5%VaR = 10%, which is equal to $100,000.00. VaR can be reported as a positive or negative amount; this is just a matter of preference. But we have to be aware that VaR is a measure of potential loss in % or in $ with a certain probability. p%VaR is the MINIMUM % loss of an investment with a probability of p. In other words, there is a probability p that I can lose %VaR or MORE. Another way to define p%VaR is the following. p%VaR is the MAXIMUM %loss of an investment with a probatility (1-p). In other words, there is a probability (1-p) that I can have a return of %VaR or HIGHER. Then, we can say that p%VaR is the best scenario that can happen after a very bad scenario that can happen with probability p. Also, we can say that p%VaR is the worse scenario that can happen when the scenario is not the bad one, so when there is a probability of (1-p)%. Let’s illustrate VaR using real data from the US index S&P500 daily returns from January 2019 to April 26, 2022.

library(quantmod)
getSymbols("^GSPC", from="2019-01-01",
                    to="2022-04-27")
[1] "^GSPC"
# I get the adjusted column:
SPindex = Ad(GSPC)
# I calculate daily returns of the index:
SPreturn<- na.omit(diff(log(SPindex)))
names(SPreturn)<-c("return")

We plot the daily index over time:

plot(SPindex)

We see strong decline in March 2020, another important decline in early 2022.

Now we do a histogram of the return of this index

hist(SPreturn, breaks=40)

The height of each bar represents the frequency in number of days that a specific daily return range has happened.

If the daily returns behaves like a normal-distributed variable, then about 2.5% of the days the S&P500 returns would be less than the its mean minus 2 times its standard deviation. Remember that for a normal-distributed variable, in the range plus/minus 2 standard deviations from its mean, we will find about 95% of the values.

We downloaded 835 days of return, so we would expect that about 20.875 will be the # of days that the SP&500 would offer returns less than -2 times its standard deviation from its mean:

# Number of days for the S&P returns:
N=nrow(SPreturn)
N
[1] 835
# The 2.5% of these # of days is about:
n25=0.025*nrow(SPreturn)
n25
[1] 20.875

Now we can look the real data and see how many days the S&P500 has offered returns less than 2 times its standard deviation from its mean. We can calculate the mean and standard deviation (volatility) of daily returns, and then identify how many days the S&P500 has offered extreme returns less than the mean minus 2 times its volatility:

# We calculate the mean and standard deviation of returns:
mean_return=mean(SPreturn$return)
mean_return
[1] 0.0006094223
volatility = sd(SPreturn$return)
volatility
[1] 0.01417127
# We calculate the expected minimum return of the 95% confidence interval (C.I.) 
#   assuming normality (the 2.5 percentile)
min95ci_returns = mean_return-2*volatility
min95ci_returns
[1] -0.02773312
# We select the days with returns less than the minimum of the 95% C.I.
extremes<- SPreturn[SPreturn$return<min95ci_returns,]
extremes
                return
2019-08-05 -0.03023018
2019-08-14 -0.02973035
2020-02-24 -0.03408808
2020-02-25 -0.03074790
2020-02-27 -0.04516814
2020-03-03 -0.02851048
2020-03-05 -0.03451078
2020-03-09 -0.07901041
2020-03-11 -0.05010289
2020-03-12 -0.09994485
2020-03-16 -0.12765220
2020-03-18 -0.05322227
2020-03-20 -0.04432762
2020-03-23 -0.02973150
2020-03-27 -0.03426785
2020-04-01 -0.04514636
2020-04-21 -0.03115512
2020-05-01 -0.02846021
2020-06-11 -0.06075269
2020-09-03 -0.03575759
2020-09-08 -0.02814883
2020-10-28 -0.03592554
2022-03-07 -0.02996259
2022-04-22 -0.02813208
2022-04-26 -0.02855001
nextremes = nrow(extremes)
# Number of extreme returns:
nextremes
[1] 25
percextremes = nextremes / N * 100
percextremes
[1] 2.994012

In this case, we have 25 extreme days with returns less than the minimum of the 95% C.I. of daily returns. Then, it seems that we have had more extreme returns than expected. This represents about 2.994012% of the total # of days, which is greater than the expected 2.5%.

Then, 2.994012% of the time, the S&P500 has experience a loss of -2.7733123% or MORE. If we believe that in the future the S&P500 will behave in a similar manner, then we can say that there is a probability of 2.994012% that I can lose -2.7733123% or MORE in only ONE DAY.

We can also say that there is a probability of 97.005988% (100% - 2.994012%) that the MAXIMUM loss I can experience in ONE DAY is -2.7733123%. Then, the 2.994012%VaR of the daily S&P500 returns is -2.7733123%

VaR1 = min95ci_returns*100
VaR1
[1] -2.773312

For turbulence periods in financial markets it is likely that a bad scenario with probability p can happen. Once this scenario happens, then p%VaR is the best scenario, but in reality this the p%VaR is not too realistic. We need to be more conservative and estimate which might be the expected loss once the probability p of the bad scenario happens. This conservative estimate is given by the Expected Shortfall. We will review the concept of ES in the following section.

Expected Shortfall (ES)

Expected Shortfall with a probability p refers to the expected average loss of an investment in ONE period (in this case, in one day) once the probability p of a bad scenario happens. Then, the absolute value of p%ES will always be greater (more negative) than the p%VaR. Let’s calculate the ES using the previous example of the S&P500 500 daily returns.

To estimate the ES at the 2.994012% probability, we just calculate the average of the days when the S&P500 returns have been less than -2.7733123%.We can do this as follows

extremes
                return
2019-08-05 -0.03023018
2019-08-14 -0.02973035
2020-02-24 -0.03408808
2020-02-25 -0.03074790
2020-02-27 -0.04516814
2020-03-03 -0.02851048
2020-03-05 -0.03451078
2020-03-09 -0.07901041
2020-03-11 -0.05010289
2020-03-12 -0.09994485
2020-03-16 -0.12765220
2020-03-18 -0.05322227
2020-03-20 -0.04432762
2020-03-23 -0.02973150
2020-03-27 -0.03426785
2020-04-01 -0.04514636
2020-04-21 -0.03115512
2020-05-01 -0.02846021
2020-06-11 -0.06075269
2020-09-03 -0.03575759
2020-09-08 -0.02814883
2020-10-28 -0.03592554
2022-03-07 -0.02996259
2022-04-22 -0.02813208
2022-04-26 -0.02855001
ES=mean(extremes$return) * 100
ES
[1] -4.412946

The Expected Shortfall at the 2.994012% probability is the average of these extreme return, which is -4.4129462%. As expected, the 2.994012% ES is more negative than the 2.994012%VaR (-4.4129462% vs -2.7733123%).

There are more than one method to estimate VaR and ES. Read the my Note “Introduction to Value at Risk” to learn different methods to estimate VaR and Expected Shortfall.

The most popular methods are 1) Percentile method using real distribution, 2) assuming normal distribution of returns, and 3) assuming a fat-tailed distribution of returns using the t-Student probability distribution.

Another not popular method (but more accurate) is using a volatility model like an GARCH(1,1) model to estimate and forecast daily volatility, and use this volatility to estimate Value at Risk.

Kurtosis and fat-tail distribution

Skewness and Kurtosis are statistical measures of a random variable. Here is a quick summary of Skewness and Kurtosis:

Skewness is a measure for asymmetry of the distribution. A normal distribution has about the same values that are less than then mean compared with the values that are higher than the mean. A normal distribution has a skewness close to zero. A negative skewness means that the tail to the left has more values (skewed to the left). A positive skewness means that the tail to the right has more values (skewed to the right).

Kurtosis can be used to examine know whether a variable has a distribution with much more extreme values compared to the normal distribution. According to the value of Kurtosis, we can identify whether we have a significant number of extreme values in a variable:

If Kurtosis=3 then, the variable behaves close to normal distribution

If Kurtosis>3, then this means that there are extreme values in the variable that makes the distribution to be a fat-tailed distribution.

Here is a quick review of moments in statistics, which are the base to understand Variance, Skewness and Kurtosis:

Moments in Statistics:

A Moment in Statistis is a specific measure about a probability distribution of a random variable.

The expected value of a random variable X can be calculated as the mean of X using historical values:

Mean(X)=X⎯⎯⎯⎯=E[X]=1N∑(Xi)=X1+X2+…XNN

The 4 Moments in Statistics are:

1st moment: E[X−X⎯⎯⎯⎯]=0

2nd moment: E[X−X⎯⎯⎯⎯]2=1N∑(Xi−X⎯⎯⎯⎯)2=VAR(X) = Variance(X);

The Standard Deviation of X is the square root of the Variance(X): SD(X)=VAR(X)‾‾‾‾‾‾‾‾√

3rd moment: E[X−X⎯⎯⎯⎯]3=1N∑(Xi−X⎯⎯⎯⎯)3 The 3rd moment is used to estimate Skewness(X) as:

Skewness(X)=E[X−X⎯⎯⎯⎯]3SD(X)3

4rd moment: E[X−X⎯⎯⎯⎯]4=1N∑(Xi−X⎯⎯⎯⎯)4 The 4rd moment is used to estimate Kurtosis(X) as:

Kurtosis(X)=E[X−X⎯⎯⎯⎯]4SD(X)4

Calculate Kurtosis of S&P500 daily returns. Do you see a sign of fat-tail distribution of returns

The moments library has functions for Kurtosis and Skewness. Install this library

library(moments)
kurtosis = as.numeric(kurtosis(SPreturn))
kurtosis
[1] 20.0883

Since kurtosis is much higher than 3, this is a sign of fat-tailed distribution.

Estimating VaR assuming fat-tail distribution

Considering that the distribution of real returns is more a fat-tail distribution (more-than-normal extreme values), it is better to use the t distribution rather than the normal distribution. Research indicates that there is a relationship between the kurtosis and the degrees of freedom of the t distribution to model fat-tail distributions. A t-distribution with low degrees of freedom shows fatter tails compared with a t distribution with higher degrees of freedom (or compared to a normal distribution). The way to estimate the degrees of freedom of a t distribution to model a fat-tail distribution, we use the kurtosis of the distribution as follows: Degrees of freedom (df)= 6/Kurtosis + 4 You have to estimate the degrees of freedom and then use the t distribution to estimate the 1% VaR of $1million for the next future day of the sample. Compare both estimations of VaR, the one calculated assuming normal distribution and this one. Briefly reports this comparison. Hint: the function qt(0.01,df) gives you the t value, which represents the # of standard deviations you need to go from the mean to the left to arrive to the 1% percentile. I calculate the degrees of freedom I use to simulate a fat-tailed distribution using the t-Student distribution:

# I calculate the degrees of freedom for the t distribution:
df = 6/kurtosis + 4
df
[1] 4.298681
# I calculate the t critical value according to the probability of 1% and the df:
t1 = qt(0.01,df)
t1
[1] -3.607109

Now I get the 1%VaR assuming fat-tail distribution:

meansp = mean(SPreturn)
volatilitysp = sd(SPreturn)
VaR2 = meansp + t1 * volatilitysp
VaR2
[1] -0.05050791

Interpretation of this 1%VaR:

Assuming fat-tail distribution of daily returns of the S&P500, I can lose in one day -5.0507905% or more with a probability of 1%.

Compared to the first 1%VaR=-277.331227% (assuming normal distribution), this 1%VaR=-5.0507905% assuming fat-tail distribution is more conservative, and more realistic.

To calculate the Expected Shortfall assuming fat-tail distribution, we can calculate the average return of the bad days that have had returns less than the 1%VaR assuming fat-tail distribution:

baddays2 <- SPreturn[SPreturn$return<=VaR2,]
baddays2
                return
2020-03-09 -0.07901041
2020-03-12 -0.09994485
2020-03-16 -0.12765220
2020-03-18 -0.05322227
2020-06-11 -0.06075269

Now we can get the average of these very bad days:

ES3 = mean(baddays2$return)
ES3
[1] -0.08411649

This would be the more conservative estimation of 1% Expected Shortfall of the S&P500. Then, assuming fat-tail distribution of daily returns of the S&P500, and using real historical daily returns, ON AVERAGE, I can lose in one day -8.4116486% with a probability of 1%.

Forecasting VaR with GARCH(1,1) Model

Work with the same dataset (daily S&P500 returns).

1.- Run a GARCH(1,1) model. INTERPRET the model and briefly explain your results.

library(rugarch)
Especgarch1=ugarchspec(variance.model= list(model= "sGARCH", garchOrder= c(1, 1), 
submodel= NULL,  variance.targeting= FALSE), 
mean.model= list(armaOrder= c(0, 0), include.mean= TRUE, archm= FALSE, 
archpow= 0, arfima= FALSE,  external.regressors= NULL, archex= FALSE), 
distribution.model= "norm", start.pars= list(), fixed.pars= list())

#Estimación del GARCH(1,1)

garch1<- ugarchfit(spec=Especgarch1, data=SPreturn$return)
garch1

*---------------------------------*
*          GARCH Model Fit        *
*---------------------------------*

Conditional Variance Dynamics   
-----------------------------------
GARCH Model : sGARCH(1,1)
Mean Model  : ARFIMA(0,0,0)
Distribution    : norm 

Optimal Parameters
------------------------------------
        Estimate  Std. Error  t value Pr(>|t|)
mu      0.001047    0.000270   3.8804 0.000104
omega   0.000006    0.000003   2.3265 0.019993
alpha1  0.253622    0.036941   6.8656 0.000000
beta1   0.721067    0.011207  64.3398 0.000000

Robust Standard Errors:
        Estimate  Std. Error  t value Pr(>|t|)
mu      0.001047    0.000260  4.02889 0.000056
omega   0.000006    0.000007  0.84433 0.398487
alpha1  0.253622    0.051157  4.95770 0.000001
beta1   0.721067    0.094270  7.64894 0.000000

LogLikelihood : 2677.884 

Information Criteria
------------------------------------
                    
Akaike       -6.4045
Bayes        -6.3819
Shibata      -6.4046
Hannan-Quinn -6.3958

Weighted Ljung-Box Test on Standardized Residuals
------------------------------------
                        statistic p-value
Lag[1]                     0.7817  0.3766
Lag[2*(p+q)+(p+q)-1][2]    0.9404  0.5180
Lag[4*(p+q)+(p+q)-1][5]    1.3737  0.7709
d.o.f=0
H0 : No serial correlation

Weighted Ljung-Box Test on Standardized Squared Residuals
------------------------------------
                        statistic p-value
Lag[1]                     0.4232  0.5153
Lag[2*(p+q)+(p+q)-1][5]    1.1354  0.8283
Lag[4*(p+q)+(p+q)-1][9]    1.7665  0.9310
d.o.f=2

Weighted ARCH LM Tests
------------------------------------
            Statistic Shape Scale P-Value
ARCH Lag[3]   0.01123 0.500 2.000  0.9156
ARCH Lag[5]   0.92434 1.440 1.667  0.7555
ARCH Lag[7]   1.06257 2.315 1.543  0.9032

Nyblom stability test
------------------------------------
Joint Statistic:  2.5098
Individual Statistics:             
mu     0.2157
omega  0.2498
alpha1 0.1074
beta1  0.2079

Asymptotic Critical Values (10% 5% 1%)
Joint Statistic:         1.07 1.24 1.6
Individual Statistic:    0.35 0.47 0.75

Sign Bias Test
------------------------------------


Adjusted Pearson Goodness-of-Fit Test:
------------------------------------
  group statistic p-value(g-1)
1    20     43.06    0.0012721
2    30     60.44    0.0005448
3    40     80.40    0.0001071
4    50     77.28    0.0061301


Elapsed time : 0.07194638 

*INTERPRETATION the model output**

What does each coefficient mean?

THE beta1 COEFFICIENT IS 0.7210678, AND IT IS SIGNIFICANTLY DIFFERENT THAN ZERO. THIS MEANS THAT ABOUT 72.1067762% OF THE RETURN VARIANCE OF YESTERDAY IS PASSED TO THE RETURN VARIANCE OF TODAY, BESIDES THE IMPACT OF THE YESTERDAY’S SQUARED ERROR. SINCE THIS COEFFICIENT IS GREATER THAN 0.7, THIS IS A SIGN OF CLUSTERING OF VOLATILITY.

CONSIDERING THE VARIANCE OF YESTERDAY, THE SQUARED ERROR OF YESTERDAY IMPACTS THE RETURN VARIANCE OF TODAY IN A POSITIVE AND SIGNIFICANT WAY, BUT ONLY ABOUT 25.3621103% OF THE SQUARED ERROR OF YESTERDAY IS PASSED TO THE VARIANCE OF TODAY.

2.- With this GARCH(1,1) model, forecast both volatility and 1% VaR of $1 million for 1 future day (tomorrow).

garch1.prediction <- ugarchforecast(garch1,n.ahead= 1)

garch1.prediction will be a very special R class: an uGARCHforecast class. This is special R class object that stores information about the GARCH prediction.

For this type of R class objects, we use the @ character to get access to the content of each element stored in this object.

The forecast for the daily return is stored in

garch1.prediction@forecast$seriesFor
     2022-04-26
T+1 0.001047319

The forecast for the standard deviation of returns (volatility) is stored in:

garch1.prediction@forecast$sigmaFor
    2022-04-26
T+1 0.02016522

Now using this predicted volatility, I calculate a new 1%VaR for tomorrow. I will assume fat-tail distribution, so I will use the t-Student distribution with the degrees of freedom I calculated above:

VaR1_G = meansp + t1 * garch1.prediction@forecast$sigmaFor[1]
VaR1_G
[1] -0.07212872

*The 1%VaR using the predicted volatility from the GARCH(1,1) model is -7.2128673%.

Extend the dataset 10 days in the future. Do a within-sample forecast and an out-of-sample forecast for all days of the year (before and after TODAY) for daily volatility and 1%VaR for 1 day.

we do a forecast for 10 days

garch1.prediction <- ugarchforecast(garch1,n.ahead= 10)

We can plot the historic and future volatility:

plot(garch1.prediction,which = 3)

Now I calculate the daily 1%VaR for these 10 future days:

VaR1_G = meansp + t1 * garch1.prediction@forecast$sigmaFor
VaR1_G
      2022-04-26
T+1  -0.07212872
T+2  -0.07174397
T+3  -0.07136698
T+4  -0.07099762
T+5  -0.07063577
T+6  -0.07028130
T+7  -0.06993409
T+8  -0.06959401
T+9  -0.06926094
T+10 -0.06893478
# Now I calculate 1%VaR for a $1 million per day:
VaR1_G_1mill = VaR1_G * 1000000
VaR1_G_1mill
     2022-04-26
T+1   -72128.72
T+2   -71743.97
T+3   -71366.98
T+4   -70997.62
T+5   -70635.77
T+6   -70281.30
T+7   -69934.09
T+8   -69594.01
T+9   -69260.94
T+10  -68934.78

4.- Do a plot of volatility and VaR for the whole year *The GARCH(1,1) fit estimation for volatility (sigma) and for stock return in the history is stored in the garch1 R object at:

$sigma

We can look at the last values of the volatility fitted values:

tail(garch1@fit$sigma)
[1] 0.01017608 0.01169814 0.01026554 0.01209087 0.01809301 0.01573200

We can use these fit sigma values and attach them in the SPreturn xts dataset along with the daily returns:

SPreturn$gvol = garch1@fit$sigma

Now we add a column for the 1%VaR using these historical GARCH volatilities. I will use the t-Student distribution assuming fat-tails ( I will use the t1 I estimated above):

SPreturn$gVaR = meansp + t1 * SPreturn$gvol
SPreturn$gVaR1mill = SPreturn$gVaR * 1000000

Now I plot daily historical returns, and historial GARCH(1,1) fitted values for daily volatilities and 1%Value at Risk:

plot(SPreturn$gVaR1mill)

plot(SPreturn[,c(2,3)])

The green line is the GARCH(1,1) 1% Value at Risk; the red line is the GARCH(1,1) fitted volatility, and the black line is the daily return series.

I recommend you to read the article GARCH 101: The use of [http://www.cmat.edu.uy/~mordecki/hk/engle.pdf||ARCH/GARCH Models in Applied Econometrics] written by Robert Engle to better understand how you can apply ARCH model to forecast volatility and VaR

---
title: "Workshop 9, Financial Econometrics II"
author: Stefan Schweitzer
output: html_notebook
---
## Workshop 9, Financial Econometrics II
# Alberto Dorantes, Ph.D. Apr 27, 2022
  
# Consideration
Estimados estudiantes:
En estos días getSymbols de la librería quantmod NO está funcionando (marca un error al bajar datos).
Para arreglar el error, tienen que re-instalar la librería quantmod. Pueden hacerlo desde Packages en la ventana inferior derecha de RStudio, ó también escribir en la consola:

```{r}
install.packages("quantmod")
```
```{r}
library(quantmod)
```

# Abstract
In this workshop we will learn about Value at Risk (VaR) and Expected Shortfall (ES) and how to apply it to a real-world market index. We will also use a GARCH(1,1) model to forecast volatility and VaR.

# Introduction to Value at Risk (VaR)
Value at Risk (VaR) is a measure of risk that focuses on potential losses of an investment in a short period of time.
Investment firms and banks constantly monitor VaR of their portfolios in order to be prepared in case a bad scenario. VaR is usually estimated daily using a small probability such as 5%, 1%, or 0.1%.
Very bad scenarios in the short term have a certain probability p to happen in one day. For example, if the probability of losing 10% or MORE of an investment in one single day is 5%, then we say that the 5% Value at Risk (5%VaR) is equal to 10%.
We can also calculate VaR in money if we know the initial investment. For example, using the same example, if my initial investment is $1’000,000, then the 5%VaR = 10%, which is equal to $100,000.00. VaR can be reported as a positive or negative amount; this is just a matter of preference. But we have to be aware that VaR is a measure of potential loss in % or in $ with a certain probability.
p%VaR is the MINIMUM % loss of an investment with a probability of p. In other words, there is a probability p that I can lose %VaR or MORE. Another way to define p%VaR is the following. p%VaR is the MAXIMUM %loss of an investment with a probatility (1-p). In other words, there is a probability (1-p) that I can have a return of %VaR or HIGHER.
Then, we can say that p%VaR is the best scenario that can happen after a very bad scenario that can happen with probability p. Also, we can say that p%VaR is the worse scenario that can happen when the scenario is not the bad one, so when there is a probability of (1-p)%.
Let’s illustrate VaR using real data from the US index S&P500 daily returns from January 2019 to April 26, 2022.

```{r}
library(quantmod)
getSymbols("^GSPC", from="2019-01-01",
                    to="2022-04-27")
```

```{r}
# I get the adjusted column:
SPindex = Ad(GSPC)
# I calculate daily returns of the index:
SPreturn<- na.omit(diff(log(SPindex)))
names(SPreturn)<-c("return")
```

We plot the daily index over time:
```{r}
plot(SPindex)
```
We see strong decline in March 2020, another important decline in early 2022.

Now we do a histogram of the return of this index

```{r}
hist(SPreturn, breaks=40)
```
The height of each bar represents the frequency in number of days that a specific daily return range has happened.

If the daily returns behaves like a normal-distributed variable, then about 2.5% of the days the S&P500 returns would be less than the its mean minus 2 times its standard deviation. Remember that for a normal-distributed variable, in the range plus/minus 2 standard deviations from its mean, we will find about 95% of the values.

We downloaded 835 days of return, so we would expect that about 20.875 will be the # of days that the SP&500 would offer returns less than -2 times its standard deviation from its mean:

```{r}
# Number of days for the S&P returns:
N=nrow(SPreturn)
N
```

```{r}
# The 2.5% of these # of days is about:
n25=0.025*nrow(SPreturn)
n25
```

Now we can look the real data and see how many days the S&P500 has offered returns less than 2 times its standard deviation from its mean. We can calculate the mean and standard deviation (volatility) of daily returns, and then identify how many days the S&P500 has offered extreme returns less than the mean minus 2 times its volatility:

```{r}
# We calculate the mean and standard deviation of returns:
mean_return=mean(SPreturn$return)
mean_return
```

```{r}
volatility = sd(SPreturn$return)
volatility
```

```{r}
# We calculate the expected minimum return of the 95% confidence interval (C.I.) 
#   assuming normality (the 2.5 percentile)
min95ci_returns = mean_return-2*volatility
min95ci_returns
```

```{r}
# We select the days with returns less than the minimum of the 95% C.I.
extremes<- SPreturn[SPreturn$return<min95ci_returns,]
extremes
```

```{r}
nextremes = nrow(extremes)
# Number of extreme returns:
nextremes
```

```{r}
percextremes = nextremes / N * 100
percextremes
```

In this case, we have 25 extreme days with returns less than the minimum of the 95% C.I. of daily returns. Then, it seems that we have had more extreme returns than expected. This represents about 2.994012% of the total # of days, which is greater than the expected 2.5%.

Then, 2.994012% of the time, the S&P500 has experience a loss of -2.7733123% or MORE. If we believe that in the future the S&P500 will behave in a similar manner, then we can say that there is a probability of 2.994012% that I can lose -2.7733123% or MORE in only ONE DAY.

We can also say that there is a probability of 97.005988% (100% - 2.994012%) that the MAXIMUM loss I can experience in ONE DAY is -2.7733123%. Then, the 2.994012%VaR of the daily S&P500 returns is -2.7733123%

```{r}
VaR1 = min95ci_returns*100
VaR1
```
For turbulence periods in financial markets it is likely that a bad scenario with probability p can happen. Once this scenario happens, then p%VaR is the best scenario, but in reality this the p%VaR is not too realistic. We need to be more conservative and estimate which might be the expected loss once the probability p of the bad scenario happens. This conservative estimate is given by the Expected Shortfall. We will review the concept of ES in the following section.

# Expected Shortfall (ES)

Expected Shortfall with a probability p refers to the expected average loss of an investment in ONE period (in this case, in one day) once the probability p of a bad scenario happens. Then, the absolute value of p%ES will always be greater (more negative) than the p%VaR. Let’s calculate the ES using the previous example of the S&P500 500 daily returns.

To estimate the ES at the 2.994012% probability, we just calculate the average of the days when the S&P500 returns have been less than -2.7733123%.We can do this as follows

```{r}
extremes
```
```{r}
ES=mean(extremes$return) * 100
ES
```
The Expected Shortfall at the 2.994012% probability is the average of these extreme return, which is -4.4129462%. As expected, the 2.994012% ES is more negative than the 2.994012%VaR (-4.4129462% vs -2.7733123%).

There are more than one method to estimate VaR and ES. Read the my Note “Introduction to Value at Risk” to learn different methods to estimate VaR and Expected Shortfall.

The most popular methods are 1) Percentile method using real distribution, 2) assuming normal distribution of returns, and 3) assuming a fat-tailed distribution of returns using the t-Student probability distribution.

Another not popular method (but more accurate) is using a volatility model like an GARCH(1,1) model to estimate and forecast daily volatility, and use this volatility to estimate Value at Risk.

# Kurtosis and fat-tail distribution

Skewness and Kurtosis are statistical measures of a random variable. Here is a quick summary of Skewness and Kurtosis:

Skewness is a measure for asymmetry of the distribution. A normal distribution has about the same values that are less than then mean compared with the values that are higher than the mean. A normal distribution has a skewness close to zero. A negative skewness means that the tail to the left has more values (skewed to the left). A positive skewness means that the tail to the right has more values (skewed to the right).

Kurtosis can be used to examine know whether a variable has a distribution with much more extreme values compared to the normal distribution. According to the value of Kurtosis, we can identify whether we have a significant number of extreme values in a variable:

If Kurtosis=3 then, the variable behaves close to normal distribution

If Kurtosis>3, then this means that there are extreme values in the variable that makes the distribution to be a fat-tailed distribution.

Here is a quick review of moments in statistics, which are the base to understand Variance, Skewness and Kurtosis:

Moments in Statistics:

A Moment in Statistis is a specific measure about a probability distribution of a random variable.

The expected value of a random variable X can be calculated as the mean of X using historical values:

Mean(X)=X⎯⎯⎯⎯=E[X]=1N∑(Xi)=X1+X2+...XNN

The 4 Moments in Statistics are:

1st moment: E[X−X⎯⎯⎯⎯]=0

2nd moment: E[X−X⎯⎯⎯⎯]2=1N∑(Xi−X⎯⎯⎯⎯)2=VAR(X)
 = Variance(X);

The Standard Deviation of X is the square root of the Variance(X):
SD(X)=VAR(X)‾‾‾‾‾‾‾‾√

3rd moment: E[X−X⎯⎯⎯⎯]3=1N∑(Xi−X⎯⎯⎯⎯)3
The 3rd moment is used to estimate Skewness(X) as:

Skewness(X)=E[X−X⎯⎯⎯⎯]3SD(X)3

4rd moment: E[X−X⎯⎯⎯⎯]4=1N∑(Xi−X⎯⎯⎯⎯)4
The 4rd moment is used to estimate Kurtosis(X) as:

Kurtosis(X)=E[X−X⎯⎯⎯⎯]4SD(X)4

Calculate Kurtosis of S&P500 daily returns. Do you see a sign of fat-tail distribution of returns

The moments library has functions for Kurtosis and Skewness. Install this library

```{r}
library(moments)
kurtosis = as.numeric(kurtosis(SPreturn))
kurtosis
```

Since kurtosis is much higher than 3, this is a sign of fat-tailed distribution.

# Estimating VaR assuming fat-tail distribution
Considering that the distribution of real returns is more a fat-tail distribution (more-than-normal extreme values), it is better to use the t distribution rather than the normal distribution. Research indicates that there is a relationship between the kurtosis and the degrees of freedom of the t distribution to model fat-tail distributions.
A t-distribution with low degrees of freedom shows fatter tails compared with a t distribution with higher degrees of freedom (or compared to a normal distribution).
The way to estimate the degrees of freedom of a t distribution to model a fat-tail distribution, we use the kurtosis of the distribution as follows:
Degrees of freedom (df)= 6/Kurtosis + 4
You have to estimate the degrees of freedom and then use the t distribution to estimate the 1% VaR of $1million for the next future day of the sample. Compare both estimations of VaR, the one calculated assuming normal distribution and this one. Briefly reports this comparison.
Hint: the function qt(0.01,df) gives you the t value, which represents the # of standard deviations you need to go from the mean to the left to arrive to the 1% percentile.
I calculate the degrees of freedom I use to simulate a fat-tailed distribution using the t-Student distribution:

```{r}
# I calculate the degrees of freedom for the t distribution:
df = 6/kurtosis + 4
df
```

```{r}
# I calculate the t critical value according to the probability of 1% and the df:
t1 = qt(0.01,df)
t1
```

Now I get the 1%VaR assuming fat-tail distribution:

```{r}
meansp = mean(SPreturn)
volatilitysp = sd(SPreturn)
VaR2 = meansp + t1 * volatilitysp
VaR2
```
Interpretation of this 1%VaR:

Assuming fat-tail distribution of daily returns of the S&P500, I can lose in one day -5.0507905% or more with a probability of 1%.

Compared to the first 1%VaR=-277.331227% (assuming normal distribution), this 1%VaR=-5.0507905% assuming fat-tail distribution is more conservative, and more realistic.

To calculate the Expected Shortfall assuming fat-tail distribution, we can calculate the average return of the bad days that have had returns less than the 1%VaR assuming fat-tail distribution:

```{r}
baddays2 <- SPreturn[SPreturn$return<=VaR2,]
baddays2
```

Now we can get the average of these very bad days:

```{r}
ES3 = mean(baddays2$return)
ES3
```

This would be the more conservative estimation of 1% Expected Shortfall of the S&P500. Then, assuming fat-tail distribution of daily returns of the S&P500, and using real historical daily returns, ON AVERAGE, I can lose in one day -8.4116486% with a probability of 1%.

# Forecasting VaR with GARCH(1,1) Model

Work with the same dataset (daily S&P500 returns).

1.- Run a GARCH(1,1) model. INTERPRET the model and briefly explain your results.

```{r}
library(rugarch)
Especgarch1=ugarchspec(variance.model= list(model= "sGARCH", garchOrder= c(1, 1), 
submodel= NULL,  variance.targeting= FALSE), 
mean.model= list(armaOrder= c(0, 0), include.mean= TRUE, archm= FALSE, 
archpow= 0, arfima= FALSE,  external.regressors= NULL, archex= FALSE), 
distribution.model= "norm", start.pars= list(), fixed.pars= list())

#Estimación del GARCH(1,1)

garch1<- ugarchfit(spec=Especgarch1, data=SPreturn$return)
garch1
```
*INTERPRETATION the model output**

What does each coefficient mean?

THE beta1 COEFFICIENT IS 0.7210678, AND IT IS SIGNIFICANTLY DIFFERENT THAN ZERO. THIS MEANS THAT ABOUT 72.1067762% OF THE RETURN VARIANCE OF YESTERDAY IS PASSED TO THE RETURN VARIANCE OF TODAY, BESIDES THE IMPACT OF THE YESTERDAY’S SQUARED ERROR. SINCE THIS COEFFICIENT IS GREATER THAN 0.7, THIS IS A SIGN OF CLUSTERING OF VOLATILITY.

CONSIDERING THE VARIANCE OF YESTERDAY, THE SQUARED ERROR OF YESTERDAY IMPACTS THE RETURN VARIANCE OF TODAY IN A POSITIVE AND SIGNIFICANT WAY, BUT ONLY ABOUT 25.3621103% OF THE SQUARED ERROR OF YESTERDAY IS PASSED TO THE VARIANCE OF TODAY.

2.- With this GARCH(1,1) model, forecast both volatility and 1% VaR of $1 million for 1 future day (tomorrow).

```{r}
garch1.prediction <- ugarchforecast(garch1,n.ahead= 1)
```
garch1.prediction will be a very special R class: an uGARCHforecast class. This is special R class object that stores information about the GARCH prediction.

For this type of R class objects, we use the @ character to get access to the content of each element stored in this object.

The forecast for the daily return is stored in
```{r}
garch1.prediction@forecast$seriesFor
```

The forecast for the standard deviation of returns (volatility) is stored in:

```{r}
garch1.prediction@forecast$sigmaFor
```

Now using this predicted volatility, I calculate a new 1%VaR for tomorrow. I will assume fat-tail distribution, so I will use the t-Student distribution with the degrees of freedom I calculated above:

```{r}
VaR1_G = meansp + t1 * garch1.prediction@forecast$sigmaFor[1]
VaR1_G
```

*The 1%VaR using the predicted volatility from the GARCH(1,1) model is -7.2128673%.

Extend the dataset 10 days in the future. Do a within-sample forecast and an out-of-sample forecast for all days of the year (before and after TODAY) for daily volatility and 1%VaR for 1 day.

we do a forecast for 10 days
```{r}
garch1.prediction <- ugarchforecast(garch1,n.ahead= 10)
```
We can plot the historic and future volatility:
```{r}
plot(garch1.prediction,which = 3)
```
Now I calculate the daily 1%VaR for these 10 future days:
```{r}
VaR1_G = meansp + t1 * garch1.prediction@forecast$sigmaFor
VaR1_G
```
```{r}
# Now I calculate 1%VaR for a $1 million per day:
VaR1_G_1mill = VaR1_G * 1000000
VaR1_G_1mill
```

4.- Do a plot of volatility and VaR for the whole year
*The GARCH(1,1) fit estimation for volatility (sigma) and for stock return in the history is stored in the garch1 R object at:

garch1@fit$sigma

We can look at the last values of the volatility fitted values:

```{r}
tail(garch1@fit$sigma)
```

We can use these fit sigma values and attach them in the SPreturn xts dataset along with the daily returns:

```{r}
SPreturn$gvol = garch1@fit$sigma
```

Now we add a column for the 1%VaR using these historical GARCH volatilities. I will use the t-Student distribution assuming fat-tails ( I will use the t1 I estimated above):

```{r}
SPreturn$gVaR = meansp + t1 * SPreturn$gvol
SPreturn$gVaR1mill = SPreturn$gVaR * 1000000
```
Now I plot daily historical returns, and historial GARCH(1,1) fitted values for daily volatilities and 1%Value at Risk:
```{r}
plot(SPreturn$gVaR1mill)
```
```{r}
plot(SPreturn[,c(2,3)])
```
The green line is the GARCH(1,1) 1% Value at Risk; the red line is the GARCH(1,1) fitted volatility, and the black line is the daily return series.

I recommend you to read the article GARCH 101: The use of [http://www.cmat.edu.uy/~mordecki/hk/engle.pdf||ARCH/GARCH Models in Applied Econometrics] written by Robert Engle to better understand how you can apply ARCH model to forecast volatility and VaR

