1 General directions for this Workshop

You will work in RStudio. Create an R Notebook document (File -> New File -> R Notebook), where you have to write whatever is asked in this workshop.

At the beginning of the R Notebook write Workshop 9 - Financial Econometrics II and your name (as we did in previous workshop).

You have to replicate all the steps explained in this workshop, and ALSO you have to do whatever is asked. Any QUESTION or any STEP you need to do will be written in CAPITAL LETTERS. For ANY QUESTION, you have to RESPOND IN CAPITAL LETTERS right after the question.

It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.

You have to keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file. Pay attention in class to know how to generate an html file from your .Rmd.

2 Introduction to Value at Risk (VaR)

Value at Risk (VaR) is a measure of risk that focuses on potential losses of an investment in a short period of time.

Investment firms and banks constantly monitor VaR of their portfolios in order to be prepared in case a bad scenario. VaR is usually estimated daily using a small probability such as 5%, 1%, or 0.1%.

Very bad scenarios in the short term have a certain probability p to happen in one day. For example, if the probability of losing 10% or MORE of an investment in one single day is 5%, then we say that the 5% Value at Risk (5%VaR) is equal to 10%.

We can also calculate VaR in money if we know the initial investment. For example, using the same example, if my initial investment is $1’000,000, then the 5%VaR = 10%, which is equal to $100,000.00. VaR can be reported as a positive or negative amount; this is just a matter of preference. But we have to be aware that VaR is a measure of potential loss in % or in $ with a certain probability.

p%VaR is the MINIMUM % loss of an investment with a probability of p. In other words, there is a probability p that I can lose %VaR or MORE. Another way to define p%VaR is the following. p%VaR is the MAXIMUM %loss of an investment with a probatility (1-p). In other words, there is a probability (1-p) that I can have a return of %VaR or HIGHER.

Then, we can say that p%VaR is the best scenario that can happen after a very bad scenario that can happen with probability p. Also, we can say that p%VaR is the worse scenario that can happen when the scenario is not the bad one, so when there is a probability of (1-p)%.

Let’s illustrate VaR using real data from the US index S&P500 daily returns from January 2019 to April 26, 2022.

library(quantmod)
getSymbols("^GSPC", from="2019-01-01",
                    to="2022-04-27")

## [1] "^GSPC"

# I get the adjusted column:
SPindex = Ad(GSPC)
# I calculate daily returns of the index:
SPreturn<- na.omit(diff(log(SPindex)))
names(SPreturn)<-c("return")

We plot the daily index over time:

plot(SPindex)

We see strong decline in March 2020, another important decline in early 2022.

Now we do a histogram of the return of this index:

hist(SPreturn, breaks=40)

The height of each bar represents the frequency in number of days that a specific daily return range has happened.

If the daily returns behaves like a normal-distributed variable, then about 2.5% of the days the S&P500 returns would be less than the its mean minus 2 times its standard deviation. Remember that for a normal-distributed variable, in the range plus/minus 2 standard deviations from its mean, we will find about 95% of the values.

We downloaded 835 days of return, so we would expect that about 20.875 will be the # of days that the SP&500 would offer returns less than -2 times its standard deviation from its mean:

# Number of days for the S&P returns:
N=nrow(SPreturn)
N

## [1] 835

# The 2.5% of these # of days is about:
n25=0.025*nrow(SPreturn)
n25

## [1] 20.875

Now we can look the real data and see how many days the S&P500 has offered returns less than 2 times its standard deviation from its mean. We can calculate the mean and standard deviation (volatility) of daily returns, and then identify how many days the S&P500 has offered extreme returns less than the mean minus 2 times its volatility:

# We calculate the mean and standard deviation of returns:
mean_return=mean(SPreturn$return)
mean_return

## [1] 0.0006094223

volatility = sd(SPreturn$return)
volatility

## [1] 0.01417127

# We calculate the expected minimum return of the 95% confidence interval (C.I.) 
#   assuming normality (the 2.5 percentile)
min95ci_returns = mean_return-2*volatility
min95ci_returns

## [1] -0.02773312

# We select the days with returns less than the minimum of the 95% C.I.
extremes<- SPreturn[SPreturn$return<min95ci_returns,]
extremes

##                 return
## 2019-08-05 -0.03023018
## 2019-08-14 -0.02973035
## 2020-02-24 -0.03408808
## 2020-02-25 -0.03074790
## 2020-02-27 -0.04516814
## 2020-03-03 -0.02851048
## 2020-03-05 -0.03451078
## 2020-03-09 -0.07901041
## 2020-03-11 -0.05010289
## 2020-03-12 -0.09994485
## 2020-03-16 -0.12765220
## 2020-03-18 -0.05322227
## 2020-03-20 -0.04432762
## 2020-03-23 -0.02973150
## 2020-03-27 -0.03426785
## 2020-04-01 -0.04514636
## 2020-04-21 -0.03115512
## 2020-05-01 -0.02846021
## 2020-06-11 -0.06075269
## 2020-09-03 -0.03575759
## 2020-09-08 -0.02814883
## 2020-10-28 -0.03592554
## 2022-03-07 -0.02996259
## 2022-04-22 -0.02813208
## 2022-04-26 -0.02855001

nextremes = nrow(extremes)
# Number of extreme returns:
nextremes

## [1] 25

percextremes = nextremes / N * 100
percextremes

## [1] 2.994012

In this case, we have 25 extreme days with returns less than the minimum of the 95% C.I. of daily returns. Then, it seems that we have had more extreme returns than expected. This represents about 2.994012% of the total # of days, which is greater than the expected 2.5%.

Then, 2.994012% of the time, the S&P500 has experience a loss of -2.7733123% or MORE. If we believe that in the future the S&P500 will behave in a similar manner, then we can say that there is a probability of 2.994012% that I can lose -2.7733123% or MORE in only ONE DAY.

We can also say that there is a probability of 97.005988% (100% - 2.994012%) that the MAXIMUM loss I can experience in ONE DAY is -2.7733123%. Then, the 2.994012%VaR of the daily S&P500 returns is -2.7733123%:

VaR1 = min95ci_returns*100
VaR1

## [1] -2.773312

For turbulence periods in financial markets it is likely that a bad scenario with probability p can happen. Once this scenario happens, then p%VaR is the best scenario, but in reality this the p%VaR is not too realistic. We need to be more conservative and estimate which might be the expected loss once the probability p of the bad scenario happens. This conservative estimate is given by the Expected Shortfall. We will review the concept of ES in the following section.

3 Expected Shortfall (ES)

Expected Shortfall with a probability p refers to the expected average loss of an investment in ONE period (in this case, in one day) once the probability p of a bad scenario happens. Then, the absolute value of p%ES will always be greater (more negative) than the p%VaR. Let’s calculate the ES using the previous example of the S&P500 500 daily returns.

To estimate the ES at the 2.994012% probability, we just calculate the average of the days when the S&P500 returns have been less than -2.7733123%.We can do this as follows:

extremes

##                 return
## 2019-08-05 -0.03023018
## 2019-08-14 -0.02973035
## 2020-02-24 -0.03408808
## 2020-02-25 -0.03074790
## 2020-02-27 -0.04516814
## 2020-03-03 -0.02851048
## 2020-03-05 -0.03451078
## 2020-03-09 -0.07901041
## 2020-03-11 -0.05010289
## 2020-03-12 -0.09994485
## 2020-03-16 -0.12765220
## 2020-03-18 -0.05322227
## 2020-03-20 -0.04432762
## 2020-03-23 -0.02973150
## 2020-03-27 -0.03426785
## 2020-04-01 -0.04514636
## 2020-04-21 -0.03115512
## 2020-05-01 -0.02846021
## 2020-06-11 -0.06075269
## 2020-09-03 -0.03575759
## 2020-09-08 -0.02814883
## 2020-10-28 -0.03592554
## 2022-03-07 -0.02996259
## 2022-04-22 -0.02813208
## 2022-04-26 -0.02855001

ES=mean(extremes$return) * 100
ES

## [1] -4.412946

The Expected Shortfall at the 2.994012% probability is the average of these extreme return, which is -4.4129462%. As expected, the 2.994012% ES is more negative than the 2.994012%VaR (-4.4129462% vs -2.7733123%).

There are more than one method to estimate VaR and ES. Read the my Note “Introduction to Value at Risk” to learn different methods to estimate VaR and Expected Shortfall.

The most popular methods are 1) Percentile method using real distribution, 2) assuming normal distribution of returns, and 3) assuming a fat-tailed distribution of returns using the t-Student probability distribution.

Another not popular method (but more accurate) is using a volatility model like an GARCH(1,1) model to estimate and forecast daily volatility, and use this volatility to estimate Value at Risk.

4 Kurtosis and fat-tail distribution

Skewness and Kurtosis are statistical measures of a random variable. Here is a quick summary of Skewness and Kurtosis:

Skewness is a measure for asymmetry of the distribution. A normal distribution has about the same values that are less than then mean compared with the values that are higher than the mean. A normal distribution has a skewness close to zero. A negative skewness means that the tail to the left has more values (skewed to the left). A positive skewness means that the tail to the right has more values (skewed to the right).

Kurtosis can be used to examine know whether a variable has a distribution with much more extreme values compared to the normal distribution. According to the value of Kurtosis, we can identify whether we have a significant number of extreme values in a variable:

If Kurtosis=3 then, the variable behaves close to normal distribution

If Kurtosis>3, then this means that there are extreme values in the variable that makes the distribution to be a fat-tailed distribution.

Here is a quick review of moments in statistics, which are the base to understand Variance, Skewness and Kurtosis:

Moments in Statistics:

A Moment in Statistis is a specific measure about a probability distribution of a random variable.

The expected value of a random variable X can be calculated as the mean of X using historical values:

\[ Mean\left(X\right)=\overline{X}=E\left[X\right]=\frac{1}{N}\sum\left(X_{i}\right)=\frac{X_{1}+X_{2}+...X_{N}}{N} \]

The 4 Moments in Statistics are:

1st moment: $E\left[X-\overline{X}\right]=0$
2nd moment: $E\left[X-\overline{X}\right]^{2}=\frac{1}{N}\sum\left(X_{i}-\overline{X}\right)^{2}=VAR\left(X\right)$ = Variance(X);

The Standard Deviation of X is the square root of the Variance(X):

$SD\left(X\right)=\sqrt{VAR\left(X\right)}$

3rd moment: $E\left[X-\overline{X}\right]^{3}=\frac{1}{N}\sum\left(X_{i}-\overline{X}\right)^{3}$

The 3rd moment is used to estimate Skewness(X) as:

$Skewness\left(X\right)=\frac{E\left[X-\overline{X}\right]^{3}}{SD\left(X\right)^{3}}$

4rd moment: $E\left[X-\overline{X}\right]^{4}=\frac{1}{N}\sum\left(X_{i}-\overline{X}\right)^{4}$

The 4rd moment is used to estimate Kurtosis(X) as:

$Kurtosis\left(X\right)=\frac{E\left[X-\overline{X}\right]^{4}}{SD\left(X\right)^{4}}$

Calculate Kurtosis of S&P500 daily returns. Do you see a sign of fat-tail distribution of returns

The moments library has functions for Kurtosis and Skewness. Install this library.

library(moments)
kurtosis = as.numeric(kurtosis(SPreturn))
kurtosis

## [1] 20.0883

Since kurtosis is much higher than 3, this is a sign of fat-tailed distribution.

5 Estimating VaR assuming fat-tail distribution

Considering that the distribution of real returns is more a fat-tail distribution (more-than-normal extreme values), it is better to use the t distribution rather than the normal distribution. Research indicates that there is a relationship between the kurtosis and the degrees of freedom of the t distribution to model fat-tail distributions.

A t-distribution with low degrees of freedom shows fatter tails compared with a t distribution with higher degrees of freedom (or compared to a normal distribution).

The way to estimate the degrees of freedom of a t distribution to model a fat-tail distribution, we use the kurtosis of the distribution as follows:

Degrees of freedom (df)= 6/Kurtosis + 4

You have to estimate the degrees of freedom and then use the t distribution to estimate the 1% VaR of $1million for the next future day of the sample. Compare both estimations of VaR, the one calculated assuming normal distribution and this one. Briefly reports this comparison.

Hint: the function qt(0.01,df) gives you the t value, which represents the # of standard deviations you need to go from the mean to the left to arrive to the 1% percentile.

I calculate the degrees of freedom I use to simulate a fat-tailed distribution using the t-Student distribution:

# I calculate the degrees of freedom for the t distribution:
df = 6/kurtosis + 4
df

## [1] 4.298681

# I calculate the t critical value according to the probability of 1% and the df:
t1 = qt(0.01,df)
t1

## [1] -3.607109

Now I get the 1%VaR assuming fat-tail distribution:

meansp = mean(SPreturn)
volatilitysp = sd(SPreturn)
VaR2 = meansp + t1 * volatilitysp
VaR2

## [1] -0.05050791

Interpretation of this 1%VaR:

Assuming fat-tail distribution of daily returns of the S&P500, I can lose in one day -5.0507905% or more with a probability of 1%.

Compared to the first 1%VaR=-277.331227% (assuming normal distribution), this 1%VaR=-5.0507905% assuming fat-tail distribution is more conservative, and more realistic.

To calculate the Expected Shortfall assuming fat-tail distribution, we can calculate the average return of the bad days that have had returns less than the 1%VaR assuming fat-tail distribution:

baddays2 <- SPreturn[SPreturn$return<=VaR2,]
baddays2

##                 return
## 2020-03-09 -0.07901041
## 2020-03-12 -0.09994485
## 2020-03-16 -0.12765220
## 2020-03-18 -0.05322227
## 2020-06-11 -0.06075269

Now we can get the average of these very bad days:

ES3 = mean(baddays2$return)
ES3

## [1] -0.08411649

This would be the more conservative estimation of 1% Expected Shortfall of the S&P500. Then, assuming fat-tail distribution of daily returns of the S&P500, and using real historical daily returns, ON AVERAGE, I can lose in one day -8.4116486% with a probability of 1%.

6 Forecasting VaR with GARCH(1,1) Model

Work with the same dataset (daily S&P500 returns).

Run a GARCH(1,1) model. INTERPRET the model and briefly explain your results.

library(rugarch)
Especgarch1=ugarchspec(variance.model= list(model= "sGARCH", garchOrder= c(1, 1), 
submodel= NULL,  variance.targeting= FALSE), 
mean.model= list(armaOrder= c(0, 0), include.mean= TRUE, archm= FALSE, 
archpow= 0, arfima= FALSE,  external.regressors= NULL, archex= FALSE), 
distribution.model= "norm", start.pars= list(), fixed.pars= list())

#Estimación del GARCH(1,1)

garch1<- ugarchfit(spec=Especgarch1, data=SPreturn$return)
garch1

## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,1)
## Mean Model   : ARFIMA(0,0,0)
## Distribution : norm 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.001047    0.000270   3.8804 0.000104
## omega   0.000006    0.000003   2.3265 0.019992
## alpha1  0.253621    0.036941   6.8656 0.000000
## beta1   0.721068    0.011208  64.3365 0.000000
## 
## Robust Standard Errors:
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.001047    0.000260  4.02891 0.000056
## omega   0.000006    0.000007  0.84434 0.398480
## alpha1  0.253621    0.051157  4.95771 0.000001
## beta1   0.721068    0.094269  7.64904 0.000000
## 
## LogLikelihood : 2677.884 
## 
## Information Criteria
## ------------------------------------
##                     
## Akaike       -6.4045
## Bayes        -6.3819
## Shibata      -6.4046
## Hannan-Quinn -6.3958
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                     0.7817  0.3766
## Lag[2*(p+q)+(p+q)-1][2]    0.9404  0.5180
## Lag[4*(p+q)+(p+q)-1][5]    1.3737  0.7709
## d.o.f=0
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                     0.4232  0.5153
## Lag[2*(p+q)+(p+q)-1][5]    1.1354  0.8283
## Lag[4*(p+q)+(p+q)-1][9]    1.7665  0.9310
## d.o.f=2
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale P-Value
## ARCH Lag[3]   0.01124 0.500 2.000  0.9156
## ARCH Lag[5]   0.92434 1.440 1.667  0.7555
## ARCH Lag[7]   1.06258 2.315 1.543  0.9032
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  2.5098
## Individual Statistics:             
## mu     0.2157
## omega  0.2498
## alpha1 0.1074
## beta1  0.2079
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          1.07 1.24 1.6
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value    prob sig
## Sign Bias           2.0405 0.04162  **
## Negative Sign Bias  0.6192 0.53597    
## Positive Sign Bias  0.4766 0.63375    
## Joint Effect        7.0996 0.06879   *
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20     43.06    0.0012721
## 2    30     60.44    0.0005448
## 3    40     80.40    0.0001071
## 4    50     77.28    0.0061301
## 
## 
## Elapsed time : 0.133045

*INTERPRETATION the model output**

What does each coefficient mean?

THE beta1 COEFFICIENT IS 0.7210678, AND IT IS SIGNIFICANTLY DIFFERENT THAN ZERO. THIS MEANS THAT ABOUT 72.1067762% OF THE RETURN VARIANCE OF YESTERDAY IS PASSED TO THE RETURN VARIANCE OF TODAY, BESIDES THE IMPACT OF THE YESTERDAY’S SQUARED ERROR. SINCE THIS COEFFICIENT IS GREATER THAN 0.7, THIS IS A SIGN OF CLUSTERING OF VOLATILITY.

CONSIDERING THE VARIANCE OF YESTERDAY, THE SQUARED ERROR OF YESTERDAY IMPACTS THE RETURN VARIANCE OF TODAY IN A POSITIVE AND SIGNIFICANT WAY, BUT ONLY ABOUT 25.3621103% OF THE SQUARED ERROR OF YESTERDAY IS PASSED TO THE VARIANCE OF TODAY.

With this GARCH(1,1) model, forecast both volatility and 1% VaR of $1 million for 1 future day (tomorrow).

garch1.prediction <- ugarchforecast(garch1,n.ahead= 1)

garch1.prediction will be a very special R class: an uGARCHforecast class. This is special R class object that stores information about the GARCH prediction.

For this type of R class objects, we use the @ character to get access to the content of each element stored in this object.

The forecast for the daily return is stored in:

garch1.prediction@forecast$seriesFor

##      2022-04-26
## T+1 0.001047322

The forecast for the standard deviation of returns (volatility) is stored in:

garch1.prediction@forecast$sigmaFor

##     2022-04-26
## T+1  0.0201652

Now using this predicted volatility, I calculate a new 1%VaR for tomorrow. I will assume fat-tail distribution, so I will use the t-Student distribution with the degrees of freedom I calculated above:

VaR1_G = meansp + t1 * garch1.prediction@forecast$sigmaFor[1]
VaR1_G

## [1] -0.07212867

*The 1%VaR using the predicted volatility from the GARCH(1,1) model is -7.2128673%.

Extend the dataset 10 days in the future. Do a within-sample forecast and an out-of-sample forecast for all days of the year (before and after TODAY) for daily volatility and 1%VaR for 1 day.

We do a forecast for 10 days:

garch1.prediction <- ugarchforecast(garch1,n.ahead= 10)

We can plot the historic and future volatility:

plot(garch1.prediction,which = 3)

Now I calculate the daily 1%VaR for these 10 future days:

VaR1_G = meansp + t1 * garch1.prediction@forecast$sigmaFor
VaR1_G

##       2022-04-26
## T+1  -0.07212867
## T+2  -0.07174392
## T+3  -0.07136692
## T+4  -0.07099756
## T+5  -0.07063570
## T+6  -0.07028122
## T+7  -0.06993400
## T+8  -0.06959392
## T+9  -0.06926085
## T+10 -0.06893468

# Now I calculate 1%VaR for a $1 million per day:
VaR1_G_1mill = VaR1_G * 1000000
VaR1_G_1mill

##      2022-04-26
## T+1   -72128.67
## T+2   -71743.92
## T+3   -71366.92
## T+4   -70997.56
## T+5   -70635.70
## T+6   -70281.22
## T+7   -69934.00
## T+8   -69593.92
## T+9   -69260.85
## T+10  -68934.68

Do a plot of volatility and VaR for the whole year

*The GARCH(1,1) fit estimation for volatility (sigma) and for stock return in the history is stored in the garch1 R object at:

garch1@fit$sigma

We can look at the last values of the volatility fitted values:

tail(garch1@fit$sigma)

## [1] 0.01017609 0.01169813 0.01026554 0.01209087 0.01809300 0.01573199

We can use these fit sigma values and attach them in the SPreturn xts dataset along with the daily returns:

SPreturn$gvol = garch1@fit$sigma

Now we add a column for the 1%VaR using these historical GARCH volatilities. I will use the t-Student distribution assuming fat-tails ( I will use the t1 I estimated above):

SPreturn$gVaR = meansp + t1 * SPreturn$gvol
SPreturn$gVaR1mill = SPreturn$gVaR * 1000000

Now I plot daily historical returns, and historial GARCH(1,1) fitted values for daily volatilities and 1%Value at Risk:

plot(SPreturn$gVaR1mill)

plot(SPreturn[,c(2,3)])

The green line is the GARCH(1,1) 1% Value at Risk; the red line is the GARCH(1,1) fitted volatility, and the black line is the daily return series.

I recommend you to read the article GARCH 101: The use of [http://www.cmat.edu.uy/~mordecki/hk/engle.pdf||ARCH/GARCH Models in Applied Econometrics] written by Robert Engle to better understand how you can apply ARCH model to forecast volatility and VaR.

7 Quiz 9 and W9 submission

Go to Canvas and respond Quiz 9. You will have 3 attempts. Questions in this Quiz are related to concepts of the readings related to this Workshop.

The grade of this Workshop will be the following:

Complete (100%): If you submit an ORIGINAL and COMPLETE HTML file with all the activities, with your notes, and with your OWN RESPONSES to questions
Incomplete (75%): If you submit an ORIGINAL HTML file with ALL the activities but you did NOT RESPOND to the questions and/or you did not do all activities and respond to some of the questions.
Very Incomplete (10%-70%): If you complete from 10% to 75% of the workshop or you completed more but parts of your work is a copy-paste from other workshops.
Not submitted (0%)

Remember that you have to submit your .html file through Canvas BEFORE NEXT CLASS.

Workshop 9, Financial Econometrics II

Alberto Dorantes, Ph.D.

Apr 27, 2022