Synopsis
References
Section 1a - Working Directory, Required Packages, and Session Information
Section 1b - Formatting Proprietary Data
Section 1c - Downloading Stock Ticker Data
Section 2a - Exploratory Data Analysis
Section 2b - VaR Estimation of the 10 Stocks Index
Section 2c - Estimated Shortfall (CvaR) 10 Stocks Index
Section 2d - Hill Estimation of 10 Stocks Index
Section 2e - Anderson-Darling Test of Normal Distribution
Section 2f - Table of Results
Section 3a - EVT Block Maxima Estimation of 10 Stocks Index
Section 3b - VaR Forecasting of Block Maxima
Section 3c - Estimated Shortfall (CvaR) of Block Maxima
Section 3d - Hill Estimation of Block Maxima
Section 3e - Anderson-Darling Test of Normal Distribution
Section 3f - Table of Results
Section 3g - 100 days GARCH Forecasting of Block Maxima
Section 4a - EVT Peaks-Over-Threshold Estimation - 10 Stocks Index
Section 4b - 100, and 500 days VaR Forecasting of POT
Section 4c - Estimated Shortfall (CvaR) of POT
Section 4d - Hill Estimation of POT
Section 4e - Anderson-Darling Test of Normal Distribution
Section 4f - Table of Results
Section 4g - 100 days GARCH Forecasting of Peaks-Over-Threshold
Section 5a - Table of Estimation Method Influence
Section 5b - Conclusions
Section 6 - R Language Source Code
This document is an examination of implementing Extreme Value Theory (EVT) to determine Value at Risk, (and Conditional VaR), of an index of 10 stocks, utilizing the R programming language. The combined data on 10 stocks is tested for normality with an Anderson-Darling Test, and the EVT methods of Block Maxima and Peak-Over-Threshold are used for VaR/CvaR estimation. Finally, Generalized AutoRegression of Conditional Heteroskadacity (GARCH) processing is used to predict future values of the index, 20 days into the future. This document will determine the influence on model results of different methods for calculating Risk Factors.
Extreme Value Theory, (initially developed by Fisher, Tippett, and Gnedenko), demonstrates that the distribution of the block-maxima of a sample of independent identically distributed, (iid), variables converges to one of the three Extreme Value distributions.
Renewed interest by Statisticians to modeling Extreme Values has recently occurred. Extreme Value Thoery has proven useful in a variety of Risk Factor cases. After Financial Market instabilities from 1999 - 2008, Extreme Value Analysis gained in effectiveness, as opposed to previous Value-at-Risk analysis. Extreme Values represent extreme fluctuations of a system. EVT offers the ability to model the relationships of probability of extreme events, magnitude, damage, and cost of protection.
https://arxiv.org/pdf/1310.3222.pdf
https://www.ma.utexas.edu/mp_arc/c/11/11-33.pdf
http://evt2013.weebly.com/uploads/1/2/6/9/12699923/penalva.pdf
http://pubs.sciepub.com/ijefm/2/5/4/
http://www4.stat.ncsu.edu/~mannshardt/st810EVA/Lectures/Lec3GEV.pdf
http://www.sfu.ca/~rjones/econ811/readings/McNeil%201999.pdf
http://www.bankofcanada.ca/wp-content/uploads/2010/01/wp00-20.pdf
To begin the analysis, the Working Directory is set to the folder containing the stock ticker comma separated values files. Then, the needed R programming language packages are installed and included in the package library. The R packages include packages for Extreme Value Theory functions, VaR functions, time series analysis, quantitative trade analysis, regression analysis, plotting, and html formatting.
| List of Required Packages | |
|---|---|
| Required Packages | ‘dplyr’ ‘magrittr’ ‘ggplot2’ ‘tseries’ ‘vars’ ‘evd’ ‘evir’ ‘POT’ ‘fBasics’ ‘fExtremes’ ‘quantmod’ ‘PerformanceAnalytics’ ‘rugarch’ ‘fGarch’ ‘nortest’ ‘knitr’ |
The session information below is for reference when running the required packages, and R code.
| Session Information | |
|---|---|
| R Version | R version 3.3.1 (2016-06-21) |
| Platform | x86_64-w64-mingw32/x64 (64-bit) |
| Running | Windows 10 x64 (build 14393) |
| RStudio Citation | RStudio: Integrated Development for R. |
| RStudio Version | 1.0.44 |
The first file to explore for this analysis is “Data_CSV.csv”. This file contains stock ticker data for 15 German corporations listed on the DAX stock exchange, and market portfolio data for the DAX exchange. Ten corporations were selected from this data file, and the last ten years of stock price information for these corporations was downloaded from Google Finance.
## [1] "Volkswagen VG" "Adidas"
## [3] "Siemens" "Deutsche Bank"
## [5] "Henkel" "SAP"
## [7] "BASF" "Bayer"
## [9] "Allianz" "Daimler"
## [11] "Fresenius" "BMW"
## [13] "Commerzbank" "Lufthansa"
## [15] "Deutsche Post" "DAX (Market portfolio)"
## [1] "VOW3" "ADS" "SIE" "DB" "HEN3" "SAP" "BAS" "BAYN" "ALV" "DAI"
## [11] "FMS" "BMW" "CBK" "LHA" "DPW" "DAX"
The stock price data was downloaded and read into the R programming environment. The returns are calculated with “Open Price / Close Price - 1”, and the data from the ten corporations is combined in one dataframe, (with one column per corporation).
Each row of the resulting dataframe represents one business day of the 10 years of recorded stock prices. The means for each row in the dataframe is then calculated. A column of 10 years of dates is appended to the dataframe. A second dataframe containing only the row means, and date information, is also created.
A table of dataframe statistics is created containing the Minimum, Median, Mean, Maximum, Standard Deviation, 1% Quantile, 5% Quantile, 95% Quantile, 99% Quantile, and Number of Observations for each column, (or Corporation). The quantile percentages are for extreme values. A times series chart of the means of all the returns is also created.
| Stats | Volkswagen | Adidas | Siemens | DeutscheBank | Henkel |
|---|---|---|---|---|---|
| Min. | -0.99 | -0.98 | -0.99 | -0.99 | -0.98 |
| Median | 3.67 | 18.72 | 10.94 | 23.52 | 19.28 |
| Mean | 9.29 | 15.71 | 10.10 | 21.35 | 16.73 |
| Max. | 38.42 | 21.96 | 17.24 | 32.08 | 24.06 |
| Std. Dev. | 9.76 | 6.02 | 4.89 | 7.28 | 6.97 |
| 1% Quantile | -0.81 | -0.79 | -0.83 | -0.78 | -0.83 |
| 5% Quantile | -0.07 | -0.20 | -0.16 | 0.12 | -0.27 |
| 95% Quantile | 25.22 | 20.69 | 16.42 | 28.18 | 23.01 |
| 99% Quantile | 33.31 | 20.93 | 16.87 | 28.82 | 23.49 |
| No. Of Observations | 2528.00 | 2528.00 | 2528.00 | 2528.00 | 2528.00 |
| Stats | SAP | BASF | Fresenius | Allianz | Daimler |
|---|---|---|---|---|---|
| Min. | -0.99 | -0.98 | -0.96 | -0.99 | -0.98 |
| Median | 17.39 | 14.08 | 23.27 | 4.06 | 16.75 |
| Mean | 15.20 | 13.01 | 23.24 | 7.86 | 14.93 |
| Max. | 22.74 | 20.13 | 181.02 | 20.31 | 23.66 |
| Std. Dev. | 6.23 | 5.04 | 14.10 | 6.93 | 6.04 |
| 1% Quantile | -0.18 | 0.00 | 0.62 | -0.87 | -0.21 |
| 5% Quantile | 2.66 | 3.09 | 5.18 | -0.31 | 2.99 |
| 95% Quantile | 21.95 | 18.75 | 30.39 | 17.29 | 21.74 |
| 99% Quantile | 22.31 | 19.07 | 31.23 | 17.78 | 22.19 |
| No. Of Observations | 2527.00 | 2527.00 | 2528.00 | 2528.00 | 2528.00 |
To begin estimation of the future events implied by the data, an initial Value at Risk estimation is made. First the dataframe of all row means and date information is converted to time series format, then a VaR is calculated from this time series. Predictions are made from the VaR calculation of values 100, and 500, days in the future. In the subsequent predictions-only plot, the blue circles represent values for 100 days in the future, and red circles represents return values for 500 days.
For comparison purposes, the Conditional Value at Risk (CvaR or Estimated Shortfall), of the 10 Stocks Index data is calculated. First, the times series of the data is utilized to find the maximum of the worst 0.95% drawdowns. Then, an Estimated Shortfall is calculated, via the “gaussian” method, and the results of both calculations are presented in table format.
| Max Drawdown | from | to | Estimated Shortfall | |
|---|---|---|---|---|
| 21.10127 | 1933 | 2171 | -9.508332 |
Because it is presumed that the 10 Stocks Index data is a heavy-tailed distribution, with infrequent extreme changes in the data, Hill Estimation is used for parametric estimation of the tail index. The objective is to verify that the 10 Stocks data is an Extreme Value Distribution. The plot generated by the Hill Estimation confirmes that is factual.
The Anderson-Darling Test is mainly used on families of distributions, and is a powerful determinent of non-normality of a distribution. With a large sample size, (as in the 10 Stocks Index), a P-Value of less than 0.05 indicates that the distribution varies from normality. That is expected of an Extreme Value Distribution. Probability Value found with the Anderson-Darling Test was 3.7^-24, therefore confirming non-normality.
## Statistic P-Value
## A 43.35969 3.7e-24
Finally, a Table of Results of the estimations of the future values of the 10 Stocks Index is presented. The Point Estimates and ranges of the 3 VaR estimations, (and Estimated Shortfall), are tabulated for comparison.
| Method | Point Estimate | lower | upper | CI |
|---|---|---|---|---|
| VaR 100 Days | 14.1053 | 10.1452 | 18.0654 | 3.9601 |
| VaR 500 Days | 13.962 | 9.8798 | 18.0443 | 4.0823 |
| ES | -9.5083 | 1933 | 2171 | NA |
The Block Maxima approach in Extreme Value Theory is the most basic method of EVT analysis. Block Maxima consists of dividing the observation period into non-overlapping periods of equal size, and restricts attention to the maximum observation in each period. The observations created follow domain of attraction conditions, approximately an Extreme Value Distribution. Parametric statistical methods for Extreme Value Distributions are then applied to those observations.
Extreme Value Theorists have developed the Generalized Extreme Value distribution. The GEV contains a family of continuous probability distributions, the Gumbel, Frechet and Weibull distributions, (also known as Type I, II and III Extreme Value Distributions).
In the following EVT Block Maxima analysis, the 10 Stocks Index data is fitted to a GEV. The resulting distribution is plotted. A time series plot is created in order to localize the extreme events on a timeline, from 2006 to 2016. Four plots of the order of Block Maxima data are then created. Finally, table of the Block Maxima analysis parameters is created from the gev() function.
| Block Maxima Parameters | |
|---|---|
| xi | 0.1598266 |
| sigma | 0.0008699 |
| mu | 0.0027634 |
| loc | 0.0030876 |
| scale | 0.0012700 |
| shape | 0.0000035 |
In order to create a Value at Risk, (VaR), estimation from the Block Maxima data, the 10 Stocks Index GEV data is converted into a time series. A VaR estimation is made from the GEV time series data. Predictions of future values, (100 and 500 days in the future), are extrapolated from the VaR data. In the resulting plot, the blue circles represent values for 100 days in the future, and red circles represents return values for 500 days.
The Conditional Value at Risk, (“CvaR” or “Estimated Shortfall”), of the 10 Stocks Index GEV data is calculated. First, the times series of the data is utilized to find the maximum of the worst 0.95% drawdowns. Then, an Estimated Shortfall is calculated via the “modified” method for extreme distributions, and the results of both calculations are presented in table format.
| Max Drawdown | from | to | Estimated Shortfall | |
|---|---|---|---|---|
| 0.0980278 | 1243 | 1933 | -2.19e-05 |
The Hill Estimation, (used for parametric estimation of the tail index), verifies that the 10 Stocks GEV data is an Extreme Value Distribution.
The Anderson-Darling Test is a powerful determinent of non-normality of large sample size distributions. If the P-Value is less than 0.05, the distribution varies from normality. An insignificant Probability Value of 3.7^-24 was found via this test.
## Statistic P-Value
## 422.5289 3.7e-24
Finally, a Table of Results of the estimations of the future values of the 10 Stocks Index GEV is presented. The Point Estimates and ranges of the 3 GEV VaR estimations, (and GEV Estimated Shortfall), are tabulated for comparison.
| Method | Point Estimate | lower | upper | CI |
|---|---|---|---|---|
| GEV VaR 100 Days | 0.0038 | -0.0023 | 0.0099 | 0.0061 |
| GEV VaR 500 Days | 0.0039 | -0.0022 | 0.01 | 0.0061 |
| GEV ES | -2.18696219987451e-05 | 1243 | 1933 | NA |
Predictions are made for the Block Maxima EVT data via fitting the Block Maxima GEV distribution, (of the index of 10 stocks), to a GARCH(1,1), (Generalized Auto-Regressive Conditional Heteroskadacity), model. A table of the prediction formula parameters is displayed. An “Auto Correlation Function” (ACF) plot is created that displays the significant events over time. Then, an array of plots of the fitted model results is displayed. A prediction for the next 20 days, (of the stock index performance), is created. Lastly, the 20 day prediction is displayed in 2 plots.
## [1] 13440
| GARCH Parameters | |
|---|---|
| mu | 0.0060439 |
| ar1 | 0.9976487 |
| ma1 | -0.2595056 |
| omega | -3.2076335 |
| alpha1 | 0.5405849 |
| beta1 | 0.7848842 |
| gamma1 | 1.8662047 |
| shape | 2.1000002 |
In the peaks-over-threshold approach in EVT, the initial observations that exceed a certain high threshold are selected. The probability distribution of those selected observations is approximately a generalized Pareto distribution. A maximum likelihood estimation (mle) is created by fitting a Generalized Pareto Distribution. The MLE statistics are presented in tabular form. The resulting estimation is then graphically diagnosed via MLE plotting.
| POT Parameters | |
|---|---|
| Fitted Values | 14.3934587025131 |
| Standard Error | 0.286270436634011 |
| Standard Error Type | observed |
| Log Likelihood | -9269.60418661959 |
| Hessian | 12.2024489428441 |
| Threshold | 0.35 |
A Value at Risk, (VaR), estimation from the POT data is created via converting the 10 Stocks Index MLE data into a time series. A VaR estimation is made from the MLE time series data. Predictions of future values, (100 and 500 days in the future), are extrapolated from the MLE VaR data. In the resulting plot, the blue circles represent values for 100 days in the future, and red circles represents return values for 500 days.
The Conditional Value at Risk, (“CvaR” or “Estimated Shortfall”), of the 10 Stocks Index MLE data is then calculated. The times series of the data is utilized to find the maximum of the worst 0.95% drawdowns. An Estimated Shortfall is calculated, via the “modified” method for extreme distributions, and the results of both calculations are presented in table format.
| Max Drawdown | from | to | Estimated Shortfall | |
|---|---|---|---|---|
| 21.10127 | 1933 | 2171 | -0.0523317 |
The Hill Estimation, (used for parametric estimation of the tail index), verifies that the 10 Stocks MLE data is an Extreme Value Distribution.
The Anderson-Darling Test is a powerful determinent of non-normality of large sample size distributions. If the P-Value is less than 0.05, the distribution varies from normality. The resulting P-Value was 3.7^-24 for this test.
## Statistic P-Value
## 43.35969 3.7e-24
Finally, a Table of Results of the estimations of the future values of the 10 Stocks Index MLE is presented. The Point Estimates and ranges of the 3 MLE VaR estimations, (and MLE Estimated Shortfall), are tabulated for comparison.
| Method | Point Estimate | lower | upper | CI |
|---|---|---|---|---|
| MLE VaR 100 Days | 14.1053 | 10.1452 | 18.0654 | 3.9601 |
| MLE VaR 500 Days | 13.962 | 9.8798 | 18.0443 | 4.0823 |
| MLE ES | -0.0523 | 1933 | 2171 | NA |
Predictions are made for the Peaks-Over-Threshold EVT data via fitting the MLE, (maximum likelihood estimation of the index of 10 stocks), to a GARCH(1,1), (Generalized Auto-Regressive Conditional Heteroskadacity), model. A table of the prediction formula parameters is displayed. An “Auto Correlation Function” (ACF) plot is created, that displays the significant events over time. Then, an array of plots of the fitted model results are displayed. Then a prediction for the next 20 days, (of the stock index performance), is created. Lastly, the 20 day prediction, (derived from Peaks-Over-Threshold EVT extimation), is displayed in 2 plots.
| POT Parameters | |
|---|---|
| mu | 10.3337633 |
| ar1 | 0.9963151 |
| ma1 | -0.2938808 |
| omega | -0.1349384 |
| alpha1 | -0.1570853 |
| beta1 | 0.8676732 |
| gamma1 | 0.5664016 |
| shape | 2.6095115 |
The following table assembles the results of the four methods of examining the Extreme Distribution 10 Stocks Index. The first column has the names of the four estimation methods. Statistics on VaR, ES, mu (or “shape”) statistic, and Anderson-Darling P-Values are presented.
| Test | VaR | ES | mu | P-Value |
|---|---|---|---|---|
| VaR | 0.3222 | NA | NA | 3.7e-24 |
| ES | NA | -9.50833218037079 | NA | 3.7e-24 |
| Block Maxima | 0.0421 | -2.18696219987451e-05 | 0.00276335504443396 | 3.7e-24 |
| Peaks Over Threshold | 0.3222 | -0.0523317246992927 | 14.3934587025131 | 3.7e-24 |
After thorough examination of 10 years of stock returns of 10 Corporations, (listed on the German DAX stock exchange), the validity of characterising the changes in return percentage as an Extreme Value Distribution is confirmed. All Anderson-Darling Tests of the fitted values of the four analysis methods show insignificant probabilty that the distribution has normality, or all non-exteme values. The methods are in agreement regarding the Value at Risk within the returns data. The Block Maxima method produces a slight devergence of VaR estimation. Traditional VaR estimation, and POT estimation, produce the same Value at Risk. The 2 EVT methods predict lower Expected Shortfall, versus traditional CvaR estimation of stock return data. The Standard Q-Q Plots demonstrate that Peaks-Over-Threshold is the most reliable estimation method, in the case of the Index of 10 Stocks.
# Section 1a - Working Directory, Required Packages, and Session Information
# Set Working Directory to the directory with .csv, or .RData, files
# setwd(" ")
# Required Packages
# install.packages("dplyr")
# install.packages("magrittr")
# install.packages("ggplot2")
# install.packages("tseries")
# install.packages("vars")
# install.packages("evd")
# install.packages("evir")
# install.packages("POT")
# install.packages("fBasics")
# install.packages("fExtremes")
# install.packages("quantmod")
# install.packages("PerformanceAnalytics")
# install.packages("rugarch")
# install.packages("fGarch")
# install.packages("nortest")
# install.packages("knitr")
# Initialize libraries
library(dplyr)
library(magrittr)
library(ggplot2)
library(tseries)
library(vars)
library(evd)
library(evir)
library(POT)
library(fBasics)
library(fExtremes)
library(quantmod)
library(PerformanceAnalytics)
library(rugarch)
library(fGarch)
library(nortest)
library(knitr)
PackageTable <- data.frame(matrix(nrow=1, ncol=1))
rownames(PackageTable) <- "Required Packages"
colnames(PackageTable) <- "List of Required Packages"
PackageTable[1,] <- paste("'dplyr'", "'magrittr'", "'ggplot2'",
"'tseries'", "'vars'", "'evd'", "'evir'",
"'POT'", "'fBasics'", "'fExtremes'",
"'quantmod'", "'PerformanceAnalytics'",
"'rugarch'", "'fGarch'", "'nortest'",
"'knitr'")
kable(PackageTable)
session <- sessionInfo()
SessionTable <- data.frame(matrix(nrow=5, ncol=1))
rownames(SessionTable) <- c("R Version", "Platform", "Running",
"RStudio Citation","RStudio Version")
colnames(SessionTable) <- "Session Information"
SessionTable[1,] <- session$R.version$version.string
SessionTable[2,] <- session$platform
SessionTable[3,] <- session$running
SessionTable[4,] <- "RStudio: Integrated Development for R."
SessionTable[5,] <- "1.0.44"
kable(SessionTable)
## Section 1b - Formatting the Proprietary Data
stockData <- read.csv("Data_CSV.csv", header = F)
companyList <- c("Volkswagen VG", "Adidas", "Siemens",
"Deutsche Bank", "Henkel", "SAP", "BASF", "Bayer",
"Allianz", "Daimler", "Fresenius", "BMW",
"Commerzbank", "Lufthansa", "Deutsche Post",
"DAX (Market portfolio)")
companyList
stockSymbols <- c("VOW3", "ADS", "SIE", "DB", "HEN3", "SAP", "BAS",
"BAYN", "ALV", "DAI", "FMS", "BMW", "CBK", "LHA",
"DPW", "DAX")
stockSymbols
## Section 1c - Downloading Stock Ticker Data
stockData_Volkswagen <- read.csv("vow3.csv")
stockData_Adidas <- read.csv("ads.csv")
stockData_Siemens <- read.csv("sie.csv")
stockData_DeutscheBank <- read.csv("dbk.csv")
stockData_Henkel <- read.csv("hen3.csv")
stockData_SAP <- read.csv("sap.csv")
stockData_BASF <- read.csv("bas.csv")
stockData_Fresenius <- read.csv("fre.csv")
stockData_Allianz <- read.csv("alv.csv")
stockData_Daimler <- read.csv("dai.csv")
returnsFunction <- function(x) {
as.numeric(x$Open) / as.numeric(x$Close) - 1
}
returnsData_Volkswagen <- returnsFunction(stockData_Volkswagen)
returnsData_Adidas <- returnsFunction(stockData_Adidas)
returnsData_Siemens <- returnsFunction(stockData_Siemens)
returnsData_DeutscheBank <- returnsFunction(stockData_DeutscheBank)
returnsData_Henkel <- returnsFunction(stockData_Henkel)
returnsData_SAP <- returnsFunction(stockData_SAP)
returnsData_BASF <- returnsFunction(stockData_BASF)
returnsData_Fresenius <- returnsFunction(stockData_Fresenius)
returnsData_Allianz <- returnsFunction(stockData_Allianz)
returnsData_Daimler <- returnsFunction(stockData_Daimler)
allData <- cbind(returnsData_Volkswagen, returnsData_Adidas,
returnsData_Siemens, returnsData_DeutscheBank,
returnsData_Henkel, returnsData_SAP, returnsData_BASF,
returnsData_Fresenius, returnsData_Allianz,
returnsData_Daimler)
allData <- data.frame(allData)
names(allData) <- c("Volkswagen", "Adidas", "Siemens", "DeutscheBank",
"Henkel", "SAP", "BASF", "Fresenius",
"Allianz", "Daimler")
allData_Means <- rowMeans(allData)
names(stockData_Adidas) <- c("Date", "Open", "High", "Low", "Close", "Volume")
allData$Time <- as.Date(stockData_Adidas$Date[1:2528], format="%d-%b-%y")
allDF <- data.frame(allData$Time, allData_Means)
names(allDF) <- c("Time", "Observations")
# save(allData, file = "allData.RData")
# load("allData.RData")
# Section 2a - Exploratory Data Analysis
tableStats <- matrix(nrow=10, ncol=11)
tableStats <- data.frame(tableStats)
names(tableStats) <- c("Stats", "Volkswagen", "Adidas", "Siemens",
"DeutscheBank", "Henkel", "SAP", "BASF", "Fresenius",
"Allianz", "Daimler")
tableStats$Stats <- c("Min.", "Median", "Mean", "Max.", "Std. Dev.",
"1% Quantile", "5% Quantile", "95% Quantile",
"99% Quantile", "No. Of Observations")
statsFunction <- function(x) {
tableStats$x <- c(min(x), median(x), mean(x),
max(x), sd(x), quantile(x, .01),
quantile(x, .05), quantile(x, .95),
quantile(x, .99), length(x))
tableStats$x
}
tableStats$Volkswagen <- round(statsFunction(returnsData_Volkswagen), 2)
tableStats$Adidas <- round(statsFunction(returnsData_Adidas), 2)
tableStats$Siemens <- round(statsFunction(returnsData_Siemens), 2)
tableStats$DeutscheBank <- round(statsFunction(returnsData_DeutscheBank), 2)
tableStats$Henkel <- round(statsFunction(returnsData_Henkel), 2)
tableStats$SAP <- round(statsFunction(returnsData_SAP), 2)
tableStats$BASF <- round(statsFunction(returnsData_BASF), 2)
tableStats$Fresenius <- round(statsFunction(returnsData_Fresenius), 2)
tableStats$Allianz <- round(statsFunction(returnsData_Allianz), 2)
tableStats$Daimler <- round(statsFunction(returnsData_Daimler), 2)
tableStats$mean <- rowMeans(tableStats[,2:11])
firstTable <- cbind(tableStats[,1], tableStats[,2:6])
colnames(firstTable) <- c("Stats", "Volkswagen", "Adidas", "Siemens",
"DeutscheBank", "Henkel")
secondTable <- cbind(tableStats[,1], tableStats[,7:11])
colnames(secondTable) <- c("Stats", "SAP", "BASF", "Fresenius",
"Allianz", "Daimler")
firstTable
secondTable
tsData_All <- ts(allData_Means, start=c(2006, 11, 26), frequency=250)
plot.ts(tsData_All, ylab="Returns", main="Time Series of All Data")
## Section 2b - VaR Estimation of the 10 Stocks Index
# VaR estimation of Block Maxima data
allDF_var_ts <- ts(allDF)
allDF_var.2c <- VAR(allDF_var_ts, p = 2, type = "const")
# Predict the values for the next 125, 250, and 500 days
allDF_var.prd.100 <- predict(allDF_var.2c, n.ahead = 100, ci = 0.9)
allDF_var.prd.500 <- predict(allDF_var.2c, n.ahead = 500, ci = 0.9)
plot(allDF_var.prd.100)
plot(allDF_var.prd.500)
plot(allDF_var.prd.500$fcst$Time[1:1200],
allDF_var.prd.500$fcst$Observations[1:1200], col='red',
main="Plot of VaR Forecasts", xlab = "Time",
ylab = "Observations")
points(allDF_var.prd.100$fcst$Time[1:1200],
allDF_var.prd.100$fcst$Observations[1:1200], col='blue')
legend('topleft', legend=c('100 Days = blue', '500 Days = red'),
col=c('blue','red'))
## Section 2c - Estimated Shortfall (CvaR) 10 Stocks Index
# The conditional drawdown is the the mean of the worst 0.95% drawdowns
mdd <- maxdrawdown(allDF_var_ts[,2])
estSF <- ES(ts(allDF[1:2528, 2, drop=FALSE]), p=0.95, method="gaussian",
portfolio_method = "component", invert = F, operational = F)
estSFtable <- data.frame(mdd, estSF$ES)
rownames(estSFtable) <- ""
colnames(estSFtable) <- c("Max Drawdown", "from", "to", "Estimated Shortfall")
estSFtable
## Section 2d - Hill Estimation of 10 Stocks Index
hill(allDF$Observations, option="xi", start=15, end=45)
## Section 2e - Anderson-Darling Test of Robustness
adtest <- ad.test(allDF$Observations)
adtestDF <- data.frame(adtest$statistic, adtest$p.value)
names(adtestDF) <- c("Statistic", "P-Value")
adtestDF
## Section 2f - Table of Results
VaR_ES_Results <- matrix(nrow = 3, ncol = 5)
VaR_ES_Results <- data.frame(VaR_ES_Results)
names(VaR_ES_Results) <- c("Method","Point Estimate","lower","upper","CI")
VaR_ES_Results[1,] <- c("VaR 100 Days",
round(mean(allDF_var.prd.100$fcst$Observations[,1]), 4),
round(mean(allDF_var.prd.100$fcst$Observations[,2]), 4),
round(mean(allDF_var.prd.100$fcst$Observations[,3]), 4),
round(mean(allDF_var.prd.100$fcst$Observations[,4]), 4))
VaR_ES_Results[2,] <- c("VaR 500 Days",
round(mean(allDF_var.prd.500$fcst$Observations[,1]), 4),
round(mean(allDF_var.prd.500$fcst$Observations[,2]), 4),
round(mean(allDF_var.prd.500$fcst$Observations[,3]), 4),
round(mean(allDF_var.prd.500$fcst$Observations[,4]), 4))
VaR_ES_Results[3,] <- c("ES", round(estSFtable[1], 4),
estSFtable[2], estSFtable[3],
round(estSFtable[4], 4))
VaR_ES_Results
# Section 3a - EVT Block Maxima Estimation - 10 Stocks Index
allGEV <- dgev(allData_Means, xi=0.8, mu=0)
allGEVDF <- data.frame(allDF$Time, allGEV)
names(allGEVDF) <- c("Time", "Returns")
plot(allGEVDF, col="green", pch=19, cex=0.8, ylab="Returns",
main="Plot of Extreme Value Distribution - All Data")
tsGEVDF <- ts(allGEVDF, start=c(2006, 11, 26), frequency=250)
plot.ts(tsGEVDF, col="green", pch=19, cex=0.8, ylab="Data",
main="Times Series Plot of Extreme Value Distribution")
par(mfrow = c(2,2))
plot(density(allGEVDF$Returns), xlab="Returns",
main="Block Maxima of All Data", lwd=2)
hist(allGEVDF$Returns, xlab="Returns",
main="Histogram of Block Maxima")
plot(allGEVDF[1401:2528,], xlab="Time", ylab="Observations",
main="Left Tail Plot of Block Maxima", lwd=2)
plot(allGEVDF[1:1400,], xlab="Time", ylab="Observations",
main="Right Tail Plot of Block Maxima", lwd=2)
par(mfrow = c(1,1))
all_BM_GEV <- gev(as.numeric(allGEV))
all_BM_FGEV <- fgev(allGEV, std.err = FALSE)
BlockMaximaTable <- matrix(nrow=6, ncol=1)
BlockMaximaTable <- data.frame(BlockMaximaTable)
BlockMaximaTable[1:3,] <- all_BM_GEV$par.ests[1:3]
BlockMaximaTable[4:6,] <- all_BM_FGEV$param[1:3]
rownames(BlockMaximaTable) <- c(names(all_BM_GEV$par.ests[1:3]),
names(all_BM_FGEV$param[1:3]))
colnames(BlockMaximaTable) <- "Block Maxima Parameters"
BlockMaximaTable
## Section 3b - VaR Forecasting of Block Maxima
# VaR estimation of Block Maxima data
allGEV_var_ts <- ts(allGEVDF)
allGEV_var.2c <- VAR(allGEV_var_ts, p = 2, type = "const")
# Predict the values for the next 125, 250, and 500 days
allGEV_var.prd.100 <- predict(allGEV_var.2c, n.ahead = 100, ci = 0.9)
allGEV_var.prd.500 <- predict(allGEV_var.2c, n.ahead = 500, ci = 0.9)
plot(allGEV_var.prd.100)
plot(allGEV_var.prd.500)
plot(allGEV_var.prd.500$fcst$Time[1:1200],
allGEV_var.prd.500$fcst$Returns[1:1200], col='red',
main="Plot of GEV VaR Forecasts", xlab = "Time",
ylab = "Returns")
points(allGEV_var.prd.100$fcst$Time[1:1200],
allGEV_var.prd.100$fcst$Returns[1:1200], col='blue')
legend('topleft', legend=c('100 Days = blue', '500 Days = red'),
col=c('blue','red'))
## Section 3c - Estimated Shortfall (CvaR) Forecasting of Block Maxima
# The conditional drawdown is the the mean of the worst 0.95% drawdowns
mddGEV <- maxdrawdown(allGEV_var_ts[,2])
# CvaR (Expected Shortfall) Estmation
estSFGEV <- ES(ts(allGEV), p=0.95, method="modified",
portfolio_method = "component", weights = ts(allGEV[2]),
invert = F, operational = F)
estSFGEVtable <- data.frame(mddGEV, estSFGEV$MES)
rownames(estSFGEVtable) <- ""
colnames(estSFGEVtable) <- c("Max Drawdown", "from", "to", "Estimated Shortfall")
estSFGEVtable
## Section 3d - Hill Estimation of Block Maxima
hill(allGEVDF$Returns, option="xi", start=15 , end=45)
## Section 3e - Anderson-Darling Test of Robustness
adtestGEV <- ad.test(allGEVDF$Returns)
adtestGEVdf <- data.frame(adtestGEV$statistic, adtestGEV$p.value)
names(adtestGEVdf) <- c("Statistic", "P-Value")
rownames(adtestGEVdf) <- ""
adtestGEVdf
## Section 3f - Table of Results
GEV_Results <- matrix(nrow = 3, ncol = 5)
GEV_Results <- data.frame(GEV_Results)
names(GEV_Results) <- c("Method","Point Estimate","lower","upper","CI")
GEV_Results[1,] <- c("GEV VaR 100 Days",
round(mean(allGEV_var.prd.100$fcst$Returns[,1]), 4),
round(mean(allGEV_var.prd.100$fcst$Returns[,2]), 4),
round(mean(allGEV_var.prd.100$fcst$Returns[,3]), 4),
round(mean(allGEV_var.prd.100$fcst$Returns[,4]), 4))
GEV_Results[2,] <- c("GEV VaR 500 Days",
round(mean(allGEV_var.prd.500$fcst$Returns[,1]), 4),
round(mean(allGEV_var.prd.500$fcst$Returns[,2]), 4),
round(mean(allGEV_var.prd.500$fcst$Returns[,3]), 4),
round(mean(allGEV_var.prd.500$fcst$Returns[,4]), 4))
GEV_Results[3,] <- c("GEV ES", estSFGEVtable[1],
estSFGEVtable[2], estSFGEVtable[3],
"NA")
GEV_Results
## Section 3g - 100 days GARCH(1,1) Forecasting of Block Maxima
spec = ugarchspec()
nrow(expand.grid(GARCH = 1:14, VEX = 0:1, VT = 0:1, Mean = 0:1, ARCHM = 0:2, ARFIMA = 0:1, MEX = 0:1, DISTR = 1:10))
spec = ugarchspec(variance.model = list(model = 'eGARCH',
garchOrder = c(1, 1)),
distribution = 'std')
# Fit models with Generalized Auto-Regressive Conditional Heteroskadacity
all.fitted.model = ugarchfit(spec, allGEV, solver = 'hybrid')
coefBMtable <- data.frame(coef(all.fitted.model))
names(coefBMtable) <- "GARCH Parameters"
coefBMtable
plot(all.fitted.model, which=10)
par(mfrow = c(2,2))
hist(all.fitted.model@fit$residuals)
plot(all.fitted.model@fit$residuals)
plot(all.fitted.model, which=8)
plot(all.fitted.model, which=9)
par(mfrow = c(1,1))
forc1 = ugarchforecast(all.fitted.model, n.ahead = 100)
par(mfrow=c(1,2))
plot(forc1, which = 1)
plot(forc1, which = 3)
par(mfrow=c(1,1))
# Section 4a - Peaks-Over-Threshold Estimation - 10 Stocks Index
# Determine threshold
par(mfrow = c(1,2))
tcplot(allDF$Observations, u.range=c(0.3, 0.35))
par(mfrow = c(1,1))
# Fit the Generalizaed Pareto Distribution
mle <- fitgpd(allDF$Observations, thresh=.35, shape=0, est="mle")
POTtable <- matrix(nrow=6, ncol=1)
POTtable <- data.frame(POTtable)
POTtable[1,] <- mle$fitted.values
POTtable[2,] <- mle$std.err
POTtable[3,] <- mle$std.err.type
POTtable[4,] <- mle$logLik
POTtable[5,] <- mle$hessian
POTtable[6,] <- mle$threshold
rownames(POTtable) <- c("Fitted Values", "Standard Error",
"Standard Error Type", "Log Likelihood",
"Hessian", "Threshold")
colnames(POTtable) <- "POT Parameters"
POTtable
# graphic diagnostics for the fitted model
par(mfrow=c(2,2))
plot(mle, npy = 1, which = 1)
plot(mle, npy = 1, which = 4)
plot(mle$data[1:1400], xlab="Time", ylab="Observations",
main="Left Tail Plot of POT", lwd=2)
plot(mle$data[1401:2528], xlab="Time", ylab="Observations",
main="Right Tail Plot of POT", lwd=2)
par(mfrow=c(1,1))
## Section 4b - VaR Forecasting of POT
newMLE <- data.frame(allDF$Time, mle$data)
mle_var_ts <- ts(newMLE)
mle.var.2c <- VAR(mle_var_ts, p = 2, type = "const")
# Predict the values for the next 125, 250, and 500 days
mle_var.prd.100 <- predict(mle.var.2c, n.ahead = 100, ci = 0.9)
mle_var.prd.500 <- predict(mle.var.2c, n.ahead = 500, ci = 0.9)
plot(mle_var.prd.100)
plot(mle_var.prd.500)
plot(mle_var.prd.500$fcst$allDF.Time[1:200],
mle_var.prd.500$fcst$mle.data[1:200], col='red',
main="Plot of MLE VaR Forecasts", xlab = "Time",
ylab = "Returns")
points(mle_var.prd.100$fcst$allDF.Time[1:200],
mle_var.prd.100$fcst$mle.data[1:200], col='blue')
legend('topleft', legend=c('100 Days = blue', '500 Days = red'),
col=c('blue','red'))
## Section 4c - Estimated Shortfall (CvaR) Forecasting of POT
# The conditional drawdown is the the mean of the worst 0.95% drawdowns
mddMLE <- maxdrawdown(mle_var_ts[,2])
# CvaR (Expected Shortfall) Estmation
estSFmle <- ES(ts(mle$data), p=0.95, method="modified",
portfolio_method = "component", weights = ts(allGEV[2]),
invert = F, operational = F)
estSFMLEtable <- data.frame(mddMLE, estSFmle$MES)
rownames(estSFMLEtable) <- ""
colnames(estSFMLEtable) <- c("Max Drawdown", "from", "to", "Estimated Shortfall")
estSFMLEtable
## Section 4d - Hill Estimation of Peaks-Over-Threshold
hill(mle$data, option="xi", start=15 , end=45)
## Section 4e - Anderson-Darling Test of Robustness
adtestMLE <- ad.test(mle$data)
adtestMLEdf <- data.frame(adtestMLE$statistic, adtestMLE$p.value)
names(adtestMLEdf) <- c("Statistic", "P-Value")
rownames(adtestMLEdf) <- ""
adtestMLEdf
## Section 4f - Table of Results
MLE_Results <- matrix(nrow = 3, ncol = 5)
MLE_Results <- data.frame(MLE_Results)
names(MLE_Results) <- c("Method","Point Estimate","lower","upper","CI")
MLE_Results[1,] <- c("MLE VaR 100 Days",
round(mean(mle_var.prd.100$fcst$mle.data[,1]), 4),
round(mean(mle_var.prd.100$fcst$mle.data[,2]), 4),
round(mean(mle_var.prd.100$fcst$mle.data[,3]), 4),
round(mean(mle_var.prd.100$fcst$mle.data[,4]), 4))
MLE_Results[2,] <- c("MLE VaR 500 Days",
round(mean(mle_var.prd.500$fcst$mle.data[,1]), 4),
round(mean(mle_var.prd.500$fcst$mle.data[,2]), 4),
round(mean(mle_var.prd.500$fcst$mle.data[,3]), 4),
round(mean(mle_var.prd.500$fcst$mle.data[,4]), 4))
MLE_Results[3,] <- c("MLE ES", estSFMLEtable[1],
estSFMLEtable[2], estSFMLEtable[3],
"NA")
MLE_Results
## Section 4g - 100 days GARCH(1,1) Forecasting of POT
# Fit models with Generalized Auto-Regressive Conditional Heteroskadacity
pot.fitted.model = ugarchfit(spec, mle$data, solver = 'hybrid')
coefPOTtable <- data.frame(coef(pot.fitted.model))
names(coefPOTtable) <- "POT Parameters"
coefPOTtable
par(mfrow = c(2,2))
plot(pot.fitted.model, which=8)
plot(pot.fitted.model, which=3)
plot(pot.fitted.model, which=7)
plot(pot.fitted.model, which=9)
par(mfrow = c(1,1))
forc1 = ugarchforecast(pot.fitted.model, n.ahead = 100)
par(mfrow=c(1,2))
plot(forc1, which = 1)
plot(forc1, which = 3)
par(mfrow=c(1,1))
# Section 5a - Table of Estimation Method Influence
All_Results <- matrix(nrow = 4, ncol = 5)
All_Results <- data.frame(All_Results)
colnames(All_Results) <- c("Test","VaR","ES","mu","P-Value")
All_Results[1,] <- c("VaR",
round(mean(allDF_var.2c$varresult$Observations$coefficients),4),
"NA", "NA", adtest$p.value)
All_Results[2,] <- c("ES", "NA", estSF$ES, "NA", adtest$p.value)
All_Results[3,] <- c("Block Maxima",
round(mean(allGEV_var.2c$varresult$Returns$coefficients),4),
estSFGEV$MES, all_BM_GEV$par.ests[3], adtestGEV$p.value)
All_Results[4,] <- c("Peaks Over Threshold",
round(mean(mle.var.2c$varresult$mle.data$coefficients), 4),
estSFmle$MES, mle$fitted.values, adtestMLE$p.value)
All_Results