Spain GDP via EVT

Setup

We are interested in analyzing Spanish GDP data from an EVT perspective. We will compare how estimates of statistics change before and after the shock of 2020Q2 (and to a lesser extent, 2020Q1).

The packages used for EVT analysis, follwing Pasquale Cirillo’s Lesson 5 of his Risk Management course on YouTube, are the evir and evd packages.

We use the PerformanceAnalytics package for analysis of time series data. For the plots, we use ggplot2 and scales. For data manipulation, we use tidyverse.

We will not take into account autocorrelations in the data or analogous concepts such as mixing; we will assume lags are independent of each other.

library(evir)
library(evd)
library(PerformanceAnalytics)
library(ggplot2)
library(scales)
library(tidyverse)

Data

The website of the Spanish National Institute for Statistics (INE, Instituto Nacional de Estadística) is hard to use and decipher. Instead, the data are obtained from here. This source states that the data come from the INE, ultimately.

The data consist of quarterly changes in GDP from 1995Q2 to 2020Q2. This translates to 101 observations, so there is no need to attach a CSV file to this document.

gdp <- c(62, 41, 76, 59, 73, 86, 53, 101, 91, 118, 138, 88, 112, 95, 11, 1, 119, 133, 114, 162, 124, 107, 101, 102, 72, 94, 71, 43, 85, 62, 76, 95, 5, 66, 86, 6, 98, 95, 68, 98, 89, 96, 102, 114, 97, 93, 96, 91, 9, 78, 64, 22, 12, -19, -159, -258, -3, 2, -3, -3, 14, -6, 7, -15, -31, -65, -63, -95, -97, -5, -78, -32, -8, -7, 17, 37, 46, 74, 86, 115, 107, 91, 96, 68, 43, 87, 47, 75, 102, 55, 68, 49, 55, 5, 56, 53, 36, 41, 47, -521, -1774) / 10000

dates <- as.yearqtr(c("1995Q2", "1995Q3", "1995Q4", "1996Q1", "1996Q2", "1996Q3", "1996Q4", "1997Q1", "1997Q2", "1997Q3", "1997Q4", "1998Q1", "1998Q2", "1998Q3", "1998Q4", "1999Q1", "1999Q2", "1999Q3", "1999Q4", "2000Q1", "2000Q2", "2000Q3", "2000Q4", "2001Q1", "2001Q2", "2001Q3", "2001Q4", "2002Q1", "2002Q2", "2002Q3", "2002Q4", "2003Q1", "2003Q2", "2003Q3", "2003Q4", "2004Q1", "2004Q2", "2004Q3", "2004Q4", "2005Q1", "2005Q2", "2005Q3", "2005Q4", "2006Q1", "2006Q2", "2006Q3", "2006Q4", "2007Q1", "2007Q2", "2007Q3", "2007Q4", "2008Q1", "2008Q2", "2008Q3", "2008Q4", "2009Q1", "2009Q2", "2009Q3", "2009Q4", "2010Q1", "2010Q2", "2010Q3", "2010Q4", "2011Q1", "2011Q2", "2011Q3", "2011Q4", "2012Q1", "2012Q2", "2012Q3", "2012Q4", "2013Q1", "2013Q2", "2013Q3", "2013Q4", "2014Q1", "2014Q2", "2014Q3", "2014Q4", "2015Q1", "2015Q2", "2015Q3", "2015Q4", "2016Q1", "2016Q2", "2016Q3", "2016Q4", "2017Q1", "2017Q2", "2017Q3", "2017Q4", "2018Q1", "2018Q2", "2018Q3", "2018Q4", "2019Q1", "2019Q2", "2019Q3", "2019Q4", "2020Q1", "2020Q2"))

gdp_xts <- xts(gdp, order.by = dates)

gdp_df <- data.frame(gdp, dates)

Basic Analysis

We first take a look at the data graphically. We are in January 2020. A mysterious and apparently highly-contagious virus has been causing trouble in China, and in some neighboring countries. Not to worry though, it’s just the flu. Our economy seems to be doing just fine ten years after the financial crisis:

ggplot(gdp_df[-c(100:101),], aes(x=dates, y=gdp)) +
  geom_line() +
  theme_minimal() +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  geom_point()

Fast forward to July 2020 and the turkey’s just been plucked:

ggplot(gdp_df, aes(x=dates, y=gdp)) +
  geom_line() +
  theme_minimal() +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  geom_point()

The second-to-last observation (2020 Q1) would already be called by some a “Black Swan”, as it is a drop in GDP never before recorded. The last observation (2020 Q2) is the nail in the coffin.

How do the histograms compare?

ggplot(gdp_df[-c(100:101),], aes(x=gdp)) +
  geom_histogram(binwidth = 0.0025) +
  theme_minimal()

summary(gdp_df[-c(100:101),1])

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.025800  0.001000  0.006600  0.004931  0.009500  0.016200

sd(gdp_df[-c(100:101),1])

## [1] 0.006470587

skewness(gdp_xts[-c(100:101)])

## [1] -1.701851

kurtosis(gdp_xts[-c(100:101)], method = "moment")

## [1] 7.681676

The distribution of quarterly changes in GDP was already negatively skewed. The mean is lower than the median, indicating a heavier left tail. The standard deviation is larger than the mean: this indicates a significant variability of the data. The skewness is negative but not too large. The kurtosis suggests the data is heavier-tailed than a Gaussian distribution, but still “well-behaved”. How do we stand today?

ggplot(gdp_df, aes(x=gdp)) +
  geom_histogram(binwidth = 0.0025) +
  theme_minimal()

summary(gdp_df[,1])

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.177400  0.000700  0.006400  0.002561  0.009500  0.016200

sd(gdp_df[,1])

## [1] 0.02000828

skewness(gdp_xts)

## [1] -7.535625

kurtosis(gdp_xts)

## [1] 63.72674

The median, robust as it is to “outliers” (or rather, extreme events), barely changes. The mean halves. The standard deviation triples. The skewness is multiplied by a factor of 4.4. The kurtosis increases by a factor of 8.3. These are huge changes in our estimated statistics, shouldn’t the Central Limit Theorem make all of these converge to their true values? The “descriptive” statistics of the data were not describing the data at all!

Enter: EVT

Implementation of EVT to data focuses on one tail only. Here is where we can take a decision: either we look at both tails as if they were one by taking the absolute value of the data, or we look at one tail only. There is also the option of looking at both tails separately, but this will make the report way too long. Let’s choose to look only at the lower tail of the data. Why? Here comes some (brutal) honesty you’ll never find in academia: for apparently no reason whatsoever. We can always argue why we made this decision instead of that decision, but the fact remains: most times, the reasoning comes after the decision, so we will not explain this decision. In fact, as I am writing this, I have not yet written the code nor analyzed the data. I may be wrong. Let’s see.

Let’s make a Zipf plot of the data. In these cases it is not necessary to show before/after plots, we will mark the 2020 Q1 observation in

gdp_lower = abs(gdp[gdp < 0])

log_emp_surv <- log(1 - ppoints(sort(gdp_lower)))

x_log <- log(sort(gdp_lower))

zipf <- data.frame(x_log, log_emp_surv)

ggplot(zipf, aes(x=x_log, y=log_emp_surv)) +
  geom_point() +
  theme_minimal() +
  geom_point(data=zipf[19, ], aes(x=x_log, y=log_emp_surv), color="blue") +
  geom_point(data=zipf[20, ], aes(x=x_log, y=log_emp_surv), color="red") +
  ylab("1 - F(x) (log scale)") +
  xlab("x (log scale)")

To read a Zipf plot, we look for linearity in the data. A random variable distributed according to a power law will exhibit linearity in this graph. The judging of graphical results is subjective, but I say there is linearity, even excluding the 2020 Q1 and 2020 Q2 observations. Theoretically, the slope should equal the tail exponent but this is not a good way to produce estimates. Neither is liniarity in the Zipf plot a sufficient condition for fat tails (power-law-like tail behavior). This plot is particularly prone to not work with lognormal data.

We will also check using the maximum-sum plot.

MSplot <- function(data,p=4) {
  par(mfrow = c(2, 2)) 
  x=abs(data)
  for (i in 1:p) {
    y=x^i
    S=cumsum(y)
    M=cummax(y)
    R=M/S
    plot(1:length(x),R,type='l', col=2, lwd=3, ylim=c(0,1),xlab='n', ylab='Rn', 
         main=paste("MSplot for p=",i))
  }
  par(mfrow = c(1, 1)) 
}

MSplot(gdp_lower[-c(19:20)])

This plot checks whether the \(p\) moment is finite. Now, this may not be the best plot to use given our small data sample, but we can still conclude a couple of interesting things. To read this plot, if the line does not tend to zero for the \(p\) moment, this moment is likely to not be finite.

Perhaps the first moment converges, but we would need more data to confirm this. The rest of the moments do not converge. The standard deviation, skewness, and kurtosis we saw before do not have true population values if no moments converge. Their finite-sample estimates are, therefore, quite literally, horsesh*t. As the Zipf plot suggested, the data were already fat-tailed before the turkey arrived at the slaughterhouse. Looking at kurtosis would tell us that the data became fat-tailed afterwards. Of course it’s fat-tailed afterwards!

The most important takeaway from the maximum-sum plot is the likely range for the tail index estimates. Although we do not observe convergence for \(p=1\), it could be that, due to the small data sample, the tail index is higher than 1. It should not, however, be higher than 2.

Let’s estimate!

hill(gdp_lower[-c(19:20)], start = 2)

We see that the estimates of \(\alpha\) seem to be lower than 2. This is what we concluded from the maximum-sum plots! However, due to the small data sample, the variation of tail index estimates is very high. Let’s keep this in mind while we estimate.

We fit a GEV.

fit_prior <- fgev(gdp_lower[-c(19:20)], std.err = F)

We get an estimate \(\hat{\xi} = 0.93\) so \(\hat{\alpha} = \frac{1}{\hat{\xi}} = 1.07\). Due to the small sample, we cannot retrieve standard errors of the estimates, so we need to take them with a grain of salt.

Let’s verify this tail index on the Zipf plot:

log_gev_surv <- function(x, loc, scale, shape){
  x <- exp(x)
  log(1-pgev(x, loc, scale, shape))
}

ggplot(zipf, aes(x=x_log, y=log_emp_surv)) +
  geom_point() +
  theme_minimal() +
  geom_point(data=zipf[19, ], aes(x=x_log, y=log_emp_surv), color="blue") +
  geom_point(data=zipf[20, ], aes(x=x_log, y=log_emp_surv), color="red") +
  ylab("1 - F(x) (log scale)") +
  xlab("x (log scale)") +
  stat_function(fun = log_gev_surv, 
                args = list(loc = fit_prior$estimate['loc'], 
                            scale = fit_prior$estimate['scale'],
                            shape = fit_prior$estimate['shape']))

This fit is good for the black points, which correspond to the data used to estimate this model. This model is also not bad for the “out-sample”! These observations are not impossibly unlikely for this model. It is important to bear in mind the small number of observations when evaluating the fit of this model, we can expect some “leeway”. Let’s see what we get when we include the 2020 Q1 and 2020 Q2 observations:

fit_post <- fgev(gdp_lower, std.err = F)

ggplot(zipf, aes(x=x_log, y=log_emp_surv)) +
  geom_point() +
  theme_minimal() +
  geom_point(data=zipf[19, ], aes(x=x_log, y=log_emp_surv), color="blue") +
  geom_point(data=zipf[20, ], aes(x=x_log, y=log_emp_surv), color="red") +
  ylab("1 - F(x) (log scale)") +
  xlab("x (log scale)") +
  stat_function(fun = log_gev_surv, 
                args = list(loc = fit_post$estimate['loc'], 
                            scale = fit_post$estimate['scale'],
                            shape = fit_post$estimate['shape']))

At the tail of the data, the fit seems to worsen. The estimate for \(\xi\) is also lower: \(\hat{\xi} = 0.61\) so \(\hat{\alpha} = 1.63\). In any case, the variation in the estimates could account for both estimates. Let’s stick to the prior estimate for an evaluation using different statistics. When I use “prior” it simply means prior to 2020 Q1. No Bayesian stuff here.

EVT vs The World: How Not to Calculate Value at Risk

Let’s calculate the VaR for some quantiles above 90% using the prior empirical distribution, a Gaussian with the same parameters as the prior observations, and the GEV with the parameters estimated from the prior data.

quantiles <- c(seq(from = 0.9, to = 0.99, by = 0.01),
               seq(from = 0.99, to = 0.999, by = 0.001),
               seq(from = 0.999, to = 0.9999, by = 0.0001),
               seq(from = 0.9999, to = 0.99999, by = 0.00001))

emp_var <- unname(quantile(gdp_lower[-c(19:20)],quantiles))

gauss_var <- unname(qnorm(quantiles, 
                          mean = mean(gdp_lower[-c(19:20)]), 
                          sd = sd(gdp_lower[-c(19:20)])))


gev_var <- unname(qgev(quantiles, 
                       loc = fit_prior$estimate['loc'], 
                       scale = fit_prior$estimate['scale'],
                       shape = fit_prior$estimate['shape']))

vars = data.frame(Quantile = quantiles, Empirical = emp_var, 
                  Gaussian = gauss_var, GEV = gev_var)

Perhaps it is clearer if we plot the VaR against its corresponding quantile.

vars = vars %>% pivot_longer(-Quantile, names_to = "Distribution", values_to = "VaR")

points_2020 <- data.frame(Quantile = c(0.90,0.95, 0.95, 1), 
                          value = c(gdp_lower[19], gdp_lower[19],
                                    gdp_lower[20], gdp_lower[20]))

ggplot(vars, aes(x=Quantile, y=VaR, color=Distribution)) +
  geom_line(size = 1) +
  theme_minimal() +
  scale_x_continuous(trans = "log10") +
  scale_y_continuous(trans = "log10", labels = percent) +
  geom_segment(aes(x=points_2020[1,1], y=points_2020[1,2],
                   xend=points_2020[2,1], yend=points_2020[2,2]),
               color = "blue") +
  geom_segment(aes(x=points_2020[3,1], y=points_2020[3,2],
                   xend=points_2020[4,1], yend=points_2020[4,2]),
               color = "red") +
  geom_point(data=points_2020[1,], aes(x=Quantile, y=value), color="blue") +
  geom_point(data=points_2020[2,], aes(x=Quantile, y=value), color="blue") +
  geom_point(data=points_2020[3,], aes(x=Quantile, y=value), color="red") +
  geom_point(data=points_2020[4,], aes(x=Quantile, y=value), color="red")

The prior empirical VaR completely fails at considering such extreme events. The prior Gaussian also fails. The prior GEV VaR at the 95% level is very close to the 2020 Q1 observation. Regarding the 2020 Q2 observation, it is given here as the “maximum” of the distribution, so it normal that the unbounded theoretical distributions tend to infinity here. The Gaussian also does, but this is not visible.

The prior GEV is obviously not perfect at describing future extreme events, but it is immensely better than the prior Gaussian or even the prior empirical. It is naive to investigate “low”-probability events with methods that fail to do so effectively. We have discovered the fat-tailed nature of this data. This data is already fat-tailed. It will never be thin-tailed again. Any researcher who uses thin-tailed methods on this data to investigate its “risk properties” or “low-probability events” is being absolutely ignorant and surely has the aptitude of a boiled snail in a paella in terms of logic. Don’t do this.

We skip the Expected Shortfall analysis. I suspect the results of this to be even starker than the VaR analysis.

How Often Can we Expect a Drop Like This?

Note that the question is no “will we ever see another drop like this again?” The answer is yes, definitely, resoundingly. But will it happen roughly once in 10 years or once in 1000 years?

We ask this question knowing already that this drop has happened, so it makes sense to use “posterior” estimates, that is, estimates using data which includes this drop.

What is the probability of observing a drop in GDP larger than (or equal to) the 2020 Q2 drop?

Empirically, this would have a 5% probability, due to the sample size of 20. So, on average, we would expect this to happen once every 20 quarterly drops in GDP. Remember how we only took the drops? Now it’s coming back to bite us in the butt.

A back-of-the-envelope calculation says that, since drops happen 19.8% of the times, then the probability of observing a drop in GDP larger than the 2020 Q2 drop is 0.99% per quarter. In other words, on average, we expect a drop like this to happen once every 101 quarters, or once every 25.25 years. This, of course, coincides exactly with the data, since we have 101 observation in total.

When using the posterior Gaussian (i.e. using all observations), we obtain the following parameters for the truncated data:

mean(gdp_lower)

## [1] 0.01621

sd(gdp_lower)

## [1] 0.0398703

And so, according to the Gaussian, a drop larger than the one observed in 2020 Q2 has a probability of:

p_gauss <- (20/length(gdp))*(1 - pnorm(gdp_lower[20], mean = mean(gdp_lower), sd = sd(gdp_lower)))
p_gauss

## [1] 5.228061e-06

round(1/p_gauss)

## [1] 191275

round(1/(4*p_gauss))

## [1] 47819

This event occurs, on average, once every 191275 quarters or once every 47819 years. We can rest assured now! This will not happen again in our lifetimes!

For the posterior GEV, a drop larger than the one observed in 2020 Q2 has a probability of:

p_gev_post <- (20/length(gdp))*(1 - unname(pgev(gdp_lower[20], loc = fit_post$estimate['loc'],
                                                scale = fit_post$estimate['scale'], 
                                                shape = fit_post$estimate['shape'])))
p_gev_post

## [1] 0.000585842

round(1/p_gev_post)

## [1] 1707

round(1/(4*p_gev_post))

## [1] 427

The event occurs, on average, once every 1707 quarters or once every 427 years. Let’s compare results with the prior GEV:

p_gev_prior <- (20/length(gdp))*(1 - unname(pgev(gdp_lower[20], 
                                                 loc = fit_prior$estimate['loc'],
                                                 scale = fit_prior$estimate['scale'], 
                                                 shape = fit_prior$estimate['shape'])))
p_gev_prior

## [1] 0.001970295

round(1/p_gev_prior)

## [1] 508

round(1/(4*p_gev_prior))

## [1] 127

The event occurs, on average, once every 508 quarters or once every 127 years.

These estimations, intuitively, are still off. I myself still expect to see an event like this one, and even one much larger, in my lifetime (I am 21 years old in 2020 Q2).

So we see that GEV offers no significant improvement on this front with regard to teh empirical distribution, right? Wrong.

What is the probability of observing a drop in GDP larger than (or equal to) twice the 2020 Q2 drop?

This is a 35.48% drop in one quarter in GDP. Not impossible, it’s close to the US’s drop in 2020 Q2.

Empirical: “Zero. We’ve never observed such a drop, so it won’t happen.”

Gaussian: “Zero.” It’s not really zero but, due to computer accuracy, it’s zero. “We expect to observe it on average once every infinite years.” Again, this is due to the zero.

p_gauss_X2 <- (20/length(gdp))*(1 - pnorm(2*gdp_lower[20], mean = mean(gdp_lower), sd = sd(gdp_lower)))
p_gauss_X2

## [1] 0

round(1/p_gauss_X2)

## [1] Inf

round(1/(4*p_gauss_X2))

## [1] Inf

GEV posterior: “We can expect to observe it, on average, once every 1306 years.” GEV prior: “We can expect to observe it, on average, once every 266 years.”

p_gev_post_X2 <- (20/length(gdp))*(1 - unname(pgev(2*gdp_lower[20], 
                                                   loc = fit_post$estimate['loc'],
                                                   scale = fit_post$estimate['scale'], 
                                                   shape = fit_post$estimate['shape'])))
p_gev_post_X2

## [1] 0.0001914754

round(1/p_gev_post_X2)

## [1] 5223

round(1/(4*p_gev_post_X2))

## [1] 1306

p_gev_prior_X2 <- (20/length(gdp))*(1 - unname(pgev(2*gdp_lower[20], 
                                                    loc = fit_prior$estimate['loc'],
                                                    scale = fit_prior$estimate['scale'], 
                                                    shape = fit_prior$estimate['shape'])))
p_gev_prior_X2

## [1] 0.0009401178

round(1/p_gev_prior_X2)

## [1] 1064

round(1/(4*p_gev_prior_X2))

## [1] 266

These, while not perfect, sound completely plausible. EVT wins against the world yet again. Let’s generalize.

What is the probability of observing a drop in GDP larger than \(k\) times the 2020 Q2 drop?

The plot shows \(k\) on the \(x\)-axis, and the assigned expected event latency, in years.

k = c(seq(from = 0.1, to = 1, by=0.1), seq(from = 1.5, to = 10, by = 0.5))

gaus_freq <- 1/(4*(20/length(gdp))*(1 - pnorm(k*gdp_lower[20], 
                                              mean = mean(gdp_lower), 
                                              sd = sd(gdp_lower))))
gev_prior_freq <- 1/(4*(20/length(gdp))*(1 - unname(pgev(k*gdp_lower[20], 
                                                         loc = fit_prior$estimate['loc'],
                                                         scale = fit_prior$estimate['scale'], 
                                                         shape = fit_prior$estimate['shape']))))
gev_post_freq <- 1/(4*(20/length(gdp))*(1 - unname(pgev(k*gdp_lower[20], 
                                                        loc = fit_post$estimate['loc'],
                                                        scale = fit_post$estimate['scale'], 
                                                        shape = fit_post$estimate['shape']))))

freq_points_y <- 1/(4*(20/length(gdp))*(1-ppoints(gdp_lower)))
freq_points_x <- sort(gdp_lower)/gdp_lower[20]

freq_points <- data.frame(k=freq_points_x, Model=freq_points_y)

frequencies <- data.frame(k = k, Gaussian = gaus_freq, `GEV Prior` = gev_prior_freq,
                          `GEV Posterior` = gev_post_freq)

frequencies <- frequencies %>% pivot_longer(-k, names_to = "Model", values_to  = "Frequency")

ggplot(frequencies, aes(x=k, y=Frequency, color=Model)) +
  geom_line() +
  geom_line(data=freq_points, aes(x=k, y=Model, color = "Empirical")) +
  geom_point() +
  geom_point(data=freq_points, aes(x=k, y=Model, color = "Empirical")) +
  scale_y_continuous(trans = "log10", labels = comma) +
  xlab("k (\"A drop k times the 2020 Q2 drop... \")") +
  ylab("Latency (\"...ocurrs once every y years\")") +
  theme_minimal()

A Final Note on Supports

The variable “relative drops in GDP/price/index level” is a bound variable. First and foremost, physical bounds apply. There will never be less than zero Euros in the territory occupied by Spain. Or will there? It is possible that there are only imports, no production, no exports, no government expenditure, etc. Then the GDP would be negative. Then we could attain a loss in GDP larger than 100%. We saw this happen with the price of oil. In EVT, however, we cannot say we ignore something because of how unlikely it is. We must include it in our model. But, as with pandemics and number of infections, there is still a physical limit to the loss in GDP. It is a variable that will never be described purely by GEV. But, as we have just seen, it is immensely better, in terms of risk-management applications, than more traditional methods.

Final Remarks

As with many variables, we model them as random variables not because they are random, but because we are missing information. Either co-variates and other variables which influence our target variable are missing or, more likely, we do not know the exact relationship between one and the other. Most likely, even, the system is altogether too complex to meticulously model. So we don’t. We stick to calling them random when they’re not. Sometimes, however, they are “more predictable”: if a highly-contagious pandemic has arrived inside of our borders, the economy will suffer. People already knew the numbers were going to be alarming.

The point is not to adjust models and try to predict what is, by definition, unpredictable. The point is that this will happen again eventually, sooner or later, and instead of trying to predict this to be able to avoid it, we must prepare for it. In the context of complex systems, for better or for worse, a system whose entities are not destroyed and re-built becomes a system whose entities cannot adapt to changing environments. If an external entity keeps saving them from harm, an event will eventually occur against which not even this external entity will be able to protect. Most likely, there will have been other events leading up to this event, but the entities’ inability to re-build themselves, better-adapted, every time, to the changing environment, will eventually make them collapse spectacularly.