Extending VaR and ES – The Choice of Confidence Level and Time Horizon

Isai Guizar


This document is intended for educational purposes as part of my Risk Financial Management course at Tec de Monterrey. For any questions or suggestions, feel free to contact me at iguizar@tec.mx.


1 Intro

In this document, we will explore how the metrics of Value at Risk (VaR) and Expected Shortfall (ES) change when adjusting two key parameters: the confidence level and the time horizon. While the previous session introduced these risk measures under the assumption of a single period and a fixed confidence level, it is essential to understand how extending the time frame or choosing a higher confidence threshold affects the magnitude of potential losses.

2 Confidence level

When selecting the confidence level for Value at Risk (VaR) and Expected Shortfall (ES), two main approaches guide the decision: regulatory requirements and industry practices in risk management. On the regulatory side, financial institutions follow confidence levels set by national and international regulatory bodies. For example, the Basel Committee on Banking Supervision mandates a 99% one-day VaR for market risk capital calculations as part of the Basel III framework.


In industry practices, however, confidence levels are adjusted to specific investment strategies. A 95% confidence level is commonly used for daily risk monitoring because it provides a reasonable estimate of typical losses without being excessively conservative. A 99% confidence level is preferred when assessing more severe tail events, particularly in stress testing and long-term risk projections.


As the confidence level increases, risk metrics capture more extreme losses, leading to higher VaR and ES values. That is, as \(X\) increases, the threshold for potential losses moves further into the left tail of the distribution. Figure 1 illustrates the impact of increasing confidence levels under the simplified assumption that the returns follow a standard normal distribution. The shaded areas represent the tails associated with 90%, 95%, and 99% confidence levels.


Figure 1: Impact of Increasing Confidence Levels on VaR and ES


Example

Let’s assume a portfolio consisting of a single asset, the historical data of daily log returns have a mean of 2% and a standard deviation of 10%. Compare the 1-day VaR and ES at a 90% vs 99% confidence level.


Load libraries

import numpy  as np
import pandas as pd
import scipy.stats as st
import matplotlib.pyplot as plt

Calculate the VaR

mu = 0.02
sd = 0.10

confidence90 = 0.90   # Confidence level
confidence99 = 0.99   # Confidence level


VaR90 = mu + sd*st.norm.ppf(1 - confidence90)
VaR99 = mu + sd*st.norm.ppf(1 - confidence99)

print(f"1-day 90% VaR: {-(np.exp(VaR90)-1):.2%}")
print(f"1-day 99% VaR: {-(np.exp(VaR99)-1):.2%}")
1-day 90% VaR: 10.25%
1-day 99% VaR: 19.15%


Calculate the Expected Shortfall

ES90 = mu - sd * st.norm.pdf(st.norm.ppf(1-confidence90)) / (1-confidence90)  
ES99 = mu - sd * st.norm.pdf(st.norm.ppf(1-confidence99)) / (1-confidence99)  

print(f"The Expected Shortfall (ES) at 90% is: {-(np.exp(ES90)-1):.2%}")
print(f"The Expected Shortfall (ES) at 99% is: {-(np.exp(ES99)-1):.2%}")
The Expected Shortfall (ES) at 90% is: 14.40%
The Expected Shortfall (ES) at 99% is: 21.85%

3 Time horizon

The time horizon over which risk is measured is a crucial factor in calculating Value at Risk (VaR) and Expected Shortfall (ES). Financial institutions typically assess risk over different horizons depending on the nature of their portfolios, investment objectives, regulatory requirements, and how quickly portfolio can be unwound. For example, daily VaR is commonly used for short-term trading desks, while longer-term horizons, such as 10 days or one month, are relevant for asset managers.

Analyses frequently begin by obtaining risk measures over a one-day horizon and later are extended to longer periods. When transforming to a longer horizon the issue of time aggregation arises. Suppose we want to agregate the returns over two days:

\[ R_{2t} = R_t + R_{t-1} \] The expected returns over the two periods is:

\[ E(R_{2t}) = E(R_t) + E(R_{t-1}) \] and the variance is:

\[ V(R_{2t}) = V(R_t) + V(R_{t-1}) + 2 Cov(R_t,R_{t-1}) \] in terms of the correlation (\(\rho\)), this is:

\[ V(R_{2t}) = V(R_t) + V(R_{t-1}) + 2 \rho \sqrt{V(R_t)} \sqrt{V(R_{t-1})} \] Under the reasonable assumption that the returns are identically distributed, the one-period expected return and the variance are constant over time, \(E(R_t)=E(R_{t-1})=\mu_{1t}\) and \(V(R_t) = V(R_{t-1}) = \sigma^2_{1t}\). The equations for the expected returns and variance over two periods, can be expressed as:

\[ \mu_{2t} \equiv E(R_{2t}) = 2\mu_{1t} \] and

\[ \sigma^2_{2t} \equiv V(R_{2t}) = 2\sigma^2_{1t}(1+\rho) \] The expected returns are not affected by the correlation (it is simply twice the expected return over one period), but the variance is. Here is where the critical issue arises, what is the value of \(\rho\)?


3.1 Uncorrelated returns


Analysts would prefer to assume that the returns are uncorrelated (\(\rho=0\)) over time. This is consistent with the efficient markets hypothesis, which states that prices in financial markets fully reflect all available information. A more general assumption is that the returns are independent, which implies that \(\rho=0\). If this is the case:

\[ \sigma^2_{2t} \equiv V(R_{2t}) = 2\sigma^2_{1t} \] the variance over two days, is simply two times the daily variance of the returns.

Generalizing to \(T\) periods:

\[ \mu_T = \mu_{1t} T \] and

\[ \sigma^2_T = \sigma^2_{1t} T \ \ \ \ \ \ \text{or} \ \ \ \ \ \ \sigma_T = \sigma_{1t} \sqrt{T} \]

So that the Value at Risk over \(T\) periods with an \(X\%\) confidence level:

\[ VaR_T(X\%) = \mu_T + \sigma_T \cdot \Phi^{-1}(1-X\%) \]

can be expressed as:

\[ VaR_T(X\%) = \mu_{1t}T + \sigma_{1t} \sqrt{T} \cdot \Phi^{-1}(1-X\%) \]

Similarly, the Expected Shorfall:

\[ ES_T(X\%) = \mu_{1t}T - \sigma_{1t} \sqrt{T} \cdot \frac{\phi \Big(\Phi^{-1}(1-X\%)\Big)}{1-X\%} \]


3.2 Correlated returns


When the assumption that the returns are uncorrelated does not hold, using the above calculations of risk would lead to biased estimates. For example, when the correlation is positive, a movement in one direction is followed by another in the same direction, the risk is higher than what it would be in the uncorrelated case.

As in Jorion() we can model this phenomenon with a first order autocorrelation process, AR(1), where the correlation of returns, \(\rho\), over consecutive days can be estimated as:

\[ R_t = \rho R_{t-1} + u_t \]


While the calulation of the mean over \(T\) periods remains the same, \(\mu_T=\mu_{1t} \cdot T\), the variance over two periods derived above, \(\sigma^2_{2t} = 2\sigma^2_{1t}(1+\rho)\), can be generalized to \(T\) periods as:

\[ \sigma^2_T = \sigma_{1t}^2 \Big(T + 2(T-1)\rho + 2(T-2)\rho^2+\cdots+ 2(1)\rho^{T-1}\Big) \]


4 Application

Use the data for the SP500 that was obtained from the Federal Reserve Economic Data (FRED) to estimate the value at risk of the index over 5 days. Consider a confidence level of 99%, and daily data from Jan 01, 2021 to Jan 01, 2025.


Load the libraries

import numpy  as np
import pandas as pd
import scipy.stats as st
import matplotlib.pyplot as plt
import pandas_datareader.data as web
import statsmodels.api as sm
import statsmodels.tsa.api as tsa

Obtain and clean data

symbol = 'SP500'
start_date = '2021-01-01'
end_date = '2025-01-01'

SP = web.DataReader(symbol, 'fred', start=start_date, end=end_date)

# Convert index to datetime
SP.index = pd.to_datetime(SP.index)
SP.rename(columns={symbol: 'Price'}, inplace=True)

Calculate the correlation using an AR(1) model

Before calculating the VaR, we need to know whether the returns are correlated to procceed accordingly. To that end, we will estimate the AR(1) model and test if the coefficient is statically different from zero.

Important! Since the autoregressive model requires a date index with a defined frequency (e.g., daily, monthly), we will explicitly reindex the dataset using a business day frequency (freq=‘B’). This approach facilitates model estimation but may slightly alter the mean and standard deviation of the time series.

# Generate a business day range and reindex
business_days = pd.date_range(start=SP.index.min(), end=SP.index.max(), freq='B')
SP = SP.reindex(business_days, method='ffill')  # Forward fill missing data
SP.index.name = 'DATE'

# Calculate the returns
SP['Return'] = np.log(SP['Price']).diff()
SP.dropna(inplace=True)

# Calculate the mean and standrad deviation over a one day period
mu = SP['Return'].mean()
sd = SP['Return'].std()

print(f"The mean of the daily log returns is: {mu:.4%}")
print(f"The standard deviation of the daily log returns is: {sd:.4%}")
The mean of the daily log returns is: 0.0555%
The standard deviation of the daily log returns is: 1.0430%
# Fit the AR(1) Model
model = tsa.ARIMA(SP['Return'], order=(1, 0, 0)).fit()

# Print summary
print(model.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                 Return   No. Observations:                  967
Model:                 ARIMA(1, 0, 0)   Log Likelihood                3040.864
Date:                Wed, 19 Mar 2025   AIC                          -6075.728
Time:                        20:45:54   BIC                          -6061.105
Sample:                             0   HQIC                         -6070.161
                                - 967                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0006      0.000      1.637      0.102      -0.000       0.001
ar.L1         -0.0063      0.027     -0.232      0.817      -0.059       0.047
sigma2         0.0001   3.63e-06     29.905      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):               162.64
Prob(Q):                              0.99   Prob(JB):                         0.00
Heteroskedasticity (H):               0.63   Skew:                            -0.28
Prob(H) (two-sided):                  0.00   Kurtosis:                         4.93
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarning:

A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.

/opt/anaconda3/lib/python3.12/site-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarning:

A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.

/opt/anaconda3/lib/python3.12/site-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarning:

A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.

The null hypothesis states that \(\rho=0\). This hypothesis is rejected if the p-value of the autoregressive coefficient is lower than the chosen significance level (commonly 10%, 5%, or 1%). Specifically, to test the hypothesis at a 95% confidence level (corresponding to a 5% significance level), use the following code to apply the decision rule:

ar1_coef = model.params['ar.L1'].round(4)
p_value  = model.pvalues['ar.L1'].round(4)
print("\nAR(1) Coefficient:", ar1_coef)
print("P-value:", p_value)

if p_value < 0.05:
    print("The AR(1) coefficient is significantly different from zero (autocorrelation present).")
else:
    print("The AR(1) coefficient is not significantly different from zero (no autocorrelation, rho=0).")

AR(1) Coefficient: -0.0063
P-value: 0.8166
The AR(1) coefficient is not significantly different from zero (no autocorrelation, rho=0).

Therefore, we conclude that the returns of the SP500 are uncorrelated, consistent with the efficient markets hypothesis. Calculate the VaR with \(\rho=0\).


4.1 Uncorrelated returns (\(\rho =0\))

Extend the mean (\(\mu\)) and standard deviation (\(\sigma\)) to 5 days

days = 5
mu_T = mu*days
sd_T = sd*np.sqrt(days)

Calculate the Value-at-risk over 5 days at a 99% confidence level

X = 0.99   # Confidence level
VaR_T = mu_T + sd_T*st.norm.ppf(1 - X)

print(f"5-day 99% VaR: {-(np.exp(VaR_T)-1):.2%}")
5-day 99% VaR: 5.02%

Interpretation: We are 99% confident that an investment in the SP500 will not lose more than 4.97% over five days.

Calculate the Expected Shortfall over 5 days at a 99% confidence level

z  = st.norm.ppf(1-X)
ES_T = mu_T - sd_T * st.norm.pdf(z) / (1-X)  

print(f"5-day 99% Expected Shortfall is: {-(np.exp(ES_T)-1):.2%}")
5-day 99% Expected Shortfall is: 5.77%

Interpretation: We are 99% confident that, in the event the investment in the S&P 500 experiences a loss greater than its Value at Risk (VaR) threshold, the expected loss over the next five days will be 5.69%.


4.2 Correlated returns (\(\rho \ne 0\))

If we had failed to reject the null hypothesis, the coefficient of \(\rho =0.0076\) should have been used in the VaR estimation. To facilitate the calculations I have written the following function that generalizes the estimations:

def multi_period_risk(mu_1,sd_1t,X,T,rho):
    """
    Calculate multi-period VaR and ES for an AR(1) process.
    
    Parameters:
    - mu_1:  one-period expected return
    - sd_1t: one-period standard deviation
    - X:     confidence level (e.g., 0.95 for 95%)
    - T:     number of periods
    - rho:   autoregressive coefficient
    
    Returns:
    - VaR_T: T-period VaR
    - ES_T:  T-period ES
    """
    # Calculate multi-period variance
    variance_T = sd_1t**2 * (T + 2 * sum((T - k) * rho**k for k in range(1, T)))
    sigma_T = np.sqrt(variance_T)
    
    # Multi-period expected return
    mu_T = mu_1 * T
    
    # Z-score for the given confidence level
    z_alpha = st.norm.ppf(1-X)
    
    # Multi-period VaR
    VaR_T = mu_T +  sigma_T*z_alpha
    
    # Multi-period ES
    ES_T = mu_T - sigma_T*(st.norm.pdf(z_alpha) / (1 - X)) 
    
    return VaR_T, ES_T

We only need to specify:

  1. \(\mu_{1t}\): one-period expected return

  2. \(\sigma_{1t}\): one-period standard deviation

  3. \(X\): Confidence level (e.g., 0.95 for 95%).

  4. \(T:\) Number of periods

  5. \(\rho\): Autoregressive coefficient

And the function will return the VaR and ES. It must be used as follows:

VaR_T, ES_T = multi_period_risk(mu_1=mu, sd_1t=sd, X =0.99, T=5, rho=ar1_coef)

print(f"5-day 99% VaR: {-(np.exp(VaR_T)-1):.2%}")
print(f"5-day 99% ES is: {-(np.exp(ES_T)-1):.2%}")
5-day 99% VaR: 4.99%
5-day 99% ES is: 5.74%

The autoregressive coefficient was close to zero, but positive, which increased the measures of risk as compared to the uncorrelated case.