Loss Distribution Documentation

This project was intended to answer two questions.

What is a loan portfolio’s expected loss over the next year?
Given the current state of the loans in the portfolio, how much could be lost in a really bad year? Otherwise known as the loan portfolio’s Value at Risk (VaR).

While it is never possible to know in advance the specific losses we may suffer in a particular year, we can forecast the average level of credit losses we can reasonably expect to experience. These losses are referred to as Expected Losses (EL). Expected losses can be viewed as a cost of doing business, and can be managed through credit pricing and thorough provisioning. However, one of the functions of bank capital is to provide a buffer to protect a bank against so-called peak losses that exceed expected loss levels. Peak losses do not occur every year, but when they occur, they can potentially be very large. Losses above that which were expected are referred to as Unexpected Losses (UL). Risk premiums, charged on credit exposures may absorb some components of unexpected losses, but the market will not support prices sufficient enough to cover all unexpected losses. Capital is needed to cover the risk of such peak losses.

Banks have an incentive to minimize the capital they hold, because reducing capital frees up economic resources that can be directed to profitable investments. On the other hand, the less capital a bank holds, the greater the likelihood that losses in a given year will not be covered by profit plus available capital.

The Expected Loss is assumed to equal the probability of default (PD) (i.e., the proportion of obligors that might default within a given time frame which is 1 year in the Basel context), multiplied by the outstanding exposure at default (EAD), and once more multiplied by the loss given default rate (LGD) (i.e., the percentage of the exposure that will not be covered by sale of collateral). These factors must all be estimated using historical values. Below is the mathematical expression of Expected Loss:

\(EL = PD * EAD * LGD\)

The Expected Loss for an entire portfolio is simply the sum of all the individual loan’s EL or:

\[\sum_{i=1}^n PD_i*EAD_i*LGD_i\]

for n loans in the portfolio.

The likelihood that losses will exceed Expected Losses and to what extent can be determined by a simulation. This simulation produces a range of losses along with their associated probability of occurance and is called a Portfolio Loss Distribution.

Vasicek Single-Factor Model

The Basel II Advanced Internal Ratings-Based (AIRB) framework was developed to set minimum regulatory capital requirements for the largest and most sophisticated internationally active banks. The Financial Stability Institute (2006) reports that 95 countries plan to implement Basel II by 2015, and more than 60 percent of them plan to include the AIRB option for credit risk capital requirements. The AIRB regulatory framework uses an asymptotic version of Vasicek’s (1987) portfolio credit loss model to approximate the annual default rate distributions on portfolios of credits that are differentiated by a bank-assigned credit rating. To approximate the portfolio credit loss distribution for each credit grade, the AIRB framework uses the Vasicek default rate model along with bank estimates of loss given default (LGD) and exposure at default (EAD). Regulatory capital requirements are set equal to the 99.9 percent upper-tail critical value of the portfolio loss distribution associated with each credit grade.

The losses on the value of an initial pool of loans depend on the default probability and loss severity of the underlying firms. Additionally, the degree of dependence between one firm’s default and another, known as the default correlation, plays an important role in the timing of the firm’s defaults (whether they tend to cluster or they are completely unrelated, independent events). To derive a portfolio loss distribution, knowing the individual default probabilities of the firms isn’t enough; we also need to know their correlation structure so that we can quantify the strength of the relationship between them.

The Basel II Advanced Internal Ratings Based framework uses Vasicek’s asymptotic single factor model to set minimum regulatory capital requirements. The Gaussian single factor model of portfolio credit losses (a.k.a. the Vasicek model), developed by Vasicek (1987), Finger (1999), Sch?nbucher (2000), Gordy (2003) and others, provides an approximation for the distribution of the default rate on a well-diversified credit portfolio. The model measures the aggregate value of the losses generated by defaulting credits and the income earned on non-defaulting credits is not included. The model assumes that uncertainty on credit \(i\) is driven by a latent, unbserved factor, \(X_i\), with the following properties:

\[X_i = \sqrt{\rho}*Z + \sqrt{1-\rho}*Zvar_i\]

where

\(Z\) is a common systematic risk factor affecting all firms (e.g., the state of the economy)
\(Zvar_i\) is an idiosyncratic factor independent for each firm (e.g., management, innovations, sales, etc.)
\(\rho\) is the correlation coefficient between each firm and is the same for any two firms

Credit \(i\) is assumed to default when its latent factor \(X_i\) takes on a value less than a credit-specific threshold,

\(X_i < Threshold\).

The critical concept is that, from a banker’s perspective and at a fundamental level, a company exists to service it’s debt. When the company’s assets fall below debt levels, it can no longer service it’s debt and so will go into default. We know that on a macro level, there is one thing that effects all company’s large and small - the state of the economy. In order to reflect that fact, we can let \(Z\) be a random number drawn from a standard normal distribution (with mean 0 and standard deviation 1) represent the state of the economy. If we draw a high number, then we are in a good economic state, if we draw a low number than we are in bad economic state. In order to reflect the idiosyncrasy of an individual company, we can let \(Zvar\) be a random number (also selected from a standard normal distribution with mean 0 and standard deviation 1) specific to company \(i\). Company \(i\)’s success or failure has been determined by its management, product sales, etc. In order to weight the macro-economic effect we will multiply \(Z\) by the portfolio correlation (i.e., the strength of the relationship between all the loans in the portfolio) and to weight the company “uniqueness” appropriately we will multiply it by \(1-\rho\) (i.e, the opposite of its relationship to other firms). For each iteration \(m\), we set a new economic state \(Z\), which all companies will share for that iteration. A new \(Zvar\) is drawn for each individual firm within each iteration. When we compute enough simulations, the loss scenarios we collect will eventually converge on the expected loss for the portfolio and we can then observe the full loss distribution and the associated likelihoods for the entire range of losses.

To illustrate how this would work for a single borrower, let’s set the PD for our borrower at 8% and run the simulation using the latent factor \(X_i\) to determine how many times in M iterations we get a default. We’ll notice that when M is small, the default rate will likely not be 8%, but as M gets larger we will begin to approximate 8% (i.e., in the long run our default rate will converge on the expected rate).

## Monte Carlo 
M <- 10000  
## Defining variables
rho <- 0.09    ## correlation factor of portfolio...assuming at 0.09 for trial
X <- numeric(M)
threshold <- numeric(M)
iteration <- numeric(M)
set.seed(777)
Z <- rnorm(M, mean=0, sd=1)   ## generating common risk factor
Zvar <- rnorm(M, mean=0, sd=1)  ## generating N idiosyncratic risk factors
  for (m in 1:M) {
    iteration[m] <- m
    X[m] <- sqrt(rho)*Z[m] + sqrt(1-rho)*Zvar[m]
    threshold[m] <- qnorm(0.08, mean=0, sd=1)  ## PD set at 8%
    }
sim <- as.data.frame(cbind(iteration,X,threshold))
library(dplyr)
sim <- mutate(sim, Default = (X < threshold))

## Warning: package 'ggplot2' was built under R version 3.1.3

### Plot for first 20 M
p <- qplot(iteration, X, data=sim[1:20,], colour) + geom_hline(aes(yintercept=threshold[1]), size=1, linetype="dotted") + 
  geom_point(aes(colour=factor(Default)), size=2) +
  labs(title="Illustration of Default Threshold for a Single Borrower", x="Iteration", y="Value of X")
p

drate1 <- sum(sim$Default[1:20])/20
drate1

## [1] 0.05

In the plot above we can see a horizontal dotted line which represents the borrower’s default threshold. When \(X_i\) falls below the line then there is a default (default events are marked light blue). Here we can see our default rate for this borrower over the first 20 iterations is \(0.05\). But as we run the simulation more times it will converge on 0.08.

### Plot for all M
p <- qplot(iteration, X, data=sim, colour) + geom_hline(aes(yintercept=threshold[1]), size=1, linetype="dotted") + 
  geom_point(aes(colour=factor(Default)), size=2) +
  labs(title="Illustration of Default Threshold for a Single Borrower", x="Iteration", y="Value of X")
p

drate2 <- sum(sim$Default)/(dim(sim)[1])
drate2

## [1] 0.0795

The default rate after running many iterations is now \(0.0795\). And all the default events are indicated in light blue. As we compute the simulation for all the borrowers in the portfolio we can expect the same result as we’ve seen here for one borrower.

Data Preparation

The first step is reading in our prepared portfolio that has a PD, LGD, and EAD for each borrower and checking the final product. For the purpose of illustration we will simulate our PDs and LGDs. The development of PDs and LGDs is not covered here, but PDs are taken from the historical risk grade transition matrix and LGDs from historical losses and assigned to each loan in the portfolio according to their risk grade and/or industry.

## sample portfolio for trial runs
N <- 1000
ID <- seq(from=1, to=N, by = 1)
PD <- rep_len(c(0.00001, 0.01, 0.08, 0.0002), length.out = N)
LGD <- rep(.5, N)
EAD <- rep(1000000, N)
Portfolio <- data.frame(ID, PD, LGD, EAD)

Then there are a few parameters to define:

N is the number of loans in the portfolio
rho is the portfolio correlation
M is the number of iterations in the simulation
x will be the loss (in dollars) for each iteration
rate will be the default rate per iteration

N <- dim(Portfolio)[1]  ## gives us the number of loans in the dataset
rho <- 0.09    ## sets the portfolio correlation to be used in the simulation
M <- 20000   ## number of iterations
x <-numeric(M)   ## initializes loss vector
rate <- numeric(M)  ## initializes rate vector

Lastly, we run the simulation:

set.seed(777)  
for (m in 1:M) {
    Loss <- 0
    DefaultCount <- 0
    DefaultRate <- 0
  
    Z <- rnorm(1, mean=0, sd=1)   ## generating common risk factor
    Zvar <- rnorm(N, mean=0, sd=1)  ## generating N idiosyncratic risk factors
  
    for (i in 1:N) {
      X <- sqrt(rho)*Z + sqrt(1-rho)*Zvar[i]   ## evaluating X for each loan i
      threshold <- qnorm(Portfolio$PD[i], mean=0, sd=1)   ## setting loan i's default threshold 
      if (X < threshold) {
        Loss <- Loss + Portfolio$LGD[i]*Portfolio$EAD[i]   ## maintaining a running total of Losses
        DefaultCount <- DefaultCount + 1      ## counting +1 for a defaulted loan
        }
      DefaultRate <- DefaultCount/N
      }
    x[m] <- Loss       ## capturing total portfolio loss per iteration
    rate[m] <- DefaultRate     ## capturing total default rate per iteration
    }

So, after the simulation is complete, how does it compare to just calculating \(EL = PD * LGD * EAD\)?

ExpectedLoss <- sum(PD*LGD*EAD)
ExpectedLoss

## [1] 11276250

SimMean <- mean(x)
SimMean

## [1] 11222250

The two values are very close as we have run enough simulations for our mean to converge on the most likely outcome. Additionally, we now have a whole range of losses and their associated likelihood to consider.

Value at Risk set at 99.9% can be used to calculate minimum regulatory capital requirements in the Basel II AIRB approach. The interpretation is that when capital is set at this level, 99.9 percent of all potential portfolio credit losses will be less than the capital allocation.

If we want to look at the distribution of default rates across all of the iterations. The dotted line is the mean default rate from all M iterations.