Frailty Models for Corporate Default Prediction

dummy slide

Motivation

Goal: assess risk of a portfolio with loans to a number of firms.

Motivation

Loss of bank \(b\) at time \(t\), \(L_{bt}\), is given by

\[ L_{bt} = \sum_{i:\in I_{bit}} E_{bit} L_{bit} Y_{it} \]

\(E_{bit}\) is exposure (“loan size”) to firm \(i\).
\(L_{bit}\in[0,1]\) is loss fraction given default.
\(Y_{it}\) is whether firm \(i\) defaults.

I focus on \(Y_{it}\). \(L_{bit}\) is also interesting. Joint modelling might be even more interesting.

Talk

The hazard models I use and implemented software.

Literature overview

Quick overview and my contributions.

Base models

We model the instantaneous hazard rate of firm \(i\) at time \(t\)

\[ \lambda_i(t) = \lim_{h\rightarrow 0^+}\frac{P\left(T_i \leq t + h\mid T_i \geq t\right)}{h} \]

as a piecewise constant function of firm variables \(\vect x_{ik}\) and macro variables \(\vect m_k\)

\[ \lambda_i(t) = \lambda_{ik}= \exp\left(\vect\beta^\top\vect x_{ik} + \vect\gamma^\top\vect m_k\right), \quad k - 1 < t \leq k \]

Units are on monthly, quarterly, half-yearly, or annual scale.

Base models – performance

Good at sorting by riskiness.

Concordance index for one year ahead prediction in \([0.80,0.85]\) for private firms and \(\geq0.90\) for public firms.

Harder to predict the level.

Frailty

Add random effect to account for clustering

\[ \begin{aligned} \lambda_{ik} &= \exp\left(\vect\beta^\top\vect x_{ik} + \vect\gamma^\top\vect m_k +A_k\right) \\ A_k &= \theta A_{k-1}+\epsilon_k & \epsilon_k\sim N\left(0,\sigma^2\right) \end{aligned} \]

My work

Relax linearity assumption

\[ \begin{aligned} \lambda_{ik} &= \exp\left(\vect\beta^\top\vect x_{ik}^{(1)} +\sum_{i=1}^pf_p(x_{ikp}^{(2)};\vect\nu)+ \vect\gamma^\top\vect m_k +A_k\right) \\ A_k &= \theta A_{k-1}+\epsilon_k \\ \epsilon_k&\sim N\left(0,\sigma^2\right) \end{aligned} \]

where \(\vect x_{ik} = (\vect x_{ik}^{(1)\top},\vect x_{ik}^{(2)\top})^\top\).

My work – multivariate random variable

Consider

\[ \begin{aligned} \lambda_{ik} &= \exp\left(\vect\beta^\top\vect x_{ik}^{(1)} +\sum_{i=1}^pf_p(x_{ikp}^{(2)};\vect\nu)+ \vect\gamma^\top\vect m_k + \vect A_k^\top\vect z_{ik}\right) \\ \vect A_k &= F \vect A_{k-1}+\vect \epsilon_k \\ \vect \epsilon_k&\sim N\left(\vect 0,Q\right) \end{aligned} \]

Estimation

Implementations of particle filter and smoothers.

Fast approximation

E.g., (very) fast approximation

Pseudo-likelihood estimation based on linearization.
Extended Kalman and unscented Kalman filters.
and other approximations…

Start of dynamichazard package.

Monte Carlo methods

Use a Monte Carlo expectations maximization algorithm.

Get arbitrary precision in E-step.

Need to sample from \(Td\) dimensional space.

\(T\) is number of time periods and \(\vect A_k \in \mathbb R^d\).

Particle filter

Go from importance sampling of \(\vect A_1 \mid \vect y_1\) density

\[ \begin{aligned} f(\vect A_1) &\approx \sum_{i =1}^N w_1^{(i)}\delta_{\vect a_1^{(i)}}(\vect A_1)\\ \vect y_t &= \{y_{it}\}_{i\in R_t} \end{aligned} \]

to sequential importance sampling.

Can sample in \(\bigO{Td}\) time.

Particle filter – idea

\[ L = \int \mu_0(\vect A_1)g_1\Cond{\vect y_1}{\vect A_1} \prod_{t = 2}^T g_t\Cond{\vect y_t}{\vect A_t}f\Cond{\vect A_t}{\vect A_{t - 1}} \diff \vect A_{1:T} \]

Particle filter – idea

Given some discrete approximation \(\{\vect a_{t-1}^{(i)}, w_{t-1}^{(i)}\}_{i=1,\dots,N}\) of \(P\Cond{\vect A_{t-1}}{\vect y_{1:(t-1)}}\) and proposal distribution \(q_t\)

Sample \(\vect a_t^{(i)} \sim q_t\Cond{\cdot}{\vect y_t, \vect a_{t-1}^{(i)}}\).
Update and normalize weights \[ \begin{aligned} \tilde w_t^{(i)} &= w_{t -1}^{(i)} \frac{ g_t\Cond{\vect y_t}{\vect a_t^{(i)}} f\Cond{\vect a_t^{(i)}}{\vect a_{t -1}^{(i)}} }{ q_t\Cond{\vect a_t^{(i)}}{\vect y_t, \vect a_{t-1}^{(i)}} } \\ w_t^{(i)} &= \tilde w_t^{(i)} / \sum_{k = 1}^N \tilde w_t^{(k)} \end{aligned} \]

Particle filter – computational considerations

Evaluating \(g_t\Cond{\vect y_t}{\vect a_t^{(i)}}\) is the main issue: \(\bigO{\lvert R_t\rvert}\).

All \(N\) computations can easily be done in parallel and scales nicely in number threads.

Particle smoothers and more

Improve sampling.

Use auxiliary particle filter as suggested by Pitt and Shephard (1999).

Needs smoothing for E-step.

dynamichazard contains an implementation of the smoothers suggested by Briers, Doucet, and Maskell (2009) and Fearnhead, Wyncoll, and Tawn (2010).

Want the observed information matrix.

Use methods suggested by Cappe and Moulines (2005) and Poyiadjis, Doucet, and Singh (2011).

Thank you!

Slides are on rpubs.com/boennecd/YRD-19.

dynamichazard is on CRAN at CRAN.R-project.org/package=dynamichazard.

An example of an application is at ssrn.com/abstract=3339981.

References

Briers, Mark, Arnaud Doucet, and Simon Maskell. 2009. “Smoothing Algorithms for State–space Models.” Annals of the Institute of Statistical Mathematics 62 (1): 61. doi:10.1007/s10463-009-0236-2.

Cappe, O., and E. Moulines. 2005. “Recursive Computation of the Score and Observed Information Matrix in Hidden Markov Models.” In IEEE/Sp 13th Workshop on Statistical Signal Processing, 2005, 703–8. doi:10.1109/SSP.2005.1628685.

Fearnhead, Paul, David Wyncoll, and Jonathan Tawn. 2010. “A Sequential Smoothing Algorithm with Linear Computational Cost.” Biometrika 97 (2). [Oxford University Press, Biometrika Trust]: 447–64. http://www.jstor.org/stable/25734097.

Pitt, Michael K., and Neil Shephard. 1999. “Filtering via Simulation: Auxiliary Particle Filters.” Journal of the American Statistical Association 94 (446). [American Statistical Association, Taylor & Francis, Ltd.]: 590–99. http://www.jstor.org/stable/2670179.

Poyiadjis, George, Arnaud Doucet, and Sumeetpal S. Singh. 2011. “Particle Approximations of the Score and Observed Information Matrix in State Space Models with Application to Parameter Estimation.” Biometrika 98 (1). Biometrika Trust: 65–80. http://www.jstor.org/stable/29777165.