Modeling Frailty Correlated Defaults

dummy slide

Motivation

\[ \definecolor{gray}{RGB}{192,192,192} \def\vect#1{\boldsymbol #1} \def\bigO#1{\mathcal{O}(#1)} \def\Cond#1#2{\left(#1 \mid #2\right)} \def\diff{{\mathop{}\!\mathrm{d}}} \]

Motivation

Want to model the loss distribution of b banks.

Motivation

Want to model the loss distribution of b banks.

Loss is given by

\[ L_{bt} = \sum_{i\in R_{bt}} E_{bit}G_{bit}Y_{it} \]

\(R_{bt}\): risk set, \(E_{bit}\in (0,\infty)\): exposure, \(G_{bit}\in[0,1]\): loss-given-default, and \(Y_{it}\in\{0,1\}\): default indicator.

Motivation

Want to model the loss distribution of b banks.

Loss is given by

\[ L_{bt} = \sum_{i\in R_{bt}} \color{gray}{E_{bit}G_{bit}}Y_{it} \]

\(R_{bt}\): risk set, \(E_{bit}\in (0,\infty)\): exposure, \(G_{bit}\in[0,1]\): loss-given-default, and \(Y_{it}\in\{0,1\}\): default indicator.

Focus on \(Y_{it}\).

First Idea

Assume conditional independence and e.g. let the default intensity be

\[\log\lambda_{it} = \vect\beta^\top \vect x_{it} + \vect\gamma^\top \vect z_t\]

So the probability of default is

\[ \begin{multline*} P(Y_{it}=1\mid Y_{i1}=\cdots=Y_{i,t-1}=0) = 1 - \exp\left(-\lambda_{it}\right) \end{multline*} \]

Poor choice for tail risk if invalid.

Aggregate Fit

In-sample predicted less realized default rate. Black bars are outside 90 pct. confidence intervals.

Add Frailty

Duffie et al. (2009) suggest to generalize to

\[ \begin{aligned} \log\lambda_{it} &= \vect\beta^\top \vect x_{it} + \vect\gamma^\top \vect z_t + A_t \\ A_t &\sim \theta A_{t-1} + \epsilon_t \\ \epsilon_t&\sim N(0,\sigma^2) \end{aligned} \]

The auto-regressive frailty, \(A_k\), captures excess clustering.

Remarks

Only time-varying intercept?

Findings in Lando et al. (2013), Filipe, Grammatikos, and Michala (2016), and Jensen, Lando, and Medhat (2017) suggest not.

Only linear effects on the log hazard scale?

Findings in Berg (2007), Christoffersen, Matin, and Mølgaard (2018), and the ML literature suggest not.

Generalize

\[ \begin{aligned} \log\lambda_{it} &= \vect\beta^{(1)\top}\vect x_{it}^{(1)} + \vect\gamma^\top \vect z_t + \vect\beta^{(2)\top} \vect f(\vect x_{it}^{(2)}) + \vect A_t^\top\vect u_{it} \\ \vect A_t &\sim F\vect A_{t-1} + \vect \epsilon_t \\ \vect \epsilon_t&\sim \vect N(\vect 0, Q) \\ \vect x_{it} &= \left(\vect x_{it}^{(1)\top}, \vect x_{it}^{(2)\top}\right)^\top \end{aligned} \]

\(\vect A_t \in \mathbb{R}^p\) is low dimensional and some elements in \(\vect u_{it}\) and \(\vect x_{it}\) may match.

Generalize

\[ \begin{aligned} \log\lambda_{it} &= \color{gray}{\vect\beta^{(1)\top}\vect x_{it}^{(1)} + \vect\gamma^\top \vect z_t} + \vect\beta^{(2)\top} \vect f(\vect x_{it}^{(2)}) + \vect A_t^\top\vect u_{it} \\ \color{gray}{\vect A_t} &\sim F \color{gray}{\vect A_{t-1} + \vect \epsilon_t} \\ \color{gray}{\vect \epsilon_t} &\color{gray}\sim \color{gray}{\vect N(\vect 0, }Q\color{gray}) \\ \color{gray}{\vect x_{it}} & \color{gray}= \color{gray}{ \left(\vect x_{it}^{(1)\top}, \vect x_{it}^{(2)\top}\right)^\top} \end{aligned} \]

\(\vect A_t \in \mathbb{R}^p\) is low dimensional and some elements in \(\vect u_{it}\) and \(\vect x_{it}\) may match.

Talk Overview

Estimation method

Brief description of the dynamichazard and mssm package.

Analysis and results

Estimation Method

Marginal Likelihood

\[ \begin{aligned} L &= \int_{\mathbb R^{pd}} \mu_0(\vect A_1)g_1\Cond{\vect y_1}{\vect A_1} \\ &\hspace{30pt}\cdot\prod_{t=2}^d g_t\Cond{\vect y_t}{\vect A_t} f\Cond{\vect A_t}{\vect A_{t-1}}\mathrm{d}A_{1:d} \\ \vect y_t &= \{y_{it}\}_{i\in \mathcal{O}_t} \end{aligned} \]

\(\mathcal{O}_t\) is the risk set and \(g_t\) is a conditional density.

Monte Carlo Method

Use Monte Carlo expectation maximization (EM).

Approximate E-step with a particle smoother.

Get arbitrary precision.

Particle Smoother

Implemented the generalized two-filter smoother suggested by Briers, Doucet, and Maskell (2009).

Method is \(\bigO{N^2}\)

where \(N\) is the number of particles. Not a problem for \(N<2000\). Can be reduced to an average case \(\bigO{N\log N}\) with a dual k-d tree approximation the mssm package.

Implemented the particle smoother suggested by Fearnhead, Wyncoll, and Tawn (2010).

\(\bigO{N}\) with some extra overhead per particle.

Features

Few options for conditional model given state variables.

Discrete time models with logit and cloglog link function and log link in continuous time.

Implemented approximate gradient and observed information matrix.

Both method suggested by Poyiadjis, Doucet, and Singh (2011) and method mentioned in Cappe and Moulines (2005). The the mssm package has a dual k-d tree approximation for these methods.

Implemented in C++ and supports computation in parallel.

More Software

rollRegres package: fast rolling regression.

DtD package: fast estimation of the Merton model.

Analysis and Results

Summary

Add covariates, non-linear effects, and a random slope to model in Duffie, Saita, and Wang (2007) and Duffie et al. (2009).

Find less evidence of time-varying intercept.

As shown by Lando and Nielsen (2010).

Provide evidence of time-varying size slope.

Show improved firm-level performance and industry-level performance.

Data Source

Default data from Moody’s Default Risk Service Database.

Covariates from Compustat and CRSP.

Additions

  • Working capital to size.
  • Operating income / size.
  • Market value / total liabilities.
  • Net income / size.
  • Total liabilities / size.
  • Current ratio.
  • Log relative market size.
  • Idiosyncratic volatility.

All have been used previously.

E.g., see Shumway (2001) and Chava and Jarrow (2004). Size is defined as 50 pct. total assets and 50 pct. market value.

Estimates without Random Effects

The figures in the parentheses are Wald \(\chi^2\) statistics. \(\mathcal{M_1}\): model similar to Duffie, Saita, and Wang (2007), \(\mathcal{M_2}\): model with additional variables, and \(\mathcal{M_3}\): model with non-linear effects and an interaction.

Estimates without Random Effects

Large difference in log-likelihood.

Similar to evidence by Lando and Nielsen (2010) and Bharath and Shumway (2008).

Estimated Splines

Estimated partial effect. A histogram is shown at the bottom. The heights of the bars are unrelated to the y-axis.

Estimated Splines

Estimated partial effect. A histogram is shown at the bottom. The heights of the bars are unrelated to the y-axis.

Estimated Splines

Estimated partial effect. A histogram is shown at the bottom. The heights of the bars are unrelated to the y-axis.

Adding Time-Varying Coefficient

\[\begin{aligned} \vec z_{it} &= (\vec x_{it}^\top, \vec m_t^\top, u_{it}, \alpha_t, b_t)^\top \\ g(P(Y_{it} = 1 \mid \vec z_{it})) &= \vec \beta^\top\vec f(\vec x_{it}) + \vec\gamma^\top\vec m_t + \alpha_t + b_tu_{it} \\ \begin{pmatrix}\alpha_t \\ b_t \end{pmatrix} &= \begin{pmatrix}\theta_1 & 0 \\ 0 & \theta_2 \end{pmatrix} \begin{pmatrix}\alpha_{t-1} \\ b_{t-1} \end{pmatrix} + \vec\epsilon_t \\ \vec\epsilon_t & \sim N\left(\vec 0, \begin{pmatrix} \sigma_1^2 & \rho\sigma_1\sigma_2 \\ \rho\sigma_1\sigma_2 & \sigma_2^2 \end{pmatrix}\right) \end{aligned}\]

Adding Time-Varying Coefficient

\[\begin{aligned} \color{gray}{\vec z_{it}} & \color{gray}= \color{gray}{(\vec x_{it}^\top, \vec m_t^\top, u_{it}, \alpha_t, b_t)^\top} \\ \color{gray}{g(P(Y_{it} = 1 \mid \vec z_{it}))} & \color{gray}= \color{gray}{\vec \beta^\top\vec f(\vec x_{it}) + \vec\gamma^\top\vec m_t} + \alpha_t + b_tu_{it} \\ \begin{pmatrix}\alpha_t \\ b_t \end{pmatrix} &= \begin{pmatrix}\theta_1 & 0 \\ 0 & \theta_2 \end{pmatrix} \begin{pmatrix}\alpha_{t-1} \\ b_{t-1} \end{pmatrix} + \vec\epsilon_t \\ \vec\epsilon_t & \sim N\left(\vec 0, \begin{pmatrix} \sigma_1^2 & \rho\sigma_1\sigma_2 \\ \rho\sigma_1\sigma_2 & \sigma_2^2 \end{pmatrix}\right) \end{aligned}\]

Estimates with Random Effects

The figures in the parentheses are Wald \(\chi^2\) statistics. \(\mathcal{M_4}\): model with non-linear effects, an interaction, and a random intercept and \(\mathcal{M_5}\): same as \(\mathcal{M_4}\) with a random relative market size slope.

Estimates with Random Effects

The figures in the parentheses are Wald \(\chi^2\) statistics. \(\mathcal{M_4}\): model with non-linear effects, an interaction, and a random intercept and \(\mathcal{M_5}\): same as \(\mathcal{M_4}\) with a random relative market size slope.

Estimates with Random Effects

The figures in the parentheses are Wald \(\chi^2\) statistics. \(\mathcal{M_4}\): model with non-linear effects, an interaction, and a random intercept and \(\mathcal{M_5}\): same as \(\mathcal{M_4}\) with a random relative market size slope.

Smoothed Predicted Random Effect

Log market size is as in Shumway (2001). This is just the zero-mean random effect \(b_t\).

Out-of-Sample AUCs

Blue: lowest, black: highest. ◇: model as in Duffie, Saita, and Wang (2007), ▽: + covariates and non-linear effects, ▲: + random intercept, and ◆: + random size slope.

Out-of-Sample AUCs

Blue: lowest, black: highest. ◇: model as in Duffie, Saita, and Wang (2007), ▽: + covariates and non-linear effects, ▲: + random intercept, and ◆: + random size slope.

Out-of-Sample Industry Default Rate

Bars: 90% prediction interval, ○: realized rate, other points: median, ◇: model as in Duffie, Saita, and Wang (2007), ▽: + covariates and non-linear effects, ▲: + random intercept, and ◆: + random size slope.

Out-of-Sample Industry Default Rate

Bars: 90% prediction interval, ○: realized rate, other points: median. ◇: model as in Duffie, Saita, and Wang (2007), ▽: + covariates and non-linear effects, ▲: + random intercept, and ◆: + random size slope.

Summary

Summary

Argued for random effects and non-linear effects.

Gave rough overview of the dynamichazard and mssm package.

More details are available in the packages’ vignettes or README.

Showed an application of a corporate default model with multivariate latent factors

and provided evidence of non-linear associations.

Thank You!

Slides are at rpubs.com/boennecd/CSCR-20.

Packages are at github.com/boennecd/dynamichazard and github.com/boennecd/mssm and CRAN.

References are on the next slide.

References

Berg, Daniel. 2007. “Bankruptcy Prediction by Generalized Additive Models.” Applied Stochastic Models in Business and Industry 23 (2). John Wiley & Sons, Ltd.: 129–43. https://doi.org/10.1002/asmb.658.

Bharath, Sreedhar T., and Tyler Shumway. 2008. “Forecasting Default with the Merton Distance to Default Model.” The Review of Financial Studies 21 (3): 1339–69. https://doi.org/10.1093/rfs/hhn044.

Briers, Mark, Arnaud Doucet, and Simon Maskell. 2009. “Smoothing Algorithms for State–Space Models.” Annals of the Institute of Statistical Mathematics 62 (1): 61. https://doi.org/10.1007/s10463-009-0236-2.

Cappe, O., and E. Moulines. 2005. “Recursive Computation of the Score and Observed Information Matrix in Hidden Markov Models.” In IEEE/Sp 13th Workshop on Statistical Signal Processing, 2005, 703–8. https://doi.org/10.1109/SSP.2005.1628685.

Chava, Sudheer, and Robert A. Jarrow. 2004. “Bankruptcy Prediction with Industry Effects *.” Review of Finance 8 (4): 537–69. https://doi.org/10.1093/rof/8.4.537.

Christoffersen, Benjamin, Rastin Matin, and Pia Mølgaard. 2018. “Can Machine Learning Models Capture Correlations in Corporate Distresses?”

Duffie, Darrell, Andreas Eckner, Guillaume Horel, and Leandro Saita. 2009. “Frailty Correlated Default.” The Journal of Finance 64 (5). Blackwell Publishing Inc: 2089–2123. https://doi.org/10.1111/j.1540-6261.2009.01495.x.

Duffie, Darrell, Leandro Saita, and Ke Wang. 2007. “Multi-Period Corporate Default Prediction with Stochastic Covariates.” Journal of Financial Economics 83 (3): 635–65. https://doi.org/https://doi.org/10.1016/j.jfineco.2005.10.011.

Fearnhead, Paul, David Wyncoll, and Jonathan Tawn. 2010. “A Sequential Smoothing Algorithm with Linear Computational Cost.” Biometrika 97 (2). [Oxford University Press, Biometrika Trust]: 447–64. http://www.jstor.org/stable/25734097.

Filipe, Sara Ferreira, Theoharry Grammatikos, and Dimitra Michala. 2016. “Forecasting Distress in European Sme Portfolios.” Journal of Banking & Finance 64: 112–35. https://doi.org/https://doi.org/10.1016/j.jbankfin.2015.12.007.

Jensen, Thais, David Lando, and Mamdouh Medhat. 2017. “Cyclicality and Firm-Size in Private Firm Defaults.” International Journal of Central Banking 13 (4): 97–145.

Lando, David, Mamdouh Medhat, Mads Stenbo Nielsen, and Søren Feodor Nielsen. 2013. “Additive Intensity Regression Models in Corporate Default Analysis.” Journal of Financial Econometrics 11 (3): 443–85. https://doi.org/10.1093/jjfinec/nbs018.

Lando, David, and Mads Stenbo Nielsen. 2010. “Correlation in Corporate Defaults: Contagion or Conditional Independence?” Journal of Financial Intermediation 19 (3): 355–72. https://doi.org/https://doi.org/10.1016/j.jfi.2010.03.002.

Poyiadjis, George, Arnaud Doucet, and Sumeetpal S. Singh. 2011. “Particle Approximations of the Score and Observed Information Matrix in State Space Models with Application to Parameter Estimation.” Biometrika 98 (1). Biometrika Trust: 65–80. http://www.jstor.org/stable/29777165.

Shumway, Tyler. 2001. “Forecasting Bankruptcy More Accurately: A Simple Hazard Model.” The Journal of Business 74 (1). The University of Chicago Press: 101–24. http://www.jstor.org/stable/10.1086/209665.