Mixed Competing Risk Models

dummy slide

Introduction

\[ \renewcommand\vec{\boldsymbol} \def\bigO#1{\mathcal{O}(#1)} \def\Cond#1#2{\left(#1\,\middle|\, #2\right)} \def\mat#1{\boldsymbol{#1}} \def\der{{\mathop{}\!\mathrm{d}}} \def\argmax{\text{arg}\,\text{max}} \def\Prob{\text{P}} \def\Expec{\text{E}} \def\logit{\text{logit}} \def\diag{\text{diag}} \]

Survival Analysis

Many examples are not binary but has a time aspect to them.

Buying a given product, terminating a subscription, or getting a new job.

Interested in the distribution of the time till an event

and therefore also for models for the probability of an event occurring over some interval given that it is has not occurred yet.

Example: first diagnosis of cancer.

Have to account for missing event times due to censoring.

Incomplete follow-up. E.g. end of follow-up or death.

Competing Risk

May have competing risk.

First diagnosed with breast cancer, colon cancer, kidney cancer, etc.

Still have to account for missing event times due to censoring.

Mixed Competing Risk

Do related individuals tend to have the same type of cancer?

Do related individuals tend to have early or later onset of a given type of cancer given that they get it?

Do related individuals who tend to have the same type of cancer and also tend to have earlier or later onset?

Competing Risk

Comments

We focus on a particular type of competing risk models.

Others exist and some more common models exist!

Cumulative Incidence

There are \(K\) competing risks. Let \(\epsilon_j^*\in\{1,\dots,K\}\) be the cause indicator.

\(j = 1,\dots,m\) observations.

Let \(T_j^*\) be the event time of cause \(\epsilon_j^*\).

We are interested in the cumulative incidence function

\[ F_{jk}(t) = P(T_j^*\leq t, \epsilon_j^* = k). \]

Cumulative Incidence Properties

To recall \[ F_{jk}(t) = P(T_j^*\leq t, \epsilon_j^* = k). \]

\(F_{jk}(t)\) must be monotonic.

The probability is the same or larger if we wait longer.

\[ F_j(t) = P(T_j^*\leq t) = \sum_{k = 1}^KF_{jk}(t). \] We usually assume \(\lim_{t \rightarrow\infty} F_j(t) = 1\). Moreover, for interesting examples, \(\lim_{t\rightarrow\infty}F_{jk}(t) < \lim_{t\rightarrow\infty} F_j(t) = 1\).

Cumulative Incidence Properties

To recall \[ F_{jk}(t) = P(T_j^*\leq t, \epsilon_j^* = k). \]

So \[\lim_{t\rightarrow \infty} F_{jk}(t) = P(\epsilon_j^* = k).\]

Model

Decompose \(F_{jk}(t)\) into

\[ F_{jk}(t) = \underbrace{P(T_j^*\leq t \mid \epsilon_j^* = k)}_{\text{trajectory}} \underbrace{P(\epsilon_j^* = k)}_{\text{risk}}. \]

Model (Cont.)

Issue \[\lim_{t\rightarrow \infty} F_{jk}(t) = P(\epsilon_j^* = k).\]

Pick a \(\tau\) and use

\[ F_{jk}(t) = P(T_j^*\leq t \mid T_j^*\leq \tau, \epsilon_j^* = k) P(T_j^* \leq \tau, \epsilon_j^* = k). \]

Thus, \[F_{jk}(\tau) = P(T_j^* \leq \tau, \epsilon_j^* = k).\]

Model (Cont.)

Model: Risk

Have covariates \(\vec x_j\) which the cumulative incidence function may depend on.

Set \[ P(T_j^* \leq \tau, \epsilon_j^* = k) = \pi_{k}(\vec x_j) = \frac{\exp(\vec x_j^\top\vec\beta_k)} {1 + \sum_{l=1}^K\exp(\vec x_j^\top\vec\beta_l)}. \]

Can use other multinomial models.

Model: Trajectory

\[ \begin{align*} P(T_j^*\leq 0 \mid T_j^*\leq \tau, \epsilon_j^* = k) = 0 \\ P(T_j^*\leq \tau \mid T_j^*\leq \tau, \epsilon_j^* = k) = 1 \end{align*} \]

Take a cumulative distribution function \(G\) with support on \(\mathbb R\) and use

\[ P(T_j^*\leq t \mid T_j^*\leq \tau, \epsilon_j^* = k) = G(a_k(t) - \vec x_j^\top\vec\gamma_k) \]

for some \(a_k:\,(0,\tau)\rightarrow\mathbb R\)

which is monotonic and increasing. Can be estimated with a constrained spline.

Model: Combined

Set \(G\) to be the normal distribution cumulative density function \(\Phi\).

Gives some computational advantages later.

The model is \[ \begin{align*} F_{jk}(t) &= \underbrace{P(T_j^*\leq t \mid \epsilon_j^* = k)}_{\text{trajectory}} \underbrace{P(\epsilon_j^* = k)}_{\text{risk}} \\ &= \Phi(a_k(t) - \vec x_j^\top\vec\gamma_k)\pi_{k}(\vec x_j)\\ \pi_{k}(\vec x_j) &= \frac{\exp(\vec x_j^\top\vec\beta_k)} {1 + \sum_{l=1}^K\exp(\vec x_j^\top\vec\beta_l)}. \end{align*} \]

Mixed Competing Risk

Introduction

Families may tend to have jointly higher or lower risk for some cause.

Families may tend to have early or later onset given that they get a cause.

Families in which they tend to get a certain cause may also tend to get the cause earlier or later.

Notation

We have \(n\) cluster.

Think families.

Each have \(j = 1,\dots,m_i\) observations.

The cumulative incidence functions becomes

\[ \begin{align*} F_{ijk}(t) &= \Phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k)\pi_{k}(\vec x_{ij})\\ \pi_{k}(\vec x_{ij}) &= \frac{\exp(\vec x_{ij}^\top\vec\beta_k)} {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l)}. \end{align*} \]

Random Effect for the Risk

Let \(\vec U_i\sim N^{(K)}(\vec 0, \mat\Sigma_{11})\) be unobserved random effects which affects the risk part.

\[ \begin{align*} F_{ijk}(t \mid\vec u_i) &= \Phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k)\pi_{k}(\vec x_{ij}, \vec u_i)\\ \pi_{k}(\vec x_{ij}, \vec u_i) &= \frac{\exp(\vec x_{ij}^\top\vec\beta_k + u_{ik})} {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l + u_{il})}. \end{align*} \]

\(u_{ik}\) makes cause \(k\) more likely all else being equal.

Random Effect for the Trajectory

Let \(\vec \eta_i\sim N^{(K)}(\vec 0, \mat\Sigma_{22})\) be an unobserved random effect which affects the trajectory part.

\[ \begin{align*} F_{ijk}(t \mid\vec u_i,\vec\eta_i) &= \Phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k -\eta_{ik})\pi_{k}(\vec x_{ij}, \vec u_i)\\ \pi_{k}(\vec x_{ij}, \vec u_i) &= \frac{\exp(\vec x_{ij}^\top\vec\beta_k + u_{ik})} {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l + u_{il})}. \end{align*} \]

A higher \(\eta_{ik}\) makes it more likely that cause \(k\) happens later given that it happens.

Dependence Between Random Effects

Let \[ \begin{pmatrix} \vec U_i \\ \vec\eta_i \end{pmatrix} \sim N^{(2K)}\left(\vec 0, \begin{pmatrix} \mat\Sigma_{11} & \mat\Sigma_{12} \\ \mat\Sigma_{21} & \mat\Sigma_{22} \end{pmatrix}\right) \]

\(\text{Cov}(\eta_{ik},u_{ik}) < 0\) implies that a higher risk within a family of cause \(k\) is associated with higher likelihood of an early onset.

Implementation

Overview

Cover the conditional likelihood.

Cover the pairwise composite likelihood.

Conditional Likelihood

Let \(T_{ij}\) be the observed time.

Last follow up or event time \(T_{ij}^*\), whichever comes first.

Let \(\epsilon_{ij} = 1_{\{T_i = T_{ij}^*\}}\epsilon_{ij}^* \in\{0,\dots,K\}\) be the observed cause indicator.

Conditional Likelihood: Observed

The density if we observe one of the causes is

\[ \begin{align*} f_{ijk}(t \mid\vec u_i,\vec\eta_i) &= \frac\partial{\partial t}F_{ijk}(t \mid\vec u_i, \vec\eta_i) \\ &= a_k'(t)\phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k -\eta_{ik}) \pi_{k}(\vec x_{ij}, \vec u_i) \end{align*} \]

\(\phi\) is the normal density.

In the observed case, \(\epsilon_{ij} > 0\), the likelihood factor is

\[l_{ij}(\vec u_i,\vec\eta_i) = f_{ijk}(t_{ij} \mid\vec u_i,\vec\eta_i).\]

Easy to compute!

Conditional Likelihood: Censored

In the censored case, \(\epsilon_{ij} = 0\), with \(t_{ij} < \tau\) the likelihood factor is

\[ l_{ij}(\vec u_i,\vec\eta_i) = 1 - \sum_{k = 1}^KP(T_{ij}^* \leq t_{ij}, \epsilon_j^* = k \mid \vec u_i,\vec\eta_i) \]

At least \(\bigO{K}\).

Conditional Likelihood: Censored

In the censored case, \(\epsilon_{ij} = 0\), with \(t_{ij} = \tau\) the likelihood factor is

\[ \begin{align*} l_{ij}(\vec u_i,\vec\eta_i) &= 1 - \sum_{k = 1}^KP(T_{ij}^* \leq \tau, \epsilon_j^* = k \mid \vec u_i,\vec\eta_i) \\ &= \frac 1 {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l + u_{il})} \end{align*} \]

Easy to compute!

Is not emphasized and perhaps not used by Cederkvist et al. (2018).

Marignal Likelihood

Combined we get \(\prod_{j = 1}^{m_i} l_{ij}(\vec u_i,\vec\eta_i)\).

But we need the marginal likelihood

\[ E\left(\prod_{j = 1}^{m_i} l_{ij}(\vec U_i,\vec\eta_i)\right). \]

Requires \(2K\) dimensional integration. Using quadrature with \(b\) nodes gives an implementation which behaves like \(\bigO{m_ib^{2K}K^3}\)

for the \(K\)s we consider.

Pairwise Composite Likelihood

We can work with the miss specified likelihood

\[ \prod_{j_1 = 1}^{m_i - 1}\prod_{j_2 = j_1 + 1}^{m_i} E\left(l_{ij_1}(\vec U_i,\vec\eta_i)l_{ij_2}(\vec U_i,\vec\eta_i)\right). \]

See Varin, Reid, and Firth (2011) for an overview of composite likelihood.

Now \(\bigO{m_i^2b^{2K}K^3}\)!

Each term is though easier to compute.

But for the particular model, we can compute it in \(\bigO{m_i^2b^KK^4}\) (Cederkvist et al. 2018).

Implementation

Made a new and faster implementation in the mmcif package.

The composite likelihood term is a bit tedious to implement with \(3^2 - 2 = 7\) types of composite likelihood terms.

Thank You!

The presentation is at rpubs.com/boennecd/KTH-competing.

The markdown is at github.com/boennecd/Talks.

The implementation is at github.com/boennecd/mmcif.

References are on the next slide.

References

Cederkvist, Luise, Klaus K Holst, Klaus K Andersen, and Thomas H Scheike. 2018. Modeling the cumulative incidence function of multivariate competing risks data allowing for within-cluster dependence of risk and timing.” Biostatistics 20 (2): 199–217. https://doi.org/10.1093/biostatistics/kxx072.
Varin, Cristiano, Nancy Reid, and David Firth. 2011. “AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS.” Statistica Sinica 21 (1): 5–42. http://www.jstor.org/stable/24309261.