\[ \renewcommand\vec{\boldsymbol} \def\bigO#1{\mathcal{O}(#1)} \def\Cond#1#2{\left(#1\,\middle|\, #2\right)} \def\mat#1{\boldsymbol{#1}} \def\der{{\mathop{}\!\mathrm{d}}} \def\argmax{\text{arg}\,\text{max}} \def\Prob{\text{P}} \def\Expec{\text{E}} \def\logit{\text{logit}} \def\diag{\text{diag}} \]
Buying a given product, terminating a subscription, or getting a new job.
and therefore also for models for the probability of an event occurring over some interval given that it is has not occurred yet.
Example: first diagnosis of cancer.
Incomplete follow-up. E.g. end of follow-up or death.
First diagnosed with breast cancer, colon cancer, kidney cancer, etc.
Still have to account for missing event times due to censoring.
Do related individuals tend to have the same type of cancer?
Do related individuals tend to have early or later onset of a given type of cancer given that they get it?
Do related individuals who tend to have the same type of cancer and also tend to have earlier or later onset?
\(j = 1,\dots,m\) observations.
Let \(T_j^*\) be the event time of cause \(\epsilon_j^*\).
We are interested in the cumulative incidence function
\[ F_{jk}(t) = P(T_j^*\leq t, \epsilon_j^* = k). \]
To recall \[ F_{jk}(t) = P(T_j^*\leq t, \epsilon_j^* = k). \]
The probability is the same or larger if we wait longer.
\[ F_j(t) = P(T_j^*\leq t) = \sum_{k = 1}^KF_{jk}(t). \] We usually assume \(\lim_{t \rightarrow\infty} F_j(t) = 1\). Moreover, for interesting examples, \(\lim_{t\rightarrow\infty}F_{jk}(t) < \lim_{t\rightarrow\infty} F_j(t) = 1\).
To recall \[ F_{jk}(t) = P(T_j^*\leq t, \epsilon_j^* = k). \]
So \[\lim_{t\rightarrow \infty} F_{jk}(t) = P(\epsilon_j^* = k).\]
Decompose \(F_{jk}(t)\) into
\[ F_{jk}(t) = \underbrace{P(T_j^*\leq t \mid \epsilon_j^* = k)}_{\text{trajectory}} \underbrace{P(\epsilon_j^* = k)}_{\text{risk}}. \]
Issue \[\lim_{t\rightarrow \infty} F_{jk}(t) = P(\epsilon_j^* = k).\]
Pick a \(\tau\) and use
\[ F_{jk}(t) = P(T_j^*\leq t \mid T_j^*\leq \tau, \epsilon_j^* = k) P(T_j^* \leq \tau, \epsilon_j^* = k). \]
Thus, \[F_{jk}(\tau) = P(T_j^* \leq \tau, \epsilon_j^* = k).\]
Have covariates \(\vec x_j\) which the cumulative incidence function may depend on.
Set \[ P(T_j^* \leq \tau, \epsilon_j^* = k) = \pi_{k}(\vec x_j) = \frac{\exp(\vec x_j^\top\vec\beta_k)} {1 + \sum_{l=1}^K\exp(\vec x_j^\top\vec\beta_l)}. \]
Can use other multinomial models.
\[ \begin{align*} P(T_j^*\leq 0 \mid T_j^*\leq \tau, \epsilon_j^* = k) = 0 \\ P(T_j^*\leq \tau \mid T_j^*\leq \tau, \epsilon_j^* = k) = 1 \end{align*} \]
Take a cumulative distribution function \(G\) with support on \(\mathbb R\) and use
\[ P(T_j^*\leq t \mid T_j^*\leq \tau, \epsilon_j^* = k) = G(a_k(t) - \vec x_j^\top\vec\gamma_k) \]
which is monotonic and increasing. Can be estimated with a constrained spline.
Gives some computational advantages later.
The model is \[ \begin{align*} F_{jk}(t) &= \underbrace{P(T_j^*\leq t \mid \epsilon_j^* = k)}_{\text{trajectory}} \underbrace{P(\epsilon_j^* = k)}_{\text{risk}} \\ &= \Phi(a_k(t) - \vec x_j^\top\vec\gamma_k)\pi_{k}(\vec x_j)\\ \pi_{k}(\vec x_j) &= \frac{\exp(\vec x_j^\top\vec\beta_k)} {1 + \sum_{l=1}^K\exp(\vec x_j^\top\vec\beta_l)}. \end{align*} \]
Families may tend to have jointly higher or lower risk for some cause.
Families may tend to have early or later onset given that they get a cause.
Families in which they tend to get a certain cause may also tend to get the cause earlier or later.
Think families.
Each have \(j = 1,\dots,m_i\) observations.
The cumulative incidence functions becomes
\[ \begin{align*} F_{ijk}(t) &= \Phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k)\pi_{k}(\vec x_{ij})\\ \pi_{k}(\vec x_{ij}) &= \frac{\exp(\vec x_{ij}^\top\vec\beta_k)} {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l)}. \end{align*} \]
Let \(\vec U_i\sim N^{(K)}(\vec 0, \mat\Sigma_{11})\) be unobserved random effects which affects the risk part.
\[ \begin{align*} F_{ijk}(t \mid\vec u_i) &= \Phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k)\pi_{k}(\vec x_{ij}, \vec u_i)\\ \pi_{k}(\vec x_{ij}, \vec u_i) &= \frac{\exp(\vec x_{ij}^\top\vec\beta_k + u_{ik})} {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l + u_{il})}. \end{align*} \]
\(u_{ik}\) makes cause \(k\) more likely all else being equal.
Let \(\vec \eta_i\sim N^{(K)}(\vec 0, \mat\Sigma_{22})\) be an unobserved random effect which affects the trajectory part.
\[ \begin{align*} F_{ijk}(t \mid\vec u_i,\vec\eta_i) &= \Phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k -\eta_{ik})\pi_{k}(\vec x_{ij}, \vec u_i)\\ \pi_{k}(\vec x_{ij}, \vec u_i) &= \frac{\exp(\vec x_{ij}^\top\vec\beta_k + u_{ik})} {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l + u_{il})}. \end{align*} \]
A higher \(\eta_{ik}\) makes it more likely that cause \(k\) happens later given that it happens.
Let \[ \begin{pmatrix} \vec U_i \\ \vec\eta_i \end{pmatrix} \sim N^{(2K)}\left(\vec 0, \begin{pmatrix} \mat\Sigma_{11} & \mat\Sigma_{12} \\ \mat\Sigma_{21} & \mat\Sigma_{22} \end{pmatrix}\right) \]
\(\text{Cov}(\eta_{ik},u_{ik}) < 0\) implies that a higher risk within a family of cause \(k\) is associated with higher likelihood of an early onset.
Cover the conditional likelihood.
Cover the pairwise composite likelihood.
Last follow up or event time \(T_{ij}^*\), whichever comes first.
Let \(\epsilon_{ij} = 1_{\{T_i = T_{ij}^*\}}\epsilon_{ij}^* \in\{0,\dots,K\}\) be the observed cause indicator.
The density if we observe one of the causes is
\[ \begin{align*} f_{ijk}(t \mid\vec u_i,\vec\eta_i) &= \frac\partial{\partial t}F_{ijk}(t \mid\vec u_i, \vec\eta_i) \\ &= a_k'(t)\phi(a_k(t) - \vec x_{ij}^\top\vec\gamma_k -\eta_{ik}) \pi_{k}(\vec x_{ij}, \vec u_i) \end{align*} \]
\(\phi\) is the normal density.
In the observed case, \(\epsilon_{ij} > 0\), the likelihood factor is
\[l_{ij}(\vec u_i,\vec\eta_i) = f_{ijk}(t_{ij} \mid\vec u_i,\vec\eta_i).\]
Easy to compute!In the censored case, \(\epsilon_{ij} = 0\), with \(t_{ij} < \tau\) the likelihood factor is
\[ l_{ij}(\vec u_i,\vec\eta_i) = 1 - \sum_{k = 1}^KP(T_{ij}^* \leq t_{ij}, \epsilon_j^* = k \mid \vec u_i,\vec\eta_i) \]
At least \(\bigO{K}\).
In the censored case, \(\epsilon_{ij} = 0\), with \(t_{ij} = \tau\) the likelihood factor is
\[ \begin{align*} l_{ij}(\vec u_i,\vec\eta_i) &= 1 - \sum_{k = 1}^KP(T_{ij}^* \leq \tau, \epsilon_j^* = k \mid \vec u_i,\vec\eta_i) \\ &= \frac 1 {1 + \sum_{l=1}^K\exp(\vec x_{ij}^\top\vec\beta_l + u_{il})} \end{align*} \]
Is not emphasized and perhaps not used by Cederkvist et al. (2018).
Combined we get \(\prod_{j = 1}^{m_i} l_{ij}(\vec u_i,\vec\eta_i)\).
But we need the marginal likelihood
\[ E\left(\prod_{j = 1}^{m_i} l_{ij}(\vec U_i,\vec\eta_i)\right). \]
for the \(K\)s we consider.
We can work with the miss specified likelihood
See Varin, Reid, and Firth (2011) for an overview of composite likelihood.
Each term is though easier to compute.
But for the particular model, we can compute it in \(\bigO{m_i^2b^KK^4}\) (Cederkvist et al. 2018).
Made a new and faster implementation in the mmcif package.
The composite likelihood term is a bit tedious to implement with \(3^2 - 2 = 7\) types of composite likelihood terms.
The presentation is at rpubs.com/boennecd/KTH-competing.
The markdown is at github.com/boennecd/Talks.
The implementation is at github.com/boennecd/mmcif.
References are on the next slide.
Comments
Others exist and some more common models exist!