Relevant package installation

library(survival)
library(Epi)
library(popEpi)

## 
## Attaching package: 'popEpi'

## The following object is masked from 'package:survival':
## 
##     Surv

# popEpi::splitMulti returns a data.frame rather than a data.table
options("popEpi.datatable" = FALSE)
library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2
## ──

## ✔ ggplot2 3.4.1     ✔ purrr   1.0.1
## ✔ tibble  3.1.8     ✔ dplyr   1.1.0
## ✔ tidyr   1.3.0     ✔ stringr 1.5.0
## ✔ readr   2.1.4     ✔ forcats 1.0.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

clear()

Lexis object for steno2

data(steno2)
steno2 <- cal.yr(steno2)
steno2 <- transform(steno2,
                    doEnd = pmin(doEnd, doDth, na.rm = TRUE))
str(steno2)

## 'data.frame':    160 obs. of  14 variables:
##  $ id      : num  1 2 3 4 5 6 7 8 9 10 ...
##  $ allo    : Factor w/ 2 levels "Int","Conv": 1 1 2 2 2 2 2 1 1 1 ...
##  $ sex     : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 1 2 2 2 ...
##  $ baseCVD : num  0 0 0 0 0 1 0 0 0 0 ...
##  $ deathCVD: num  0 0 0 0 1 0 0 0 1 0 ...
##  $ doBth   : 'cal.yr' num  1932 1947 1943 1945 1936 ...
##  $ doDM    : 'cal.yr' num  1991 1982 1983 1977 1986 ...
##  $ doBase  : 'cal.yr' num  1993 1993 1993 1993 1993 ...
##  $ doCVD1  : 'cal.yr' num  2014 2009 2002 1995 1994 ...
##  $ doCVD2  : 'cal.yr' num  NA 2009 NA 1997 1995 ...
##  $ doCVD3  : 'cal.yr' num  NA 2010 NA 2003 1998 ...
##  $ doESRD  : 'cal.yr' num  NaN NaN NaN NaN 1998 ...
##  $ doEnd   : 'cal.yr' num  2015 2015 2002 2003 1998 ...
##  $ doDth   : 'cal.yr' num  NA NA 2002 2003 1998 ...

Start by setting up a Lexis data frame for the entire observation time for each person; from entry (doBase, date of baseline) to exit, doEnd. Note that we call the initial state Mic(roalbuminuria), because all patients in the Steno2 study had this status at entry—it was one of the inclusion criteria:

L2 <- Lexis(entry = list(per = doBase,
                         age = doBase - doBth,
                         tfi = 0),
             exit = list(per = doEnd),
      exit.status = factor(deathCVD + !is.na(doDth),
                           labels=c("Mic","D(oth)","D(CVD)")),
               id = id,
             data = steno2)

## NOTE: entry.status has been set to "Mic" for all.

summary(L2, t = TRUE)

##      
## Transitions:
##      To
## From  Mic D(oth) D(CVD)  Records:  Events: Risk time:  Persons:
##   Mic  67     55     38       160       93    2416.59       160
## 
## Timescales:
## per age tfi 
##  ""  ""  ""

boxes(L2, boxpos = TRUE, show.BE = TRUE, scale.R = 100)

How many deaths are there in the cohort?

Explain the coding of exit.status.

How many person-years are in the follow-up?

What are the time scales defined?

In this set-up we can study the CVD and the non-CVD mortality rates, a classical competing risks problem, but we want in particular to see how the mortality rates depend on albuminuria status.

In order to allocate follow-up (person-time and events) to current albuminuria status we need to know when the persons change status; this is recorded in the data frame st2alb.

We will cut the follow-up at each date of albuminuria measurement allowing the patients to change between states Normoalbuminuria, Microalbuminuria and Macroalbuminuria at each of these dates, possibly several times per person. To this end we use the function rcutLexis (recurrent cuts), which requires a data frame of transitions with columns lex.id, cut and new.state — see ?rcutLexis.

We change the scale of the date of transition to year by cal.yr (to align with the per variable in L2), and in order to comply with the requirements of rcutLexis rename the id variable id to lex.id, the date variable doTr to cut and the state variable state to new.state:

data(st2alb)
cut2 <- cal.yr(st2alb)
names(cut2)

## [1] "id"    "doTr"  "state"

names(cut2) <- c("lex.id", "cut", "new.state")
str(cut2)

## 'data.frame':    563 obs. of  3 variables:
##  $ lex.id   : num  1 1 1 1 1 2 2 2 2 2 ...
##  $ cut      : 'cal.yr' num  1993 1995 2000 2002 2007 ...
##  $ new.state: Factor w/ 3 levels "Norm","Mic","Mac": 2 1 2 1 2 1 2 3 2 2 ...

head(cut2)

##   lex.id      cut new.state
## 1      1 1993.444       Mic
## 2      1 1995.361      Norm
## 3      1 2000.067       Mic
## 4      1 2001.984      Norm
## 5      1 2007.317       Mic
## 6      2 1993.786      Norm

How many persons are in the cut2 data frame? We can do this in two different ways, illustrating the tidyverse philosophy

addmargins(table(table(cut2$lex.id)))

## 
##   1   2   3   4   5 Sum 
##   4  25  40  46  41 156

cut2$lex.id %>% table %>% table %>% addmargins

## .
##   1   2   3   4   5 Sum 
##   4  25  40  46  41 156

Explain the entries in this table.

Now cut the follow-up at intermediate transition times (note that rcutLexis assumes that values in the cut column refer to the first timescale by default, and the first of the timescales in L2 is per):

L3 <- rcutLexis(L2, cut2)
summary(L3)

##       
## Transitions:
##      To
## From   Mic Norm Mac D(oth) D(CVD)  Records:  Events: Risk time:  Persons:
##   Mic  299   72  65     27     13       476      177    1381.57       160
##   Norm  31   90   5     14      7       147       57     607.86        69
##   Mac   20    3  44     14     18        99       55     427.16        64
##   Sum  350  165 114     55     38       722      289    2416.59       160

boxes(L3, boxpos = TRUE, cex = 0.8)

Note in the figure the default lay-out of the 5 boxes placed on a circle, including the jumps directly between Norm and Mac

Note that there are transitions both ways between all three of Norm, Mic and Mac, which is a bit illogical since we have a natural ordering of states: Norm < Mic < Mac. Hence, transitions from Norm to Mac (and vice versa) should go through Mic.

In order to remedy this anomaly we find all transitions Norm → Mac and provide a transition Norm → Mic in between, so that each transition Norm → Mac is replaced by two: Norm → Mic and Mic → Mac. And of course similarly for transitions Mac → Norm.

The relevant “jump” transitions are easily found:

(jump <-
subset(L3, (lex.Cst == "Norm" & lex.Xst == "Mac") |
           (lex.Xst == "Norm" & lex.Cst == "Mac"))[,
       c("lex.id", "per", "lex.dur","lex.Cst", "lex.Xst")])

##  lex.id     per lex.dur lex.Cst lex.Xst
##      70 1999.49    2.67     Mac    Norm
##      86 2001.76   12.82    Norm     Mac
##     130 2000.91    1.88     Mac    Norm
##     131 1997.76    4.24    Norm     Mac
##     136 1997.21    0.47     Mac    Norm
##     136 1997.69    4.24    Norm     Mac
##     171 1996.39    5.34    Norm     Mac
##     175 2004.58    9.88    Norm     Mac

What we need to do for each of these “jumps” is to provide an extra transition to Mic at a time during the stay in either lex.Cst (either Norm or Mac), i.e. somewhere between per and per + lex.dur in these records. Quite arbitrarily we choose a random time in the middle 80% between the dates:

set.seed(1952)
xcut <- select(transform(jump,
                          cut = per + lex.dur * runif(per, 0.1, 0.9),
                    new.state = "Mic"),
               c(lex.id, cut, new.state))
xcut

##  lex.id      cut new.state
##      70 2001.789       Mic
##      86 2012.232       Mic
##     130 2001.488       Mic
##     131 2001.032       Mic
##     136 1997.610       Mic
##     136 2000.780       Mic
##     171 1997.057       Mic
##     175 2013.472       Mic

How many extra records will be generated when cutting the follow-up?

Now make extra cuts (transitions to Mic) at these dates using rcutLexis with xcut on the L3 object, and order the levels sensibly:

L4 <- rcutLexis(L3, xcut)
L4 <- Relevel(L4, c("Norm","Mic","Mac","D(CVD)","D(oth)"))
summary(L4)

##       
## Transitions:
##      To
## From   Norm Mic Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm   90  35   0      6     13       144       54     581.04        66
##   Mic    72 312  65     14     30       493      181    1435.14       160
##   Mac     0  22  41     18     12        93       52     400.41        60
##   Sum   162 369 106     38     55       730      287    2416.59       160

We see that there are now no transitions directly between Norm and Mac in L4, so we can make a more intelligible plot of the transitions (remember to read the help page for boxes.Lexis):

clr <- c("forestgreen","orange","red","blue",gray(0.3))
boxes(L4, boxpos = list(x = c(20,20,20,80,80),
                        y = c(10,50,90,75,25)),
         show.BE = "nz",
         scale.R = 100,
        digits.R = 2,
             cex = 0.9,
         pos.arr = 0.3,
          col.bg = clr,
      col.border = clr,
      main = "Transitions between states in the Steno2 study",
         col.txt = c("white","black")[c(1,2,1,1,1)])

Explain the arguments of boxes.

Explain the numbers in the graph.

Describe the overall effect of albuminuria on the two mortality rates.

What we now have in the Lexis object L4 is the follow-up of the 160 persons classified in the 5 states shown. So with this multistate model (well, there is no model yet) set up we can look at mortality rates and see how they depend on the current albuminuria state, or look at the transition rates between the different albuminuria states and assess how these depend on covariates.

Transition rates: multiple time scales

We will model the transition rates with parametric functions, so we need to split the dataset along some time scale; we will use 1 month intervals (they should be sufficiently small to accommodate an assumption of constant transition rates in each interval):

S4 <- splitMulti(L4, tfi = seq(0, 25, 1/12))
summary(L4)

##       
## Transitions:
##      To
## From   Norm Mic Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm   90  35   0      6     13       144       54     581.04        66
##   Mic    72 312  65     14     30       493      181    1435.14       160
##   Mac     0  22  41     18     12        93       52     400.41        60
##   Sum   162 369 106     38     55       730      287    2416.59       160

summary(S4)

##       
## Transitions:
##      To
## From   Norm   Mic  Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm 7061    35    0      6     13      7115       54     581.04        66
##   Mic    72 17453   65     14     30     17634      181    1435.14       160
##   Mac     0    22 4844     18     12      4896       52     400.41        60
##   Sum  7133 17510 4909     38     55     29645      287    2416.59       160

We see that the number of events (transitions) and person-years are the same, in the two Lexis objects, but the number of records in S4 is substantially larger than in L4.

We can now model the overall mortality rates as functions of age and duration (time since entry) using the defaults for glm.Lexis (this function call will trigger a warning):

ma <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                      Ns(age, knots = seq(50, 80, 10)) +
                      lex.Cst)

## NOTE:
## Multiple transitions *from* state ' Mac', 'Mic', 'Norm ' - are you sure? 
## The analysis requested is effectively merging outcome states. 
## You may want analyses using a *stacked* dataset - see ?stack.Lexis
## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->D(CVD)
## Mic->D(CVD)
## Mac->D(CVD)
## Norm->D(oth)
## Mic->D(oth)
## Mac->D(oth)

The warning triggered here just tells you that you are modeling the occurrence of any type of death, which amounts to modeling of the sum of CVD and non-CVD death rates—the overall mortality.

We can illustrate the actual model underlying this by collapsing the two causes for death using Relevel:

clr <- c("forestgreen","orange","red",gray(0.3))
summary(Relevel(L4, list("Dead" = 4:5), first = FALSE))

##       
## Transitions:
##      To
## From   Norm Mic Mac Dead  Records:  Events: Risk time:  Persons:
##   Norm   90  35   0   19       144       54     581.04        66
##   Mic    72 312  65   44       493      181    1435.14       160
##   Mac     0  22  41   30        93       52     400.41        60
##   Sum   162 369 106   93       730      287    2416.59       160

  boxes(Relevel(L4, list("Dead" = 4:5), first = FALSE),
          boxpos = list(x = c(20,20,20,80),
                        y = c(10,50,90,50)),
         show.BE = "nz",
         scale.R = 100,
        digits.R = 2,
             cex = 0.9,
         pos.arr = 0.3,
          col.bg = clr,
      col.border = clr,
         col.txt = c("white","black")[c(1,2,1,1)])

This figure shows transitions between states in the Steno2 study, using only all-cause mortality.

The model structure with lex.Cst as an additive term is assuming that the overall mortality rates are proportional between states of albuminuria.

The default for glm.Lexis is to model all transitions to absorbing states which in this case are the two “dead” states, D(oth) and D(CVD).

# The glm.Lexis above is just a convenience wrapper for:

m1 <- glm(cbind(lex.Xst %in% c("D(oth)", "D(CVD)")
                 & lex.Cst != lex.Xst,
                 lex.dur)
           ~ Ns(tfi, knots = seq( 0, 20, 5)) +
             Ns(age, knots = seq(50, 80, 10)) +
             lex.Cst,
           family = poisreg,
             data = subset(S4, lex.Cst %in% c("Norm","Mic","Mac")))
#which will also give the same results as:

m2 <- glm((lex.Xst %in% c("D(oth)", "D(CVD)")
            & lex.Cst != lex.Xst)
           ~ Ns(tfi, knots = seq( 0, 20, 5)) +
             Ns(age, knots = seq(50, 80, 10)) +
             lex.Cst,
           offset = log(lex.dur),
           family = poisson,
             data = subset(S4, lex.Cst %in% c("Norm","Mic","Mac")))

Note the difference between the families poisreg and poisson: poisreg enters events and person-time more logically as part of the outcome, whereas poisson enters events as the response and person-years (lex.dur) via the offset argument.

The parameters from any of the formulations are on the log-scale so we want to see them exponentiated, so on the rate-scale:

round(ci.exp(ma), 2)

##                                   exp(Est.) 2.5%   97.5%
## (Intercept)                            0.00 0.00    0.01
## Ns(tfi, knots = seq(0, 20, 5))1        6.57 1.23   35.14
## Ns(tfi, knots = seq(0, 20, 5))2        4.29 0.97   18.89
## Ns(tfi, knots = seq(0, 20, 5))3       45.97 0.93 2265.04
## Ns(tfi, knots = seq(0, 20, 5))4        0.52 0.17    1.58
## Ns(age, knots = seq(50, 80, 10))1      3.26 1.33    8.00
## Ns(age, knots = seq(50, 80, 10))2     11.00 1.34   90.24
## Ns(age, knots = seq(50, 80, 10))3     12.41 5.56   27.71
## lex.CstMic                             0.96 0.56    1.65
## lex.CstMac                             1.71 0.95    3.06

What are the mortality rate-ratios (hazard ratios), what ratios do they refer to: rates of what between which groups?

We see there is a higher mortality in the Mac state but no discernible difference between the Mic and the Norm states.

It can be formally tested whether the three states carry the same mortality using a Wald test (testing whether the Norm and Mac parameters both are 0 on the log-scale):

Wald(ma, subset = "lex.Cst")

##      Chisq       d.f.          P 
## 6.15752406 2.00000000 0.04601619

What is the meaning of this test (i.e. what is the null hypothesis)?

Do you like the formal 5% significance level? What about 4.5% instead?

Now do the same analysis for the two causes of death separately, using the to argument to glm.Lexis:

# other causes of death
mo <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                      Ns(age, knots = seq(50, 80, 10)) +
                      lex.Cst,
                to = "D(oth)")

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->D(oth)
## Mic->D(oth)
## Mac->D(oth)

round(ci.exp(mo), 3)

##                                   exp(Est.)  2.5%        97.5%
## (Intercept)                           0.000 0.000 7.000000e-03
## Ns(tfi, knots = seq(0, 20, 5))1     161.560 2.938 8.882984e+03
## Ns(tfi, knots = seq(0, 20, 5))2      37.189 1.461 9.467380e+02
## Ns(tfi, knots = seq(0, 20, 5))3   39009.388 4.355 3.494245e+08
## Ns(tfi, knots = seq(0, 20, 5))4       2.086 0.345 1.263200e+01
## Ns(age, knots = seq(50, 80, 10))1     2.684 0.881 8.176000e+00
## Ns(age, knots = seq(50, 80, 10))2     1.868 0.185 1.887100e+01
## Ns(age, knots = seq(50, 80, 10))3    12.728 4.572 3.543300e+01
## lex.CstMic                            1.004 0.520 1.937000e+00
## lex.CstMac                            1.001 0.449 2.232000e+00

#
# CVD death
mC <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                      Ns(age, knots = seq(50, 80, 10)) +
                      lex.Cst,
                to = "D(CVD)")

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->D(CVD)
## Mic->D(CVD)
## Mac->D(CVD)

round(ci.exp(mC), 3)

##                                   exp(Est.)  2.5%      97.5%
## (Intercept)                           0.001 0.000      0.014
## Ns(tfi, knots = seq(0, 20, 5))1       1.171 0.164      8.344
## Ns(tfi, knots = seq(0, 20, 5))2       2.008 0.303     13.333
## Ns(tfi, knots = seq(0, 20, 5))3       1.493 0.018    126.727
## Ns(tfi, knots = seq(0, 20, 5))4       0.144 0.019      1.072
## Ns(age, knots = seq(50, 80, 10))1     6.833 1.078     43.327
## Ns(age, knots = seq(50, 80, 10))2   486.871 1.658 142982.410
## Ns(age, knots = seq(50, 80, 10))3    15.110 3.524     64.789
## lex.CstMic                            0.919 0.351      2.410
## lex.CstMac                            3.229 1.271      8.203

What is the conclusion w.r.t. the effect of albuminuria state on the two cause-specific mortality rates?

Now make formal tests of relevant hypotheses using Wald.

Wald(mo, subset = "Cst")

##        Chisq         d.f.            P 
## 0.0001452741 2.0000000000 0.9999273656

Wald(mC, subset = "Cst")

##        Chisq         d.f.            P 
## 13.785510275  2.000000000  0.001015113

What is the conclusion from these w.r.t. mortality dependence on albuminuria status?

We can show how fitted mortality rates look for persons currently in state Mic entering the study at a set of specific ages. The entry ages are in the vector L2$age (L2 is the initial Lexis object with one record per person):

summary(L2$age)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   37.39   48.52   56.60   55.13   61.06   67.50

Based on this we shall use ages (at entry) 45, 55 and 65, and show mortality rates for persons entering at these ages. We will show the rates as functions of their current age. We need a prediction data frame, with values for all variables in the model, (current) age and time from entry, tfi. Here expand.grid is our friend; note we are using ain for age at entry:

expand.grid(tfi = c(NA, seq(0, 20, 5)),
            ain = c(45, 55, 65))

##    tfi ain
## 1   NA  45
## 2    0  45
## 3    5  45
## 4   10  45
## 5   15  45
## 6   20  45
## 7   NA  55
## 8    0  55
## 9    5  55
## 10  10  55
## 11  15  55
## 12  20  55
## 13  NA  65
## 14   0  65
## 15   5  65
## 16  10  65
## 17  15  65
## 18  20  65

It will give all combinations of the values in the vectors supplied as a data.frame. The NAs are there for plotting purposes— we get a break in plotted curves if there is an NA in the data. We want the tfi points to be closer than in the illustrative example:

prf <- transform(expand.grid(tfi = c(NA, seq(0, 19, 0.5)),
                             ain = c(45, 55, 65))[-1,],
                 age = ain + tfi,
             lex.Cst = "Mic")
prf[ 1: 5,]

##   tfi ain  age lex.Cst
## 2 0.0  45 45.0     Mic
## 3 0.5  45 45.5     Mic
## 4 1.0  45 46.0     Mic
## 5 1.5  45 46.5     Mic
## 6 2.0  45 47.0     Mic

prf[40:44,]

##    tfi ain  age lex.Cst
## 41  NA  55   NA     Mic
## 42 0.0  55 55.0     Mic
## 43 0.5  55 55.5     Mic
## 44 1.0  55 56.0     Mic
## 45 1.5  55 56.5     Mic

matshade(prf$age, cbind(ci.pred(mo, prf),
                        ci.pred(mC, prf)) * 100,
         lwd = 3, col = c("black","blue"),
         log = "y", ylim = c(0.02,20), plot = TRUE,
         xlab = "Age at follow-up (years)",
         main = "CVD mortality rates (blue) and non-CVD mortality rates (black), with 95% confidence intervals as shades", cex.main = 0.78,
         ylab = "Mortality rate per 100 PY")
abline(v = c(45, 55, 65), lty = 3)

The rates of death from other causes is very small at the beginning and increases steeply over the first 5 years of follow-up, while the CVD mortality rates are pretty stable with a foreseeable increase by age.Give an informal description of the curves, and a possible reason for the shape of the curves.

We can show the impact of albuminuria state on the mortality rates in a 3-panel layout:

par(mfrow=c(1,3))
for(st in c("Norm","Mic","Mac"))
   {
matshade(prf$age, pmin(pmax(
                  cbind(ci.pred(mo, transform(prf, lex.Cst = st)),
                        ci.pred(mC, transform(prf, lex.Cst = st))) * 100,
                            0.05), 60),
         lwd = 3, col = c("black","blue"),
         log = "y", ylim = c(0.1,50), plot = TRUE)
text(60, 50, st, adj = 0)
   }

How are the curves in the three panels related?

Describe the effect of albuminuria status on the two types of mortality.

How can you see this from the model parameters of the models mo and mC?

State probabilities

We would like to see how the probabilities of being in each of the states in figure 3.2 look as a function of time since entry, and we will in particular be interested in how this depends on allo, the allocation to intensified or standard treatment.

Models for transition rates

Thus we will need models for 1) the cause-specific mortality rates and 2) transition rates between albuminuria states. And of course models which all include the effect of allo (treatment allocation). We already fitted models for the mortality rates, but here we refit them in a slightly different guise, namely including the treatment allocation (allo) as covariate too.

Mortality rates

We first model the mortality rates using a proportional hazards model, but allowing different levels of mortality between the two allocation groups (in allo), and the three albuminuria states (in lex.Cst):

mix <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst * allo,
                 to = "D(oth)")

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->D(oth)
## Mic->D(oth)
## Mac->D(oth)

round(ci.exp(mix), 3)

##                                   exp(Est.)  2.5%        97.5%
## (Intercept)                           0.000 0.000 5.000000e-03
## Ns(tfi, knots = seq(0, 20, 5))1     195.021 3.367 1.129679e+04
## Ns(tfi, knots = seq(0, 20, 5))2      43.667 1.644 1.159968e+03
## Ns(tfi, knots = seq(0, 20, 5))3   59106.199 5.928 5.893319e+08
## Ns(tfi, knots = seq(0, 20, 5))4       2.617 0.429 1.595000e+01
## Ns(age, knots = seq(50, 80, 10))1     2.673 0.884 8.081000e+00
## Ns(age, knots = seq(50, 80, 10))2     1.479 0.141 1.552500e+01
## Ns(age, knots = seq(50, 80, 10))3    11.747 4.215 3.274000e+01
## lex.CstMic                            0.967 0.361 2.593000e+00
## lex.CstMac                            1.630 0.532 5.000000e+00
## alloConv                              1.797 0.591 5.463000e+00
## lex.CstMic:alloConv                   1.072 0.281 4.092000e+00
## lex.CstMac:alloConv                   0.362 0.072 1.821000e+00

We would however like to see the allocation effect on mortality separately for each albuminuria state; this is done by the “/” operator in the model formula (pronounced: allo effect within lex.Cst):

mox <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst / allo,
                 to = "D(oth)")

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->D(oth)
## Mic->D(oth)
## Mac->D(oth)

round(ci.exp(mox), 3)

##                                   exp(Est.)  2.5%        97.5%
## (Intercept)                           0.000 0.000 5.000000e-03
## Ns(tfi, knots = seq(0, 20, 5))1     195.021 3.367 1.129679e+04
## Ns(tfi, knots = seq(0, 20, 5))2      43.667 1.644 1.159968e+03
## Ns(tfi, knots = seq(0, 20, 5))3   59106.199 5.928 5.893319e+08
## Ns(tfi, knots = seq(0, 20, 5))4       2.617 0.429 1.595000e+01
## Ns(age, knots = seq(50, 80, 10))1     2.673 0.884 8.081000e+00
## Ns(age, knots = seq(50, 80, 10))2     1.479 0.141 1.552500e+01
## Ns(age, knots = seq(50, 80, 10))3    11.747 4.215 3.274000e+01
## lex.CstMic                            0.967 0.361 2.593000e+00
## lex.CstMac                            1.630 0.532 5.000000e+00
## lex.CstNorm:alloConv                  1.797 0.591 5.463000e+00
## lex.CstMic:alloConv                   1.927 0.925 4.013000e+00
## lex.CstMac:alloConv                   0.651 0.205 2.071000e+00

c(deviance(mox), deviance(mix))

## [1] 734.0855 734.0855

The use of the deviance gives a good indication that the models fitted actually are the same model, just differently parametrized.

What is the meaning of the parameters for the allo effect?

We also need a similar model for the CVD-mortality:

mCx <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst / allo,
                 to = "D(CVD)")

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->D(CVD)
## Mic->D(CVD)
## Mac->D(CVD)

round(ci.exp(mCx), 3)

##                                   exp(Est.)  2.5%      97.5%
## (Intercept)                           0.000 0.000      0.013
## Ns(tfi, knots = seq(0, 20, 5))1       1.004 0.140      7.179
## Ns(tfi, knots = seq(0, 20, 5))2       2.291 0.354     14.834
## Ns(tfi, knots = seq(0, 20, 5))3       1.361 0.016    115.800
## Ns(tfi, knots = seq(0, 20, 5))4       0.125 0.016      0.988
## Ns(age, knots = seq(50, 80, 10))1     7.289 1.122     47.353
## Ns(age, knots = seq(50, 80, 10))2   641.288 1.868 220199.762
## Ns(age, knots = seq(50, 80, 10))3    20.847 4.730     91.871
## lex.CstMic                            0.808 0.199      3.275
## lex.CstMac                            1.250 0.245      6.373
## lex.CstNorm:alloConv                  1.399 0.278      7.047
## lex.CstMic:alloConv                   1.682 0.579      4.889
## lex.CstMac:alloConv                   4.856 1.367     17.250

What is the conclusion for the intervention effect on CVD mortality rates?

Albumineria state rates

For a complete description of transitions in the model we also need models for the transitions between albuminuria states (the vertical arrows in figure 3.2):

We will use different models for deterioration (arrows up) and improvement (arrows down) in albuminuria in figure 3.2). Again the model specification is a simplified by glm.Lexis, but now requires specification of both from and to arguments:

det <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst / allo,
                 from = c("Norm","Mic"),
                   to = c("Mic","Mac"))

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Norm->Mic
## Mic->Mac

imp <- glm.Lexis(S4, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst / allo,
                 from = c("Mac","Mic"),
                   to = c("Mic","Norm"))

## stats::glm Poisson analysis of Lexis object S4 with log link:
## Rates for transitions:
## Mac->Mic
## Mic->Norm

round(ci.exp(det), 3)

##                                   exp(Est.)  2.5%  97.5%
## (Intercept)                           0.071 0.032  0.158
## Ns(tfi, knots = seq(0, 20, 5))1       0.648 0.248  1.692
## Ns(tfi, knots = seq(0, 20, 5))2       0.276 0.079  0.961
## Ns(tfi, knots = seq(0, 20, 5))3       0.281 0.043  1.823
## Ns(tfi, knots = seq(0, 20, 5))4       0.218 0.065  0.738
## Ns(age, knots = seq(50, 80, 10))1     1.977 0.839  4.656
## Ns(age, knots = seq(50, 80, 10))2     3.588 0.948 13.574
## Ns(age, knots = seq(50, 80, 10))3     2.728 0.795  9.358
## lex.CstMic                            0.393 0.223  0.695
## lex.CstNorm:alloConv                  0.490 0.222  1.082
## lex.CstMic:alloConv                   1.965 1.178  3.278

round(ci.exp(imp), 3)

##                                   exp(Est.)  2.5% 97.5%
## (Intercept)                           0.208 0.128 0.339
## Ns(tfi, knots = seq(0, 20, 5))1       0.239 0.075 0.759
## Ns(tfi, knots = seq(0, 20, 5))2       0.069 0.011 0.420
## Ns(tfi, knots = seq(0, 20, 5))3       0.046 0.008 0.277
## Ns(tfi, knots = seq(0, 20, 5))4       0.172 0.034 0.859
## Ns(age, knots = seq(50, 80, 10))1     0.844 0.292 2.444
## Ns(age, knots = seq(50, 80, 10))2     0.342 0.069 1.690
## Ns(age, knots = seq(50, 80, 10))3     0.580 0.073 4.624
## lex.CstMac                            1.055 0.465 2.393
## lex.CstMic:alloConv                   0.526 0.324 0.855
## lex.CstMac:alloConv                   1.341 0.545 3.303

round(ci.exp(det, subset="al"), 1)

##                      exp(Est.) 2.5% 97.5%
## lex.CstNorm:alloConv       0.5  0.2   1.1
## lex.CstMic:alloConv        2.0  1.2   3.3

round(ci.exp(imp, subset="al"), 1)

##                     exp(Est.) 2.5% 97.5%
## lex.CstMic:alloConv       0.5  0.3   0.9
## lex.CstMac:alloConv       1.3  0.5   3.3

What was the meaning of “different models for det and imp”?

What do the parameters in the models represent?

What are the assumptions in the models?

Label the transitions in figure 3.2 with the models for each of thetransitions.

Simulation of state probabilities

We now have statistical models for all transitions, two models for the cause specific mortality rates, and two models for transitions between albuminuria states. The state probabilities that in principle can be derived from these are not trivial to compute, essentially they can only be computed by simulation.

First we need an explicit specification of what transitions are decribed by what the model, since the simulated transitions will be using predictions from these models. This is specified in a list of lists (remember what a list is??).

There must be one element in the list for each transient state (of which we have 3):

Tr <- list(Norm = list("Mic" = det,
                    "D(oth)" = mox,
                    "D(CVD)" = mCx),
            Mic = list("Mac" = det,
                      "Norm" = imp,
                    "D(oth)" = mox,
                    "D(CVD)" = mCx),
            Mac = list("Mic" = imp,
                    "D(oth)" = mox,
                    "D(CVD)" = mCx))
lapply(Tr, names)

## $Norm
## [1] "Mic"    "D(oth)" "D(CVD)"
## 
## $Mic
## [1] "Mac"    "Norm"   "D(oth)" "D(CVD)"
## 
## $Mac
## [1] "Mic"    "D(oth)" "D(CVD)"

For example, the object Tr$Norm$Mic is the model det; the model for the transition rate Norm → Mic; we see that there are 10 entries in the specification of Tr, corresponding to each of the 10 transitions in the diagram in figure 3.2. Some of the entries in Tr point to the same model. All the models fitted were actually joint models for more than one transition, but with specific parameters representing differences between the specific transition rates modeled.

We can use the estimated rates to simulate the transition between states in a group of people with a given set of covariates—the initial cohort. The simulated data can the be used to assess the probability of being in each of the states at a given time after entry to the study, by simply counting how many of the persons from the initial cohort are simulated to be in each state. These probabilities depend on the covariates used for modeling of transition rates, that is any of the covariates in mox, mCx, det and imp. We can choose our initial cohort in (at least) two different ways:

Use a population with the same covariate distribution as the entire study population at entry (“population-averaged”)
Use a population with a prespecified set of covariates at entry (“conditional”).

Either way, what is needed is a data frame of persons indicating their initial status. simLexis will then simulate their individual trajectories through states (what transition takes place when) and produce a simulated cohort of persons in the form of a Lexis object. The initial (baseline) data frame should also be a Lexis object, but the values of lex.Xst and lex.dur need not be given, since these will be simulated.

Study population cohort

First construct a cohort with the same covariate distribution as the entire study for each of the allocation groups:

ini <- L2[,c("per", "age", "tfi")]
ini <- rbind(transform(ini, lex.Cst = factor("Mic"), allo = factor("Int")),
             transform(ini, lex.Cst = factor("Mic"), allo = factor("Conv")))
str(ini)

## Classes 'Lexis' and 'data.frame':    320 obs. of  5 variables:
##  $ per    : 'cal.yr' num  1993 1993 1993 1993 1993 ...
##  $ age    : 'cal.yr' num  61.1 46.6 49.9 48.5 57.3 ...
##  $ tfi    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ lex.Cst: Factor w/ 1 level "Mic": 1 1 1 1 1 1 1 1 1 1 ...
##  $ allo   : Factor w/ 2 levels "Int","Conv": 1 1 1 1 1 1 1 1 1 1 ...
##  - attr(*, "breaks")=List of 3
##   ..$ per: NULL
##   ..$ age: NULL
##   ..$ tfi: NULL
##  - attr(*, "time.scales")= chr [1:3] "per" "age" "tfi"
##  - attr(*, "time.since")= chr [1:3] "" "" ""

This will be the initial values in the cohort we follow through states—we have the starting state in lex.Cst and the covariates (at start): timescales ( per, age, tfi) and the other covariates allo

First we simulate transitions from a large cohort that looks like the study population, say 10 copies of each person in the original data set (see ?simLexis):

?simLexis

## starting httpd help server ... done

set.seed(1952)
system.time(
Sorg <- simLexis(Tr = Tr,  # models for each transition
               init = ini, # cohort of straters
                  N = 100, # how many copies of each person in ini
            t.range = 21,  # how long should we simulate before censoring
              n.int = 200))# how many intervals for evaluating rates

##    user  system elapsed 
##  110.04   10.50  120.56

There is no guaranteed order of the states in the Sorg object, so we explicitly reorder the states:

Sorg <- Relevel(Sorg, c("Norm", "Mic", "Mac", "D(CVD)", "D(oth)"))
summary(Sorg, by = "allo")

## $Int
##       
## Transitions:
##      To
## From    Norm   Mic  Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm  2293  5472    0    648   1324      9737     7444   76184.55      8601
##   Mic   9737  4202 4916   1258   2590     22703    18501  151493.64     16000
##   Mac      0  1231 1523    619   1543      4916     3393   33953.90      4674
##   Sum  12030 10905 6439   2525   5457     37356    29338  261632.09     16000
## 
## $Conv
##       
## Transitions:
##      To
## From   Norm  Mic  Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm 1460 1574    0    597   1457      5088     3628   47906.47      4927
##   Mic  5088 2237 8302   1479   3169     20275    18038  129297.70     16000
##   Mac     0 2701 1439   3154   1008      8302     6863   45188.18      7404
##   Sum  6548 6512 9741   5230   5634     33665    28529  222392.35     16000

subset(Sorg, lex.id %in% 28:32)

##  lex.id     per   age   tfi lex.dur lex.Cst lex.Xst allo     cens
##      28 1993.33 61.05  0.00    1.16     Mic    Norm  Int 2014.326
##      28 1994.49 62.21  1.16    2.31    Norm     Mic  Int 2014.326
##      28 1996.79 64.52  3.47    3.13     Mic  D(oth)  Int 2014.326
##      29 1993.33 61.05  0.00    3.17     Mic    Norm  Int 2014.326
##      29 1996.50 64.22  3.17   14.87    Norm     Mic  Int 2014.326
##      29 2011.37 79.10 18.04    2.96     Mic     Mic  Int 2014.326
##      30 1993.33 61.05  0.00    8.20     Mic  D(CVD)  Int 2014.326
##      31 1993.33 61.05  0.00    9.08     Mic    Norm  Int 2014.326
##      31 2002.40 70.13  9.08    3.40    Norm     Mic  Int 2014.326
##      31 2005.80 73.53 12.48    1.90     Mic  D(CVD)  Int 2014.326
##      32 1993.33 61.05  0.00    3.15     Mic     Mac  Int 2014.326
##      32 1996.48 64.20  3.15    0.84     Mac     Mic  Int 2014.326
##      32 1997.32 65.05  4.00   11.54     Mic  D(CVD)  Int 2014.326

Describe in words how the simulated data looks, and what each record represents.

What is it really we simulated?

addmargins(table(table(Sorg$lex.id)))

## 
##     1     2     3     4     5     6     7     8     9   Sum 
##  8416 13680  5765  3125   708   248    44    12     2 32000

What does this table mean?

Now we can just count how many of the original 1600 persons are in each of the states at each of a set of times; this is done by the function nState:

system.time(
Nst <- nState(Sorg,
                at = seq(0, 20, 0.2),
              from = 0,
        time.scale = "tfi"))

##    user  system elapsed 
##   10.98    0.15   11.14

str(Nst)

##  'table' int [1:101, 1:5] 0 873 1625 2346 2985 3547 4056 4585 5055 5477 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ when : chr [1:101] "0" "0.2" "0.4" "0.6" ...
##   ..$ State: chr [1:5] "Norm" "Mic" "Mac" "D(CVD)" ...

head(Nst)

##      State
## when   Norm   Mic   Mac D(CVD) D(oth)
##   0       0 32000     0      0      0
##   0.2   873 30754   350     23      0
##   0.4  1625 29711   611     53      0
##   0.6  2346 28678   889     84      3
##   0.8  2985 27739  1160    113      3
##   1    3547 26903  1408    136      6

This is however not necessarily a relevant summary; we would be interested in seeing how things look in each of the allocation groups, Int and Conv.

Nint <- nState(subset(Sorg, allo == "Int"),
               at = seq(0, 20.2, 0.1),
             from = 0,
       time.scale = "tfi")
Nconv<- nState(subset(Sorg, allo == "Conv"),
               at = seq(0, 20.2, 0.1),
             from = 0,
       time.scale = "tfi")
cbind(
head(Nint), NA,
head(Nconv))

##     Norm   Mic Mac D(CVD) D(oth)    Norm   Mic Mac D(CVD) D(oth)
## 0      0 16000   0      0      0 NA    0 16000   0      0      0
## 0.1  294 15629  71      6      0 NA  157 15723 115      5      0
## 0.2  575 15287 129      9      0 NA  298 15467 221     14      0
## 0.3  820 14995 170     15      0 NA  405 15270 304     21      0
## 0.4 1091 14656 232     21      0 NA  534 15055 379     32      0
## 0.5 1327 14364 278     29      2 NA  676 14793 491     39      1

If we divide each of these by the number of persons, we get the probabilities of being in each if the states at the different times since entry.

If we want the state probabilities cumulated over states we can derive these by pState, that yields a matrix with the cumulative state probabilities.

Pint  <- pState(Nint )
Pconv <- pState(Nconv)
 str(Pint)

##  'pState' num [1:203, 1:5] 0 0.0184 0.0359 0.0512 0.0682 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ when : chr [1:203] "0" "0.1" "0.2" "0.3" ...
##   ..$ State: chr [1:5] "Norm" "Mic" "Mac" "D(CVD)" ...

head(Pint)

##      State
## when       Norm       Mic       Mac   D(CVD) D(oth)
##   0   0.0000000 1.0000000 1.0000000 1.000000      1
##   0.1 0.0183750 0.9951875 0.9996250 1.000000      1
##   0.2 0.0359375 0.9913750 0.9994375 1.000000      1
##   0.3 0.0512500 0.9884375 0.9990625 1.000000      1
##   0.4 0.0681875 0.9841875 0.9986875 1.000000      1
##   0.5 0.0829375 0.9806875 0.9980625 0.999875      1

Describe the structure of Pint.

There is a standard plotting method for a pState object; it will plot the stacked state probabilities stacked in the order given by the perm argument (not used here because they are already in the order we want):

clr <- c("forestgreen", "orange", "red", "blue", gray(0.4))
par(mfrow = c(1,2), mar=c(3,3,2,2))
plot(Pint , col = clr, xlim = c(0, 20))
plot(Pconv, col = clr, xlim = c(20, 0))

A more sophisticated version:

clr <- c("forestgreen", "orange", "red", "blue", gray(0.4))
par(mfrow = c(1,2), mar=c(3,3,2,2))

plot(Pint, col = clr, xlim = c(0, 20))
# the survival curve
lines(as.numeric(rownames(Pint)), Pint[,"Mac"], lwd = 3, col = "black")
lines(as.numeric(rownames(Pint)), Pint[,"Mac"], lwd = 1, col = "white")
text(rownames(Pint)[150],
     Pint[150,] - diff(c(0, Pint[150,]))/2,
     colnames(Pint), col = "white", cex = 0.8)

plot(Pconv, col = clr, xlim = c(20, 0))
# the survival curve
lines(as.numeric(rownames(Pconv)), Pconv[,"Mac"], lwd = 3, col = "black")
lines(as.numeric(rownames(Pconv)), Pconv[,"Mac"], lwd = 1, col = "white")
text(rownames(Pconv)[150],
     Pconv[150,] - diff(c(0, Pconv[150,]))/2,
     colnames(Pint), col = "white", cex = 0.8)

mtext(c("Intensive care","Conventional care"),
      side = 3, at = c(1,3)/4, outer = TRUE, line = -2)

The figure shows State probabilities for the two intervention groups, for populations of the same structure as the original total Steno2 population.

Redo the plot with proper labeling of axes, including units where needed.

Describe the results and conclude on the probabilities shown.

The plot 3.5 may be of limited interest; the probabilities here are really “the probability that a randomly chosen person from the Steno 2 study. . .”. So we are referring to a universe that is not generalizable, the reference is to a particular distribution of ages at entry etc. into the study. The plot is only partially relevant for showing the intervention effect, the absolute sizes of the state probabilities are strictly speaking irrelevant.

Initiation cohort with predefined variables

Even if we take the modeling background deeply serious and accept that occurrence rates depend only on current age (age), time since entry (tfi) and treatment allocation (allo), the assumption of age-distribution as in the Steno 2 study is quit eabsurd; who wants to refer to this? Often this is disguised in terms such as “population averaged”.

Therefore, it would be more relevant to show the results for a homogeneous population of persons at select ages at entry. This would just require a different init data frame. But note that it must be a Lexis object, easiest obtained by copying from S4:

ini <- S4[1:10,c("lex.id", "per", "age", "tfi", "lex.Cst", "allo")]
ini[,"lex.id"]  <- 1:10
ini[,"per"]     <- 1993 # not used but it is a time scale in S4
ini[,"age"]     <-
ini[,"ain"]     <- rep(seq(45,65,5), 2)
ini[,"tfi"]     <- 0
ini[,"lex.Cst"] <- factor("Mic",
                          levels = c("Norm","Mic","Mac","D(CVD)","D(oth)"))
ini[,"allo"]    <- factor(rep(c("Int","Conv"), each = 5))
ini

##  lex.id  per age tfi lex.Cst allo ain
##       1 1993  45   0     Mic  Int  45
##       2 1993  50   0     Mic  Int  50
##       3 1993  55   0     Mic  Int  55
##       4 1993  60   0     Mic  Int  60
##       5 1993  65   0     Mic  Int  65
##       6 1993  45   0     Mic Conv  45
##       7 1993  50   0     Mic Conv  50
##       8 1993  55   0     Mic Conv  55
##       9 1993  60   0     Mic Conv  60
##      10 1993  65   0     Mic Conv  65

str(ini)

## Classes 'Lexis' and 'data.frame':    10 obs. of  7 variables:
##  $ lex.id : int  1 2 3 4 5 6 7 8 9 10
##  $ per    : num  1993 1993 1993 1993 1993 ...
##  $ age    : num  45 50 55 60 65 45 50 55 60 65
##  $ tfi    : num  0 0 0 0 0 0 0 0 0 0
##  $ lex.Cst: Factor w/ 5 levels "Norm","Mic","Mac",..: 2 2 2 2 2 2 2 2 2 2
##  $ allo   : Factor w/ 2 levels "Conv","Int": 2 2 2 2 2 1 1 1 1 1
##  $ ain    : num  45 50 55 60 65 45 50 55 60 65
##  - attr(*, "time.scales")= chr [1:3] "per" "age" "tfi"
##  - attr(*, "time.since")= chr [1:3] "" "" ""
##  - attr(*, "breaks")=List of 3
##   ..$ per: NULL
##   ..$ age: NULL
##   ..$ tfi: num [1:301] 0 0.0833 0.1667 0.25 0.3333 ...

Note that it is important that we enter the variable lex.Cst as a factor with the same levels as in the Lexis object S4, in the order we want the states when reporting results, because we will be using ini as basis for predictions of rates needed to do the simulations. Further, allo must also be entered as a factor, otherwise it is not possible to compute predictions from the models, because allo were included as a factor in the models.

For each of these combinations of age (at entry) and treatment allocation we will simulate 100 persons (note that we are using the same transition rates, the models in Tr):

system.time(
Sdef <- simLexis(Tr = Tr,
               init = ini,
                  N = 1000,
            t.range = 21,
              n.int = 200))

##    user  system elapsed 
##   33.86    2.22   36.08

summary(Sdef, by = "allo")

## $Conv
##       
## Transitions:
##      To
## From   Norm  Mic  Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm  455  496    0    175    426      1552     1097   14534.05      1498
##   Mic  1552  726 2591    432   1021      6322     5596   41195.79      5000
##   Mac     0  826  520    941    304      2591     2071   14578.57      2316
##   Sum  2007 2048 3111   1548   1751     10465     8764   70308.41      5000
## 
## $Int
##       
## Transitions:
##      To
## From   Norm  Mic  Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm  706 1648    0    216    396      2966     2260   22922.78      2652
##   Mic  2966 1362 1553    381    775      7037     5675   48019.31      5000
##   Mac     0  389  493    186    485      1553     1060   11131.40      1488
##   Sum  3672 3399 2046    783   1656     11556     8995   82073.49      5000

subset(Sdef, lex.id < 5)

##  lex.id     per   age  tfi lex.dur lex.Cst lex.Xst allo ain cens
##       1 1993.00 45.00 0.00    2.76     Mic    Norm  Int  45 2014
##       1 1995.76 47.76 2.76    0.29    Norm  D(oth)  Int  45 2014
##       2 1993.00 45.00 0.00    4.50     Mic     Mac  Int  45 2014
##       2 1997.50 49.50 4.50    2.33     Mac  D(oth)  Int  45 2014
##       3 1993.00 45.00 0.00    2.51     Mic    Norm  Int  45 2014
##       3 1995.51 47.51 2.51   18.49    Norm    Norm  Int  45 2014
##       4 1993.00 45.00 0.00    7.19     Mic    Norm  Int  45 2014
##       4 2000.19 52.19 7.19   13.81    Norm    Norm  Int  45 2014

In real applications we would use 5000 or 10,000 replicates of each to minimize the simulation error.

Now we will repeat the graph above, but for the 10 combinations of age at enrollment (ain), and allocation; we start with the 45 year old allocated to Int:

P45i <- nState(subset(Sdef, ain == 45 & allo == "Int"),
               at = seq(0, 20, 0.1),
             from = 0,
       time.scale = "tfi")
head(P45i)

##      State
## when  Norm  Mic  Mac D(CVD) D(oth)
##   0      0 1000    0      0      0
##   0.1   20  980    0      0      0
##   0.2   44  953    3      0      0
##   0.3   57  939    4      0      0
##   0.4   78  915    7      0      0
##   0.5   93  900    7      0      0

head(pState(P45i))

##      State
## when   Norm   Mic Mac D(CVD) D(oth)
##   0   0.000 1.000   1      1      1
##   0.1 0.020 1.000   1      1      1
##   0.2 0.044 0.997   1      1      1
##   0.3 0.057 0.996   1      1      1
##   0.4 0.078 0.993   1      1      1
##   0.5 0.093 0.993   1      1      1

This should then be repeated for 4 other ages at enrollment and the two allocations, plus we will only store the state probabilities:

P45c <- pState(nState(subset(Sdef, ain == 45 & allo == "Conv"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P45i <- pState(nState(subset(Sdef, ain == 45 & allo == "Int"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P50c <- pState(nState(subset(Sdef, ain == 55 & allo == "Conv"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P50i <- pState(nState(subset(Sdef, ain == 55 & allo == "Int"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P55c <- pState(nState(subset(Sdef, ain == 55 & allo == "Conv"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P55i <- pState(nState(subset(Sdef, ain == 55 & allo == "Int"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P60c <- pState(nState(subset(Sdef, ain == 55 & allo == "Conv"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P60i <- pState(nState(subset(Sdef, ain == 55 & allo == "Int"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P65c <- pState(nState(subset(Sdef, ain == 65 & allo == "Conv"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))
P65i <- pState(nState(subset(Sdef, ain == 65 & allo == "Int"),
               at = seq(0, 21, 0.1),
             from = 0,
       time.scale = "tfi"))

Then we can plot these in a multiple lay-out:

par(mfrow = c(5,2), mar = c(1,1,0,0),
      oma = c(3,3,1,0), mgp=c(3,1,0)/1.6)

plot(P45i, col = clr, xlim = c(0,20))
plot(P45c, col = clr, xlim = c(20,0))

plot(P50i, col = clr, xlim = c(0,20))
plot(P50c, col = clr, xlim = c(20,0))

plot(P55i, col = clr, xlim = c(0,20))
plot(P55c, col = clr, xlim = c(20,0))

plot(P60i, col = clr, xlim = c(0,20))
plot(P60c, col = clr, xlim = c(20,0))

plot(P65i, col = clr, xlim = c(0,20))
plot(P65c, col = clr, xlim = c(20,0))

mtext(c("Int","Conv"), side = 3, at = c(1,3)/4, outer = TRUE, line = 0)
mtext(paste(seq(45,65,5)), side = 2, at = (5:1*2-1)/10,
      outer = TRUE, line = 0)

The figure shows Predicted probabilities of being in each of the states for persons aged 45, 50, 55, 60 and 65 at entry, separately for the two intervention groups

We see that the curves are quite ragged; this is due to the simulation errors, it would be nicer if we simulated 1000 copies of each instead of only 100.

Digression: The previous is a lot of hard-coding, we would like to be able to easily get a plot with only a subset of the ages. To this end it is more convenient to collect the state probabilities in an array:

(ain <- seq(45, 65, 5))

## [1] 45 50 55 60 65

(alo <- levels(S4$allo))

## [1] "Int"  "Conv"

pdef <- NArray(c(list(ain = ain,
                     allo = alo),
                 dimnames(P45i)))
str(pdef)

##  logi [1:5, 1:2, 1:211, 1:5] NA NA NA NA NA NA ...
##  - attr(*, "dimnames")=List of 4
##   ..$ ain  : chr [1:5] "45" "50" "55" "60" ...
##   ..$ allo : chr [1:2] "Int" "Conv"
##   ..$ when : chr [1:211] "0" "0.1" "0.2" "0.3" ...
##   ..$ State: chr [1:5] "Norm" "Mic" "Mac" "D(CVD)" ...

But when we stick the results in an array we lose the pState class of the results: so we resort to the mat2pol function that stacks probabilities and plots them, so we simply take the result from nState and divide by the number in the initial state (Mic) using sweep:

for(aa in ain)
for(gg in alo)
   pdef[paste(aa), gg, ,] <-
   nState(subset(Sdef, ain == aa & allo == gg),
          at = as.numeric(dimnames(pdef)[["when"]]),
        from = 0,
  time.scale = "tfi")
pdef <- sweep(pdef, 1:2, pdef[,,1,"Mic"], "/")
str(pdef)

##  num [1:5, 1:2, 1:211, 1:5] 0 0 0 0 0 0 0 0 0 0 ...
##  - attr(*, "dimnames")=List of 4
##   ..$ ain  : chr [1:5] "45" "50" "55" "60" ...
##   ..$ allo : chr [1:2] "Int" "Conv"
##   ..$ when : chr [1:211] "0" "0.1" "0.2" "0.3" ...
##   ..$ State: chr [1:5] "Norm" "Mic" "Mac" "D(CVD)" ...

Then we have the state probabilities in the array pdef

ain <- seq(45, 65, 10)
par(mfrow = c(length(ain),2),
      mar = c(1,2,1,2)/5,
      oma = c(2,4,2,3),
      mgp = c(3,1,0) / 1.6)
for(aa in ain)
   {
mat2pol(pdef[paste(aa),"Int" ,,], col = clr, xlim = c(0,20),
        xaxs = "i", yaxs = "i", xaxt = "n", yaxt = "n")
axis(side = 2)
axis(side = 4, labels = NA)
mat2pol(pdef[paste(aa),"Conv",,], col = clr, xlim = c(20,0),
        xaxs = "i", yaxs = "i", xaxt = "n", yaxt = "n")
axis(side = 2, labels = NA)
axis(side = 4)
   }
mtext(c("Int","Conv"), side = 3, at = c(1,3)/4, outer = TRUE, line = 0)
mtext(ain, side = 2, at = (length(ain):1 * 2 - 1) / (length(ain) * 2),
      outer = TRUE, line = 2)

Thie figure shows Predicted probabilities of being in each of the states for persons aged 45, 55 and 65 at entry, separately for the two intervention groups

State probabilities using the Aalen-Johansen approach from survival

The survival package allows estimation of state probabilities by the Aalen-Johansen estimator similar to what we did in competing risks. The Aalen-Johansen estimator is the multi-state counterpart of the Kaplan-Meier estimator of survival probability in a 2-state (alive/dead) model. As mentioned under competing risks, the results will refer to a population of the same structure as the study population, and so the absolute sizes of the state probabilities will not be generalizable to other populations. The results here correspond to the results we derived using the original Steno2 population cohort in section 3.3.2.1 on page 54 ff. The estimates of state probabilities in section 3.3.2.1 are based on parametric models for the transition probabilities, where some of the transition rates depend on age and duration in the same way. The estimates from the Aalen-Johansen approach is non-parametric in the sense that the transition rates can have any shape; the down side is that they cannot depend on more than one time scale (here, sensibly time since entry) and the shape and size of them are not easily retrievable.

In Epi package there is a function, AaJ.Lexis, that does the hard work of reshaping a Lexis object for use by survfit, and calls survfit to return an object of class survfitms

AaJepi <- AaJ.Lexis(L4, timeScale = "tfi")

## NOTE: Timescale is tfi

class(AaJepi)

## [1] "survfitms" "survfit"

AaJepi

## Call: survfit(formula = form, data = Lx, id = lex.id, istate = lex.Cst)
## 
##          n nevent   rmean*
## Norm   730     72 3.718799
## Mic    730     57 9.173179
## Mac    730     65 2.574825
## D(CVD) 730     38 2.855333
## D(oth) 730     55 3.580671
##    *restricted mean time in state (max time = 21.90281 )

We find the same number of transitions in the transitions slot in the result from AaJ.Lexis as from the summary.Lexis (as we should):

AaJepi$transitions

##         to
## from     Norm Mic Mac D(CVD) D(oth) (censored)
##   Norm     72  35   0      6     13         18
##   Mic      72 276  65     14     30         36
##   Mac       0  22  28     18     12         13
##   D(CVD)    0   0   0      0      0          0
##   D(oth)    0   0   0      0      0          0

summary(L4, simp = FALSE)

##         
## Transitions:
##      To
## From     Norm Mic Mac D(CVD) D(oth)  Records:  Events: Risk time:  Persons:
##   Norm     90  35   0      6     13       144       54     581.04        66
##   Mic      72 312  65     14     30       493      181    1435.14       160
##   Mac       0  22  41     18     12        93       52     400.41        60
##   D(CVD)    0   0   0      0      0         0        0         NA        NA
##   D(oth)    0   0   0      0      0         0        0         NA        NA
##   Sum     162 369 106     38     55       730      287    2416.59       160

The predicted state probabilities are in the slot called pstate, and the confidence intervals in the corresponding slots lower and upper.

names(AaJepi)

##  [1] "n"           "time"        "n.risk"      "n.event"     "n.censor"   
##  [6] "pstate"      "p0"          "cumhaz"      "std.err"     "sp0"        
## [11] "logse"       "transitions" "lower"       "upper"       "conf.type"  
## [16] "conf.int"    "states"      "type"        "call"

AaJepi$states

## [1] "Norm"   "Mic"    "Mac"    "D(CVD)" "D(oth)"

head(AaJepi$pstate)

##         [,1]    [,2]    [,3] [,4] [,5]
## [1,] 0.00000 0.99375 0.00625    0    0
## [2,] 0.00625 0.98750 0.00625    0    0
## [3,] 0.00625 0.98750 0.00625    0    0
## [4,] 0.01250 0.98125 0.00625    0    0
## [5,] 0.01250 0.98125 0.00625    0    0
## [6,] 0.01250 0.98125 0.00625    0    0

head(AaJepi$lower)

##              [,1]      [,2]         [,3] [,4] [,5]
## [1,]           NA 0.9816133 0.0008858142   NA   NA
## [2,] 0.0008858142 0.9704340 0.0008858142   NA   NA
## [3,] 0.0008858142 0.9704340 0.0008858142   NA   NA
## [4,] 0.0031535032 0.9604561 0.0008858142   NA   NA
## [5,] 0.0031535032 0.9604561 0.0008858142   NA   NA
## [6,] 0.0031535032 0.9604561 0.0008858142   NA   NA

head(AaJepi$upper)

##            [,1] [,2]       [,3] [,4] [,5]
## [1,]         NA    1 0.04409785   NA   NA
## [2,] 0.04409785    1 0.04409785   NA   NA
## [3,] 0.04409785    1 0.04409785   NA   NA
## [4,] 0.04954807    1 0.04409785   NA   NA
## [5,] 0.04954807    1 0.04409785   NA   NA
## [6,] 0.04954807    1 0.04409785   NA   NA

We can now show the Aalen-Johansen estimator of the state probabilities:

par(mfrow = c(1,1))
mat2pol(AaJepi$pstate, perm = c(1:3,5,4), x = AaJepi$time, main= "Overall state probabilities from the Aalen-Johansen model", col = clr)
lines(AaJepi$time, apply(AaJepi$pstate[,1:3], 1, sum), lwd = 5)

But as above, we are interested in seeing the results from each of the allocation groups, so we do the calculation for each:

AaJallo <- AaJ.Lexis(L4, ~ allo, timeScale = "tfi")

## NOTE: Timescale is tfi

AaJallo

## Call: survfit(formula = form, data = Lx, id = lex.id, istate = lex.Cst)
## 
##                     n nevent   rmean*
## allo=Int, Norm    384     47 4.794785
## allo=Conv, Norm   346     25 2.640688
## allo=Int, Mic     384     34 9.933409
## allo=Conv, Mic    346     23 8.401154
## allo=Int, Mac     384     23 2.073443
## allo=Conv, Mac    346     42 3.077278
## allo=Int, D(CVD)  384     12 2.024196
## allo=Conv, D(CVD) 346     26 3.691880
## allo=Int, D(oth)  384     26 3.076973
## allo=Conv, D(oth) 346     29 4.091806
##    *restricted mean time in state (max time = 21.90281 )

The result in the AaJallo object is in a long vector of time and pstate, the two parts corresponding to Int and Conv put after one another, with the length of each part in strata.

AaJallo$states

## [1] "Norm"   "Mic"    "Mac"    "D(CVD)" "D(oth)"

AaJallo$strata

##  allo=Int allo=Conv 
##       375       337

wh <- rep(substr(names(AaJallo$strata), 6, 9), AaJallo$strata)
table(wh)

## wh
## Conv  Int 
##  337  375

So we just make the plots for the two subsets and place them next to each other as before:

par(mfrow = c(1,2), mar=c(3,3,2,0), oma = c(0,0,0,3), las = 1, bty = "n")
mat2pol(AaJallo$pstate[wh=="Int",],
        x = AaJallo$time[wh=="Int"],
        col = clr, xlim = c(0,21), xaxs = "i", yaxs = "i")
lines(AaJallo$time[wh=="Int"],
      apply(AaJallo$pstate[,1:3], 1, sum)[wh=="Int"], lwd = 4)

mat2pol(AaJallo$pstate[wh=="Conv",],
        x = AaJallo$time[wh=="Conv"],
        col = clr, xlim = c(21,0), xaxs = "i", yaxs = "i")
lines(AaJallo$time[wh=="Conv"],
      apply(AaJallo$pstate[,1:3], 1, sum)[wh=="Conv"], lwd = 4)
axis(side = 4)
mtext(c("Int","Conv"), side = 3, at = c(1,3)/4, outer = TRUE, line = -2)

The figure shows Aalen-Johansen estimator of the state probabilities for the two intervention groups, for the original Steno2 data, subdivided by intervention allocation.

The estimator of state probabilities in figure 3.9 can be considered the empirical counterpart of figure 3.5; the state probabilities for a population as the one in the study. But then, not quite so; the models underlying figure 3.5 are proportional hazards in the sense that the effects of age and time since enrollment are proportional between state by allocation (6 groups for mortality, 4 groups for albuminuria state), whereas the figures in 3.9 are based on separate models for each transition and allocation, but only with time since enrollment as timescale (no age-effect assumed).

We have confidence intervals for each of the state probabilities in the slots lower and upper, but not for the sums of these. And it is (cumulative) sums of state probabilities we have shown in the graph. Moreover we would also want confidence intervals for areas under the curves. Neither are available from the Aalen-Johansen nor from the simulation approach. The simulation approach does not even give confidence intervals for each of the state probabilities.

Time spent in albuminuria states

Besides the state probabilities at different times after entry for groups of patients, we may also want to assess the time spent in each state, during, say, the first 15 or 20 years after entry. 40. We may want to compare groups by the expected time spent in the albuminuria states during the first, say, 20 years. The expected time in a state is simply the time-integral of the probabilities, so we could compute it from pdef; each probability represents an interval of length 0.1, so we just take the midpoint of the probabilities from pstate at the ends of each interval.

This is however a bit like bringing coal to Newcastle, we have the simulated cohort Sdef, where we have simulated sojourn times in each state; so we can just sum these and use as estimates of time spent in each state.

tLive <- xtabs(lex.dur ~ ain + allo + lex.Cst, data = Sdef) /
         nid(Sdef) * 10
str(mtLive <- addmargins(tLive, 3))

##  'table' num [1:5, 1:2, 1:6] 4.69 3.56 3.05 1.94 1.3 ...
##  - attr(*, "dimnames")=List of 3
##   ..$ ain    : chr [1:5] "45" "50" "55" "60" ...
##   ..$ allo   : chr [1:2] "Conv" "Int"
##   ..$ lex.Cst: chr [1:6] "Norm" "Mic" "Mac" "D(CVD)" ...

round(ftable(mtLive          , col.vars = c(3,2)), 1)

##     lex.Cst Norm       Mic       Mac      D(CVD)      D(oth)       Sum     
##     allo    Conv  Int Conv  Int Conv  Int   Conv  Int   Conv  Int Conv  Int
## ain                                                                        
## 45           4.7  7.5  9.7  9.5  3.0  1.8    0.0  0.0    0.0  0.0 17.4 18.7
## 50           3.6  5.7  9.8 10.5  3.4  2.3    0.0  0.0    0.0  0.0 16.8 18.4
## 55           3.0  4.1  8.5 10.9  3.2  2.5    0.0  0.0    0.0  0.0 14.8 17.6
## 60           1.9  3.2  7.4  9.2  2.7  2.6    0.0  0.0    0.0  0.0 12.0 15.0
## 65           1.3  2.4  5.8  8.0  2.3  2.0    0.0  0.0    0.0  0.0  9.4 12.4

round(ftable(mtLive[,,-(4:5)], col.vars = c(3,2)), 1)

##     lex.Cst Norm       Mic       Mac       Sum     
##     allo    Conv  Int Conv  Int Conv  Int Conv  Int
## ain                                                
## 45           4.7  7.5  9.7  9.5  3.0  1.8 17.4 18.7
## 50           3.6  5.7  9.8 10.5  3.4  2.3 16.8 18.4
## 55           3.0  4.1  8.5 10.9  3.2  2.5 14.8 17.6
## 60           1.9  3.2  7.4  9.2  2.7  2.6 12.0 15.0
## 65           1.3  2.4  5.8  8.0  2.3  2.0  9.4 12.4

With the results in an array we can also easily show the difference between the intervention and the control arms of the trial:

round((mtLive[,"Int",-(4:5)] - mtLive[,"Conv",-(4:5)]), 1)

##     lex.Cst
## ain  Norm  Mic  Mac  Sum
##   45  2.8 -0.2 -1.2  1.3
##   50  2.2  0.6 -1.1  1.7
##   55  1.1  2.4 -0.7  2.8
##   60  1.3  1.8 -0.1  3.0
##   65  1.1  2.2 -0.3  3.0

We might also want to know the lifetime lost (during the first 20 years) to each of the two causes of death. This timespan is not directly available in Sdef, it is the time from death till 20 years (the time-frame we have chosen). For persons who die, the time of death is tfi + lex.dur from the record with lex.Xst ∈ (D(oth), D(CVD), so quite easily evaluated with xtabs too:

tDead <- xtabs((20 - tfi - lex.dur) ~ ain + allo + lex.Xst,
         data = subset(Sdef, lex.Xst %in% c("D(oth)", "D(CVD)"))) /
         nid(Sdef) * 10
str(mtDead <- addmargins(tDead[,,4:5], 3))

##  'table' num [1:5, 1:2, 1:3] 0.439 1.767 3.069 4.327 5.738 ...
##  - attr(*, "dimnames")=List of 3
##   ..$ ain    : chr [1:5] "45" "50" "55" "60" ...
##   ..$ allo   : chr [1:2] "Conv" "Int"
##   ..$ lex.Xst: chr [1:3] "D(CVD)" "D(oth)" "Sum"

round(ftable(mtDead          , col.vars = c(3,2)), 1)

##     lex.Xst D(CVD)      D(oth)       Sum     
##     allo      Conv  Int   Conv  Int Conv  Int
## ain                                          
## 45             0.4  0.3    2.8  1.8  3.3  2.1
## 50             1.8  0.8    2.0  1.4  3.8  2.3
## 55             3.1  1.2    2.5  1.8  5.6  3.0
## 60             4.3  2.1    3.8  3.3  8.1  5.4
## 65             5.7  2.8    4.9  5.0 10.6  7.8

round(ftable(mtDead[,,-(4:5)], col.vars = c(3,2)), 1)

##     lex.Xst D(CVD)      D(oth)       Sum     
##     allo      Conv  Int   Conv  Int Conv  Int
## ain                                          
## 45             0.4  0.3    2.8  1.8  3.3  2.1
## 50             1.8  0.8    2.0  1.4  3.8  2.3
## 55             3.1  1.2    2.5  1.8  5.6  3.0
## 60             4.3  2.1    3.8  3.3  8.1  5.4
## 65             5.7  2.8    4.9  5.0 10.6  7.8

The difference between these numbers between intervention and control groups are the years gained from the intervention, subdivided by cause of death:

round((mtDead[,"Int",] - mtDead[,"Conv",]), 1)

##     lex.Xst
## ain  D(CVD) D(oth)  Sum
##   45   -0.2   -1.0 -1.2
##   50   -0.9   -0.6 -1.5
##   55   -1.8   -0.7 -2.5
##   60   -2.3   -0.5 -2.8
##   65   -3.0    0.1 -2.9

Draw a conclusion from these numbers.

Clinical variables

So far we have only considered covariates that we know the value of at any time point, including future time points, that is the allocation status and timescales such as age and time since inclusion.

In the dataset st2clin are clinical measurements taken at different dates, up to six different occasions per person:

data(st2clin)
 str(st2clin)

## 'data.frame':    750 obs. of  5 variables:
##  $ id  : num  1 2 3 4 5 6 7 8 9 10 ...
##  $ doV : Date, format: "1993-05-07" "1993-05-05" ...
##  $ a1c : num  87.3 66.5 73 61.2 102.7 ...
##  $ chol: num  3.9 6.6 5.6 5.2 6 4.8 8.6 5.1 4.2 5.4 ...
##  $ crea: num  83 83 68 97 149 55 56 78 123 79 ...

st2clin <- cal.yr(st2clin)
names(st2clin)

## [1] "id"   "doV"  "a1c"  "chol" "crea"

names(st2clin)[1:2] <- c("lex.id","per")
summary(st2clin)

##      lex.id            per            a1c              chol       
##  Min.   :  1.00   Min.   :1993   Min.   : 32.80   Min.   : 2.200  
##  1st Qu.: 39.00   1st Qu.:1995   1st Qu.: 54.80   1st Qu.: 4.000  
##  Median : 84.50   Median :1997   Median : 66.35   Median : 4.800  
##  Mean   : 85.81   Mean   :2000   Mean   : 68.22   Mean   : 4.941  
##  3rd Qu.:131.00   3rd Qu.:2002   3rd Qu.: 79.38   3rd Qu.: 5.700  
##  Max.   :176.00   Max.   :2015   Max.   :147.60   Max.   :14.000  
##                                  NA's   :4        NA's   :3       
##       crea        
##  Min.   :  28.00  
##  1st Qu.:  67.00  
##  Median :  88.00  
##  Mean   :  99.16  
##  3rd Qu.: 115.25  
##  Max.   :1067.00  
##  NA's   :2

addmargins(table(table(st2clin$lex.id)))

## 
##   1   2   3   4   5   6 Sum 
##   2   6  23  38  31  60 160

Explain the contents of the table.

We can use addCov.Lexis to amend the follow-up data with the clinical measurements:

S5 <- addCov.Lexis(S4, st2clin, "per")
tt <- table(st2clin$lex.id)
(who <- names(tt[tt == 3])[1])

## [1] "5"

subset(st2clin, lex.id == who)

##     lex.id      per   a1c chol crea
## 5        5 1993.151 102.7  6.0  149
## 165      5 1995.511  54.7  8.8  140
## 321      5 1997.496  41.9  5.8  141

subset(S5,
       lex.id == who,
       select = c(lex.id,per,tfi,tfc,exnam,a1c,chol,crea))

##  lex.id     per  tfi  tfc exnam   a1c chol crea
##       5 1993.22 0.00 0.07   ex1 102.7  6.0  149
##       5 1993.31 0.08 0.15   ex1 102.7  6.0  149
##       5 1993.39 0.17 0.24   ex1 102.7  6.0  149
##       5 1993.47 0.25 0.32   ex1 102.7  6.0  149
##       5 1993.56 0.33 0.40   ex1 102.7  6.0  149
##       5 1993.64 0.42 0.49   ex1 102.7  6.0  149
##       5 1993.72 0.50 0.57   ex1 102.7  6.0  149
##       5 1993.77 0.55 0.62   ex1 102.7  6.0  149
##       5 1993.81 0.58 0.65   ex1 102.7  6.0  149
##       5 1993.89 0.67 0.74   ex1 102.7  6.0  149
##       5 1993.97 0.75 0.82   ex1 102.7  6.0  149
##       5 1994.06 0.83 0.90   ex1 102.7  6.0  149
##       5 1994.14 0.92 0.99   ex1 102.7  6.0  149
##       5 1994.22 1.00 1.07   ex1 102.7  6.0  149
##       5 1994.31 1.08 1.15   ex1 102.7  6.0  149
##       5 1994.39 1.17 1.24   ex1 102.7  6.0  149
##       5 1994.47 1.25 1.32   ex1 102.7  6.0  149
##       5 1994.56 1.33 1.40   ex1 102.7  6.0  149
##       5 1994.64 1.42 1.49   ex1 102.7  6.0  149
##       5 1994.72 1.50 1.57   ex1 102.7  6.0  149
##       5 1994.81 1.58 1.65   ex1 102.7  6.0  149
##       5 1994.89 1.67 1.74   ex1 102.7  6.0  149
##       5 1994.97 1.75 1.82   ex1 102.7  6.0  149
##       5 1995.06 1.83 1.90   ex1 102.7  6.0  149
##       5 1995.14 1.92 1.99   ex1 102.7  6.0  149
##       5 1995.22 2.00 2.07   ex1 102.7  6.0  149
##       5 1995.31 2.08 2.15   ex1 102.7  6.0  149
##       5 1995.39 2.17 2.24   ex1 102.7  6.0  149
##       5 1995.47 2.25 2.32   ex1 102.7  6.0  149
##       5 1995.51 2.29 0.00   ex2  54.7  8.8  140
##       5 1995.56 2.33 0.04   ex2  54.7  8.8  140
##       5 1995.64 2.42 0.13   ex2  54.7  8.8  140
##       5 1995.72 2.50 0.21   ex2  54.7  8.8  140
##       5 1995.81 2.58 0.29   ex2  54.7  8.8  140
##       5 1995.89 2.67 0.38   ex2  54.7  8.8  140
##       5 1995.97 2.75 0.46   ex2  54.7  8.8  140
##       5 1996.06 2.83 0.54   ex2  54.7  8.8  140
##       5 1996.14 2.92 0.63   ex2  54.7  8.8  140
##       5 1996.22 3.00 0.71   ex2  54.7  8.8  140
##       5 1996.31 3.08 0.79   ex2  54.7  8.8  140
##       5 1996.39 3.17 0.88   ex2  54.7  8.8  140
##       5 1996.47 3.25 0.96   ex2  54.7  8.8  140
##       5 1996.56 3.33 1.04   ex2  54.7  8.8  140
##       5 1996.64 3.42 1.13   ex2  54.7  8.8  140
##       5 1996.72 3.50 1.21   ex2  54.7  8.8  140
##       5 1996.81 3.58 1.29   ex2  54.7  8.8  140
##       5 1996.89 3.67 1.38   ex2  54.7  8.8  140
##       5 1996.97 3.75 1.46   ex2  54.7  8.8  140
##       5 1997.06 3.83 1.54   ex2  54.7  8.8  140
##       5 1997.07 3.85 1.56   ex2  54.7  8.8  140
##       5 1997.14 3.92 1.63   ex2  54.7  8.8  140
##       5 1997.22 4.00 1.71   ex2  54.7  8.8  140
##       5 1997.31 4.08 1.79   ex2  54.7  8.8  140
##       5 1997.39 4.17 1.88   ex2  54.7  8.8  140
##       5 1997.47 4.25 1.96   ex2  54.7  8.8  140
##       5 1997.50 4.27 0.00   ex3  41.9  5.8  141
##       5 1997.56 4.33 0.06   ex3  41.9  5.8  141
##       5 1997.64 4.42 0.14   ex3  41.9  5.8  141
##       5 1997.72 4.50 0.23   ex3  41.9  5.8  141
##       5 1997.81 4.58 0.31   ex3  41.9  5.8  141
##       5 1997.89 4.67 0.39   ex3  41.9  5.8  141
##       5 1997.97 4.75 0.48   ex3  41.9  5.8  141

timeScales(S5)

## [1] "per" "age" "tfi" "tfc"

timeSince(S5)

## per age tfi tfc 
##  ""  ""  "" "X"

We see that tfc is included as a time scale, but it is a not a proper time scale; it is reset to 0 at every clinical visit, and it also has some missing values, as do the clinical variables. The missing values are where there is follow-up before the earliest clinical measurement for a person. But tfc needs to be a time scale in the Lexis object in order to be properly handled when subsequently cutting and splitting a Lexis object.

The values of the clinical measurements in st2clin are added to the follow-up data: extra cut points at the measurement dates are added, and the values of the clinical variables are propagated as LOCF (Last Observation Carried Forward), so it is possible to model the effect of these clinical variables on transition rates—creatinine is traditionally modeled on a log-scale, here we use the base 2 logarithm.

detc <- glm.Lexis(S5, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                        Ns(age, knots = seq(50, 80, 10)) +
                        lex.Cst / allo +
                        a1c + chol + log2(crea),
                  from = c("Norm","Mic"),
                    to = c("Mic","Mac"))

## stats::glm Poisson analysis of Lexis object S5 with log link:
## Rates for transitions:
## Norm->Mic
## Mic->Mac

impc <- glm.Lexis(S5, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                        Ns(age, knots = seq(50, 80, 10)) +
                        lex.Cst / allo +
                        a1c + chol + log2(crea),
                    to = c("Norm","Mic"),
                  from = c("Mic","Mac"))

## stats::glm Poisson analysis of Lexis object S5 with log link:
## Rates for transitions:
## Mic->Norm
## Mac->Mic

round(ci.exp(detc), 3)

##                                   exp(Est.)  2.5%  97.5%
## (Intercept)                           0.080 0.005  1.337
## Ns(tfi, knots = seq(0, 20, 5))1       0.714 0.271  1.880
## Ns(tfi, knots = seq(0, 20, 5))2       0.300 0.084  1.067
## Ns(tfi, knots = seq(0, 20, 5))3       0.297 0.045  1.967
## Ns(tfi, knots = seq(0, 20, 5))4       0.226 0.066  0.776
## Ns(age, knots = seq(50, 80, 10))1     2.032 0.862  4.789
## Ns(age, knots = seq(50, 80, 10))2     4.453 1.131 17.535
## Ns(age, knots = seq(50, 80, 10))3     3.387 0.948 12.106
## lex.CstMic                            0.390 0.220  0.692
## a1c                                   1.005 0.993  1.018
## chol                                  1.093 0.912  1.311
## log2(crea)                            0.864 0.582  1.281
## lex.CstNorm:alloConv                  0.433 0.193  0.975
## lex.CstMic:alloConv                   1.698 0.974  2.959

We do not really need to see the first 8 parameter estimates so we omit them:

round(ci.exp(detc, subset = -(1:8), pval = T), 3)

##                      exp(Est.)  2.5% 97.5%     P
## lex.CstMic               0.390 0.220 0.692 0.001
## a1c                      1.005 0.993 1.018 0.398
## chol                     1.093 0.912 1.311 0.336
## log2(crea)               0.864 0.582 1.281 0.466
## lex.CstNorm:alloConv     0.433 0.193 0.975 0.043
## lex.CstMic:alloConv      1.698 0.974 2.959 0.062

round(ci.exp(impc, subset = -(1:8), pval = T), 3)

##                     exp(Est.)  2.5% 97.5%     P
## lex.CstMac              1.053 0.465 2.382 0.902
## a1c                     0.990 0.978 1.003 0.129
## chol                    0.964 0.804 1.157 0.697
## log2(crea)              0.869 0.578 1.308 0.501
## lex.CstMic:alloConv     0.598 0.359 0.996 0.048
## lex.CstMac:alloConv     1.525 0.611 3.804 0.366

We also look at the effect of the clinical variables on the mortality rates:

moc <- glm.Lexis(S5, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst / allo +
                       a1c + chol + log2(crea),
                    to = "D(oth)")

## stats::glm Poisson analysis of Lexis object S5 with log link:
## Rates for transitions:
## Norm->D(oth)
## Mic->D(oth)
## Mac->D(oth)

mCc <- glm.Lexis(S5, ~ Ns(tfi, knots = seq( 0, 20, 5)) +
                       Ns(age, knots = seq(50, 80, 10)) +
                       lex.Cst / allo +
                       a1c + chol + log2(crea),
                    to = "D(CVD)")

## stats::glm Poisson analysis of Lexis object S5 with log link:
## Rates for transitions:
## Norm->D(CVD)
## Mic->D(CVD)
## Mac->D(CVD)

round(ci.exp(moc, subset = -(1:8), pval = T), 3)

##                      exp(Est.)  2.5% 97.5%     P
## lex.CstMic               0.972 0.361 2.615 0.955
## lex.CstMac               1.304 0.411 4.135 0.652
## a1c                      1.005 0.987 1.024 0.582
## chol                     0.839 0.630 1.117 0.229
## log2(crea)               1.860 1.145 3.019 0.012
## lex.CstNorm:alloConv     1.900 0.606 5.962 0.271
## lex.CstMic:alloConv      1.944 0.878 4.306 0.101
## lex.CstMac:alloConv      0.777 0.235 2.575 0.680

round(ci.exp(mCc, subset = -(1:8), pval = T), 3)

##                      exp(Est.)  2.5%  97.5%     P
## lex.CstMic               0.806 0.199  3.274 0.763
## lex.CstMac               1.144 0.221  5.916 0.873
## a1c                      0.999 0.980  1.019 0.951
## chol                     1.002 0.734  1.367 0.990
## log2(crea)               1.351 0.757  2.409 0.309
## lex.CstNorm:alloConv     1.397 0.272  7.177 0.689
## lex.CstMic:alloConv      1.680 0.552  5.113 0.361
## lex.CstMac:alloConv      5.067 1.387 18.513 0.014

Only crea has any effect; a doubling of creatinine is associated with a 1.86 times higher mortality rate from other (non-CVD) causes. Confidence interval is (1.14,3.02), so not terribly precisely determined.

There are limitations in using clinical measurements as time-dependent variables without a model for the clinical variables. In order to simulate events based on models for transition rates we must know all covariates at all (future) times, so models with non-deterministicly varying are not usable. Timescales are time-varying covariate, but they vary deterministicly, so their value for each person will be known at any time of future follow-up.

So the models with effects of clinical variables as presented here cannot be used for prediction of state probabilities—that would require a model for the change of the clinical variables over time as well.

Several transitions from one state: stack

So far, we have only jointly modeled transitions that originated in different states, for example:

Mic → Mac and Norm → Mic; Norm → D(CVD), Mic → D(CVD) and Mac → D(CVD).

As long as the different rates modeled are originating in different states, the likelihood will have at most one contribution from each record in the Lexis follow-up data set.

But if we want to create a joint model for more than one rate originating in a given state we must repeat some of risk time in different contributions to the likelihood. This means that the modeling cannot be based on (subsets of) a Lexis object, we must repeat some records. This is detailed in section on Competing Risks in the chapter on multistate likelihood in the PMM (Practical Multistate Modeling, http://bendixcarstensen.com/MSbook.pdf).

This behaviour can be achieved by the stack.Lexis function:

St4 <- stack(S4)
c(nrow(S4), nrow(St4))

## [1]  29645 106569

table(S4$lex.Cst)

## 
##   Norm    Mic    Mac D(CVD) D(oth) 
##   7115  17634   4896      0      0

table(St4$lex.Tr, St4$lex.Cst)

##               
##                 Norm   Mic   Mac D(CVD) D(oth)
##   Mac->D(CVD)      0     0  4896      0      0
##   Mac->D(oth)      0     0  4896      0      0
##   Mac->Mic         0     0  4896      0      0
##   Mic->D(CVD)      0 17634     0      0      0
##   Mic->D(oth)      0 17634     0      0      0
##   Mic->Mac         0 17634     0      0      0
##   Mic->Norm        0 17634     0      0      0
##   Norm->D(CVD)  7115     0     0      0      0
##   Norm->D(oth)  7115     0     0      0      0
##   Norm->Mic     7115     0     0      0      0

ftable(St4$lex.Tr, St4$lex.Xst, St4$lex.Fail, col.vars = 2:3)

##                Norm         Mic         Mac       D(CVD)       D(oth)      
##               FALSE  TRUE FALSE  TRUE FALSE  TRUE  FALSE  TRUE  FALSE  TRUE
##                                                                            
## Mac->D(CVD)       0     0    22     0  4844     0      0    18     12     0
## Mac->D(oth)       0     0    22     0  4844     0     18     0      0    12
## Mac->Mic          0     0     0    22  4844     0     18     0     12     0
## Mic->D(CVD)      72     0 17453     0    65     0      0    14     30     0
## Mic->D(oth)      72     0 17453     0    65     0     14     0      0    30
## Mic->Mac         72     0 17453     0     0    65     14     0     30     0
## Mic->Norm         0    72 17453     0    65     0     14     0     30     0
## Norm->D(CVD)   7061     0    35     0     0     0      0     6     13     0
## Norm->D(oth)   7061     0    35     0     0     0      6     0      0    13
## Norm->Mic      7061     0     0    35     0     0      6     0     13     0

We see that the lex.Fail is only TRUE where lex.Xst is equal to the second part if the lex.Tr.

The two ways of representing the data for person 102 are quite different:

subset(S4 , lex.id == 102)[,1:8]

##  lex.id     per   age  tfi lex.dur lex.Cst lex.Xst  id
##     102 1993.54 58.27 0.00    0.08     Mic     Mic 102
##     102 1993.62 58.36 0.08    0.08     Mic     Mic 102
##     102 1993.71 58.44 0.17    0.08     Mic     Mic 102
##     102 1993.79 58.52 0.25    0.08     Mic     Mic 102
##     102 1993.87 58.61 0.33    0.08     Mic     Mic 102
##     102 1993.96 58.69 0.42    0.08     Mic     Mic 102
##     102 1994.04 58.77 0.50    0.08     Mic     Mic 102
##     102 1994.12 58.86 0.58    0.08     Mic     Mic 102
##     102 1994.21 58.94 0.67    0.08     Mic     Mic 102
##     102 1994.29 59.02 0.75    0.08     Mic     Mic 102
##     102 1994.37 59.11 0.83    0.08     Mic     Mic 102
##     102 1994.46 59.19 0.92    0.08     Mic     Mic 102
##     102 1994.54 59.27 1.00    0.08     Mic     Mic 102
##     102 1994.62 59.36 1.08    0.08     Mic     Mic 102
##     102 1994.71 59.44 1.17    0.08     Mic     Mic 102
##     102 1994.79 59.52 1.25    0.08     Mic     Mic 102
##     102 1994.87 59.61 1.33    0.08     Mic     Mic 102
##     102 1994.96 59.69 1.42    0.08     Mic     Mic 102
##     102 1995.04 59.77 1.50    0.08     Mic     Mic 102
##     102 1995.12 59.86 1.58    0.08     Mic     Mic 102
##     102 1995.21 59.94 1.67    0.03     Mic  D(CVD) 102

subset(St4, lex.id == 102)[,1:9]

##        lex.id      per      age        tfi    lex.dur lex.Cst lex.Xst
## 18070     102 1993.540 58.27242 0.00000000 0.08333333     Mic     Mic
## 18071     102 1993.623 58.35575 0.08333333 0.08333333     Mic     Mic
## 18072     102 1993.707 58.43908 0.16666667 0.08333333     Mic     Mic
## 18073     102 1993.790 58.52242 0.25000000 0.08333333     Mic     Mic
## 18074     102 1993.873 58.60575 0.33333333 0.08333333     Mic     Mic
## 18075     102 1993.957 58.68908 0.41666667 0.08333333     Mic     Mic
## 18076     102 1994.040 58.77242 0.50000000 0.08333333     Mic     Mic
## 18077     102 1994.123 58.85575 0.58333333 0.08333333     Mic     Mic
## 18078     102 1994.207 58.93908 0.66666667 0.08333333     Mic     Mic
## 18079     102 1994.290 59.02242 0.75000000 0.08333333     Mic     Mic
## 18080     102 1994.373 59.10575 0.83333333 0.08333333     Mic     Mic
## 18081     102 1994.457 59.18908 0.91666667 0.08333333     Mic     Mic
## 18082     102 1994.540 59.27242 1.00000000 0.08333333     Mic     Mic
## 18083     102 1994.623 59.35575 1.08333333 0.08333333     Mic     Mic
## 18084     102 1994.707 59.43908 1.16666667 0.08333333     Mic     Mic
## 18085     102 1994.790 59.52242 1.25000000 0.08333333     Mic     Mic
## 18086     102 1994.873 59.60575 1.33333333 0.08333333     Mic     Mic
## 18087     102 1994.957 59.68908 1.41666667 0.08333333     Mic     Mic
## 18088     102 1995.040 59.77242 1.50000000 0.08333333     Mic     Mic
## 18089     102 1995.123 59.85575 1.58333333 0.08333333     Mic     Mic
## 18090     102 1995.207 59.93908 1.66666667 0.03353867     Mic  D(CVD)
## 180701    102 1993.540 58.27242 0.00000000 0.08333333     Mic     Mic
## 180711    102 1993.623 58.35575 0.08333333 0.08333333     Mic     Mic
## 180721    102 1993.707 58.43908 0.16666667 0.08333333     Mic     Mic
## 180731    102 1993.790 58.52242 0.25000000 0.08333333     Mic     Mic
## 180741    102 1993.873 58.60575 0.33333333 0.08333333     Mic     Mic
## 180751    102 1993.957 58.68908 0.41666667 0.08333333     Mic     Mic
## 180761    102 1994.040 58.77242 0.50000000 0.08333333     Mic     Mic
## 180771    102 1994.123 58.85575 0.58333333 0.08333333     Mic     Mic
## 180781    102 1994.207 58.93908 0.66666667 0.08333333     Mic     Mic
## 180791    102 1994.290 59.02242 0.75000000 0.08333333     Mic     Mic
## 180801    102 1994.373 59.10575 0.83333333 0.08333333     Mic     Mic
## 180811    102 1994.457 59.18908 0.91666667 0.08333333     Mic     Mic
## 180821    102 1994.540 59.27242 1.00000000 0.08333333     Mic     Mic
## 180831    102 1994.623 59.35575 1.08333333 0.08333333     Mic     Mic
## 180841    102 1994.707 59.43908 1.16666667 0.08333333     Mic     Mic
## 180851    102 1994.790 59.52242 1.25000000 0.08333333     Mic     Mic
## 180861    102 1994.873 59.60575 1.33333333 0.08333333     Mic     Mic
## 180871    102 1994.957 59.68908 1.41666667 0.08333333     Mic     Mic
## 180881    102 1995.040 59.77242 1.50000000 0.08333333     Mic     Mic
## 180891    102 1995.123 59.85575 1.58333333 0.08333333     Mic     Mic
## 180901    102 1995.207 59.93908 1.66666667 0.03353867     Mic  D(CVD)
## 180702    102 1993.540 58.27242 0.00000000 0.08333333     Mic     Mic
## 180712    102 1993.623 58.35575 0.08333333 0.08333333     Mic     Mic
## 180722    102 1993.707 58.43908 0.16666667 0.08333333     Mic     Mic
## 180732    102 1993.790 58.52242 0.25000000 0.08333333     Mic     Mic
## 180742    102 1993.873 58.60575 0.33333333 0.08333333     Mic     Mic
## 180752    102 1993.957 58.68908 0.41666667 0.08333333     Mic     Mic
## 180762    102 1994.040 58.77242 0.50000000 0.08333333     Mic     Mic
## 180772    102 1994.123 58.85575 0.58333333 0.08333333     Mic     Mic
## 180782    102 1994.207 58.93908 0.66666667 0.08333333     Mic     Mic
## 180792    102 1994.290 59.02242 0.75000000 0.08333333     Mic     Mic
## 180802    102 1994.373 59.10575 0.83333333 0.08333333     Mic     Mic
## 180812    102 1994.457 59.18908 0.91666667 0.08333333     Mic     Mic
## 180822    102 1994.540 59.27242 1.00000000 0.08333333     Mic     Mic
## 180832    102 1994.623 59.35575 1.08333333 0.08333333     Mic     Mic
## 180842    102 1994.707 59.43908 1.16666667 0.08333333     Mic     Mic
## 180852    102 1994.790 59.52242 1.25000000 0.08333333     Mic     Mic
## 180862    102 1994.873 59.60575 1.33333333 0.08333333     Mic     Mic
## 180872    102 1994.957 59.68908 1.41666667 0.08333333     Mic     Mic
## 180882    102 1995.040 59.77242 1.50000000 0.08333333     Mic     Mic
## 180892    102 1995.123 59.85575 1.58333333 0.08333333     Mic     Mic
## 180902    102 1995.207 59.93908 1.66666667 0.03353867     Mic  D(CVD)
## 180703    102 1993.540 58.27242 0.00000000 0.08333333     Mic     Mic
## 180713    102 1993.623 58.35575 0.08333333 0.08333333     Mic     Mic
## 180723    102 1993.707 58.43908 0.16666667 0.08333333     Mic     Mic
## 180733    102 1993.790 58.52242 0.25000000 0.08333333     Mic     Mic
## 180743    102 1993.873 58.60575 0.33333333 0.08333333     Mic     Mic
## 180753    102 1993.957 58.68908 0.41666667 0.08333333     Mic     Mic
## 180763    102 1994.040 58.77242 0.50000000 0.08333333     Mic     Mic
## 180773    102 1994.123 58.85575 0.58333333 0.08333333     Mic     Mic
## 180783    102 1994.207 58.93908 0.66666667 0.08333333     Mic     Mic
## 180793    102 1994.290 59.02242 0.75000000 0.08333333     Mic     Mic
## 180803    102 1994.373 59.10575 0.83333333 0.08333333     Mic     Mic
## 180813    102 1994.457 59.18908 0.91666667 0.08333333     Mic     Mic
## 180823    102 1994.540 59.27242 1.00000000 0.08333333     Mic     Mic
## 180833    102 1994.623 59.35575 1.08333333 0.08333333     Mic     Mic
## 180843    102 1994.707 59.43908 1.16666667 0.08333333     Mic     Mic
## 180853    102 1994.790 59.52242 1.25000000 0.08333333     Mic     Mic
## 180863    102 1994.873 59.60575 1.33333333 0.08333333     Mic     Mic
## 180873    102 1994.957 59.68908 1.41666667 0.08333333     Mic     Mic
## 180883    102 1995.040 59.77242 1.50000000 0.08333333     Mic     Mic
## 180893    102 1995.123 59.85575 1.58333333 0.08333333     Mic     Mic
## 180903    102 1995.207 59.93908 1.66666667 0.03353867     Mic  D(CVD)
##             lex.Tr lex.Fail
## 18070    Mic->Norm    FALSE
## 18071    Mic->Norm    FALSE
## 18072    Mic->Norm    FALSE
## 18073    Mic->Norm    FALSE
## 18074    Mic->Norm    FALSE
## 18075    Mic->Norm    FALSE
## 18076    Mic->Norm    FALSE
## 18077    Mic->Norm    FALSE
## 18078    Mic->Norm    FALSE
## 18079    Mic->Norm    FALSE
## 18080    Mic->Norm    FALSE
## 18081    Mic->Norm    FALSE
## 18082    Mic->Norm    FALSE
## 18083    Mic->Norm    FALSE
## 18084    Mic->Norm    FALSE
## 18085    Mic->Norm    FALSE
## 18086    Mic->Norm    FALSE
## 18087    Mic->Norm    FALSE
## 18088    Mic->Norm    FALSE
## 18089    Mic->Norm    FALSE
## 18090    Mic->Norm    FALSE
## 180701    Mic->Mac    FALSE
## 180711    Mic->Mac    FALSE
## 180721    Mic->Mac    FALSE
## 180731    Mic->Mac    FALSE
## 180741    Mic->Mac    FALSE
## 180751    Mic->Mac    FALSE
## 180761    Mic->Mac    FALSE
## 180771    Mic->Mac    FALSE
## 180781    Mic->Mac    FALSE
## 180791    Mic->Mac    FALSE
## 180801    Mic->Mac    FALSE
## 180811    Mic->Mac    FALSE
## 180821    Mic->Mac    FALSE
## 180831    Mic->Mac    FALSE
## 180841    Mic->Mac    FALSE
## 180851    Mic->Mac    FALSE
## 180861    Mic->Mac    FALSE
## 180871    Mic->Mac    FALSE
## 180881    Mic->Mac    FALSE
## 180891    Mic->Mac    FALSE
## 180901    Mic->Mac    FALSE
## 180702 Mic->D(CVD)    FALSE
## 180712 Mic->D(CVD)    FALSE
## 180722 Mic->D(CVD)    FALSE
## 180732 Mic->D(CVD)    FALSE
## 180742 Mic->D(CVD)    FALSE
## 180752 Mic->D(CVD)    FALSE
## 180762 Mic->D(CVD)    FALSE
## 180772 Mic->D(CVD)    FALSE
## 180782 Mic->D(CVD)    FALSE
## 180792 Mic->D(CVD)    FALSE
## 180802 Mic->D(CVD)    FALSE
## 180812 Mic->D(CVD)    FALSE
## 180822 Mic->D(CVD)    FALSE
## 180832 Mic->D(CVD)    FALSE
## 180842 Mic->D(CVD)    FALSE
## 180852 Mic->D(CVD)    FALSE
## 180862 Mic->D(CVD)    FALSE
## 180872 Mic->D(CVD)    FALSE
## 180882 Mic->D(CVD)    FALSE
## 180892 Mic->D(CVD)    FALSE
## 180902 Mic->D(CVD)     TRUE
## 180703 Mic->D(oth)    FALSE
## 180713 Mic->D(oth)    FALSE
## 180723 Mic->D(oth)    FALSE
## 180733 Mic->D(oth)    FALSE
## 180743 Mic->D(oth)    FALSE
## 180753 Mic->D(oth)    FALSE
## 180763 Mic->D(oth)    FALSE
## 180773 Mic->D(oth)    FALSE
## 180783 Mic->D(oth)    FALSE
## 180793 Mic->D(oth)    FALSE
## 180803 Mic->D(oth)    FALSE
## 180813 Mic->D(oth)    FALSE
## 180823 Mic->D(oth)    FALSE
## 180833 Mic->D(oth)    FALSE
## 180843 Mic->D(oth)    FALSE
## 180853 Mic->D(oth)    FALSE
## 180863 Mic->D(oth)    FALSE
## 180873 Mic->D(oth)    FALSE
## 180883 Mic->D(oth)    FALSE
## 180893 Mic->D(oth)    FALSE
## 180903 Mic->D(oth)    FALSE

Suppose we wanted to fit a model for the two types of mortality assuming that, say, the effect of sex was the same. Since some of the transitions we put in the same model originate from the same state we need the stacked data representation where each record corresponds to a likelihood term.

cbind(with(subset(St4, grepl("D", lex.Tr)), table(lex.Tr)))

##               [,1]
## Mac->D(CVD)   4896
## Mac->D(oth)   4896
## Mac->Mic         0
## Mic->D(CVD)  17634
## Mic->D(oth)  17634
## Mic->Mac         0
## Mic->Norm        0
## Norm->D(CVD)  7115
## Norm->D(oth)  7115
## Norm->Mic        0

We can then fit a model with common effect of sex for the two types mortality:

stD <- glm(cbind(lex.Fail, lex.dur)
           ~ Ns(tfi, knots = seq( 0, 20,  5)) * lex.Tr +
             Ns(age, knots = seq(50, 80, 10)) * lex.Tr +
             lex.Tr / allo + sex,
          family = poisreg,
            data = subset(St4, grepl("D", lex.Tr)))

## Warning: glm.fit: fitted rates numerically 0 occurred

round(ci.exp(stD)[,1,drop=F],3)

##                                                          exp(Est.)
## (Intercept)                                           0.000000e+00
## Ns(tfi, knots = seq(0, 20, 5))1                       2.069800e+01
## Ns(tfi, knots = seq(0, 20, 5))2                       2.756800e+01
## Ns(tfi, knots = seq(0, 20, 5))3                       7.291900e+01
## Ns(tfi, knots = seq(0, 20, 5))4                       5.860000e-01
## lex.TrMac->D(oth)                                     0.000000e+00
## lex.TrMic->D(CVD)                                     7.322000e+00
## lex.TrMic->D(oth)                                     9.000000e-03
## lex.TrNorm->D(CVD)                                    0.000000e+00
## lex.TrNorm->D(oth)                                    1.350600e+01
## Ns(age, knots = seq(50, 80, 10))1                     7.378000e+00
## Ns(age, knots = seq(50, 80, 10))2                     2.669160e+02
## Ns(age, knots = seq(50, 80, 10))3                     6.845600e+01
## sexM                                                  1.440000e+00
## Ns(tfi, knots = seq(0, 20, 5))1:lex.TrMac->D(oth)     5.334862e+72
## Ns(tfi, knots = seq(0, 20, 5))2:lex.TrMac->D(oth)     3.913890e+51
## Ns(tfi, knots = seq(0, 20, 5))3:lex.TrMac->D(oth)    1.417514e+141
## Ns(tfi, knots = seq(0, 20, 5))4:lex.TrMac->D(oth)     2.902424e+30
## Ns(tfi, knots = seq(0, 20, 5))1:lex.TrMic->D(CVD)     5.000000e-03
## Ns(tfi, knots = seq(0, 20, 5))2:lex.TrMic->D(CVD)     7.600000e-02
## Ns(tfi, knots = seq(0, 20, 5))3:lex.TrMic->D(CVD)     3.100000e-02
## Ns(tfi, knots = seq(0, 20, 5))4:lex.TrMic->D(CVD)     1.690000e-01
## Ns(tfi, knots = seq(0, 20, 5))1:lex.TrMic->D(oth)     2.015738e+03
## Ns(tfi, knots = seq(0, 20, 5))2:lex.TrMic->D(oth)     8.679800e+01
## Ns(tfi, knots = seq(0, 20, 5))3:lex.TrMic->D(oth)     3.163383e+07
## Ns(tfi, knots = seq(0, 20, 5))4:lex.TrMic->D(oth)     3.372500e+01
## Ns(tfi, knots = seq(0, 20, 5))1:lex.TrNorm->D(CVD)    4.469074e+04
## Ns(tfi, knots = seq(0, 20, 5))2:lex.TrNorm->D(CVD)    5.744664e+04
## Ns(tfi, knots = seq(0, 20, 5))3:lex.TrNorm->D(CVD)    1.297133e+11
## Ns(tfi, knots = seq(0, 20, 5))4:lex.TrNorm->D(CVD)    1.000000e-03
## Ns(tfi, knots = seq(0, 20, 5))1:lex.TrNorm->D(oth)    2.480000e-01
## Ns(tfi, knots = seq(0, 20, 5))2:lex.TrNorm->D(oth)    1.960000e-01
## Ns(tfi, knots = seq(0, 20, 5))3:lex.TrNorm->D(oth)    1.109700e+01
## Ns(tfi, knots = seq(0, 20, 5))4:lex.TrNorm->D(oth)    2.950000e-01
## lex.TrMac->D(oth):Ns(age, knots = seq(50, 80, 10))1   2.060000e-01
## lex.TrMic->D(CVD):Ns(age, knots = seq(50, 80, 10))1   2.002000e+00
## lex.TrMic->D(oth):Ns(age, knots = seq(50, 80, 10))1   2.490000e-01
## lex.TrNorm->D(CVD):Ns(age, knots = seq(50, 80, 10))1  1.990000e+00
## lex.TrNorm->D(oth):Ns(age, knots = seq(50, 80, 10))1  1.107000e+00
## lex.TrMac->D(oth):Ns(age, knots = seq(50, 80, 10))2   1.100000e-02
## lex.TrMic->D(CVD):Ns(age, knots = seq(50, 80, 10))2   6.593000e+00
## lex.TrMic->D(oth):Ns(age, knots = seq(50, 80, 10))2   5.000000e-03
## lex.TrNorm->D(CVD):Ns(age, knots = seq(50, 80, 10))2  1.000000e-03
## lex.TrNorm->D(oth):Ns(age, knots = seq(50, 80, 10))2  1.200000e-02
## lex.TrMac->D(oth):Ns(age, knots = seq(50, 80, 10))3   1.490000e-01
## lex.TrMic->D(CVD):Ns(age, knots = seq(50, 80, 10))3   2.070000e-01
## lex.TrMic->D(oth):Ns(age, knots = seq(50, 80, 10))3   1.650000e-01
## lex.TrNorm->D(CVD):Ns(age, knots = seq(50, 80, 10))3  0.000000e+00
## lex.TrNorm->D(oth):Ns(age, knots = seq(50, 80, 10))3  2.940000e-01
## lex.TrMac->D(CVD):alloConv                            8.706000e+00
## lex.TrMac->D(oth):alloConv                            6.270000e-01
## lex.TrMic->D(CVD):alloConv                            1.688000e+00
## lex.TrMic->D(oth):alloConv                            2.096000e+00
## lex.TrNorm->D(CVD):alloConv                           2.054000e+00
## lex.TrNorm->D(oth):alloConv                           1.800000e+00

So under the assumption that the sex-effect is the same for all 6 mortality rates in figure 3.2 the M/W rate ratio is 1.46. But it is only rarely that we want to model different rates out of the same state, so use of stack(.Lexis) is seldom needed.

Bendix Stats Program - Chapter 3

Adam Livori

16/02/2023