This is an intructory tutorial on egocentric relational event modeling (EREM) going through the worked example in Marcum and Butts (2015) using their informR package and the relational event modeling package “relevent”. You can find the original paper and source code at Marcum & Butts (2015). The main difference here is that I’ve added interleaved my comments with the code. The example will however make a lot more sense if you read through Marcum and Butts’ paper first (and they also go into a lot more detail).

First we need to load the packages we need. If you haven’t installed them already you’ll also need to install the packages in R. There are various ways to do this but install.packages(‘relevent’) and install.packages(‘informR’) should work directly from the R console.

library(relevent)
## Loading required package: trust
## Loading required package: sna
## Loading required package: statnet.common
## 
## Attaching package: 'statnet.common'
## The following object is masked from 'package:base':
## 
##     order
## Loading required package: network
## network: Classes for Relational Data
## Version 1.16.1 created on 2020-10-06.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
##                     Mark S. Handcock, University of California -- Los Angeles
##                     David R. Hunter, Penn State University
##                     Martina Morris, University of Washington
##                     Skye Bender-deMoll, University of Washington
##  For citation information, type citation("network").
##  Type help("network-package") to get started.
## sna: Tools for Social Network Analysis
## Version 2.6 created on 2020-10-5.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
##  For citation information, type citation("sna").
##  Type help(package="sna") to get started.
## Loading required package: coda
## relevent: Relational Event Models
## Version 1.0-4 created on March 9, 2015.
## copyright (c) 2007, Carter T. Butts, University of California-
## Irvine
##  For citation information, type citation("relevent").
##  Type help(package="relevent") to get started.
library(informR)
## Loading required package: abind
## informR: Sequence Statistics for Relational Event Models
## Version 1.0-5 created on 2015-03-09.
## copyright (c) 2010, Christopher Steven Marcum, University of California-
## Irvine
##  For citation information, type citation("informR").
##  Type help(package="informR") to get started.

The next few lines load two versions of the example data (from the American Time Use Survey) and show use the first few cases. There are two versions of the data (both for a subset of participants aged 80 or above): atus80ord records the order of events for each participant and atus80int includes timing information. This data file is identical except each event is split into two records coding the start and end of the activity. Note that times are zero-referenced for each person (with zero representing the start of data collection or observation). Apart from that we just have the participant ID and a covariate indicating whether the participant is male or female.

data("atus80ord", package = "informR")
data("atus80int", package = "informR")

atus80ord[1:5, ]
##                 Activities       TUCASEID SEX
## 1348              Sleeping 20030101031049   2
## 1349                Eating 20030101031049   2
## 1350 Private personal care 20030101031049   2
## 1351  Household Production 20030101031049   2
## 1352                Eating 20030101031049   2
atus80int[1:10, ]
##                                       Events    Time       TUCASEID SEX
## 200301010310491               Sleeping|START       0 20030101031049   2
## 200301010310492                Sleeping|STOP     240 20030101031049   2
## 200301010310493                 Eating|START 240.001 20030101031049   2
## 200301010310494                  Eating|STOP     270 20030101031049   2
## 200301010310495  Private personal care|START 270.001 20030101031049   2
## 200301010310496   Private personal care|STOP     300 20030101031049   2
## 200301010310497   Household Production|START 300.001 20030101031049   2
## 200301010310498    Household Production|STOP     780 20030101031049   2
## 200301010310499                 Eating|START 780.001 20030101031049   2
## 2003010103104910                 Eating|STOP     810 20030101031049   2

Ordinal data

The next step is some pre-processing of the ordinal data to get it in the right format for the relevent package. The informR package makes this somewhat easier, but it still requires some setup. There is an option to code some events (here “NA” observations) as missing (which means that they aren’t modeled directly). An eventlist object is then created using a matrix of observed events and participant IDs and a statement defining the missing/null events.

atus80ord[which(is.na(atus80ord[, "Activities"])), "Activities"] <- "MISSING"
rawevents <- cbind(atus80ord$Activities, atus80ord$TUCASEID)
evls <- gen.evl(rawevents, null.events = c("MISSING"))

This shows the components that make up the event list, the eventlist entry for the first participant in the data set, the event key and null events that are defined. The event key is particularly useful as it shows how the events are labeled with letters and what event each letter corresponds to. For example here ‘a’ is ‘Sleeping’ and ‘b’ is ‘Eating’. It is often useful to refer back to the key as it may change (for example if you analyze a subset of the data). Note that ‘a’ is the code for the first observed event and we can have up to 52 events.

names(evls)
## [1] "eventlist"   "event.key"   "null.events"
evls$eventlist[[1]]
## [1] 1 2 3 4 2 4 1
## attr(,"char")
## [1] "a" "b" "c" "d" "b" "d" "a"
evls$event.key
##       id  event.type             
##  [1,] "a" "Sleeping"             
##  [2,] "b" "Eating"               
##  [3,] "c" "Private personal care"
##  [4,] "d" "Household Production" 
##  [5,] "e" "Travel"               
##  [6,] "f" "Communication"        
##  [7,] "g" "Leisure"              
##  [8,] "h" "Personal Care"        
##  [9,] "i" "MISSING"              
## [10,] "j" "Waiting"              
## [11,] "k" "Volunteering"         
## [12,] "l" "Caregiving"           
## [13,] "m" "Education"            
## [14,] "n" "Work Production"
evls$null.events
## [1] "MISSING"

The next step is to define what sufficient statistics will be estimated by defining the intercepts. This is more-or-less just setting up a contrast matrix for the model so that we get parameter estimates for each event (at least at this stage; this will get modified in more complex models).

alpha.ints <- gen.intercepts(evls, basecat = "Sleeping")
length(alpha.ints)
## [1] 3430
dim(alpha.ints[[1]][[1]])
## [1]  7 13 12
alpha.ints[[1]][[1]][1, , ]
##                       Eating Private personal care Household Production Travel
## Sleeping                   0                     0                    0      0
## Eating                     1                     0                    0      0
## Private personal care      0                     1                    0      0
## Household Production       0                     0                    1      0
## Travel                     0                     0                    0      1
## Communication              0                     0                    0      0
## Leisure                    0                     0                    0      0
## Personal Care              0                     0                    0      0
## Waiting                    0                     0                    0      0
## Volunteering               0                     0                    0      0
## Caregiving                 0                     0                    0      0
## Education                  0                     0                    0      0
## Work Production            0                     0                    0      0
##                       Communication Leisure Personal Care Waiting Volunteering
## Sleeping                          0       0             0       0            0
## Eating                            0       0             0       0            0
## Private personal care             0       0             0       0            0
## Household Production              0       0             0       0            0
## Travel                            0       0             0       0            0
## Communication                     1       0             0       0            0
## Leisure                           0       1             0       0            0
## Personal Care                     0       0             1       0            0
## Waiting                           0       0             0       1            0
## Volunteering                      0       0             0       0            1
## Caregiving                        0       0             0       0            0
## Education                         0       0             0       0            0
## Work Production                   0       0             0       0            0
##                       Caregiving Education Work Production
## Sleeping                       0         0               0
## Eating                         0         0               0
## Private personal care          0         0               0
## Household Production           0         0               0
## Travel                         0         0               0
## Communication                  0         0               0
## Leisure                        0         0               0
## Personal Care                  0         0               0
## Waiting                        0         0               0
## Volunteering                   0         0               0
## Caregiving                     1         0               0
## Education                      0         1               0
## Work Production                0         0               1

Now we are able to fit an intercept only model for the ordinal version of the data. Various estimation approaches are implemented, but the default is “BPM” which estimates the posterior modes using a Bayesian approach with a very weakly informative prior by default. The BPM approach works well for initial modeling, being a good compromise between efficiency and accuracy. I’d reccommend checking the estimates using MCMC (“BMCMC”) or simulated importance resampling (“BSIR”). The latter also estimates the mean rather than the mode and therefore may be attractive for that reason. Frequentist “MLE” estimates are also implemented. You can vary the parameters of the prior. The mean scale and df of the t distribution here are (mu = 0, sigma = 100, nu = 4), though the default of (mu = 0, sigma = 1000, nu = 4) is even less informative. With either prior estimates are driven almost entirely by the data and will be very similar to those of frequentist statistics.

The output here is an intercept only model which effectively equivalent to a series of Poisson count models. The model is parameterized with a reference category (ethe ‘a’ event ‘Sleeping’), so all the estimates are relative to the that event. Thus we can see that on average only household production, travel and leisure are more common than sleeping. This sort of model is of limited interest (especially for ordinal data).

alpha.fit <- rem(eventlist = evls$eventlist, statslist = alpha.ints, estimator = "BPM", 
  prior.param = list(mu = 0, sigma = 100, nu = 4))
summary(alpha.fit)
## Egocentric Relational Event Model (Ordinal Likelihood)
## 
##                        Post.Mode    Post.SD  Z value  Pr(>|z|)    
## Eating                -0.0524425  0.0156947  -3.3414 0.0008335 ***
## Private personal care -0.6843110  0.0189127 -36.1827 < 2.2e-16 ***
## Household Production   0.3306130  0.0143563  23.0292 < 2.2e-16 ***
## Travel                 0.0078847  0.0154572   0.5101 0.6099800    
## Communication         -0.7191476  0.0191342 -37.5844 < 2.2e-16 ***
## Leisure                0.5687237  0.0137056  41.4956 < 2.2e-16 ***
## Personal Care         -1.8716218  0.0299893 -62.4097 < 2.2e-16 ***
## Waiting               -3.6080413  0.0674144 -53.5203 < 2.2e-16 ***
## Volunteering          -3.5948544  0.0669831 -53.6681 < 2.2e-16 ***
## Caregiving            -3.3621502  0.0598342 -56.1911 < 2.2e-16 ***
## Education             -5.0212337  0.1352830 -37.1165 < 2.2e-16 ***
## Work Production       -3.7153701  0.0710354 -52.3031 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Null deviance: 315678.6 on 61537 degrees of freedom
## Residual deviance: 245959.8 on 61525 degrees of freedom
##  Chi-square: 69718.73 on 12 degrees of freedom, asymptotic p-value 0 
## AIC: 245983.8 AICC: 245983.9 BIC: 246092.2 
## Log posterior: -124191.7 
## Prior parameters: mu=0 sigma=100 nu=4

We can also confirm that the estimates from the intercept only model match what we’d get from a Poisson model of the raw counts:

pois.mle <- log(prop.table(table(atus80ord$Activities))[-c(7, 10)]/prop.table(table(atus80ord$Activities))[10])
round(cbind(BPM = alpha.fit$coef[order(names(alpha.fit$coef))], pois.mle), 4)
##                           BPM pois.mle
## Caregiving            -3.3622  -3.3622
## Communication         -0.7191  -0.7191
## Eating                -0.0524  -0.0524
## Education             -5.0212  -5.0212
## Household Production   0.3306   0.3306
## Leisure                0.5687   0.5687
## Personal Care         -1.8716  -1.8716
## Private personal care -0.6843  -0.6843
## Travel                 0.0079   0.0079
## Volunteering          -3.5949  -3.5949
## Waiting               -3.6080  -3.6080
## Work Production       -3.7154  -3.7154

Where the EREM is particularly interesting is in being able to flexibly fit models for complex sequences of events defined using s-forms (sequence forms). These can be though of as events defined by patterns in two or more events. The patterns can be very simple or arbitrarily complex. The example here is for a relatively simple s-form involving repetition of an event type (e.g., Eating followed by Eating). The code here sets up the s-form and also shows how the s-forms are represented in R (see Marcum and Butts’ paper for a more detailed explanation).

a1 <- paste(evls$event.key[-9, 1], evls$event.key[-9, 1], sep = "")
beta.sforms <- gen.sformlist(evls, a1)
beta.sforms[[1]][1, , ]
##   aa bb cc dd ee ff gg hh jj kk ll mm nn
## a  0  0  0  0  0  0  0  0  0  0  0  0  0
## b  0  0  0  0  0  0  0  0  0  0  0  0  0
## c  0  0  0  0  0  0  0  0  0  0  0  0  0
## d  0  0  0  0  0  0  0  0  0  0  0  0  0
## e  0  0  0  0  0  0  0  0  0  0  0  0  0
## f  0  0  0  0  0  0  0  0  0  0  0  0  0
## g  0  0  0  0  0  0  0  0  0  0  0  0  0
## h  0  0  0  0  0  0  0  0  0  0  0  0  0
## j  0  0  0  0  0  0  0  0  0  0  0  0  0
## k  0  0  0  0  0  0  0  0  0  0  0  0  0
## l  0  0  0  0  0  0  0  0  0  0  0  0  0
## m  0  0  0  0  0  0  0  0  0  0  0  0  0
## n  0  0  0  0  0  0  0  0  0  0  0  0  0
evls$eventlist$"20030101033341"
## [1] 1 2 2 7 1
## attr(,"char")
## [1] "a" "b" "b" "g" "a"
sapply(attr(evls$eventlist$"20030101033341", "char")[1:3], sf2nms, event.key = evls$event.key)
##          a          b          b 
## "Sleeping"   "Eating"   "Eating"
beta.sforms$"20030101033341"[1:4, , c("aa", "bb")]
## , , aa
## 
##   a b c d e f g h j k l m n
## 1 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2 1 0 0 0 0 0 0 0 0 0 0 0 0
## 3 0 0 0 0 0 0 0 0 0 0 0 0 0
## 4 0 0 0 0 0 0 0 0 0 0 0 0 0
## 
## , , bb
## 
##   a b c d e f g h j k l m n
## 1 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 0 0 0 0 0 0 0
## 3 0 1 0 0 0 0 0 0 0 0 0 0 0
## 4 0 1 0 0 0 0 0 0 0 0 0 0 0

We need to combine the sufficient statistics for the intercepts and the s-forms and then we can fit the model. (Here the code just reports the new coefficients for the s-forms. You can get the full model from “summary(beta.fit)”. Note that I changed the code slightly from that in the paper where beta.fit just extracted the coefficients rather than the full model. This suggests that some activities are very unlikely to be repeated (e.g., Eating) whereas for others it is relatively more likely (e.g., Caregiving or Work production).

beta.ints <- slbind(beta.sforms, alpha.ints, new.names = TRUE, event.key = evls$event.key)
beta.fit <- rem(evls$eventlist, beta.ints, estimator = "BPM", prior.param = list(mu = 0, 
  sigma = 100, nu = 4))
round(cbind(beta.fit$coef[13:25]), 4)
##                                               [,1]
## SleepingSleeping                           -1.0863
## EatingEating                               -3.8336
## Private personal carePrivate personal care -2.5093
## Household ProductionHousehold Production   -0.0121
## TravelTravel                               -0.3470
## CommunicationCommunication                  0.4067
## LeisureLeisure                             -0.1748
## Personal CarePersonal Care                  0.3683
## WaitingWaiting                              1.6504
## VolunteeringVolunteering                    3.2567
## CaregivingCaregiving                        3.2422
## EducationEducation                         -4.4263
## Work ProductionWork Production              2.9645

The coefficients are log hazards - the log of the hazard ratio for two events. For instance the communication log hazard is -0.8062075 and exp(-0.8062075) gives a hazard ratio of 0.45 or so relative to the reference event (Sleeping). For s-forms we can get the conditional probability of an event from the total hazard of that event (the baseline times the s-form), divided by the total hazard of all events (the communication hazard plus the baseline hazards for all other events). Here it is computed for the communication -> communication digram (our repeated event s-form). This reveals that the chance of a communication event following another communication event is about 5%.

comComHaz <- exp(sum(beta.fit$coef[c(5, 18)]))
allOthHaz <- exp(beta.fit$coef[(1:12)[-5]])
comComHaz/sum(comComHaz + allOthHaz)
## [1] 0.05074562

Although not very exciting its often a good idea to include s-forms for repeated events if they can occur in your data, as this will improve the overall fit of the model. Its is usually more interesting to set up s-forms to address specific research questions. The next section of code sets up a complex s-form to see if sleep is interrupted by caregiving for another person or by personal care. The first segment of code fits these, while the second segment adds an s-form for other sleep interruptions as a control. The final output is a sumamry of the BPM model coefficients for the three s-forms and their z statistics. (I’m not focusing on inference here, but in principle you can get likelihood ratio tests, z tests and p values from the “MLE” option. Models can be compared using AIC or BIC and the Bayesian approaches give posterior probability (credibility) intervals and a Bayesian analog of a p value. This suggests that sleep interruptions generally a very rare except for those for personal care.

a2 <- c("a(c|h)a", "al+a")
gamma.sforms <- gen.sformlist(evls, a2)
## Note: 
##       a(c|h)a S-form(s) contains special regex syntax. 
##       Using slow search methods. 
##  Note: 
##       al+a S-form(s) contains special regex syntax. 
##       Using slow search methods.
gamma.ints <- slbind(gamma.sforms, beta.ints, new.names = TRUE, event.key = evls$event.key)
gamma.fit <- rem(evls$eventlist, gamma.ints, estimator = "BPM", prior.param = list(mu = 0, 
  sigma = 100, nu = 4))
summary(gamma.fit)
## Egocentric Relational Event Model (Ordinal Likelihood)
## 
##                                                       Post.Mode   Post.SD
## Eating                                                 0.100907  0.016147
## Private personal care                                 -0.787081  0.019822
## Household Production                                   0.339104  0.015493
## Travel                                                 0.052881  0.016301
## Communication                                         -0.746241  0.020027
## Leisure                                                0.616306  0.014958
## Personal Care                                         -2.036929  0.030837
## Waiting                                               -3.621061  0.068077
## Volunteering                                          -3.672055  0.069781
## Caregiving                                            -3.432682  0.064196
## Education                                             -5.019067  0.135321
## Work Production                                       -3.766753  0.073064
## SleepingSleeping                                      -0.848855  0.066511
## EatingEating                                          -3.858526  0.189726
## Private personal carePrivate personal care            -2.273460  0.201444
## Household ProductionHousehold Production              -0.040021  0.027019
## TravelTravel                                          -0.371463  0.038617
## CommunicationCommunication                             0.386483  0.058181
## LeisureLeisure                                        -0.208772  0.023220
## Personal CarePersonal Care                             0.556270  0.171652
## WaitingWaiting                                         1.632468  0.508972
## VolunteeringVolunteering                               3.238798  0.255101
## CaregivingCaregiving                                   3.214235  0.211677
## EducationEducation                                    -4.440779 38.358737
## Work ProductionWork Production                         2.946673  0.318480
## Sleeping(Private personal care|Personal Care)Sleeping  1.215852  0.031546
## SleepingCaregiving+Sleeping                           -0.370198  0.267511
##                                                        Z value  Pr(>|z|)    
## Eating                                                  6.2495 4.118e-10 ***
## Private personal care                                 -39.7081 < 2.2e-16 ***
## Household Production                                   21.8873 < 2.2e-16 ***
## Travel                                                  3.2440  0.001179 ** 
## Communication                                         -37.2617 < 2.2e-16 ***
## Leisure                                                41.2014 < 2.2e-16 ***
## Personal Care                                         -66.0543 < 2.2e-16 ***
## Waiting                                               -53.1903 < 2.2e-16 ***
## Volunteering                                          -52.6227 < 2.2e-16 ***
## Caregiving                                            -53.4723 < 2.2e-16 ***
## Education                                             -37.0901 < 2.2e-16 ***
## Work Production                                       -51.5542 < 2.2e-16 ***
## SleepingSleeping                                      -12.7626 < 2.2e-16 ***
## EatingEating                                          -20.3373 < 2.2e-16 ***
## Private personal carePrivate personal care            -11.2858 < 2.2e-16 ***
## Household ProductionHousehold Production               -1.4812  0.138548    
## TravelTravel                                           -9.6192 < 2.2e-16 ***
## CommunicationCommunication                              6.6428 3.078e-11 ***
## LeisureLeisure                                         -8.9909 < 2.2e-16 ***
## Personal CarePersonal Care                              3.2407  0.001192 ** 
## WaitingWaiting                                          3.2074  0.001339 ** 
## VolunteeringVolunteering                               12.6961 < 2.2e-16 ***
## CaregivingCaregiving                                   15.1846 < 2.2e-16 ***
## EducationEducation                                     -0.1158  0.907835    
## Work ProductionWork Production                          9.2523 < 2.2e-16 ***
## Sleeping(Private personal care|Personal Care)Sleeping  38.5421 < 2.2e-16 ***
## SleepingCaregiving+Sleeping                            -1.3839  0.166401    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Null deviance: 315678.6 on 61537 degrees of freedom
## Residual deviance: 241397.7 on 61510 degrees of freedom
##  Chi-square: 74280.92 on 27 degrees of freedom, asymptotic p-value 0 
## AIC: 241451.7 AICC: 241451.7 BIC: 241695.4 
## Log posterior: -123425.3 
## Prior parameters: mu=0 sigma=100 nu=4
sleep.sfs <- paste("a", letters[2:14], "a", sep = "")
sleep.sforms <- glb.sformlist(evls, sforms = list(sleep.sfs), new.names = "InterSleep")
sleep.ints <- slbind(sleep.sforms, gamma.ints)
sleep.fit <- rem(evls$eventlist, sleep.ints, estimator = "BPM", prior.param = list(mu = 0, 
  sigma = 100, nu = 4))
round(cbind(estimate= sleep.fit$coef, z = sleep.fit$coef/sleep.fit$sd)[26:28, ], 4)
##                                                       estimate        z
## Sleeping(Private personal care|Personal Care)Sleeping   1.5334  44.7723
## SleepingCaregiving+Sleeping                            -0.2575  -0.9647
## InterSleep                                             -1.2908 -22.2002

Marcum and Butts provide a much more thorough discussion of how to set up s-forms (and I’ve skipped some of that here). However, its useful to show one further segment for the ordinal data. This includes an interaction with a the covariate (sex) to see if personal care interruptions to sleep are more common for older men than older women (e.g., because of prostate problems).

sl.ind <- 26:27
fem.evs <- unlist(glapply(atus80ord$SEX, atus80ord$TUCASEID, unique, regroup = FALSE))
fem.evs <- ifelse(fem.evs == 1, 1, 0)
gamma.ints2 <- slbind.cond(fem.evs, gamma.ints, sl.ind = sl.ind, var.suffix = "FEM")
sl.ind <- 26:29
gamma.fit2 <- rem(evls$eventlist, gamma.ints2, estimator = "BPM", prior.param = list(mu = 0, 
  sigma = 100, nu = 4))
round(cbind(BPM = gamma.fit2$coef[sl.ind], Z = gamma.fit2$coef[sl.ind]/gamma.fit2$sd[sl.ind]), 
      4)
##                                                               BPM       Z
## Sleeping(Private personal care|Personal Care)Sleeping      1.2195 32.9940
## SleepingCaregiving+Sleeping                               -0.5895 -1.6327
## Sleeping(Private personal care|Personal Care)Sleeping.FEM -0.0120 -0.1929
## SleepingCaregiving+Sleeping.FEM                            0.5522  1.0512

Interval data

As the interval timing version of the model takes a while to fit the example in Marcum and Butts just uses a random subset of 500 participants (all aged 80 or over). The random seed is set to ensure output matches what they get in the paper. (Note: I’ve altered the code slightly to get the seed to match older versions of R). In a real application you would nearly always want to use the full data set. The struture of the data is very similar to that of the ordinal version except that each event includes zero-referenced timing information (each person’s record starts at 0) and events are split into a onset/termination pairs denoted by “|START” or “|STOP” suffixes. Importantly times can’t overlap so it may be necessary to add a small constant to the onset event. Here this has already been done. The data are in minutes so each onset after the first has had 1/1000 of a minute added to it. This is not sufficient to distort things, but you might want to use a smaller constant for some data sets.

evlsint <- gen.evl(atus80int[, 1:3])
RNGkind(sample.kind = "Rounding")
## Warning in RNGkind(sample.kind = "Rounding"): non-uniform 'Rounding' sampler
## used
set.seed(570819)
evlsint$eventlist <- evlsint$eventlist[sample(1:length(evlsint$eventlist), 500)]
evlsint$event.key
##       id  event.type                   
##  [1,] "a" "Sleeping|START"             
##  [2,] "b" "Sleeping|STOP"              
##  [3,] "c" "Eating|START"               
##  [4,] "d" "Eating|STOP"                
##  [5,] "e" "Private personal care|START"
##  [6,] "f" "Private personal care|STOP" 
##  [7,] "g" "Household Production|START" 
##  [8,] "h" "Household Production|STOP"  
##  [9,] "i" "Travel|START"               
## [10,] "j" "Travel|STOP"                
## [11,] "k" "Communication|START"        
## [12,] "l" "Communication|STOP"         
## [13,] "m" "Leisure|START"              
## [14,] "n" "Leisure|STOP"               
## [15,] "o" "Personal Care|START"        
## [16,] "p" "Personal Care|STOP"         
## [17,] "q" "NA|START"                   
## [18,] "r" "NA|STOP"                    
## [19,] "s" "Waiting|START"              
## [20,] "t" "Waiting|STOP"               
## [21,] "u" "Volunteering|START"         
## [22,] "v" "Volunteering|STOP"          
## [23,] "w" "Caregiving|START"           
## [24,] "x" "Caregiving|STOP"            
## [25,] "y" "Education|START"            
## [26,] "z" "Education|STOP"             
## [27,] "A" "Work Production|START"      
## [28,] "B" "Work Production|STOP"
evlsint$eventlist$"20040706041558"
##       [,1]     [,2]
##  [1,]    1    0.000
##  [2,]    2  240.000
##  [3,]    3  240.001
##  [4,]    4  300.000
##  [5,]    7  300.001
##  [6,]    8  310.000
##  [7,]   13  310.001
##  [8,]   14  340.000
##  [9,]   13  340.001
## [10,]   14  450.000
## [11,]    3  450.001
## [12,]    4  510.000
## [13,]    1  510.001
## [14,]    2  570.000
## [15,]   13  570.001
## [16,]   14  870.000
## [17,]   13  870.001
## [18,]   14 1020.000
## [19,]    1 1020.001
## [20,]    2 1650.001
## attr(,"char")
##  [1] "a" "b" "c" "d" "g" "h" "m" "n" "m" "n" "c" "d" "a" "b" "m" "n" "m" "n" "a"
## [20] "b"

We then need to set up the list of statistics (statslist) that defines the coefficients for the intercept only version of the model. For interval data we want to set the contrasts to FALSE so that we don’t have a reference event category (similar to a cell means type parameterization in an ANOVA model if you are familiar with that). This means we’ll be estimating a separate parameter for each event type.

int.base <- gen.intercepts(evlsint, type = 1, contr = FALSE)

The next bit is somewhat tricky. For an interval timing model you need to constrain the parameters that are estimated. This is is sometimes necessary for other models but for interval timing there is always the constraint that a “STOP” event has to follow a “START” event of the same type. Constraints are specified in a supplementary list (supplist) that has to be set up carefully. The following code creates such a list for the interval data and can fairly easily be adapted for other data sets. I’ve added a call to display the first element of the supplist so one can see its structure (basically 1 for events that are going to be estimated and zero otherwise).

eT <- evlsint$event.key[, 2]
supplist <- list()
for (i in 1:length(evlsint$eventlist)) {
  supplist[[i]] <- matrix(0, nrow = NROW(evlsint$eventlist[[i]]), ncol = length(eT))
  colnames(supplist[[i]]) <- eT
}
for (i in 1:length(evlsint$eventlist)) {
  evl <- evlsint$eventlist[[i]]
  supplist[[i]][1, grep("START", eT)] <- 1
  for (j in 2:(NROW(evl))) {
    CE <- evl[j, 1]
    PE <- evl[j - 1, 1]
    if (PE == 0) {
      supplist[[i]][j, grep("STOP", eT)] <- 1
    }
    if (PE != 0) {
      if (grepl("START", eT[PE])) {
        supplist[[i]][j, eT[CE]] <- 1
      }
      if (grepl("STOP", eT[PE])) {
        supplist[[i]][j, grepl("START", eT)] <- 1
      }
    }
  }
}
supplist[1]
## [[1]]
##       Sleeping|START Sleeping|STOP Eating|START Eating|STOP
##  [1,]              1             0            1           0
##  [2,]              0             1            0           0
##  [3,]              1             0            1           0
##  [4,]              0             0            0           0
##  [5,]              1             0            1           0
##  [6,]              0             0            0           0
##  [7,]              1             0            1           0
##  [8,]              0             0            0           1
##  [9,]              1             0            1           0
## [10,]              0             0            0           0
## [11,]              1             0            1           0
## [12,]              0             0            0           0
## [13,]              1             0            1           0
## [14,]              0             0            0           0
## [15,]              1             0            1           0
## [16,]              0             0            0           0
## [17,]              1             0            1           0
## [18,]              0             0            0           1
## [19,]              1             0            1           0
## [20,]              0             0            0           0
## [21,]              1             0            1           0
## [22,]              0             1            0           0
##       Private personal care|START Private personal care|STOP
##  [1,]                           1                          0
##  [2,]                           0                          0
##  [3,]                           1                          0
##  [4,]                           0                          0
##  [5,]                           1                          0
##  [6,]                           0                          0
##  [7,]                           1                          0
##  [8,]                           0                          0
##  [9,]                           1                          0
## [10,]                           0                          0
## [11,]                           1                          0
## [12,]                           0                          0
## [13,]                           1                          0
## [14,]                           0                          0
## [15,]                           1                          0
## [16,]                           0                          0
## [17,]                           1                          0
## [18,]                           0                          0
## [19,]                           1                          0
## [20,]                           0                          0
## [21,]                           1                          0
## [22,]                           0                          0
##       Household Production|START Household Production|STOP Travel|START
##  [1,]                          1                         0            1
##  [2,]                          0                         0            0
##  [3,]                          1                         0            1
##  [4,]                          0                         0            0
##  [5,]                          1                         0            1
##  [6,]                          0                         0            0
##  [7,]                          1                         0            1
##  [8,]                          0                         0            0
##  [9,]                          1                         0            1
## [10,]                          0                         0            0
## [11,]                          1                         0            1
## [12,]                          0                         0            0
## [13,]                          1                         0            1
## [14,]                          0                         0            0
## [15,]                          1                         0            1
## [16,]                          0                         1            0
## [17,]                          1                         0            1
## [18,]                          0                         0            0
## [19,]                          1                         0            1
## [20,]                          0                         0            0
## [21,]                          1                         0            1
## [22,]                          0                         0            0
##       Travel|STOP Communication|START Communication|STOP Leisure|START
##  [1,]           0                   1                  0             1
##  [2,]           0                   0                  0             0
##  [3,]           0                   1                  0             1
##  [4,]           0                   0                  0             0
##  [5,]           0                   1                  0             1
##  [6,]           1                   0                  0             0
##  [7,]           0                   1                  0             1
##  [8,]           0                   0                  0             0
##  [9,]           0                   1                  0             1
## [10,]           0                   0                  1             0
## [11,]           0                   1                  0             1
## [12,]           1                   0                  0             0
## [13,]           0                   1                  0             1
## [14,]           0                   0                  0             0
## [15,]           0                   1                  0             1
## [16,]           0                   0                  0             0
## [17,]           0                   1                  0             1
## [18,]           0                   0                  0             0
## [19,]           0                   1                  0             1
## [20,]           0                   0                  0             0
## [21,]           0                   1                  0             1
## [22,]           0                   0                  0             0
##       Leisure|STOP Personal Care|START Personal Care|STOP NA|START NA|STOP
##  [1,]            0                   1                  0        1       0
##  [2,]            0                   0                  0        0       0
##  [3,]            0                   1                  0        1       0
##  [4,]            1                   0                  0        0       0
##  [5,]            0                   1                  0        1       0
##  [6,]            0                   0                  0        0       0
##  [7,]            0                   1                  0        1       0
##  [8,]            0                   0                  0        0       0
##  [9,]            0                   1                  0        1       0
## [10,]            0                   0                  0        0       0
## [11,]            0                   1                  0        1       0
## [12,]            0                   0                  0        0       0
## [13,]            0                   1                  0        1       0
## [14,]            1                   0                  0        0       0
## [15,]            0                   1                  0        1       0
## [16,]            0                   0                  0        0       0
## [17,]            0                   1                  0        1       0
## [18,]            0                   0                  0        0       0
## [19,]            0                   1                  0        1       0
## [20,]            1                   0                  0        0       0
## [21,]            0                   1                  0        1       0
## [22,]            0                   0                  0        0       0
##       Waiting|START Waiting|STOP Volunteering|START Volunteering|STOP
##  [1,]             1            0                  1                 0
##  [2,]             0            0                  0                 0
##  [3,]             1            0                  1                 0
##  [4,]             0            0                  0                 0
##  [5,]             1            0                  1                 0
##  [6,]             0            0                  0                 0
##  [7,]             1            0                  1                 0
##  [8,]             0            0                  0                 0
##  [9,]             1            0                  1                 0
## [10,]             0            0                  0                 0
## [11,]             1            0                  1                 0
## [12,]             0            0                  0                 0
## [13,]             1            0                  1                 0
## [14,]             0            0                  0                 0
## [15,]             1            0                  1                 0
## [16,]             0            0                  0                 0
## [17,]             1            0                  1                 0
## [18,]             0            0                  0                 0
## [19,]             1            0                  1                 0
## [20,]             0            0                  0                 0
## [21,]             1            0                  1                 0
## [22,]             0            0                  0                 0
##       Caregiving|START Caregiving|STOP Education|START Education|STOP
##  [1,]                1               0               1              0
##  [2,]                0               0               0              0
##  [3,]                1               0               1              0
##  [4,]                0               0               0              0
##  [5,]                1               0               1              0
##  [6,]                0               0               0              0
##  [7,]                1               0               1              0
##  [8,]                0               0               0              0
##  [9,]                1               0               1              0
## [10,]                0               0               0              0
## [11,]                1               0               1              0
## [12,]                0               0               0              0
## [13,]                1               0               1              0
## [14,]                0               0               0              0
## [15,]                1               0               1              0
## [16,]                0               0               0              0
## [17,]                1               0               1              0
## [18,]                0               0               0              0
## [19,]                1               0               1              0
## [20,]                0               0               0              0
## [21,]                1               0               1              0
## [22,]                0               0               0              0
##       Work Production|START Work Production|STOP
##  [1,]                     1                    0
##  [2,]                     0                    0
##  [3,]                     1                    0
##  [4,]                     0                    0
##  [5,]                     1                    0
##  [6,]                     0                    0
##  [7,]                     1                    0
##  [8,]                     0                    0
##  [9,]                     1                    0
## [10,]                     0                    0
## [11,]                     1                    0
## [12,]                     0                    0
## [13,]                     1                    0
## [14,]                     0                    0
## [15,]                     1                    0
## [16,]                     0                    0
## [17,]                     1                    0
## [18,]                     0                    0
## [19,]                     1                    0
## [20,]                     0                    0
## [21,]                     1                    0
## [22,]                     0                    0

We can now estimate the model (again using BPM for convenience). This gives similar output for the ordinal model except that we now have parameters for the onset “START” events and the termination “STOP” events. The former are log hazards for the frequency of an event occurring and the latter are parameters estimating the duration of an event. The latter have an inverse waiting time distribution. Note that the z and p values here for the intercepts are not that interesting as they test the hypothesis that estimate is different from zero. To test differences in intercepts one could use the interval estimates (though this is a bit fiddly as the SE for the difference is approximately 1.414 times the SE for the coefficient assuming the two SEs are equal) or refit the model combing the two categories and comparing Chi-square (or AIC or BIC).

Because of the way we’ve parameterized the interval model its is pretty simple to get the probability of an event. The hazard for an event is given by the exponential of its coefficient and the probability by its hazard divided by the sum of all hazards. So for sleeping it is exp(4.924765) which is 137.657 divided by 1057.597 giving a probability of .1301602 or around 13%. So around 13% of events are sleeping events.

For the average duration of an event (strictly the mode or mean depending on estimation method) you need the inverse of the exponential of the coefficient. So for sleeping 1/(exp(-5.690678)) = 296.0943. This is about 4.9 hours (which seems reasonable for a sample of participants aged 80+).

The intercept-only version of the interval model is therefore very useful in simultaneously estimating both the probabilty of an event and its mean duration. The model also gives us standard errors in the form of the posterior SD of the estimate (a standard error is also the SD of a parameter estimate, and its the latter terminology that is preferred in a Bayesian context). This makes it realtively easy to construct graphs with error bars for the frequency or duration of different events. (I’ll give an example of how to do this in a later publication).

intfit.base <- rem(evlsint$eventlist, int.base, supplist = supplist, timing = "interval", 
  estimator = "BPM")
summary(intfit.base)
## Egocentric Relational Event Model (Interval Likelihood)
## 
##                             Post.Mode   Post.SD   Z value  Pr(>|z|)    
## Sleeping|START               4.967864  0.028421  174.7954 < 2.2e-16 ***
## Sleeping|STOP               -5.690678  0.028421 -200.2278 < 2.2e-16 ***
## Eating|START                 4.900197  0.029399  166.6787 < 2.2e-16 ***
## Eating|STOP                 -3.589547  0.029399 -122.0974 < 2.2e-16 ***
## Private personal care|START  4.282762  0.040032  106.9833 < 2.2e-16 ***
## Private personal care|STOP  -3.497837  0.040032  -87.3759 < 2.2e-16 ***
## Household Production|START   5.272565  0.024405  216.0465 < 2.2e-16 ***
## Household Production|STOP   -3.760817  0.024405 -154.1017 < 2.2e-16 ***
## Travel|START                 4.934185  0.028904  170.7114 < 2.2e-16 ***
## Travel|STOP                 -2.790073  0.028904  -96.5301 < 2.2e-16 ***
## Communication|START          4.256786  0.040555  104.9624 < 2.2e-16 ***
## Communication|STOP          -3.899224  0.040555  -96.1457 < 2.2e-16 ***
## Leisure|START                5.509079  0.021683  254.0755 < 2.2e-16 ***
## Leisure|STOP                -4.653257  0.021683 -214.6055 < 2.2e-16 ***
## Personal Care|START          3.217249  0.068199   47.1741 < 2.2e-16 ***
## Personal Care|STOP          -3.998140  0.068199  -58.6242 < 2.2e-16 ***
## NA|START                     2.698641  0.088388   30.5316 < 2.2e-16 ***
## NA|STOP                     -4.242211  0.088388  -47.9951 < 2.2e-16 ***
## Waiting|START                1.343119  0.174078    7.7156 1.199e-14 ***
## Waiting|STOP                -3.856553  0.174078  -22.1542 < 2.2e-16 ***
## Volunteering|START           1.213907  0.185695    6.5371 6.273e-11 ***
## Volunteering|STOP           -4.582844  0.185695  -24.6794 < 2.2e-16 ***
## Caregiving|START             1.430130  0.166667    8.5808 < 2.2e-16 ***
## Caregiving|STOP             -4.013357  0.166667  -24.0801 < 2.2e-16 ***
## Education|START             -0.207479  0.377964   -0.5489     0.583    
## Education|STOP              -4.302118  0.377964  -11.3823 < 2.2e-16 ***
## Work Production|START        1.430130  0.166667    8.5808 < 2.2e-16 ***
## Work Production|STOP        -4.863244  0.166667  -29.1795 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Null deviance: 174603.4 on 18228 degrees of freedom
## Residual deviance: 21536.87 on 18200 degrees of freedom
##  Chi-square: 153066.6 on 28 degrees of freedom, asymptotic p-value 0 
## AIC: 21592.87 AICC: 21592.96 BIC: 21811.57 
## Log posterior: -38795.9 
## Prior parameters: mu=0 sigma=1000 nu=4

The next segment of code just shows how the event probabilities from the model match those from the raw data (labeled MLE). This can be a useful check that the model is set up correctly.

bpm.start <- (exp(intfit.base$coef[grep("START", names(intfit.base$coef))])/sum(exp(intfit.base$coef[grep("START", 
  names(intfit.base$coef))])))
in.ids <- which(atus80int$TUCASEID %in% names(evlsint$eventlist) & grepl("START", 
  atus80int$Events))
mle.start <- prop.table(table(atus80int[in.ids, "Events"]))
round(cbind(BPM = sort(bpm.start), MLE = sort(mle.start)), 4)
##                                BPM    MLE
## Education|START             0.0008 0.0008
## Volunteering|START          0.0032 0.0032
## Waiting|START               0.0036 0.0036
## Caregiving|START            0.0039 0.0039
## Work Production|START       0.0039 0.0039
## NA|START                    0.0140 0.0140
## Personal Care|START         0.0236 0.0236
## Communication|START         0.0667 0.0667
## Private personal care|START 0.0685 0.0685
## Eating|START                0.1269 0.1269
## Travel|START                0.1313 0.1313
## Sleeping|START              0.1358 0.1358
## Household Production|START  0.1842 0.1842
## Leisure|START               0.2334 0.2334

The interval model can also incorporate s-forms - allowing us to consider both the impact on the probability and duration of a particular pattern. The following segment repeats the s-form from the ordinal model for seeing if sleep is interrupted by personal care or by any other activity. The main difference is that the s-form has to explicitly code for both onset and termination events. You can see how the sequences are definied in the sleepA and sleepB vectors (which code for the onset and terination of the s-form for interruption by non-personal care events).

Here the example calculates the conditional probability of a sleep interruption - which is around 5.7% (so rare). Note the importance of including the sleep coefficient in the calculation (as sleep interruptions can’t occur if you weren’t first asleep). An interruption reduces sleep duration by a factor of 1(exp(0.353266)) = 0.702 or about 30% (about ninety minutes). This reduction isn’t statistically significant (though one could reasonably argue that a one-sided test is appropriate here in which case the Bayesian p value approximation is around .03).

sleepA <- paste("ab", c(letters[seq(1, 26, 2)], "A"), c(letters[seq(2, 26, 2)], "B"), 
  "a", sep = "")
sleepB <- paste("ab", c(letters[seq(1, 26, 2)], "A"), c(letters[seq(2, 26, 2)], "B"), 
  "ab", sep = "")
sleep.glbs <- glb.sformlist(evlsint, list(sleepA, sleepB, c("abefa", "abopa"), c("abefab", 
  "abopab")), cond = TRUE, new.names = c("Inter.Sl.Ons", "Inter.Sl.Dur", "PerCare.Sl.Ons", 
  "PerCare.Sl.Dur"))
sleep.int <- slbind(sleep.glbs, int.base)
intfit.sleep <- rem(evlsint$eventlist, sleep.int, supplist = supplist, timing = "interval", 
  estimator = "BPM")
summary(intfit.sleep)
## Egocentric Relational Event Model (Interval Likelihood)
## 
##                             Post.Mode   Post.SD   Z value  Pr(>|z|)    
## Sleeping|START               4.989059  0.029361  169.9212 < 2.2e-16 ***
## Sleeping|STOP               -5.707896  0.029248 -195.1566 < 2.2e-16 ***
## Eating|START                 4.900197  0.029399  166.6787 < 2.2e-16 ***
## Eating|STOP                 -3.589547  0.029399 -122.0974 < 2.2e-16 ***
## Private personal care|START  4.282762  0.040032  106.9833 < 2.2e-16 ***
## Private personal care|STOP  -3.497837  0.040032  -87.3759 < 2.2e-16 ***
## Household Production|START   5.272565  0.024405  216.0465 < 2.2e-16 ***
## Household Production|STOP   -3.760817  0.024405 -154.1017 < 2.2e-16 ***
## Travel|START                 4.934185  0.028904  170.7114 < 2.2e-16 ***
## Travel|STOP                 -2.790073  0.028904  -96.5301 < 2.2e-16 ***
## Communication|START          4.256786  0.040555  104.9624 < 2.2e-16 ***
## Communication|STOP          -3.899224  0.040555  -96.1457 < 2.2e-16 ***
## Leisure|START                5.509079  0.021683  254.0755 < 2.2e-16 ***
## Leisure|STOP                -4.653257  0.021683 -214.6055 < 2.2e-16 ***
## Personal Care|START          3.217249  0.068199   47.1741 < 2.2e-16 ***
## Personal Care|STOP          -3.998140  0.068199  -58.6242 < 2.2e-16 ***
## NA|START                     2.698641  0.088388   30.5316 < 2.2e-16 ***
## NA|STOP                     -4.242211  0.088388  -47.9951 < 2.2e-16 ***
## Waiting|START                1.343119  0.174078    7.7156 1.199e-14 ***
## Waiting|STOP                -3.856553  0.174078  -22.1542 < 2.2e-16 ***
## Volunteering|START           1.213907  0.185695    6.5371 6.273e-11 ***
## Volunteering|STOP           -4.582844  0.185695  -24.6794 < 2.2e-16 ***
## Caregiving|START             1.430130  0.166667    8.5808 < 2.2e-16 ***
## Caregiving|STOP             -4.013357  0.166667  -24.0801 < 2.2e-16 ***
## Education|START             -0.207479  0.377964   -0.5489   0.58305    
## Education|STOP              -4.302118  0.377964  -11.3823 < 2.2e-16 ***
## Work Production|START        1.430130  0.166667    8.5808 < 2.2e-16 ***
## Work Production|STOP        -4.863244  0.166667  -29.1795 < 2.2e-16 ***
## Inter.Sl.Ons                -0.890707  0.184920   -4.8167 1.459e-06 ***
## Inter.Sl.Dur                 0.353266  0.187985    1.8792   0.06021 .  
## PerCare.Sl.Ons               1.314628  0.232737    5.6485 1.618e-08 ***
## PerCare.Sl.Dur               0.021551  0.243891    0.0884   0.92959    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Null deviance: 174603.4 on 18228 degrees of freedom
## Residual deviance: 21489.3 on 18196 degrees of freedom
##  Chi-square: 153114.1 on 32 degrees of freedom, asymptotic p-value 0 
## AIC: 21553.3 AICC: 21553.41 BIC: 21803.24 
## Log posterior: -42776.04 
## Prior parameters: mu=0 sigma=1000 nu=4
exp(-0.890707 + 4.989059)/sum(exp(intfit.sleep$coef[grep("START", names(intfit.sleep$coef))]))
## [1] 0.05677088

The original paper includes an example for dyadic data (in an appendix) which I’m not going over here. In a future post I plan to go over applying the EREM approach to focal follow data based on my paper with Kate Ellis-Davies, Sheina Lew-Levy, Eleanor Fleming and Adam H. Boyette. Ellis-Davies et al., (2021)