Temporal Exponential Random Graph Models (TERGMs) for dynamic network modeling in statnet

Last updated: 2021-06-27

master_lib <- "~/R/GHmaster-library"

# #Make lib 1st position in libPath
defaultPaths <- .libPaths()
.libPaths(c(master_lib, defaultPaths))
.libPaths()
[1] "C:/Users/Martina Morris/Documents/R/GHmaster-library"
[2] "C:/Users/Martina Morris/Documents/R/win-library/4.0" 
[3] "C:/Program Files/R/R-4.0.5/library"                  
library(tergm, lib.loc=master_lib)
# statnet packages
install.packages('tergm')
install.packages('tsna')
install.packages('ndtv')

# other packages to enhance graphical output
install.packages('htmlwidgets')
install.packages('latticeExtra')
library(tergm)
library(tsna)
library(ndtv)
library(htmlwidgets)
library(latticeExtra)
sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] latticeExtra_0.6-29      lattice_0.20-44          htmlwidgets_1.5.3       
 [4] ndtv_0.13.0              sna_2.6                  statnet.common_4.5.0-362
 [7] animation_2.6            tsna_0.3.3               tergm_4.0-2301          
[10] networkDynamic_0.11.0    ergm_4.0-6512            network_1.17.1-685      
[13] knitr_1.33              

loaded via a namespace (and not attached):
 [1] xfun_0.23               bslib_0.2.5.1           purrr_0.3.4            
 [4] rle_0.9.2-234           vctrs_0.3.8             htmltools_0.5.1.1      
 [7] yaml_2.2.1              utf8_1.2.1              rlang_0.4.11           
[10] jquerylib_0.1.4         pillar_1.6.1            RColorBrewer_1.1-2     
[13] jpeg_0.1-8.1            trust_0.1-8             lifecycle_1.0.0        
[16] robustbase_0.93-8       stringr_1.4.0           lpSolveAPI_5.5.2.0-17.7
[19] coda_0.19-4             memoise_2.0.0           evaluate_0.14          
[22] fastmap_1.1.0           parallel_4.0.5          fansi_0.5.0            
[25] DEoptimR_1.0-9          openssl_1.4.4           cachem_1.0.5           
[28] base64_2.0              jsonlite_1.7.2          png_0.1-7              
[31] askpass_1.1             digest_0.6.27           stringi_1.6.2          
[34] grid_4.0.5              tools_4.0.5             magrittr_2.0.1         
[37] sass_0.4.0              tibble_3.1.2            crayon_1.4.1           
[40] pkgconfig_2.0.3         MASS_7.3-54             ellipsis_0.3.2         
[43] Matrix_1.3-4            rmarkdown_2.8           R6_2.5.0               
[46] nlme_3.1-152            compiler_4.0.5         
set.seed(1)
data(samplk)
ls()
[1] "defaultPaths" "master_lib"   "samplk1"      "samplk2"      "samplk3"     
samplist <- list(samplk1,samplk2,samplk3)
sampdyn <- networkDynamic(network.list = samplist)
Neither start or onsets specified, assuming start=0
Onsets and termini not specified, assuming each network in network.list should have a discrete spell of length 1
Argument base.net not specified, using first element of network.list instead
Created net.obs.period to describe network
 Network observation period info:
  Number of observation spells: 1 
  Maximal time range observed: 0 until 3 
  Temporal mode: discrete 
  Time unit: step 
  Suggested time increment: 1 
sampdyn
NetworkDynamic properties:
  distinct change times: 4 
  maximal time range: 0 until  3 

Includes optional net.obs.period attribute:
 Network observation period info:
  Number of observation spells: 1 
  Maximal time range observed: 0 until 3 
  Temporal mode: discrete 
  Time unit: step 
  Suggested time increment: 1 

 Network attributes:
  vertices = 18 
  directed = TRUE 
  hyper = FALSE 
  loops = FALSE 
  multiple = FALSE 
  bipartite = FALSE 
  net.obs.period: (not shown)
  total edges= 88 
    missing edges= 0 
    non-missing edges= 88 

 Vertex attribute names: 
    active cloisterville group vertex.names 

 Edge attribute names: 
    active 
network.extract(sampdyn, at = 3) # empty
network.extract(sampdyn, at = 0) # the first network, as expected
# Equivalent statements
#network.extract(sampdyn, at = 0) # first network
#network.extract(sampdyn, onset = 0, terminus=1) # same thing
vignette("networkDynamic")
par(mfrow = c(2,2), oma=c(1,1,1,1), mar=c(4,1,1,1))
plot(network.extract(sampdyn, at = 0), main = "Time 1", 
     displaylabels = T, label.cex = 0.6, vertex.cex = 2, pad = 0.5)
plot(network.extract(sampdyn, at = 1), main = "Time2", 
     displaylabels = T, label.cex = 0.6, vertex.cex = 2, pad = 0.5)
plot(network.extract(sampdyn, at = 2), main = "Time3", 
     displaylabels = T, label.cex = 0.6, vertex.cex = 2, pad = 0.5)
plot(sampdyn, main = "Collapsed", 
     displaylabels = T, label.cex = 0.6, vertex.cex = 2, pad = 0.5)
vignette("tsna_vignette")
tSnaStats(sampdyn,"degree") # Changes in degree centrality
Time Series:
Start = 0 
End = 3 
Frequency = 1 
  John Bosco Gregory Basil Peter Bonaventure Berthold Mark Victor Ambrose
0         12      10     5     6           8        4    7      7       5
1         11      12     4     8          10        5    6      5       4
2          7       9     7     7           9        5    8      5       7
3         NA      NA    NA    NA          NA       NA   NA     NA      NA
  Romauld Louis Winfrid Amand Hugh Boniface Albert Elias Simplicius
0       4     5       4     5   10        4      5     4          5
1       4     5       7     5    6        6      5     5          6
2       4     5       9     5    5        5      4     5          6
3      NA    NA      NA    NA   NA       NA     NA    NA         NA
tErgmStats(sampdyn, "~ edges+triangle") # Notice the increase in triangles
Time Series:
Start = 0 
End = 3 
Frequency = 1 
  edges triangle
0    55       31
1    57       56
2    56       62
3     0        0
# who is v1?
get.vertex.attribute(sampdyn, "vertex.names")[1]
[1] "John Bosco"
# who is in v1's FRS?
tp <- tPath(sampdyn, v=1, direction = 'fwd')
print(tp)
$tdist
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

$previous
 [1]  0  3  1  5  1  4  2  7  6  4  5 14  5  1 14  7  3 13

$gsteps
 [1] 0 2 1 2 1 3 3 4 4 3 2 2 2 1 2 4 2 3

$start
[1] 0

$end
[1] Inf

$direction
[1] "fwd"

$type
[1] "earliest.arrive"

attr(,"class")
[1] "tPath" "list" 
par(mfrow=c(1,2))
coords <- plot(tp, main="Forward Reachable Set from v1", cex.main=.8)
plotPaths(sampdyn, tp, 
          coord = coords,
          main = "Overlaid on collapsed Network",
          label.cex=.8, cex.main=.8)
table(edgeDuration(sampdyn, mode = 'duration', subject = 'spells')) 

 1  2  3 
48 15 30 
# Note bimodality
data(short.stergm.sim)
render.d3movie(short.stergm.sim, 
               plot.par=list(displaylabels=T))
render.d3movie(short.stergm.sim,
               plot.par=list(displaylabels=T),
               output.mode = 'htmlWidget') # using htmlwidgets package here
slice parameters:
  start:0
  end:25
  interval:1
  aggregate.dur:1
  rule:latest
proximity.timeline(short.stergm.sim,default.dist = 6,
          mode = 'sammon',labels.at = 17,vertex.cex = 4)
# ergm(my.network ~ edges + nodefactor('age') + gwesp(0, fixed=T))     #do not run this!

# stergm(my.network ~                               #do not run this!
#     Form(~ edges + gwesp(0, fixed=T)) +
#     Diss(~ edges + nodefactor('age')),
#     estimate =  `insert method`
# )
# samp.fit <- stergm(samplist,
#   formation =  ~edges+mutual+cyclicalties+transitiveties,
#   dissolution = ~edges+mutual+cyclicalties+transitiveties,
#   estimate = "CMLE",
#   times = c(1:3)
#   )

summary(NetSeries(samplist) ~
    Form(~edges+mutual+cyclicalties+transitiveties) +
    Diss(~edges+mutual+cyclicalties+transitiveties)
    )

summary(NetSeries(samplist) ~
    Cross(~edges+mutual+cyclicalties+transitiveties) +
    Change(~edges+mutual+cyclicalties+transitiveties)
    )

samp.fit2 <- tergm(samplist ~
    Cross(~edges+mutual+cyclicalties+transitiveties) +
    Change(~edges+mutual+cyclicalties+transitiveties),
  estimate = "CMLE",
  times = c(1:3)
    )

samp.fit <- tergm(samplist ~
    Form(~edges+mutual+cyclicalties+transitiveties) +
    Diss(~edges+mutual+cyclicalties+transitiveties),
  estimate = "CMLE",
  times = c(1:3)
    )
Fitting formation:  

Starting maximum likelihood estimation via MCMLE:
Iteration 1 of at most 20:  

 = = = = = = Lots of output snipped = = = = = = 

This model was fit using MCMC.  To examine model diagnostics and check for degeneracy, use the mcmc.diagnostics() function.
summary(samp.fit)
Call:
tergm(formula = samplist ~ Form(~edges + mutual + cyclicalties + 
    transitiveties) + Diss(~edges + mutual + cyclicalties + transitiveties), 
    estimate = "CMLE", times = c(1:3))

Monte Carlo Conditional Maximum Likelihood Results:

                    Estimate Std. Error MCMC % z value Pr(>|z|)    
Form~edges           -3.5025     0.3508      0  -9.984   <1e-04 ***
Form~mutual           2.0590     0.3872      0   5.317   <1e-04 ***
Form~cyclicalties    -0.1399     0.2092      0  -0.669   0.5036    
Form~transitiveties   0.4049     0.2640      0   1.534   0.1251    
Diss~edges           -0.1795     0.3001      0  -0.598   0.5498    
Diss~mutual          -0.8415     0.5156      0  -1.632   0.1027    
Diss~cyclicalties     0.2151     0.2591      0   0.830   0.4064    
Diss~transitiveties  -0.5163     0.2696      0  -1.915   0.0555 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 848.4  on 612  degrees of freedom
 Residual Deviance: 381.0  on 604  degrees of freedom

AIC: 397  BIC: 432.4  (Smaller is better. MC Std. Err. = 1.556)
samp.fit.2 <- tergm(
  samplist ~
    Form(~edges+mutual+cyclicalties+transitiveties) +
    Diss(~edges+mutual+cyclicalties+transitiveties),
  estimate = "CMLE",
  times = c(1:2)
    )
summary(samp.fit.2)
Call:
tergm(formula = samplist ~ Form(~edges + mutual + cyclicalties + 
    transitiveties) + Diss(~edges + mutual + cyclicalties + transitiveties), 
    estimate = "CMLE", times = c(1:2))

Monte Carlo Conditional Maximum Likelihood Results:

                    Estimate Std. Error MCMC % z value Pr(>|z|)    
Form~edges           -3.4869     0.3380      0 -10.315   <1e-04 ***
Form~mutual           2.0255     0.4075      0   4.971   <1e-04 ***
Form~cyclicalties    -0.1386     0.2033      0  -0.682   0.4953    
Form~transitiveties   0.3992     0.2452      0   1.628   0.1036    
Diss~edges           -0.2296     0.3123      0  -0.735   0.4622    
Diss~mutual          -0.7594     0.5177      0  -1.467   0.1424    
Diss~cyclicalties     0.1921     0.2472      0   0.777   0.4372    
Diss~transitiveties  -0.5258     0.2922      0  -1.799   0.0719 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 848.4  on 612  degrees of freedom
 Residual Deviance: 379.1  on 604  degrees of freedom

AIC: 395.1  BIC: 430.4  (Smaller is better. MC Std. Err. = 1.755)
theta.diss <- log(9)
data(florentine)
X11()

# stergm.fit.1 <- stergm(flobusiness,
#   formation =  ~edges+gwesp(0,fixed = T),
#   dissolution = ~offset(edges),
#   targets = "formation",
#   offset.coef.diss = theta.diss,
#   estimate = "EGMME",
#   control = control.stergm(SA.plot.progress = TRUE)
#   )
startTime <- Sys.time()
tergm.fit.1 <- tergm(
  flobusiness ~ 
    Form(~ edges + gwesp(0, fixed=T)) + 
    Persist(~ offset(edges)),
  targets = "formation",
  offset.coef = log(9),
  estimate = "EGMME",
  control = control.tergm(SA.plot.progress=TRUE)
  )
stopTime <- Sys.time()
print(paste("Estimation time:", stopTime-startTime))

dev.off()
mcmc.diagnostics(tergm.fit.1, which="plots") # only returns the plots
tergm.fit.1

Call:
tergm(formula = flobusiness ~ Form(~edges + gwesp(0, fixed = T)) + 
    Persist(~offset(edges)), offset.coef = log(9), estimate = "EGMME", 
    control = control.tergm(SA.plot.progress = TRUE), targets = "formation")

Last MCMC sample of size 2500 based on:
[1]  NULL

Gradient Descent Equilibrium Generalized Method of Moments Results Coefficients:
           Form~edges     Form~gwesp.fixed.0  offset(Persist~edges)  
               -6.583                  2.344                  2.197  
names(tergm.fit.1)
 [1] "newnetwork"    "newnetworks"   "init"          "covar"        
 [5] "mc.se"         "eta"           "opt.history"   "sample"       
 [9] "network"       "network"       "coef"          "targets"      
[13] "target.stats"  "estimate"      "sample.obs"    "control"      
[17] "reference"     "constraints"   "etamap"        "offset"       
[21] "mle.lik"       "MPLE_is_MLE"   "ergm_version"  "call"         
[25] "formula"       "estimate.desc" "tergm_version"
summary(tergm.fit.1)
Call:
tergm(formula = flobusiness ~ Form(~edges + gwesp(0, fixed = T)) + 
    Persist(~offset(edges)), offset.coef = log(9), estimate = "EGMME", 
    control = control.tergm(SA.plot.progress = TRUE), targets = "formation")

Gradient Descent Equilibrium Generalized Method of Moments Results Results:

                      Estimate Std. Error MCMC % z value Pr(>|z|)    
Form~edges             -6.5826     0.5951      0 -11.060   <1e-04 ***
Form~gwesp.fixed.0      2.3439     0.5338      0   4.391   <1e-04 ***
offset(Persist~edges)   2.1972     0.0000      0     Inf   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 The following terms are fixed by offset and are not estimated:
  offset(Persist~edges) 
tergm.sim.1 <- simulate(tergm.fit.1, nsim = 1, 
                        time.slices = 1000)
wealthsize <- log(get.vertex.attribute(flobusiness, "wealth")) * 2/3
slice.par = list(start = 0, 
               end = 25, 
               interval = 1, 
               aggregate.dur = 1, 
               rule = "any")
compute.animation(tergm.sim.1, slice.par = slice.par)
render.par = list(tween.frames = 5,
                show.time = T,
                show.stats = "~edges+gwesp(0,fixed = T)")
plot.par = list(edge.col = "darkgray",
              displaylabels = T,
              label.cex = .8,
              label.col = "blue",
              vertex.cex = wealthsize)
render.d3movie(tergm.sim.1,
               render.par = render.par,
               plot.par = plot.par,
               output.mode = 'htmlWidget')
cbind(model = summary(flobusiness ~ edges + gwesp(0, fixed = T)),
      obs = colMeans(attributes(tergm.sim.1)$stats))
              model    obs
edges            15 16.529
gwesp.fixed.0    12 13.116
plot(attributes(tergm.sim.1)$stats)
plot(as.matrix(attributes(tergm.sim.1)$stats))
# create dataFrame for direct estimation
tergm.sim.1.df <- as.data.frame(tergm.sim.1)
names(tergm.sim.1.df)
[1] "onset"             "terminus"          "tail"             
[4] "head"              "onset.censored"    "terminus.censored"
[7] "duration"          "edge.id"          
tergm.sim.1.df[1,]
  onset terminus tail head onset.censored terminus.censored duration edge.id
1     0       36    3    5           TRUE             FALSE       36       1
cbind(directEst = mean(tergm.sim.1.df$duration),
      ndEst = mean(edgeDuration(tergm.sim.1, 
                                mode = 'duration', 
                                subject = 'spells')))
     directEst    ndEst
[1,]  9.882915 9.882915
theta.diss.100 <- log(99)
ergm.fit1 <- ergm(flobusiness ~ edges + gwesp(0, fixed = T))
summary(ergm.fit1)
Call:
ergm(formula = flobusiness ~ edges + gwesp(0, fixed = T))

Monte Carlo Maximum Likelihood Results:

              Estimate Std. Error MCMC % z value Pr(>|z|)    
edges          -3.3419     0.6301      0  -5.304   <1e-04 ***
gwesp.fixed.0   1.5464     0.6101      0   2.535   0.0113 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 166.36  on 120  degrees of freedom
 Residual Deviance:  78.23  on 118  degrees of freedom

AIC: 82.23  BIC: 87.81  (Smaller is better. MC Std. Err. = 0.1898)
#theta.form <- ergm.fit1$coef 
theta.form <- coef(ergm.fit1)
theta.form
        edges gwesp.fixed.0 
    -3.341914      1.546387 
theta.form[1] <- theta.form[1] - theta.diss.100
theta.form
        edges gwesp.fixed.0 
    -7.937034      1.546387 
# tergm.sim.2 <- simulate(flobusiness, 
#                          formation = ~edges+gwesp(0,fixed = T),
#                          dissolution = ~edges, 
#                          monitor = "all",
#                          coef.form = theta.form, 
#                          coef.diss = theta.diss.100,
#                          time.slices = 50000)

tergm.sim.2 <- simulate(
  flobusiness ~ 
    Form(~ edges + gwesp(0,fixed=T)) +
    Diss(~ edges),
  monitor = "all",
  coef = c(theta.form, theta.diss.100), 
  time.slices = 50000,
  dynamic = TRUE)
# first, recovery of the cross sectional statistics in the formation model
cbind(observed = summary(flobusiness ~ edges + gwesp(0,fixed = T)),
      simulated = colMeans(attributes(tergm.sim.2)$stats))
Warning in cbind(observed = summary(flobusiness ~ edges + gwesp(0, fixed =
T)), : number of rows of result is not a multiple of vector length (arg 1)
              observed simulated
edges               15   0.04296
gwesp.fixed.0       12   0.00000
edges               15  -0.04296
plot(attributes(tergm.sim.2)$stats)
# second, recovery of the tie duration
tergm.sim.dm.2 <- as.data.frame(tergm.sim.2)
mean(tergm.sim.dm.2$duration)
[1] 1.009804
     Form(~ edges + degree(2:10))
     Diss(~ edges)

Function	Static Networks	Dynamic Networks
Data Storage	network	networkDynamic
Descriptive Stats	sna	tsna
Visualization	plot.network	ndtv
Statistical Modeling	ergm	tergm

Temporal Exponential Random Graph Models (TERGMs) for dynamic network modeling in statnet

Statnet Development Team

The `statnet` Project

Introduction

Prerequisites

Software Installation

Temporal network data

Types of temporal network data

Overview of modeling frameworks for temporal network data

Exploratory tools

`networkDynamic`

`tsna`

SNA metrics

ERGM terms

Temporal paths

Durations

`ndtv`

Network Movies

Proximity Timelines

Statistical Modeling with `tergm`

Elements of a model

Model terms

Term Operators

Joint vs. Separable Models

`tergm` syntax

Examples

CMLE

EGMME (1)

EGMME (2)

6. Additional functionality

References

Appendices

The separable model: STERGMs

Inutition

Formal representation

Independence: within vs. between timesteps

Temporal Exponential Random Graph Models (TERGMs) for dynamic network modeling in statnet

Statnet Development Team

The statnet Project

Introduction

Prerequisites

Software Installation

Temporal network data

Types of temporal network data

Overview of modeling frameworks for temporal network data

Exploratory tools

networkDynamic

tsna

SNA metrics

ERGM terms

Temporal paths

Durations

ndtv

Network Movies

Proximity Timelines

Statistical Modeling with tergm

Elements of a model

Model terms

Term Operators

Joint vs. Separable Models

tergm syntax

Examples

CMLE

EGMME (1)

EGMME (2)

6. Additional functionality

References

Appendices

The separable model: STERGMs

Inutition

Formal representation

Independence: within vs. between timesteps

The `statnet` Project

`networkDynamic`

`tsna`

`ndtv`

Statistical Modeling with `tergm`

`tergm` syntax