ERGM-easy-to-use

0.1 Types of network structuries
0.2 Why ERGM
- 0.2.1 Model simulation
0.3 all credits to

0.1 Types of network structuries

0.1.1 Full graph

library(igraph)

## 
## Attaching package: 'igraph'

## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum

## The following object is masked from 'package:base':
## 
##     union

fg <- make_full_graph(100)

plot(fg, vertex.size=10, vertex.label=NA)

Everyone connected with everyone. Perfect spread of information or desease. However, in the real world of networks, such situations do not always (very rarely) occur. What could this graph show? My idea is - a node is a political scientist, and communication is a visit to one summer school. In all other situations, it seems like someone communicates with whom, who trades with whom, has sex, retweets, etc. But for sure these graphs won’t be full.

0.1.2 Simple star graph

st <- make_star(100, mode = "undirected")

plot(st, vertex.size=10, vertex.label=NA)

Everyone connected with one person, boss or somebody else… Teacher? One important node and others.

0.1.3 Tree graph

tr <- make_tree(100, children = 3, mode = "undirected")

plot(tr, vertex.size=10, vertex.label=NA)

A tree is a more hierarchical network, since each node has 3 others in its subordination, such a structure can work in a company with different departments.

0.1.4 Erdos-Renyi random graph model

(‘n’ is number of nodes, ‘m’ is the number of edges).

er <- sample_gnm(n=100, m=80) 

plot(er, vertex.size=6, vertex.label=NA)

Just a random and equal probability to form a link between nodes.

0.1.5 Watts-Strogatz small-world model

sw <- sample_smallworld(dim=2, size=10, nei=1, p=0.1)

plot(sw, vertex.size=6, vertex.label=NA, layout=layout_in_circle)

The main point of a small world network is the fact that nodes are connected to neighbors and sometimes know someone from a completely different part of the network. So adding a small amount of random connections greatly reduces the average distance over the network. Remember the theory of 6 handshakes. This is an example of it. Even if you only communicate with your friends, then you most likely know at least someone who knows a person from a completely different sphere, or you can be such a person!

0.1.6 Barabasi-Albert preferential attachment model for scale-free graphs

(n is number of nodes, power is the power of attachment (1 is linear); m is the number of edges added on each time step)

 ba <-  sample_pa(n=100, power=1, m=1,  directed=F)

 plot(ba, vertex.size=6, vertex.label=NA)

A node with more links has a better chance of getting a new link. For example, the citation network works this way, but are you more likely to cite the work with a large number of citations, all other things being equal?

ceb.sw <- cluster_edge_betweenness(sw) 
ceb.fg <- cluster_edge_betweenness(fg) 
ceb.st <- cluster_edge_betweenness(st) 
ceb.ba <- cluster_edge_betweenness(ba)
ceb.er <- cluster_edge_betweenness(er)


deg.sw <- degree(sw, mode="all")
deg.st <- degree(st, mode="all")
deg.ba <- degree(ba, mode="all")
deg.er <- degree(er, mode="all")

Let’s just count the number of links for each node and select clicks (groups) through betweenness. we will use it later.

plot(ceb.fg, fg, vertex.size=6, vertex.label=NA, layout=layout_with_fr)

For example for a full graph there could be only one community, because everybody is equal and have the same amount of links.

par(mfrow=c(2,2))
plot(ceb.sw, sw, vertex.size=6, vertex.label=NA, layout=layout_with_fr, main = "Small world")  
plot(ceb.st, st, vertex.size=6, vertex.label=NA, layout=layout_with_fr, main = "Star")  
plot(ceb.ba, ba, vertex.size=6, vertex.label=NA, layout=layout_with_fr, main = "preferential attachment")  
plot(ceb.er, er, vertex.size=6, vertex.label=NA, layout=layout_with_fr, main = "random network")

A similar situation is with a star, when everyone unconditionally belongs to the same group. On a small world’s network, most groups are nodes that are next to each other, but sometimes they can overlap and a group can stretch to half the network. For the preferred attachment, groups are formed on each branch. And for a random network, in addition to groups, isolated, unconnected nodes appear.

Think about the distribution of links (degree) in term of inequality, and only then look at the pictures below.

. . . .

Why did this happen?

par(mfrow=c(2,2))
hist(deg.sw, breaks=1:vcount(sw)-1, main="Histogram of node degree SW", xlim = c(0, 15))
hist(deg.st, breaks=1:vcount(st)-1, main="Histogram of node degree Star", xlim = c(0, 100))
hist(deg.ba, breaks=1:vcount(ba)-1, main="Histogram of node degree Pref Att", xlim = c(0, 15))  
hist(deg.er, breaks=1:vcount(er)-1, main="Histogram of node degree Random", xlim = c(0, 15))

On average, how many links connect 2 random nodes in a network? That is, how easy is the information spreading over the network, or something else?

print("MEAN PATH DISTANCE")

## [1] "MEAN PATH DISTANCE"

print("Small world:")

## [1] "Small world:"

mean_distance(sw, directed=F)

## [1] 3.678586

print("Star:")

## [1] "Star:"

mean_distance(st, directed=F)

## [1] 1.98

print("preferential attachment:")

## [1] "preferential attachment:"

mean_distance(ba, directed=F)

## [1] 5.403838

print("Random:")

## [1] "Random:"

mean_distance(er, directed=F)

## [1] 5.944486

Notice how much more efficiently (the path is shorter) the network of a small world works.

0.2 Why ERGM

Cranmer and Desmarais (2011)

Actually, this picture is more about network vs diads.

Most of network-related packaged made by statnet

https://statnet.github.io/Workshops/ergm_tutorial.html

library(ergm)

## Loading required package: network

## network: Classes for Relational Data
## Version 1.15 created on 2019-04-01.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
##                     Mark S. Handcock, University of California -- Los Angeles
##                     David R. Hunter, Penn State University
##                     Martina Morris, University of Washington
##                     Skye Bender-deMoll, University of Washington
##  For citation information, type citation("network").
##  Type help("network-package") to get started.

## 
## Attaching package: 'network'

## The following objects are masked from 'package:igraph':
## 
##     %c%, %s%, add.edges, add.vertices, delete.edges,
##     delete.vertices, get.edge.attribute, get.edges,
##     get.vertex.attribute, is.bipartite, is.directed,
##     list.edge.attributes, list.vertex.attributes,
##     set.edge.attribute, set.vertex.attribute

## 
## ergm: version 3.10.4, created on 2019-06-10
## Copyright (c) 2019, Mark S. Handcock, University of California -- Los Angeles
##                     David R. Hunter, Penn State University
##                     Carter T. Butts, University of California -- Irvine
##                     Steven M. Goodreau, University of Washington
##                     Pavel N. Krivitsky, University of Wollongong
##                     Martina Morris, University of Washington
##                     with contributions from
##                     Li Wang
##                     Kirk Li, University of Washington
##                     Skye Bender-deMoll, University of Washington
##                     Chad Klumb
## Based on "statnet" project software (statnet.org).
## For license and citation information see statnet.org/attribution
## or type citation("ergm").

## NOTE: Versions before 3.6.1 had a bug in the implementation of the
## bd() constriant which distorted the sampled distribution somewhat.
## In addition, Sampson's Monks datasets had mislabeled vertices. See
## the NEWS and the documentation for more details.

## NOTE: Some common term arguments pertaining to vertex attribute
## and level selection have changed in 3.10.0. See terms help for
## more details. Use 'options(ergm.term=list(version="3.9.4"))' to
## use old behavior.

data(package='ergm') # tells us the datasets in ergm packages
data(florentine) # loads flomarriage and flobusiness data
?flomarriage # Let's look at the flomarriage data

plot(flomarriage) # Let's view the flomarriage network

Basic plot only shows the structure, without any additional information.

But, what could we visualize?

Lets take a look at data, its attributes

flomarriage

##  Network attributes:
##   vertices = 16 
##   directed = FALSE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 20 
##     missing edges= 0 
##     non-missing edges= 20 
## 
##  Vertex attribute names: 
##     priorates totalties vertex.names wealth 
## 
## No edge attributes

Add a question mark before line 176 to get additional information about the data.

ggraph and igraph useful packages to work with networks, have you seen them before?

lines 187-195 won`t work, uncomment them (ctr+shift+C) and try to run.

What is the problem?

library(ggraph)

## Loading required package: ggplot2

library(igraph)
# plot(flomarriage, vertex.shape="none", vertex.label=V(flomarriage)$vertex.names,
# vertex.label.font=2, vertex.label.color="gray40",
# vertex.label.cex=.7, edge.color="gray85")
# 
# ggraph(flomarriage) + 
#     geom_edge_link() + 
#     geom_node_point()+
#     theme_void()+
#   geom_node_text(V(flomarriage)$vertex.names)

Since, ergm uses “network” objects, and igraph, ggraph use object of different types, for a nice visualization we need to convert them with the help of intergraph package.

library(intergraph)
g_m <- intergraph::asIgraph(flomarriage)
g_b <- intergraph::asIgraph(flobusiness)


ggraph(g_b) + 
    geom_edge_link(color = "gray") + 
    theme_void()+
  geom_node_text(aes(label = vertex.names, size = wealth))

## Using `nicely` as default layout

Size is a wealth, link is a marriage. try to Google some of these families. + Change data object in plot for a g_b,

Let`s start with a basic model, ergm function expects network data object, “~” and ergm-terms.

Basically ergm is a bunch of log-regressions, with link between two nodes as dependent, and everything else as independent.

flomodel.01 <- ergm(flomarriage ~ edges)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

summary(flomodel.01)

## 
## ==========================
## Summary of model fit
## ==========================
## 
## Formula:   flomarriage ~ edges
## 
## Iterations:  5 out of 20 
## 
## Monte Carlo MLE Results:
##       Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges  -1.6094     0.2449      0  -6.571   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.4  on 120  degrees of freedom
##  Residual Deviance: 108.1  on 119  degrees of freedom
##  
## AIC: 110.1    BIC: 112.9    (Smaller is better.)

Base model only explains probability to form a link between two random nodes. To interpret it we have to pay attention to the P-value (less is better). MCMC 0-1% is good. Estimate is a power of attribute, 0-1 weak, 1-2 moderate, 2+ strong. BIC and AIC tools to compare different models, less is better.

Interpretation is quite harder because of log odds, but we could convert them to a chances

plogis(coef(flomodel.01)[1])

##     edges 
## 0.1666667

So it is near 16% chance for a marriage between two random families.

flomodel.02 <- ergm(flomarriage~edges+triangle)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Starting Monte Carlo maximum likelihood estimation (MCMLE):

## Iteration 1 of at most 20:

## Optimizing with step length 1.

## The log-likelihood improved by 0.004853.

## Step length converged once. Increasing MCMC sample size.

## Iteration 2 of at most 20:

## Optimizing with step length 1.

## The log-likelihood improved by < 0.0001.

## Step length converged twice. Stopping.

## Finished MCMLE.

## Evaluating log-likelihood at the estimate. Using 20 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 .
## This model was fit using MCMC.  To examine model diagnostics and
## check for degeneracy, use the mcmc.diagnostics() function.

summary(flomodel.02)

## 
## ==========================
## Summary of model fit
## ==========================
## 
## Formula:   flomarriage ~ edges + triangle
## 
## Iterations:  2 out of 20 
## 
## Monte Carlo MLE Results:
##          Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges     -1.6735     0.3489      0  -4.797   <1e-04 ***
## triangle   0.1607     0.5975      0   0.269    0.788    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.4  on 120  degrees of freedom
##  Residual Deviance: 108.1  on 118  degrees of freedom
##  
## AIC: 112.1    BIC: 117.6    (Smaller is better.)

plogis(coef(flomodel.02)[1])

##     edges 
## 0.1579571

A adding an each adding term could decrease the estimation of others, even if term in not significant.

Nodecov term adds a single network statistic for each quantitative attribute or matrix column to the model equaling the sum of attr(i) and attr(j) for all edges (i,j) in the network.

flomodel.03 <- ergm(flomarriage~edges+nodecov('wealth'))

## Warning: `set_attrs()` is deprecated as of rlang 0.3.0
## This warning is displayed once per session.

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

summary(flomodel.03)

## 
## ==========================
## Summary of model fit
## ==========================
## 
## Formula:   flomarriage ~ edges + nodecov("wealth")
## 
## Iterations:  4 out of 20 
## 
## Monte Carlo MLE Results:
##                 Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges          -2.594929   0.536056      0  -4.841   <1e-04 ***
## nodecov.wealth  0.010546   0.004674      0   2.256   0.0241 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.4  on 120  degrees of freedom
##  Residual Deviance: 103.1  on 118  degrees of freedom
##  
## AIC: 107.1    BIC: 112.7    (Smaller is better.)

?nodecov

It could be interpreted as tendency to form ties with more chance for each additional point of wealth. Compare this results with network of

# flomodel.04 <- ergm(PUT_NAME_OF_BUSINESS_NET_HERE   ~edges+nodecov('wealth'))
# summary(flomodel.04)

For a business network, it is not important that both families are very wealthy. So the marriage is between strong to avoid enmity, and business is conducted in the most profitable way.

For a futher exploring we switch to a social network of school kids.

?faux.mesa.high

data(faux.mesa.high) 
mesa <- faux.mesa.high

par(mfrow=c(1,1)) # Back to 1-panel plots
plot(mesa, vertex.col='Grade')
legend('bottomleft',fill=7:12,
       legend=paste('Grade',7:12),cex=0.75)

plot(mesa, vertex.col='Race')
legend('bottomleft',fill=1:5,
       legend=c("Black", "Hisp", "NatAm", "Other", "White"),cex=0.75)

fauxmodel.01 <- ergm(mesa ~ edges + 
                       nodefactor('Grade') + nodematch('Grade',diff=T) +
                       nodefactor('Race') + nodematch('Race',diff=T))

## Observed statistic(s) nodematch.Race.Black and nodematch.Race.Other are at their smallest attainable values. Their coefficients will be fixed at -Inf.

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

summary(fauxmodel.01)

## 
## ==========================
## Summary of model fit
## ==========================
## 
## Formula:   mesa ~ edges + nodefactor("Grade") + nodematch("Grade", diff = T) + 
##     nodefactor("Race") + nodematch("Race", diff = T)
## 
## Iterations:  7 out of 20 
## 
## Monte Carlo MLE Results:
##                       Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges                  -8.0538     1.2561      0  -6.412  < 1e-04 ***
## nodefactor.Grade.8      1.5201     0.6858      0   2.216 0.026663 *  
## nodefactor.Grade.9      2.5284     0.6493      0   3.894  < 1e-04 ***
## nodefactor.Grade.10     2.8652     0.6512      0   4.400  < 1e-04 ***
## nodefactor.Grade.11     2.6291     0.6563      0   4.006  < 1e-04 ***
## nodefactor.Grade.12     3.4629     0.6566      0   5.274  < 1e-04 ***
## nodematch.Grade.7       7.4662     1.1730      0   6.365  < 1e-04 ***
## nodematch.Grade.8       4.2882     0.7150      0   5.997  < 1e-04 ***
## nodematch.Grade.9       2.0371     0.5538      0   3.678 0.000235 ***
## nodematch.Grade.10      1.2489     0.6233      0   2.004 0.045111 *  
## nodematch.Grade.11      2.4521     0.6124      0   4.004  < 1e-04 ***
## nodematch.Grade.12      1.2987     0.6981      0   1.860 0.062824 .  
## nodefactor.Race.Hisp   -1.6659     0.2963      0  -5.622  < 1e-04 ***
## nodefactor.Race.NatAm  -1.4725     0.2869      0  -5.132  < 1e-04 ***
## nodefactor.Race.Other  -2.9618     1.0372      0  -2.856 0.004296 ** 
## nodefactor.Race.White  -0.8488     0.2958      0  -2.869 0.004112 ** 
## nodematch.Race.Black      -Inf     0.0000      0    -Inf  < 1e-04 ***
## nodematch.Race.Hisp     0.6912     0.3451      0   2.003 0.045153 *  
## nodematch.Race.NatAm    1.2482     0.3550      0   3.517 0.000437 ***
## nodematch.Race.Other      -Inf     0.0000      0    -Inf  < 1e-04 ***
## nodematch.Race.White    0.3140     0.6405      0   0.490 0.623947    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 28987  on 20910  degrees of freedom
##  Residual Deviance:  1827  on 20889  degrees of freedom
##  
## AIC: 1869    BIC: 2036    (Smaller is better.) 
## 
##  Warning: The following terms have infinite coefficient estimates:
##   nodematch.Race.Black nodematch.Race.Other

table(mesa %v% 'Race') # Frequencies of race

## 
## Black  Hisp NatAm Other White 
##     6   109    68     4    18

mixingmatrix(mesa, "Race")

## Note:  Marginal totals can be misleading
##  for undirected mixing matrices.
##       Black Hisp NatAm Other White
## Black     0    8    13     0     5
## Hisp      8   53    41     1    22
## NatAm    13   41    46     0    10
## Other     0    1     0     0     0
## White     5   22    10     0     4

mixingmatrix(mesa, "Grade")

## Note:  Marginal totals can be misleading
##  for undirected mixing matrices.
##     7  8  9 10 11 12
## 7  75  0  0  1  1  1
## 8   0 33  2  4  2  1
## 9   0  2 23  7  6  4
## 10  1  4  7  9  1  5
## 11  1  2  6  1 17  5
## 12  1  1  4  5  5  6

fauxmodel.01.gof <- gof(fauxmodel.01)
fauxmodel.01.gof

## 
## Goodness-of-fit for degree 
## 
##    obs min  mean max MC p-value
## 0   57  19 34.11  53       0.00
## 1   51  42 54.23  73       0.74
## 2   30  39 51.62  64       0.00
## 3   28  22 33.88  44       0.26
## 4   18   7 17.48  32       0.98
## 5   10   2  8.02  21       0.64
## 6    2   0  3.53  11       0.68
## 7    4   0  1.40   4       0.04
## 8    1   0  0.45   2       0.82
## 9    2   0  0.22   2       0.02
## 10   1   0  0.04   1       0.08
## 11   0   0  0.01   1       1.00
## 12   0   0  0.01   1       1.00
## 13   1   0  0.00   0       0.00
## 
## Goodness-of-fit for edgewise shared partner 
## 
##      obs min   mean max MC p-value
## esp0  83 154 186.31 218          0
## esp1  70   3  15.98  35          0
## esp2  36   0   0.83   6          0
## esp3  13   0   0.04   2          0
## esp5   1   0   0.00   0          0
## 
## Goodness-of-fit for minimum geodesic distance 
## 
##       obs  min    mean   max MC p-value
## 1     203  168  203.16   243       1.00
## 2     411  267  412.32   589       1.00
## 3     561  381  710.81  1075       0.28
## 4     591  493  996.33  1629       0.06
## 5     710  521 1170.10  1982       0.12
## 6     875  442 1242.75  2265       0.40
## 7     860  316 1241.17  2318       0.44
## 8     824  224 1164.07  2009       0.48
## 9     704  152 1018.27  1689       0.44
## 10    563   55  831.88  1365       0.44
## 11    402   14  634.60  1295       0.54
## 12    246    1  461.93  1060       0.60
## 13    104    0  322.49   924       0.56
## 14     77    0  219.34   838       0.76
## 15     33    0  146.00   789       0.80
## 16      4    0   93.13   623       0.64
## 17      0    0   55.89   462       0.64
## 18      0    0   31.98   380       0.80
## 19      0    0   17.51   255       1.00
## 20      0    0    9.37   133       1.00
## 21      0    0    4.55   108       1.00
## 22      0    0    1.85    74       1.00
## 23      0    0    0.82    44       1.00
## 24      0    0    0.31    17       1.00
## 25      0    0    0.12     8       1.00
## 26      0    0    0.02     2       1.00
## Inf 13742 5504 9919.23 17558       0.32
## 
## Goodness-of-fit for model statistics 
## 
##                       obs min   mean max MC p-value
## edges                 203 168 203.16 243       1.00
## nodefactor.Grade.8     75  44  76.37 108       0.92
## nodefactor.Grade.9     65  37  64.78  92       1.00
## nodefactor.Grade.10    36  19  36.25  56       1.00
## nodefactor.Grade.11    49  32  49.73  73       0.96
## nodefactor.Grade.12    28  16  27.51  43       1.00
## nodematch.Grade.7      75  54  74.32  96       1.00
## nodematch.Grade.8      33  19  33.46  50       0.92
## nodematch.Grade.9      23  11  23.17  36       1.00
## nodematch.Grade.10      9   3   9.04  17       1.00
## nodematch.Grade.11     17   8  17.25  25       0.98
## nodematch.Grade.12      6   1   5.68  11       1.00
## nodefactor.Race.Hisp  178 125 176.89 224       0.96
## nodefactor.Race.NatAm 156 115 156.98 197       0.92
## nodefactor.Race.Other   1   0   1.06   3       1.00
## nodefactor.Race.White  45  27  44.60  63       1.00
## nodematch.Race.Hisp    53  36  52.52  71       0.96
## nodematch.Race.NatAm   46  26  46.14  67       1.00
## nodematch.Race.White    4   0   4.05   8       1.00

plot(fauxmodel.01.gof)

Ok model, for a terrible look down. WARNING!

fauxmodel.bad <- ergm(mesa ~ edges +
                       nodefactor('Grade') + nodemix('Grade') +
                       nodefactor('Race') + nodematch('Race',diff=F) + kstar(1)  + nodematch("Sex")+ nodefactor('Sex'),control=  control.ergm(MPLE.samplesize = 5000 , MCMLE.maxit = 20,   MCMC.interval = 10) )

## Observed statistic(s) mix.Grade.7.8 and mix.Grade.7.9 are at their smallest attainable values. Their coefficients will be fixed at -Inf.

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Starting Monte Carlo maximum likelihood estimation (MCMLE):

## Iteration 1 of at most 20:

## Optimizing with step length 0.326477745335357.

## The log-likelihood improved by 2.013.

## Iteration 2 of at most 20:

## Optimizing with step length 0.196436592898565.

## The log-likelihood improved by 1.637.

## Iteration 3 of at most 20:

## Optimizing with step length 0.0822102067120781.

## The log-likelihood improved by 1.644.

## Iteration 4 of at most 20:

## Optimizing with step length 0.0732125350054855.

## The log-likelihood improved by 1.707.

## Iteration 5 of at most 20:

## Optimizing with step length 0.100447990277291.

## The log-likelihood improved by 1.571.

## Iteration 6 of at most 20:

## Optimizing with step length 0.123036000041802.

## The log-likelihood improved by 1.406.

## Iteration 7 of at most 20:

## Optimizing with step length 0.163770634291013.

## The log-likelihood improved by 1.895.

## Iteration 8 of at most 20:

## Optimizing with step length 0.105233117432089.

## The log-likelihood improved by 1.713.

## Iteration 9 of at most 20:

## Optimizing with step length 0.0683156805522523.

## The log-likelihood improved by 1.383.

## Iteration 10 of at most 20:

## Optimizing with step length 0.0810583270684751.

## The log-likelihood improved by 1.399.

## Iteration 11 of at most 20:

## Optimizing with step length 0.115755301923427.

## The log-likelihood improved by 1.395.

## Iteration 12 of at most 20:

## Optimizing with step length 0.166904716690295.

## The log-likelihood improved by 1.952.

## Iteration 13 of at most 20:

## Optimizing with step length 0.213339995692825.

## The log-likelihood improved by 1.463.

## Iteration 14 of at most 20:

## Optimizing with step length 0.0827426138689308.

## The log-likelihood improved by 1.97.

## Iteration 15 of at most 20:

## Optimizing with step length 0.144233980115854.

## The log-likelihood improved by 1.571.

## Iteration 16 of at most 20:

## Optimizing with step length 0.159745519733487.

## The log-likelihood improved by 1.486.

## Iteration 17 of at most 20:

## Optimizing with step length 0.162539957998691.

## The log-likelihood improved by 2.213.

## Iteration 18 of at most 20:

## Optimizing with step length 0.195186127804424.

## The log-likelihood improved by 1.939.

## Iteration 19 of at most 20:

## Optimizing with step length 0.192498255525989.

## The log-likelihood improved by 2.262.

## Iteration 20 of at most 20:

## Optimizing with step length 0.174546597046615.

## Error in solve.default(H, tol = 1e-20) : 
##   Lapack routine dgesv: system is exactly singular: U[33,33] = 0

## Warning in ergm.MCMCse.lognormal(theta = theta, init = init, statsmatrix =
## statsmatrix0, : Approximate Hessian matrix is singular. Standard errors due
## to MCMC approximation of the likelihood cannot be evaluated. This is likely
## due to insufficient MCMC sample size or highly correlated model terms.

## The log-likelihood improved by 1.874.

## MCMLE estimation did not converge after 20 iterations. The estimated coefficients may not be accurate. Estimation may be resumed by passing the coefficients as initial values; see 'init' under ?control.ergm for details.

## Finished MCMLE.

## Evaluating log-likelihood at the estimate. Using 20 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 .
## This model was fit using MCMC.  To examine model diagnostics and
## check for degeneracy, use the mcmc.diagnostics() function.

?control.ergm()

summary(fauxmodel.bad)

## 
## ==========================
## Summary of model fit
## ==========================
## 
## Formula:   mesa ~ edges + nodefactor("Grade") + nodemix("Grade") + nodefactor("Race") + 
##     nodematch("Race", diff = F) + kstar(1) + nodematch("Sex") + 
##     nodefactor("Sex")
## 
## Iterations:  20 out of 20 
## 
## Monte Carlo MLE Results:
##                       Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges                  -8.7318         NA     NA      NA       NA    
## nodefactor.Grade.8      0.4327         NA     NA      NA       NA    
## nodefactor.Grade.9      1.7006         NA     NA      NA       NA    
## nodefactor.Grade.10     2.3653         NA     NA      NA       NA    
## nodefactor.Grade.11     2.1709         NA     NA      NA       NA    
## nodefactor.Grade.12     3.8942         NA     NA      NA       NA    
## mix.Grade.7.7           7.6698         NA     NA      NA       NA    
## mix.Grade.7.8             -Inf     0.0000      0    -Inf   <1e-04 ***
## mix.Grade.8.8           7.3286         NA     NA      NA       NA    
## mix.Grade.7.9             -Inf     0.0000      0    -Inf   <1e-04 ***
## mix.Grade.8.9           1.2302         NA     NA      NA       NA    
## mix.Grade.9.9           4.0833         NA     NA      NA       NA    
## mix.Grade.7.10          1.7798         NA     NA      NA       NA    
## mix.Grade.8.10          3.0876         NA     NA      NA       NA    
## mix.Grade.9.10          1.1296         NA     NA      NA       NA    
## mix.Grade.10.10         2.6920         NA     NA      NA       NA    
## mix.Grade.7.11          1.2900         NA     NA      NA       NA    
## mix.Grade.8.11          1.8182         NA     NA      NA       NA    
## mix.Grade.9.11          2.1327         NA     NA      NA       NA    
## mix.Grade.10.11        -1.6857         NA     NA      NA       NA    
## mix.Grade.11.11         3.0084         NA     NA      NA       NA    
## mix.Grade.7.12          1.0299         NA     NA      NA       NA    
## mix.Grade.8.12         -0.1170         NA     NA      NA       NA    
## mix.Grade.9.12          0.4223         NA     NA      NA       NA    
## mix.Grade.10.12        -0.2418         NA     NA      NA       NA    
## mix.Grade.11.12         0.4191         NA     NA      NA       NA    
## mix.Grade.12.12        -0.9897         NA     NA      NA       NA    
## nodefactor.Race.Hisp   -1.7851         NA     NA      NA       NA    
## nodefactor.Race.NatAm  -1.1222         NA     NA      NA       NA    
## nodefactor.Race.Other  -2.0082         NA     NA      NA       NA    
## nodefactor.Race.White  -0.9903         NA     NA      NA       NA    
## nodematch.Race          0.8691         NA     NA      NA       NA    
## kstar1                  0.3664         NA     NA      NA       NA    
## nodematch.Sex           0.4004         NA     NA      NA       NA    
## nodefactor.Sex.M       -0.4324         NA     NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 28987  on 20910  degrees of freedom
##  Residual Deviance:   NaN  on 20875  degrees of freedom
##  
## AIC: NaN    BIC: NaN    (Smaller is better.) 
## 
##  Warning: The following terms have infinite coefficient estimates:
##   mix.Grade.7.8 mix.Grade.7.9

Bad gof

par(mfrow=c(2,2))
plot(gof(fauxmodel.bad))

#mcmc.diagnostics(flomodel.03)

0.2.1 Model simulation

par(mfrow=c(1,2)) # Back to 1-panel plots
plot(mesa, vertex.col='Grade', main = "real")
plot(simulate(fauxmodel.01), vertex.col='Grade', main =  "simulated")
legend('bottomleft',fill=7:12,
       legend=paste('Grade',7:12),cex=0.75)

All terms for ERGM, frequently used, rarely used, their description

vignette('ergm-term-crossRef')

## starting httpd help server ... done