1 Introducción

Los modelos de grafos aleatorios exponenciales (exponential random graph models, ERGMs) o modelos \(p^*\) se especifican de manera análoga a los modelos lineales generalizados (generalized linear models, GLMs).

Frank, O. & Strauss, D. (1986), ‘Markov graphs’, Journal of the American Statistical Association 81(395), 832–842.

https://cran.r-project.org/web/packages/ergm/vignettes/ergm.pdf

Wasserman, S. & Pattison, P. (1996), ‘Logit models and logistic regressions for social networks:I. an introduction to markov graphs and p’, Psychometrika 61(3), 401–425.*

Un ERGM es un modelo de la forma \[ p(\mathbf{y}\mid\boldsymbol{\theta}) = \frac{1}{\kappa}\,\exp{\left\{ \boldsymbol{\theta}^{\textsf{T}}\,\boldsymbol{\mathsf{g}}(\mathbf{y}) \right\}} \] donde:

\(\mathbf{y}=[y_{i,j}]\) es la realización de una matriz de adyacencia aleatoria \(\mathbf{Y}=[Y_{i,j}]\) asociada con un grafo \(G=(V,E)\) simple (no dirigido), i.e., \(y_{i,j}\in\{0,1\}\) y \(y_{i,j}\equiv y_{j,i}\).
\(\boldsymbol{\mathsf{g}}(\mathbf{y}) = [\mathsf{g}_1(\mathbf{y}),\ldots,\mathsf{g}_K(\mathbf{y})]^{\textsf{T}}\) es un vector \(K\) dimensional constituido por estadísticos de \(\mathbf{y}\) y/o funciones (conocidas) de \(\mathbf{y}\) y atributos nodales \(\mathbf{x}\).
\(\boldsymbol{\theta} = [\theta_1,\ldots,\theta_K]^{\textsf{T}}\) es un vector \(K\) dimensional de parámetros desconocidos.
\(\kappa\equiv\kappa(\boldsymbol{\theta})\) es la constante de normalización, \(\kappa = \sum_{\mathbf{y}} \exp{\left\{ \boldsymbol{\theta}^{\textsf{T}}\,\boldsymbol{\mathsf{g}}(\mathbf{y}) \right\}}\).

Los coeficientes \(\boldsymbol{\theta}\) representan el tamaño y la dirección de los efectos de \(\boldsymbol{\mathsf{g}}(\mathbf{y})\) sobre la probabilidad general de la red.

La probabilidad del grafo completo se puede volver a expresar en escala logit en términos de las probabilidades condicionales de observar una arista arista manteniendo el resto de la red fija entre dos actores: \[ \text{logit}\,\textsf{Pr}(y_{i,j}=1\mid\mathbf{y}_{-(i,j)}) = \boldsymbol{\theta}^{\textsf{T}}\,\boldsymbol{\delta}_{i,j}(\mathbf{y}) \] donde:

\(\mathbf{y}_{-(i,j)}\) corresponde a \(\mathbf{y}\) exceptuando la observación \(y_{i,j}\).
\(\boldsymbol{\delta}_{i,j}(\mathbf{y})\) es la estadística de cambio que se calcula como el valor \(\boldsymbol{\mathsf{g}}(\mathbf{y})\) cuando \(y_{i,j}=1\) menos el valor \(\boldsymbol{\mathsf{g}}(\mathbf{y})\) cuando \(y_{i,j}=0\), manteniendo constante el valor de todas las demás díadas en \(\mathbf{y}\).
Los coeficientes se pueden interpretar como la contribución de los términos a la probabilidad (en escala logit) de observar una arista particular, condicionado a que todas las demás díadas permanezcan iguales.

2 Modelo Bernoulli

Para cada par de vértices la presencia o ausencia de una arista es independiente del estado de otras posibles aristas: \[ p(\mathbf{y}\mid\boldsymbol{\theta}) = \frac{1}{\kappa}\,\exp{\left\{ \sum_{i<j} \theta_{i,j}\, y_{i,j} \right\}}\,. \] Además, asumiendo \(\theta_{i,j}\equiv\theta\), i.e., homogeneidad a través de toda la red: \[ p(\mathbf{y}\mid\boldsymbol{\theta}) = \frac{1}{\kappa}\,\exp{\left\{ \theta\, S(\mathbf{y}) \right\}} \] donde \(S(\mathbf{y}) = \sum_{i<j}\, y_{i,j}\) es el número de aristas en el grafo. Esto es, el log odds de de observar \(\mathbf{y}\) es proporcional al número de aristas en el grafo.

Simplificando, se tiene que el modelo resultante es equivalente a: \[ y_{i,j}\mid\theta\stackrel{\text{iid}}{\sim}\textsf{Bernoulli}(\textsf{expit}(\theta)) \] para \(i<j\), donde \(\textsf{expit}(x) = 1/(1+\exp(-x))\) es la función inversa de la función \(\textsf{logit}(x) = \log(x/(1-x))\).

edges en ergm.

3 Estrellas y triángulos

Es de interés incorporar estadísticos de orden superior asociados con la estructura global del grafo, como el número de \(k\)-estrellas (\(k\)-stars, \(k+1\) vértices y \(k\) aristas) \(S_k(\mathbf{y})\) y el número de triángulos (3-clique) \(T(\mathbf{y})\).

\(S_1(\mathbf{y})\) es el número de aristas.

Se especifica el modelo: \[ p(\mathbf{y}\mid\boldsymbol{\theta}) = \frac{1}{\kappa}\,\exp{\left\{ \sum_{k=1}^{n-1}\theta_k\,S_k(\mathbf{y}) + \theta_T\,T(\mathbf{y}) \right\}} \] donde \(n\) es el orden del grafo.

Comúnmente se ajusta considerando \(S_1(\mathbf{y})\), \(S_2(\mathbf{y})\), \(S_3(\mathbf{y})\), y \(T(\mathbf{y})\).
El estado de cualquier triángulo que contenga la díada \(y_{i,j}\) depende del estado de las díadas de la forma \(y_{i,k}\) y \(y_{j,k}\). Por lo tanto, las díadas no son probabilísticamente independientes entre sí.
En contraste con la independencia del modelo de Bernoulli, produce grafos con ciertos tipos de dependencia entre las aristas (dependencia Markoviana, dos aristas son dependientes siempre que compartan un vértice, dadas todas las demás aristas posibles).
Con frecuencia no se ajusta bien a los datos dado que tiene problemas de degeneración (distribución de probabilidad que coloca una cantidad desproporcionadamente grande de su masa en un conjunto de resultados pequeño).

kstar y triangle en ergm.

4 Restricciones paramétricas

Con el fin de mejorar el ajuste del modelo se proponen restricciones sobre los parámetros de las estrellas e la forma \(\theta_k \propto (-1)^k/\lambda^{k-2}\), para \(k\geq 2\) y algún \(\lambda\geq 1\). Esto combina las \(k\)-estrellas en un solo estadístico alternante (alternating \(k\)-star statistic) de la forma: \[ \textsf{AKS}(\mathbf{y}) = \sum_{k=2}^{n-1} \frac{(-1)^{k}}{\lambda^{k-2}}\,S_k(\mathbf{y}) \] el cual está asociado con un solo parámetro \(\theta_{\text{AKS}}\) que tiene en cuenta los efectos de todas las \(k\)-estrellas simultáneamente.

altkstar en ergm.

También se puede utilizar el conteo de grado ponderado geométricamente (geometrically weighted degree count), si se quiere incorporar el grado en el modelo: \[ \textsf{GWD}(\mathbf{y}) = \sum_{d=0}^{n-1} e^{-\gamma\,d} N_d(\mathbf{y}) \] donde \(N_d(\mathbf{y})\) es el número de vértices de grado \(d\) y \(\gamma>0\).

gwdegree o gwesp en ergm.

5 Atributos de los vértices

La plausibilidad de que una arista una a dos vértices depende no solo del estado (0 o 1) de las aristas entre otros pares de vértices, sino también de los atributos de los vértices (variables exógenas).

Los atributos de los vértices se pueden incluir en la formulación de un ERGM mediante \[ \boldsymbol{\mathsf{g}}(\mathbf{y},\mathbf{x}) = \sum_{i<j} y_{i,j}\,h(\mathbf{x}_i,\mathbf{x}_j) \] donde \(h(\mathbf{x}_i,\mathbf{x}_j)\) es una función simétrica de \(\mathbf{x}_i\) y \(\mathbf{x}_j\), y \(\mathbf{x}_i\) es el vector de atributos observados del vértice \(i\).

Efectos principales: \(h(x_i,x_j) = x_i + x_j\). nodecov o nodemain en ergm.
Efectos de segundo orden (homofilia): \(h(x_i,x_j) = |x_i - x_j|\). nodematch o absdif en ergm.
Efectos de segundo orden (homofilia): \(h(x_i,x_j) = I_{\{x_i=x_j\}}\). match en ergm.

6 Ajuste del modelo

El estimador de máxima verosimilitud (maximum likelihood estimator, MLE) de \(\boldsymbol{\theta}\) es \[ \hat{\boldsymbol{\theta}}_{\text{MLE}} = \text{arg max}_{\boldsymbol{\theta}}\, \ell(\boldsymbol{\theta}) = \text{arg max}_{\boldsymbol{\theta}}\, \left(\boldsymbol{\theta}^{\textsf{T}}\,\boldsymbol{\mathsf{g}}(\mathbf{y}) - \psi(\boldsymbol{\theta})\right) \] donde \(\psi(\boldsymbol{\theta}) = \log \kappa(\boldsymbol{\theta})\).

La función \(\psi(\boldsymbol{\theta})\) no se puede evaluar explícitamente, dado que involucra una suma sobre \(2^{\binom{n}{2}}\) términos (uno para cada posible valor de \(\mathbf{y}\)), para cada \(\boldsymbol{\theta}\).

El paquete ergm utiliza una versión estimación de máxima verosimilitud por medio de cadenas de Markov de Monte Carlo (Markoc chain Monte Carlo, MCMC).

Geyer, C. J., & Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data. Journal of the Royal Statistical Society: Series B (Methodological), 54(3), 657-683.
Hunter, D. R., & Handcock, M. S. (2006). Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics, 15(3), 565-583.

7 Ejemplo: Familias Florentinas

Conjunto de datos de lazos matrimoniales y comerciales entre familias florentinas del Renacimiento.

Padgett, John F. 1994. Marriage and Elite Structure in Renaissance Florence, 1282-1500. Paper delivered to the Social Science History Association.

# paquetes
suppressMessages(suppressWarnings(library(ergm)))
# datos
data(florentine)
flomarriage

##  Network attributes:
##   vertices = 16 
##   directed = FALSE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 20 
##     missing edges= 0 
##     non-missing edges= 20 
## 
##  Vertex attribute names: 
##     priorates totalties vertex.names wealth 
## 
## No edge attributes

# clase
class(flomarriage)

## [1] "network"

# atributos
wealth <- flomarriage %v% 'wealth'
wealth

##  [1]  10  36  55  44  20  32   8  42 103  48  49   3  27  10 146  48

Parace haber una relación entre la riqueza y la sociabilidad.

# grafico
par(mfrow=c(1,2)) # Setup a 2 panel plot
plot(flomarriage, main = "Matrimonios", label = network.vertex.names(flomarriage))
plot(flomarriage, vertex.cex = wealth/25, main = "Matrimoios por riqueza")

7.1 Modelo Bernoulli

# formulacion del modelo
ergm.01 <- formula(flomarriage ~ edges)
summary(ergm.01)

## edges 
##    20

# ajuste del modelo
set.seed(42)
ergm.fit.01 <- ergm(formula = ergm.01)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

anova(ergm.fit.01)

## Analysis of Variance Table
## 
## Model 1: flomarriage ~ edges
##          Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
## NULL                       120     166.35                 
## Model 1:  1   58.221       119     108.14    2.343e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(ergm.fit.01)

## Call:
## ergm(formula = ergm.01)
## 
## Maximum Likelihood Results:
## 
##       Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges  -1.6094     0.2449      0  -6.571   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.4  on 120  degrees of freedom
##  Residual Deviance: 108.1  on 119  degrees of freedom
##  
## AIC: 110.1  BIC: 112.9  (Smaller is better. MC Std. Err. = 0)

# extraer coeficientes
coef(ergm.fit.01)

##     edges 
## -1.609438

# de network a igraph
A <- as.matrix.network.adjacency(x = flomarriage)
g <- igraph::graph_from_adjacency_matrix(adjmatrix = A, mode = "undirected")
# expit de -1.4992
1/(1+exp(-(-1.609438)))

## [1] 0.1666667

exp(-1.609438)/(1+exp(-1.609438))

## [1] 0.1666667

# densidad
igraph::edge_density(graph = g)

## [1] 0.1666667

# mas componentes 
names(ergm.fit.01)

##  [1] "coefficients"    "iterations"      "MCMCtheta"       "gradient"       
##  [5] "hessian"         "covar"           "failure"         "glm"            
##  [9] "glm.null"        "xmat.full"       "call"            "ergm_version"   
## [13] "offset"          "MPLE_is_MLE"     "drop"            "estimable"      
## [17] "network"         "reference"       "newnetwork"      "formula"        
## [21] "constrained"     "constraints"     "obs.constraints" "nw.stats"       
## [25] "etamap"          "estimate"        "estimate.desc"   "control"        
## [29] "null.lik"        "mle.lik"

7.2 Modelo con aristas y triángulos

# formulacion del modelo
ergm.02 <- formula(flomarriage ~ edges + triangles)
summary(ergm.02)

##    edges triangle 
##       20        3

# ajuste del modelo
set.seed(42)
ergm.fit.02 <- ergm(formula = ergm.02)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Starting Monte Carlo maximum likelihood estimation (MCMLE):

## Iteration 1 of at most 60:

## Optimizing with step length 1.0000.

## The log-likelihood improved by 0.0062.

## Convergence test p-value: < 0.0001. Converged with 99% confidence.
## Finished MCMLE.
## Evaluating log-likelihood at the estimate. Fitting the dyad-independent submodel...
## Bridging between the dyad-independent submodel and the full model...
## Setting up bridge sampling...
## Using 16 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 .
## Bridging finished.
## This model was fit using MCMC.  To examine model diagnostics and check
## for degeneracy, use the mcmc.diagnostics() function.

anova(ergm.fit.02)

## Analysis of Variance Table
## 
## Model 1: flomarriage ~ edges + triangles
##          Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
## NULL                       120     166.35                 
## Model 1:  2   58.285       118     108.07    2.206e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(ergm.fit.02)

## Call:
## ergm(formula = ergm.02)
## 
## Monte Carlo Maximum Likelihood Results:
## 
##          Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges     -1.6898     0.3595      0  -4.700   <1e-04 ***
## triangle   0.1622     0.6342      0   0.256    0.798    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.4  on 120  degrees of freedom
##  Residual Deviance: 108.1  on 118  degrees of freedom
##  
## AIC: 112.1  BIC: 117.6  (Smaller is better. MC Std. Err. = 0.009767)

Para una arista que no cree ningún triángulo, el log-odds es \(-1.6898\), y la probabilidad correspondiente es \(0.1558021\).
Para una arista que cree un triángulo, el log-odds es \(-1.6898 + 0.1622 = -1.5276\), y la probabilidad correspondiente es \(0.1783451\).

# probabilidades
expit <- function(x) 1/(1+exp(-x))
expit(-1.6898)

## [1] 0.1558021

expit(-1.5276)

## [1] 0.1783451

7.3 Modelo con una covariable

La riqueza parecía estar asociada con un mayor grado en esta red. En efecto, hay un efecto positivo de la riqueza sobre la probabilidad de un observar una arista:

# covariable
summary(wealth)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3.00   17.50   39.00   42.56   48.25  146.00

# formulacion del modelo 
ergm.03 <- formula(flomarriage ~ edges + nodemain('wealth'))
summary(ergm.03)

##          edges nodecov.wealth 
##             20           2168

# ajuste del modelo
set.seed(42)
ergm.fit.03 <- ergm(formula = ergm.03)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

anova(ergm.fit.03)

## Analysis of Variance Table
## 
## Model 1: flomarriage ~ edges + nodemain("wealth")
##          Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
## NULL                       120     166.35                 
## Model 1:  2   63.247       118     103.11    1.846e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(ergm.fit.03)

## Call:
## ergm(formula = ergm.03)
## 
## Maximum Likelihood Results:
## 
##                 Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges          -2.594929   0.536056      0  -4.841   <1e-04 ***
## nodecov.wealth  0.010546   0.004674      0   2.256   0.0241 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.4  on 120  degrees of freedom
##  Residual Deviance: 103.1  on 118  degrees of freedom
##  
## AIC: 107.1  BIC: 112.7  (Smaller is better. MC Std. Err. = 0)

La probabilidad condicional en escala logit (log-odds) de que una arista esté presente entre dos actores, manteniendo fija el resto de la red, es \[ \text{logit}\,\textsf{Pr}(y_{i,j}=1\mid\mathbf{y}_{-(i,j)}) = -2.594929\cdot\text{cambio en el n. de aristas} + 0.010546\cdot\text{suma de la riqueza de los dos nodos} \] - Para una arista entre dos nodos con la mínima riqueza, el log-odds condicional es \(-2.594929 + 0.010546*(3+3)=-2.531653\). - Para una arista entre dos nodos con la máxima riqueza, el log-odds condicional es \(-2.594929 + 0.010546*(146+146)=0.484503\). - Las probabilidades correspondientes son \(0.07366876\) y \(0.6188106\).

# probabilidades
expit(-2.531653)

## [1] 0.07366876

expit(0.484503)

## [1] 0.6188106

8 Ejemplo: AddHealth

Red de amistad mutua basada en una de las escuelas del estudio AddHealth, Wave I. La comunidad escolar en la que se basa se encuentra en la zona rural del oeste de los EE. UU., con un alumnado mayoritariamente hispano y nativo americano.

Resnick M.D., Bearman, P.S., Blum R.W. et al. (1997). Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health, Journal of the American Medical Association, 278: 823-32.

Se examina la homofilia en las amistades por grado y raza. Ambos son atributos discretos.

# datos (version simulada)
data(faux.mesa.high)
mesa <- faux.mesa.high
mesa

##  Network attributes:
##   vertices = 205 
##   directed = FALSE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 203 
##     missing edges= 0 
##     non-missing edges= 203 
## 
##  Vertex attribute names: 
##     Grade Race Sex 
## 
## No edge attributes

table(mesa %v% 'Grade')

## 
##  7  8  9 10 11 12 
## 62 40 42 25 24 12

table(mesa %v% 'Race')

## 
## Black  Hisp NatAm Other White 
##     6   109    68     4    18

table(mesa %v% 'Sex')

## 
##   F   M 
##  99 106

# grafico
par(mfrow = c(1,1))
plot(mesa, vertex.col = 'Grade')
legend('bottomleft', fill = 7:12, legend = paste('Grade',7:12), cex = 0.75)

# formulacion del modelo
ergm.04 <- formula(mesa ~ edges + nodefactor('Grade') + nodematch('Grade', diff=T) + nodefactor('Race') + nodematch('Race', diff=T))
summary(ergm.04)

##                 edges    nodefactor.Grade.8    nodefactor.Grade.9 
##                   203                    75                    65 
##   nodefactor.Grade.10   nodefactor.Grade.11   nodefactor.Grade.12 
##                    36                    49                    28 
##     nodematch.Grade.7     nodematch.Grade.8     nodematch.Grade.9 
##                    75                    33                    23 
##    nodematch.Grade.10    nodematch.Grade.11    nodematch.Grade.12 
##                     9                    17                     6 
##  nodefactor.Race.Hisp nodefactor.Race.NatAm nodefactor.Race.Other 
##                   178                   156                     1 
## nodefactor.Race.White  nodematch.Race.Black   nodematch.Race.Hisp 
##                    45                     0                    53 
##  nodematch.Race.NatAm  nodematch.Race.Other  nodematch.Race.White 
##                    46                     0                     4

¡Cuidado! Hay varios 0s.

# ajuste del modelo
set.seed(42)
ergm.fit.04 <- ergm(formula = ergm.04)

## Observed statistic(s) nodematch.Race.Black and nodematch.Race.Other are at their smallest attainable values. Their coefficients will be fixed at -Inf.

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

anova(ergm.fit.04)

## Analysis of Variance Table
## 
## Model 1: mesa ~ edges + nodefactor("Grade") + nodematch("Grade", diff = T) + 
##     nodefactor("Race") + nodematch("Race", diff = T)
##          Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
## NULL                     20910    28987.4                 
## Model 1: 19    27161     20891     1826.8    < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(ergm.fit.04)

## Call:
## ergm(formula = ergm.04)
## 
## Maximum Likelihood Results:
## 
##                       Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges                  -8.0538     1.2561      0  -6.412  < 1e-04 ***
## nodefactor.Grade.8      1.5201     0.6858      0   2.216 0.026663 *  
## nodefactor.Grade.9      2.5284     0.6493      0   3.894  < 1e-04 ***
## nodefactor.Grade.10     2.8652     0.6512      0   4.400  < 1e-04 ***
## nodefactor.Grade.11     2.6291     0.6563      0   4.006  < 1e-04 ***
## nodefactor.Grade.12     3.4629     0.6566      0   5.274  < 1e-04 ***
## nodematch.Grade.7       7.4662     1.1730      0   6.365  < 1e-04 ***
## nodematch.Grade.8       4.2882     0.7150      0   5.997  < 1e-04 ***
## nodematch.Grade.9       2.0371     0.5538      0   3.678 0.000235 ***
## nodematch.Grade.10      1.2489     0.6233      0   2.004 0.045111 *  
## nodematch.Grade.11      2.4521     0.6124      0   4.004  < 1e-04 ***
## nodematch.Grade.12      1.2987     0.6981      0   1.860 0.062824 .  
## nodefactor.Race.Hisp   -1.6659     0.2963      0  -5.622  < 1e-04 ***
## nodefactor.Race.NatAm  -1.4725     0.2869      0  -5.132  < 1e-04 ***
## nodefactor.Race.Other  -2.9618     1.0372      0  -2.856 0.004296 ** 
## nodefactor.Race.White  -0.8488     0.2958      0  -2.869 0.004112 ** 
## nodematch.Race.Black      -Inf     0.0000      0    -Inf  < 1e-04 ***
## nodematch.Race.Hisp     0.6912     0.3451      0   2.003 0.045153 *  
## nodematch.Race.NatAm    1.2482     0.3550      0   3.517 0.000437 ***
## nodematch.Race.Other      -Inf     0.0000      0    -Inf  < 1e-04 ***
## nodematch.Race.White    0.3140     0.6405      0   0.490 0.623947    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 28987  on 20910  degrees of freedom
##  Residual Deviance:  1827  on 20889  degrees of freedom
##  
## AIC: 1865  BIC: 2016  (Smaller is better. MC Std. Err. = 0)
## 
##  Warning: The following terms have infinite coefficient estimates:
##   nodematch.Race.Black nodematch.Race.Other

Algunos coeficientes dan como resultado -Inf:

mixingmatrix(object = mesa, attrname = "Race")

##       Black Hisp NatAm Other White
## Black     0    8    13     0     5
## Hisp      8   53    41     1    22
## NatAm    13   41    46     0    10
## Other     0    1     0     0     0
## White     5   22    10     0     4

## Note:  Marginal totals can be misleading for undirected mixing matrices.

9 Ejemplo: Sampson

Sampson (1969) registró las interacciones sociales entre un grupo de monjes mientras él era un residente como experimentador en el claustro. De particular interés son los datos sobre relaciones de afecto positivo (“agrado”), en los que se preguntó a cada monje si tenía relaciones positivas con cada uno de los demás monjes. Cada monje clasificó solo sus tres opciones principales (o cuatro, en el caso de empates) en “agrado”. Aquí, consideramos que existe auna arista dirigida del monje A al monje B si A nombró a B entre estas opciones principales.

Sampson, S.~F. (1968), A novitiate in a period of change: An experimental and case study of relationships, Unpublished Ph.D. dissertation, Department of Sociology, Cornell University.

# datos
data(samplk)
samplk3

##  Network attributes:
##   vertices = 18 
##   directed = TRUE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 56 
##     missing edges= 0 
##     non-missing edges= 56 
## 
##  Vertex attribute names: 
##     cloisterville group vertex.names 
## 
## No edge attributes

# grafico
plot(samplk3, vertex.cex = 2)

# formulacion del modelo
ergm.05 <- formula(samplk3 ~ edges + mutual)
summary(ergm.05)

##  edges mutual 
##     56     15

15 de las 56 aristas dirigidas son mutuas.

# ajuste del modelo
set.seed(42)
ergm.fit.05 <- ergm(formula = ergm.05)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Starting Monte Carlo maximum likelihood estimation (MCMLE):

## Iteration 1 of at most 60:

## Optimizing with step length 1.0000.

## The log-likelihood improved by 0.0031.

## Convergence test p-value: < 0.0001. Converged with 99% confidence.
## Finished MCMLE.
## Evaluating log-likelihood at the estimate. Fitting the dyad-independent submodel...
## Bridging between the dyad-independent submodel and the full model...
## Setting up bridge sampling...
## Using 16 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 .
## Bridging finished.
## This model was fit using MCMC.  To examine model diagnostics and check
## for degeneracy, use the mcmc.diagnostics() function.

anova(ergm.fit.05)

## Analysis of Variance Table
## 
## Model 1: samplk3 ~ edges + mutual
##          Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
## NULL                       306     424.21                 
## Model 1:  2   156.28       304     267.92    < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(ergm.fit.05)

## Call:
## ergm(formula = ergm.05)
## 
## Monte Carlo Maximum Likelihood Results:
## 
##        Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges   -2.1685     0.2199      0  -9.860   <1e-04 ***
## mutual   2.3127     0.4745      0   4.874   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 424.2  on 306  degrees of freedom
##  Residual Deviance: 267.9  on 304  degrees of freedom
##  
## AIC: 271.9  BIC: 279.4  (Smaller is better. MC Std. Err. = 0.3185)

Hay un efecto de reciprocidad fuerte y significativo.
Los coeficientes asociados la densidad y reciprocidad se cancelan aproximadamente para una arista mutua,lo que significa que la probabilidad correspondiente es aproximadamente del 50%.
Para una arista no mutua se tiene un log-odds condicional del \(-2,16\), esto es, una probabilidad del 10%.

10 Ejemplo: Lazega

Red de relaciones de trabajo colaborativo entre miembros de una firma de abogados. Disponible en la librería sand de R.

Lazega, E. (2001). The collegial phenomenon: The social mechanisms of cooperation among peers in a corporate law partnership. Oxford University Press on Demand.

10.1 Preparación de los datos

# paquetes
suppressMessages(suppressWarnings(library(ergm)))
suppressMessages(suppressWarnings(library(sand)))
# datos
data(lazega)
# matriz de adyacencia
A <- igraph::as_adjacency_matrix(graph = lazega)
# simetrica?
isSymmetric(as.matrix(A))

## [1] TRUE

# atributos (variables exogenas)
v.attrs <- igraph::as_data_frame(x = lazega, what = "vertices")
head(v.attrs)

##    name Seniority Status Gender Office Years Age Practice School
## V1   V1         1      1      1      1    31  64        1      1
## V2   V2         2      1      1      1    32  62        2      1
## V3   V3         3      1      1      2    13  67        1      1
## V4   V4         4      1      1      1    31  59        2      3
## V5   V5         5      1      1      2    31  59        1      2
## V6   V6         6      1      1      2    29  55        1      1

# formato network 
lazega.s <- network::as.network(x = as.matrix(A), directed = F)
# establecer atributos
network::set.vertex.attribute(x = lazega.s, attrname = "Office",    value = v.attrs$Office)
network::set.vertex.attribute(x = lazega.s, attrname = "Practice",  value = v.attrs$Practice)
network::set.vertex.attribute(x = lazega.s, attrname = "Gender",    value = v.attrs$Gender)
network::set.vertex.attribute(x = lazega.s, attrname = "Seniority", value = v.attrs$Seniority)

10.2 Modelo de estrellas y triángulos

# formulacion del modelo
my.ergm <- formula(lazega.s ~ edges + kstar(2) + kstar(3) + triangle)
summary(my.ergm)

##    edges   kstar2   kstar3 triangle 
##      115      926     2681      120

10.3 Modelo con restricciones paramétricas

\(\textsf{GWD}(\mathbf{y})\) con \(\gamma = 1\).

# formulacion del modelo
my.ergm <- formula(lazega.s ~ edges + gwesp(decay = 1, fixed = T))
summary(my.ergm)

##         edges gwesp.fixed.1 
##      115.0000      213.1753

10.4 Modelo con variables exógenas

\[ p(\mathbf{y}\mid\theta_1,\theta_2,\boldsymbol{\beta}) = \frac{1}{\kappa(\theta_1,\theta_2,\boldsymbol{\beta})}\,\exp{\left\{ \theta_1 S_1(\mathbf{y}) + \theta_2\textsf{GWD}(\mathbf{y}) + \boldsymbol{\beta}^{\textsf{T}}\boldsymbol{\mathsf{g}}(\mathbf{y},\mathbf{x}) \right\}} \]

lazega.ergm <- formula(lazega.s ~ edges + gwesp(decay = 1, fixed = T) 
                       + nodemain("Seniority")
                       + nodemain("Practice")
                       + match("Practice")
                       + match("Gender")
                       + match("Office"))
summary(lazega.ergm)

##              edges      gwesp.fixed.1  nodecov.Seniority   nodecov.Practice 
##           115.0000           213.1753          4687.0000           359.0000 
## nodematch.Practice   nodematch.Gender   nodematch.Office 
##            72.0000            99.0000            85.0000

set.seed(42)
lazega.ergm.fit <- ergm(formula = lazega.ergm)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Starting Monte Carlo maximum likelihood estimation (MCMLE):

## Iteration 1 of at most 60:

## Optimizing with step length 1.0000.

## The log-likelihood improved by 0.3105.

## Estimating equations are not within tolerance region.

## Iteration 2 of at most 60:

## Optimizing with step length 1.0000.

## The log-likelihood improved by 0.0645.

## Convergence test p-value: 0.3379. Not converged with 99% confidence; increasing sample size.
## Iteration 3 of at most 60:
## Optimizing with step length 1.0000.
## The log-likelihood improved by 0.0488.
## Convergence test p-value: 0.7349. Not converged with 99% confidence; increasing sample size.
## Iteration 4 of at most 60:
## Optimizing with step length 1.0000.
## The log-likelihood improved by 0.0206.
## Convergence test p-value: 0.1764. Not converged with 99% confidence; increasing sample size.
## Iteration 5 of at most 60:
## Optimizing with step length 1.0000.
## The log-likelihood improved by 0.0066.
## Convergence test p-value: < 0.0001. Converged with 99% confidence.
## Finished MCMLE.
## Evaluating log-likelihood at the estimate. Fitting the dyad-independent submodel...
## Bridging between the dyad-independent submodel and the full model...
## Setting up bridge sampling...
## Using 16 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 .
## Bridging finished.
## This model was fit using MCMC.  To examine model diagnostics and check
## for degeneracy, use the mcmc.diagnostics() function.

anova(lazega.ergm.fit)

## Analysis of Variance Table
## 
## Model 1: lazega.s ~ edges + gwesp(decay = 1, fixed = T) + nodemain("Seniority") + 
##     nodemain("Practice") + match("Practice") + match("Gender") + 
##     match("Office")
##          Df Deviance Resid. Df Resid. Dev Pr(>|Chisq|)    
## NULL                       630     873.37                 
## Model 1:  7   415.71       623     457.66    < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(lazega.ergm.fit)

## Call:
## ergm(formula = lazega.ergm)
## 
## Monte Carlo Maximum Likelihood Results:
## 
##                     Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges              -7.095481   0.683921      0 -10.375  < 1e-04 ***
## gwesp.fixed.1       0.666311   0.099177      0   6.718  < 1e-04 ***
## nodecov.Seniority   0.024377   0.006338      0   3.846 0.000120 ***
## nodecov.Practice    0.398218   0.107373      0   3.709 0.000208 ***
## nodematch.Practice  0.770408   0.187214      0   4.115  < 1e-04 ***
## nodematch.Gender    0.724602   0.248256      0   2.919 0.003514 ** 
## nodematch.Office    1.167166   0.194039      0   6.015  < 1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 873.4  on 630  degrees of freedom
##  Residual Deviance: 457.7  on 623  degrees of freedom
##  
## AIC: 471.7  BIC: 502.8  (Smaller is better. MC Std. Err. = 0.2505)

Hay evidencia de un efecto de transitividad no trivial.

Manteniendo constantes los valores de las otras estadísticas, condicionalmente se tiene que:

Practicar derecho corporativo, en lugar de litigios, aumenta el odds de cooperación en un factor de \(\text{exp}(0.3946) = 1.484\), lo que corresponde a una probabilidad del 50%.
Ser del mismo género el oods de cooperación, ya que \(\text{exp}(0.7377) = 2.091\).

11 Ejemplo: Familias Florentinas (cont.)

Se considera el término degree(1) para determinar si hay más (o menos) nodos de grado 1 de los que se esperarían, dada la densidad.

# formulacion del modelo
summary(flobusiness ~ edges + degree(1))

##   edges degree1 
##      15       3

# ajuste el modelo
fit <- ergm(flobusiness ~ edges + degree(1))

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Starting Monte Carlo maximum likelihood estimation (MCMLE):

## Iteration 1 of at most 60:

## Optimizing with step length 1.0000.

## The log-likelihood improved by 0.3101.

## Estimating equations are not within tolerance region.

## Iteration 2 of at most 60:

## Optimizing with step length 1.0000.

## The log-likelihood improved by 0.0351.

## Convergence test p-value: 0.0003. Converged with 99% confidence.
## Finished MCMLE.
## Evaluating log-likelihood at the estimate. Fitting the dyad-independent submodel...
## Bridging between the dyad-independent submodel and the full model...
## Setting up bridge sampling...
## Using 16 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 .
## Bridging finished.
## This model was fit using MCMC.  To examine model diagnostics and check
## for degeneracy, use the mcmc.diagnostics() function.

summary(fit)

## Call:
## ergm(formula = flobusiness ~ edges + degree(1))
## 
## Monte Carlo Maximum Likelihood Results:
## 
##         Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges    -2.1179     0.2965      0  -7.144   <1e-04 ***
## degree1  -0.6034     0.6906      0  -0.874    0.382    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 166.36  on 120  degrees of freedom
##  Residual Deviance:  89.37  on 118  degrees of freedom
##  
## AIC: 93.37  BIC: 98.95  (Smaller is better. MC Std. Err. = 0.0338)

11.1 Convergencia

Cuando los términos dependientes de la díada están en el modelo, los algoritmos computacionales en ergm usan MCMC (con un muestreador de Metropolis-Hastings) para estimar los parámetros.

mcmc.diagnostics(fit)

## Sample statistics summary:
## 
## Iterations = 20992:131072
## Thinning interval = 512 
## Number of chains = 1 
## Sample size per chain = 216 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##             Mean    SD Naive SE Time-series SE
## edges    0.06944 3.593   0.2445         0.2708
## degree1 -0.39352 1.543   0.1050         0.1050
## 
## 2. Quantiles for each variable:
## 
##         2.5%   25% 50%  75% 97.5%
## edges     -6 -2.25   0 2.00 6.625
## degree1   -3 -1.00  -1 0.25 3.000
## 
## 
## Are sample statistics significantly different from observed?
##                 edges       degree1 Overall (Chi^2)
## diff.      0.06944444 -0.3935185185              NA
## test stat. 0.25647099 -3.7494437496    15.029026139
## P-val.     0.79758717  0.0001772272     0.000725652
## 
## Sample statistics cross-correlations:
##              edges    degree1
## edges    1.0000000 -0.3323922
## degree1 -0.3323922  1.0000000
## 
## Sample statistics auto-correlation:
## Chain 1 
##                edges     degree1
## Lag 0     1.00000000 1.000000000
## Lag 512   0.09943339 0.038685469
## Lag 1024  0.09820085 0.009829421
## Lag 1536 -0.05466601 0.115088038
## Lag 2048 -0.03678104 0.052230451
## Lag 2560  0.03714065 0.047601743
## 
## Sample statistics burn-in diagnostic (Geweke):
## Chain 1 
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##   edges degree1 
## -1.0000 -0.3513 
## 
## Individual P-values (lower = worse):
##     edges   degree1 
## 0.3173269 0.7253321 
## Joint P-value (lower = worse):  0.3485651 .

## 
## MCMC diagnostics shown here are from the last round of simulation, prior to computation of final parameter estimates. Because the final estimates are refinements of those used for this simulation run, these diagnostics may understate model performance. To directly assess the performance of the final model on in-model statistics, please use the GOF command: gof(ergmFitObject, GOF=~model).

11.2 Simulación

Una vez que hemos estimado los coeficientes de un ERGM, el modelo está completamente especificado. Define una distribución de probabilidad en todas las redes de este tamaño. Si el modelo se ajusta bien a los datos observados, es más probable que las redes simuladas de esta distribución se “parezcan” a los datos observados.

# ajuste del modelo
set.seed(42)
ergm.03 <- formula(flomarriage ~ edges + nodemain('wealth'))
ergm.fit.03 <- ergm(formula = ergm.03)

## Starting maximum pseudolikelihood estimation (MPLE):

## Evaluating the predictor and response matrix.

## Maximizing the pseudolikelihood.

## Finished MPLE.

## Stopping at the initial estimate.

## Evaluating log-likelihood at the estimate.

# simulacion
sim.03 <- simulate(object = ergm.fit.03, nsim = 10, seed = 42)
# clase
class(sim.03)

## [1] "network.list"

# resumen de la simulacion
summary(sim.03)

## Number of Networks: 10 
## Model: flomarriage ~ edges + nodemain("wealth") 
## Reference: ~Bernoulli 
## Constraints: ~. 
## Stored network statistics:
##       edges nodecov.wealth
##  [1,]    21           2351
##  [2,]    19           2266
##  [3,]    18           2026
##  [4,]    23           2451
##  [5,]    18           2204
##  [6,]    17           1841
##  [7,]    18           2188
##  [8,]    20           2385
##  [9,]    15           1628
## [10,]    21           2315
## attr(,"monitored")
## [1] FALSE FALSE

## Number of Networks: 10 
## Model: flomarriage ~ edges + nodemain("wealth") 
## Reference: ~Bernoulli 
## Constraints: ~.

# atributos de la simulacion
attributes(sim.03)

## $coefficients
##          edges nodecov.wealth 
##    -2.59492903     0.01054591 
## 
## $control
## Control parameter list generated by 'control.simulate.formula' or equivalent. Non-empty parameters:
## MCMC.burnin: 16384
## MCMC.interval: 1024
## MCMC.scale: 1
## MCMC.prop: ~sparse
## MCMC.prop.weights: "default"
## MCMC.batch: 0
## MCMC.effectiveSize.damp: 10
## MCMC.effectiveSize.maxruns: 1000
## MCMC.effectiveSize.burnin.pval: 0.2
## MCMC.maxedges: Inf
## MCMC.runtime.traceplot: FALSE
## network.output: "network"
## parallel: 0
## parallel.version.check: TRUE
## parallel.inherit.MT: FALSE
## MCMC.samplesize: 10
## obs.MCMC.mul: 0.25
## obs.MCMC.samplesize.mul: 0.5
## obs.MCMC.interval.mul: 0.5
## obs.MCMC.burnin.mul: 0.5
## obs.MCMC.prop: ~sparse
## obs.MCMC.prop.weights: "default"
## MCMC.save_networks: TRUE
## 
## $response
## [1] NA
## 
## $class
## [1] "network.list"
## 
## $stats
##       edges nodecov.wealth
##  [1,]    21           2351
##  [2,]    19           2266
##  [3,]    18           2026
##  [4,]    23           2451
##  [5,]    18           2204
##  [6,]    17           1841
##  [7,]    18           2188
##  [8,]    20           2385
##  [9,]    15           1628
## [10,]    21           2315
## attr(,"monitored")
## [1] FALSE FALSE
## 
## $formula
## flomarriage ~ edges + nodemain("wealth")
## attr(,".Basis")
##  Network attributes:
##   vertices = 16 
##   directed = FALSE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 20 
##     missing edges= 0 
##     non-missing edges= 20 
## 
##  Vertex attribute names: 
##     priorates totalties vertex.names wealth 
## 
## No edge attributes
## 
## $constraints
## ~.
## <environment: 0x0000000021bd4388>
## 
## $reference
## ~Bernoulli
## <environment: 0x0000000021cd65b0>

# valores simulados vs valores observados
rbind("Valoes obs." = summary(ergm.03),
      "Media  sim." = colMeans(attr(sim.03, "stats")))

##             edges nodecov.wealth
## Valoes obs.    20         2168.0
## Media  sim.    19         2165.5

# grafico de una red simulada
plot(sim.03[[1]], label = sim.03[[1]] %v% "vertex.names", vertex.cex = (sim.03[[1]] %v% "wealth")/25)

11.3 Bondad de ajuste

Una prueba de si un modelo “se ajusta a los datos” es qué tan bien reproduce las propiedades de la red observada que no están en el modelo. Hacemos esto eligiendo un estadístico de prueba que no esté en el modelo y comparando el valor del estadístico observado en la red original con la distribución de valores que obtenemos en redes simuladas del modelo.

# bondad de ajuste del modelo
# solo tres estadistico de prueba
gof.03 <- gof(object = ergm.fit.03)
gof.03

## 
## Goodness-of-fit for degree 
## 
##          obs min mean max MC p-value
## degree0    1   0 1.24   4       1.00
## degree1    4   0 3.55   9       0.94
## degree2    2   1 4.35  10       0.34
## degree3    6   0 3.12   6       0.08
## degree4    2   0 1.93   5       1.00
## degree5    0   0 1.09   5       0.70
## degree6    1   0 0.42   3       0.62
## degree7    0   0 0.20   1       1.00
## degree8    0   0 0.02   1       1.00
## degree9    0   0 0.06   1       1.00
## degree10   0   0 0.01   1       1.00
## degree12   0   0 0.01   1       1.00
## 
## Goodness-of-fit for edgewise shared partner 
## 
##      obs min  mean max MC p-value
## esp0  12   2 12.09  22        1.0
## esp1   7   0  6.05  13        0.9
## esp2   1   0  1.35   8        1.0
## esp3   0   0  0.26   3        1.0
## esp4   0   0  0.05   2        1.0
## esp5   0   0  0.01   1        1.0
## 
## Goodness-of-fit for minimum geodesic distance 
## 
##     obs min  mean max MC p-value
## 1    20  12 19.81  30       1.00
## 2    35  10 34.41  70       1.00
## 3    32   0 26.67  41       0.64
## 4    15   0 11.66  24       0.64
## 5     3   0  4.01  16       0.98
## 6     0   0  0.90   8       1.00
## 7     0   0  0.18   5       1.00
## 8     0   0  0.03   2       1.00
## Inf  15   0 22.33  87       1.00
## 
## Goodness-of-fit for model statistics 
## 
##                 obs  min    mean  max MC p-value
## edges            20   12   19.81   30          1
## nodecov.wealth 2168 1233 2151.59 3792          1

# graficos
plot(gof.03)

Modelos de grafos aleatorios exponenciales

Juan Sosa PhD

jcsosam@unal.edu.co