I have always believed that you cannot fully understand something until you build it from scratch. Applying this to networks, we cannot fully understand their properties until we simulated some networks with given parameters. Here is an example. Even if you have a lot of experience working with networks, I guarantee that you will be surprised once you have worked through the code–I was.
Start by loading the necessary packages:
library(ergm)
library(network)
library(ggplot2)
library(devtools)
library(scales)
library(geomnet)
library(RColorBrewer)
Suppose the goal is to simulate a random undirected network in which each pair of blue nodes is connected with probability p1, each pair of red notes is connected with probability p2, and each pair of non-matching nodes is connected with probability p.
This is a bit distracting, but it will make sense momentarily. I start by write a function that returns the parity of a number (we’ll use this to assign color to nodes).
odd<-function(x){
if (x %% 2 == 0) {
return(0)
}
else {return(1)
}
}
Now the actual function that simulates a network:
gen.net = function(n, p, p1, p2) {
#Genarate a list of possible edge ids among n nodes:
ids<-expand.grid(id1=1:n,id2=1:n)
ids<-ids[ids[,1]<ids[,2],]
#Generate an edgewise covariate based on "color":
twored<-NULL
twoblue<-NULL
edges<-NULL
for (i in seq(1:nrow(ids))) {
twored[i]<-as.numeric(odd(ids[i,1])==1 & odd(ids[i,2])==1)
twoblue[i]<-as.numeric(odd(ids[i,1])==0 & odd(ids[i,2])==0)
}
#Generate connections based on color and probabilities:
for (i in seq(1:nrow(ids))) {
if (twoblue[i]==1){
edges[i]<-rbinom(1,1,p1)
}
else if (twored[i]==1){
edges[i]<-rbinom(1,1,p2)
}
else {edges[i]<-rbinom(1,1,p)
}
}
#return(cbind(ids,twoblue, twored, edge))
# From the values above, generate a symmetric matrix
networkmat = matrix(rep(NA, n * n), ncol = n)
for (i in 1:length(edges)) {
k<-ids[i,1]
j<-ids[i,2]
networkmat[k, j] = edges[i]
networkmat[j, k] = edges[i]
}
diag(networkmat)<-0
return(networkmat)
}
Let’s test it out by giving it the following arguments: n=100 (we want to simulate a network with 100 nodes), p=.01 (probability of an edge between two nodes of different color), p1=0.03 (probability of an edge beween two blue nodes), p2=0.05 (probability of an edge beween two red nodes). Notice how small each of the probabilities are.
mynet <- gen.net(100, 0.01,0.03,0.05)
What would such a network look like? The next step if to prepare the data for graphing using geom_net:
#Get a list of edges:
mynet<-network(mynet, directed=FALSE) # I actually find it easier to briefly convert the matrix to network class in order to quickly extract the list of edges.
d <- as.matrix(mynet,matrix.type="edgelist")
colnames(d)<-c("id2","id1") #I always make sure that the first column contains the nodes with smaller id.
mynet<-as.data.frame(d[,c(2,1)]) #This is just re-ordering of the columns for my own convenience.
#Get a list of vertices:
mynodes<-attr(d,"vnames")
mynodes<-as.data.frame(mynodes)
#Combine edges and nodes into an object suitable for geomnet:
mynet1<-fortify(as.edgedf(mynet), mynodes, directed=FALSE) #This is the most straightforward way to get the data into the format necessary for geom_net. Notice that #all you need to specify is a list of edges and a list of nodes.
Before graphing, I also want to add the node color to the data. I don’t know if you noticed before, but I used the parity of the node identifyier (even, odd) to give color to each node (odd=blue, even=red). Now I will use the same rule to add colors to mynet1
for (i in seq(1:nrow(mynet1))) {
mynet1$color[i]<-as.numeric(odd(mynet1$from_id[i])==1)
}
Now we can make the actual graph:
#Specify theme:
theme_set(theme_grey() + theme(legend.position="none",
panel.background = element_rect(fill = NA, color = 'black'),
axis.ticks = element_blank(), axis.text = element_blank()))
#Make a graph:
p<-ggplot(data = mynet1)+ geom_net(aes(from_id =as.factor(from_id), to_id = as.factor(to_id),color=factor(color), shape=factor(color)),directed=FALSE,
singletons=TRUE,size=3,labelon=FALSE)+ylab("")+xlab("")+
scale_colour_manual(values = c("#D55E00","#0072B2"))
p
And voila! Does the network look like what you expected? I know I did not expect it to have this many connections (although it makes sense once you think more of it). I was also surprised to see all the complicated-looking configurations. Can you see those long chains of connections and what looks like some circular patterns? We know that those are completely random (we designed this network), but don’t they look like one may want to model them if you were to, say, run an ergm? Well, next time you are running an ergm, think about this.