Based on Prof. Katherine Ognyanova, Rutgers University 2017 Workshop http://kateto.net/network-visualization.
The main concern in designing a network visualization is the purpose it has to serve. What are the structure properties that we want to highlight? What are the key concerns we want to address? Modern graph layouts are optimized for speed and aesthetics. In particular, they seek to minimize overlaps and edge crossing, and ensure similar edge length across the graph.
The main packages we are going to use are igraph (maintained by Gabor Csardi and Tamas Nepusz), sna & network (maintained by Carter Butts and the Statnet team), visNetwork (maintained by Benoit Thieurmel), threejs (maintained by Bryan W. Lewis), NetworkD3 (maintained by Christopher Gandrud), and ndtv (maintained by Skye Bender-deMoll).
install.packages(“igraph”) install.packages(“network”) install.packages(“sna”) install.packages(“visNetwork”) install.packages(“threejs”) install.packages(“networkD3”) install.packages(“ndtv”)
Specify the color in R see # http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf.
# install.packages('RColorBrewer')
library('RColorBrewer')
# install.packages('extrafont')
library('extrafont')
## Registering fonts with R
plot(x=1:10, y=rep(5,10), pch=19, cex=3, col="red")
points(x=1:10, y=rep(6, 10), pch=19, cex=3, col="mediumblue")
points(x=1:10, y=rep(4, 10), pch=19, cex=3, col="green")
plot(x=1:5, y=rep(5,5), pch=19, cex=24, col=rgb(.25, .5, .3, alpha=.5), xlim=c(0,6))
par(bg="gray40")
col.tr <- grDevices::adjustcolor("557799", alpha=0.7)
plot(x=1:5, y=rep(5,5), pch=19, cex=24, col=col.tr, xlim=c(0,6))
dev.off()
## null device
## 1
pal1 <- heat.colors(5, alpha=1) # 5 colors from the heat palette, opaque
pal2 <- rainbow(5, alpha=.5) # 5 colors from the heat palette, transparent
plot(x=1:10, y=1:10, pch=19, cex=15, col=pal1)
plot(x=1:10, y=1:10, pch=19, cex=15, col=pal2)
display.brewer.all()
display.brewer.pal(8, "Set3")
display.brewer.pal(8, "Spectral")
display.brewer.pal(8, "Blues")
pal3 <- brewer.pal(10, "Set3")
plot(x=10:1, y=10:1, pch=19, cex=16, col=pal3)
Import system fonts - may take a while font_import() fonts() # See what font families are available to you now. loadfonts(device = “win”) # use device = “pdf” for pdf plot output.
# library('extrafont')
plot(x=10:1, y=10:1, pch=19, cex=13,
main="This is a Arial Black Font", col=pal3,
family="Arial Black" ,xlab="Value in X",ylab="Y Values")
plot(x=10:1, y=10:1, pch=19, cex=13,
main="This is a Symbol Font", col=pal3,
family="Symbol" ,xlab="Value in X",ylab="Y Values")
3 Data format, size, and preparation
Two small example data sets. Both contain data about media organizations. One involves a network of hyperlinks and mentions among news sources. The second is a network of links between media venues and consumers. In fact, when drawing very big networks we may even want to hide the network edges, and focus on identifying and visualizing communities of nodes. At this point, the size of the networks you can visualize in R is limited mainly by the RAM of your machine. One thing to emphasize though is that in many cases, visualizing larger networks as giant hairballs is less helpful than providing charts that show key characteristics of the graph.
The first data set we are going to work with consists of two files.
library(igraph)
## Warning: package 'igraph' was built under R version 3.4.2
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
edges=read.csv("D:/R_Files/Dataset1_EDGES.csv")
head(edges)
## from to weight type
## 1 s01 s02 10 hyperlink
## 2 s01 s02 12 hyperlink
## 3 s01 s03 22 hyperlink
## 4 s01 s04 21 hyperlink
## 5 s04 s11 22 mention
## 6 s05 s15 21 mention
nodes=read.csv("D:/R_Files/Dataset1_NODES.csv")
head(nodes)
## id media media.type type.label audience.size
## 1 s01 NY Times 1 Newspaper 20
## 2 s02 Washington Post 1 Newspaper 25
## 3 s03 Wall Street Journal 1 Newspaper 30
## 4 s04 USA Today 1 Newspaper 32
## 5 s05 LA Times 1 Newspaper 20
## 6 s06 New York Post 1 Newspaper 50
nrow(nodes); length(unique(nodes$id))
## [1] 17
## [1] 17
nrow(edges); nrow(unique(edges[,c("from", "to")]))
## [1] 52
## [1] 49
Notice that there are more links than unique from-to combinations. That means we have cases in the data where there are multiple links between the same two nodes. We will collapse all links of the same type between the same two nodes by summing their weights, using aggregate() by “from”, “to”, & “type”:
links <- aggregate(edges[,3], edges[,-3], sum)
links <- links[order(links$from, links$to),]
colnames(links)[4] <- "weight"
rownames(links) <- NULL
We can see that links2 is an adjacency matrix for a two-mode network. Two-mode or bipartite graphs have two different types of actors and links that go across, but not within each type. Our second media example is a network of that kind, examining links between news sources and their consumers.
nodes2 <- read.csv("D:/R_Files/Dataset2_NODES.csv", header=T, as.is=T)
links2 <- read.csv("D:/R_Files/Dataset2_EDGES.csv", header=T, row.names=1)
head(nodes2)
## id media media.type media.name audience.size
## 1 s01 NYT 1 Newspaper 20
## 2 s02 WaPo 1 Newspaper 25
## 3 s03 WSJ 1 Newspaper 30
## 4 s04 USAT 1 Newspaper 32
## 5 s05 LATimes 1 Newspaper 20
## 6 s06 CNN 2 TV 56
head(links2)
## U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14 U15 U16 U17
## s01 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## s02 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
## s03 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
## s04 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0
## s05 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
## s06 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1
## U18 U19 U20
## s01 0 0 0
## s02 0 0 1
## s03 0 0 0
## s04 0 0 0
## s05 0 0 0
## s06 0 0 0
links2 <- as.matrix(links2)
dim(links2)
## [1] 10 20
dim(nodes2)
## [1] 30 5
net2 <- graph_from_incidence_matrix(links2)
table(V(net2)$type)
##
## FALSE TRUE
## 10 20
We start by converting the raw data to an igraph network object.
4 Plotting networks with igraph 4.1 Turning networks into igraph objects 4.2 Plotting parameters 4.3 Network layouts 4.4 Highlighting aspects of the network 4.5 Highlighting specific nodes or links 4.6 Interactive plotting with tkplot 4.7 Other ways to represent a network 4.8 Plotting two-mode networks 4.9 Plotting multiplex networks
DATASET 1: edgelist To convert our data into an igraph network, we will use the graph_from_data_frame() function, which takes two data frames: d and vertices. . d describes the edges of the network. Its first two columns are the IDs of the source and the target node for each edge. The following columns are edge attributes (weight, type, label, or anything else). . vertices starts with a column of node IDs. Any following columns are interpreted as node attributes.
net <- graph_from_data_frame(d=links, vertices=nodes, directed=T)
net
## IGRAPH 4a05468 DNW- 17 49 --
## + attr: name (v/c), media (v/c), media.type (v/n), type.label
## | (v/c), audience.size (v/n), type (e/c), weight (e/n)
## + edges from 4a05468 (vertex names):
## [1] s01->s02 s01->s03 s01->s04 s01->s15 s02->s01 s02->s03 s02->s09
## [8] s02->s10 s03->s01 s03->s04 s03->s05 s03->s08 s03->s10 s03->s11
## [15] s03->s12 s04->s03 s04->s06 s04->s11 s04->s12 s04->s17 s05->s01
## [22] s05->s02 s05->s09 s05->s15 s06->s06 s06->s16 s06->s17 s07->s03
## [29] s07->s08 s07->s10 s07->s14 s08->s03 s08->s07 s08->s09 s09->s10
## [36] s10->s03 s12->s06 s12->s13 s12->s14 s13->s12 s13->s17 s14->s11
## [43] s14->s13 s15->s01 s15->s04 s15->s06 s16->s06 s16->s17 s17->s04
The description of an igraph object starts with four letters: 1. D or U, for a directed or undirected graph 2. N for a named graph (where nodes have a name attribute) 3. W for a weighted graph (where edges have a weight attribute) 4. B for a bipartite (two-mode) graph (where nodes have a type attribute)
The two numbers that follow (17 49) refer to the number of nodes and edges in the graph. The description also lists node & edge attributes, for example: . (g/c) - graph-level character attribute . (v/c) - vertex-level character attribute . (e/n) - edge-level numeric attribute We also have easy access to nodes, edges, and their attributes with:
E(net) # The edges of the "net" object
## + 49/49 edges from 4a05468 (vertex names):
## [1] s01->s02 s01->s03 s01->s04 s01->s15 s02->s01 s02->s03 s02->s09
## [8] s02->s10 s03->s01 s03->s04 s03->s05 s03->s08 s03->s10 s03->s11
## [15] s03->s12 s04->s03 s04->s06 s04->s11 s04->s12 s04->s17 s05->s01
## [22] s05->s02 s05->s09 s05->s15 s06->s06 s06->s16 s06->s17 s07->s03
## [29] s07->s08 s07->s10 s07->s14 s08->s03 s08->s07 s08->s09 s09->s10
## [36] s10->s03 s12->s06 s12->s13 s12->s14 s13->s12 s13->s17 s14->s11
## [43] s14->s13 s15->s01 s15->s04 s15->s06 s16->s06 s16->s17 s17->s04
V(net) # The vertices of the "net" object
## + 17/17 vertices, named, from 4a05468:
## [1] s01 s02 s03 s04 s05 s06 s07 s08 s09 s10 s11 s12 s13 s14 s15 s16 s17
E(net)$type # Edge attribute "type"
## [1] "hyperlink" "hyperlink" "hyperlink" "mention" "hyperlink"
## [6] "hyperlink" "hyperlink" "hyperlink" "hyperlink" "hyperlink"
## [11] "hyperlink" "hyperlink" "mention" "hyperlink" "hyperlink"
## [16] "hyperlink" "mention" "mention" "hyperlink" "mention"
## [21] "mention" "hyperlink" "hyperlink" "mention" "hyperlink"
## [26] "hyperlink" "mention" "mention" "mention" "hyperlink"
## [31] "mention" "hyperlink" "mention" "mention" "mention"
## [36] "hyperlink" "mention" "hyperlink" "mention" "hyperlink"
## [41] "mention" "mention" "mention" "hyperlink" "hyperlink"
## [46] "hyperlink" "hyperlink" "mention" "hyperlink"
V(net)$media # Vertex attribute "media"
## [1] "NY Times" "Washington Post" "Wall Street Journal"
## [4] "USA Today" "LA Times" "New York Post"
## [7] "CNN" "MSNBC" "FOX News"
## [10] "ABC" "BBC" "Yahoo News"
## [13] "Google News" "Reuters.com" "NYTimes.com"
## [16] "WashingtonPost.com" "AOL.com"
(that returns oblects of type vertex sequence/edge sequence)
V(net)[media=="BBC"]
## + 1/17 vertex, named, from 4a05468:
## [1] s11
E(net)[type=="mention"]
## + 20/49 edges from 4a05468 (vertex names):
## [1] s01->s15 s03->s10 s04->s06 s04->s11 s04->s17 s05->s01 s05->s15
## [8] s06->s17 s07->s03 s07->s08 s07->s14 s08->s07 s08->s09 s09->s10
## [15] s12->s06 s12->s14 s13->s17 s14->s11 s14->s13 s16->s17
net[1,]
## s01 s02 s03 s04 s05 s06 s07 s08 s09 s10 s11 s12 s13 s14 s15 s16 s17
## 0 22 22 21 0 0 0 0 0 0 0 0 0 0 20 0 0
net[5,7]
## [1] 0
It is also easy to extract an edge list or matrix back from the igraph network: ### Get an edge list or a matrix:
as_edgelist(net, names=T)
## [,1] [,2]
## [1,] "s01" "s02"
## [2,] "s01" "s03"
## [3,] "s01" "s04"
## [4,] "s01" "s15"
## [5,] "s02" "s01"
## [6,] "s02" "s03"
## [7,] "s02" "s09"
## [8,] "s02" "s10"
## [9,] "s03" "s01"
## [10,] "s03" "s04"
## [11,] "s03" "s05"
## [12,] "s03" "s08"
## [13,] "s03" "s10"
## [14,] "s03" "s11"
## [15,] "s03" "s12"
## [16,] "s04" "s03"
## [17,] "s04" "s06"
## [18,] "s04" "s11"
## [19,] "s04" "s12"
## [20,] "s04" "s17"
## [21,] "s05" "s01"
## [22,] "s05" "s02"
## [23,] "s05" "s09"
## [24,] "s05" "s15"
## [25,] "s06" "s06"
## [26,] "s06" "s16"
## [27,] "s06" "s17"
## [28,] "s07" "s03"
## [29,] "s07" "s08"
## [30,] "s07" "s10"
## [31,] "s07" "s14"
## [32,] "s08" "s03"
## [33,] "s08" "s07"
## [34,] "s08" "s09"
## [35,] "s09" "s10"
## [36,] "s10" "s03"
## [37,] "s12" "s06"
## [38,] "s12" "s13"
## [39,] "s12" "s14"
## [40,] "s13" "s12"
## [41,] "s13" "s17"
## [42,] "s14" "s11"
## [43,] "s14" "s13"
## [44,] "s15" "s01"
## [45,] "s15" "s04"
## [46,] "s15" "s06"
## [47,] "s16" "s06"
## [48,] "s16" "s17"
## [49,] "s17" "s04"
as_adjacency_matrix(net, attr="weight")
## 17 x 17 sparse Matrix of class "dgCMatrix"
## [[ suppressing 17 column names 's01', 's02', 's03' ... ]]
##
## s01 . 22 22 21 . . . . . . . . . . 20 . .
## s02 23 . 21 . . . . . 1 5 . . . . . . .
## s03 21 . . 22 1 . . 4 . 2 1 1 . . . . .
## s04 . . 23 . . 1 . . . . 22 3 . . . . 2
## s05 1 21 . . . . . . 2 . . . . . 21 . .
## s06 . . . . . 1 . . . . . . . . . 21 21
## s07 . . 1 . . . . 22 . 21 . . . 4 . . .
## s08 . . 2 . . . 21 . 23 . . . . . . . .
## s09 . . . . . . . . . 21 . . . . . . .
## s10 . . 2 . . . . . . . . . . . . . .
## s11 . . . . . . . . . . . . . . . . .
## s12 . . . . . 2 . . . . . . 22 22 . . .
## s13 . . . . . . . . . . . 21 . . . . 1
## s14 . . . . . . . . . . 1 . 21 . . . .
## s15 22 . . 1 . 4 . . . . . . . . . . .
## s16 . . . . . 23 . . . . . . . . . . 21
## s17 . . . 4 . . . . . . . . . . . . .
as_data_frame(net, what="edges")
## from to type weight
## 1 s01 s02 hyperlink 22
## 2 s01 s03 hyperlink 22
## 3 s01 s04 hyperlink 21
## 4 s01 s15 mention 20
## 5 s02 s01 hyperlink 23
## 6 s02 s03 hyperlink 21
## 7 s02 s09 hyperlink 1
## 8 s02 s10 hyperlink 5
## 9 s03 s01 hyperlink 21
## 10 s03 s04 hyperlink 22
## 11 s03 s05 hyperlink 1
## 12 s03 s08 hyperlink 4
## 13 s03 s10 mention 2
## 14 s03 s11 hyperlink 1
## 15 s03 s12 hyperlink 1
## 16 s04 s03 hyperlink 23
## 17 s04 s06 mention 1
## 18 s04 s11 mention 22
## 19 s04 s12 hyperlink 3
## 20 s04 s17 mention 2
## 21 s05 s01 mention 1
## 22 s05 s02 hyperlink 21
## 23 s05 s09 hyperlink 2
## 24 s05 s15 mention 21
## 25 s06 s06 hyperlink 1
## 26 s06 s16 hyperlink 21
## 27 s06 s17 mention 21
## 28 s07 s03 mention 1
## 29 s07 s08 mention 22
## 30 s07 s10 hyperlink 21
## 31 s07 s14 mention 4
## 32 s08 s03 hyperlink 2
## 33 s08 s07 mention 21
## 34 s08 s09 mention 23
## 35 s09 s10 mention 21
## 36 s10 s03 hyperlink 2
## 37 s12 s06 mention 2
## 38 s12 s13 hyperlink 22
## 39 s12 s14 mention 22
## 40 s13 s12 hyperlink 21
## 41 s13 s17 mention 1
## 42 s14 s11 mention 1
## 43 s14 s13 mention 21
## 44 s15 s01 hyperlink 22
## 45 s15 s04 hyperlink 1
## 46 s15 s06 hyperlink 4
## 47 s16 s06 hyperlink 23
## 48 s16 s17 mention 21
## 49 s17 s04 hyperlink 4
as_data_frame(net, what="vertices")
## name media media.type type.label audience.size
## s01 s01 NY Times 1 Newspaper 20
## s02 s02 Washington Post 1 Newspaper 25
## s03 s03 Wall Street Journal 1 Newspaper 30
## s04 s04 USA Today 1 Newspaper 32
## s05 s05 LA Times 1 Newspaper 20
## s06 s06 New York Post 1 Newspaper 50
## s07 s07 CNN 2 TV 56
## s08 s08 MSNBC 2 TV 34
## s09 s09 FOX News 2 TV 60
## s10 s10 ABC 2 TV 23
## s11 s11 BBC 2 TV 34
## s12 s12 Yahoo News 3 Online 33
## s13 s13 Google News 3 Online 23
## s14 s14 Reuters.com 3 Online 12
## s15 s15 NYTimes.com 3 Online 24
## s16 s16 WashingtonPost.com 3 Online 28
## s17 s17 AOL.com 3 Online 33
plot(net)
net <- simplify(net, remove.multiple = F, remove.loops = T)
You might notice that we could have used simplify to combine multiple edges by summing their weights with a command like simplify(net, edge.attr.comb=list(Weight=“sum”,“ignore”)). The problem is that this would also combine multiple edge types (in our data: “hyperlinks” and “mentions”). Let’s and reduce the arrow size and remove the labels (we do that by setting them to NA):
plot(net, edge.arrow.size=.4,vertex.label=NA)
As we have seen above, the edges of our second network are in a matrix format. We can read those into a graph object using graph_from_incidence_matrix(). In igraph, bipartite networks have a node attribute called type that is FALSE (or 0) for vertices in one mode and TRUE (or 1) for those in the other mode. To transform a one-mode network matrix into an igraph object, use graph_from_adjacency_matrix().
head(nodes2)
## id media media.type media.name audience.size
## 1 s01 NYT 1 Newspaper 20
## 2 s02 WaPo 1 Newspaper 25
## 3 s03 WSJ 1 Newspaper 30
## 4 s04 USAT 1 Newspaper 32
## 5 s05 LATimes 1 Newspaper 20
## 6 s06 CNN 2 TV 56
head(links2)
## U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14 U15 U16 U17
## s01 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## s02 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
## s03 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
## s04 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0
## s05 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
## s06 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1
## U18 U19 U20
## s01 0 0 0
## s02 0 0 1
## s03 0 0 0
## s04 0 0 0
## s05 0 0 0
## s06 0 0 0
net2 <- graph_from_incidence_matrix(links2)
table(V(net2)$type)
##
## FALSE TRUE
## 10 20
Plotting with igraph: the network plots have a wide set of parameters you can set. Those include node options (starting with vertex.) and edge options (starting with edge.). A list of selected options is included below, but you can also check out ?igraph.plotting for more information.
We can set the node & edge options in two ways - the first one is to specify them in the plot() function, as we are doing below.
Plot with curved edges (edge.curved=.1) and reduce arrow size: Note that using curved edges will allow you to see multiple links between two nodes (e.g. links going in either direction, or multiplex links)
plot(net, edge.arrow.size=.4, edge.curved=.1)
Replace the vertex label with the node names stored in “media”
plot(net, edge.arrow.size=.2, edge.color="orange",
vertex.color="orange", vertex.frame.color="#ffffff",
vertex.label=V(net)$media, vertex.label.color="black")
The second way to set attributes is to add them to the igraph object. Let’s say we want to color our network nodes based on type of media, and size them based on degree centrality (more links ->larger node) We will also change the width of the edges based on their weight.
# Generate colors based on media type:
colrs <- c("gray50", "tomato", "gold")
V(net)$color <- colrs[V(net)$media.type]
# Compute node degrees (#links) and use that to set node size:
deg <- degree(net, mode="all")
V(net)$size <- deg*3
# We could also use the audience size value:
V(net)$size <- V(net)$audience.size*0.6
# The labels are currently node IDs.
# Setting them to NA will render no labels:
V(net)$label <- NA
# Set edge width based on weight:
E(net)$width <- E(net)$weight/6
#change arrow size and edge color:
E(net)$arrow.size <- .2
E(net)$edge.color <- "gray80"
E(net)$width <- 1+E(net)$weight/12
plot(net)
plot(net, edge.color="orange", vertex.color="gray50")
It helps to add a legend explaining the meaning of the colors we used
plot(net)
legend(x=-1.5, y=-1.1, c("Newspaper","Television", "Online News"), pch=21,
col="#777777", pt.bg=colrs, pt.cex=2, cex=.8, bty="n", ncol=1)
plot(net, vertex.shape="none", vertex.label=V(net)$media,
vertex.label.font=2, vertex.label.color="gray40",
vertex.label.cex=.7, edge.color="gray85")
edge.start <- ends(net, es=E(net), names=F)[,1]
edge.col <- V(net)$color[edge.start]
plot(net, edge.color=edge.col, edge.curved=.1)
Network layouts are simply algorithms that return coordinates for each node in a network.
For the purposes of exploring layouts, we will generate a slightly larger 80-node graph. We use the sample_pa() function which generates a simple graph starting from one node and adding more nodes and links based on a preset level of preferential attachment (Barabasi-Albert model).
net.bg <- sample_pa(80)
V(net.bg)$size <- 8
V(net.bg)$frame.color <- "white"
V(net.bg)$color <- "orange"
V(net.bg)$label <- ""
E(net.bg)$arrow.mode <- 0
plot(net.bg)
plot(net.bg, layout=layout_randomly)
calculate the vertex coordinates in advance:
l <- layout_in_circle(net.bg)
plot(net.bg, layout=l)
l is simply a matrix of x, y coordinates (N x 2) for the N nodes in the graph. You can easily generate your own:
l <- cbind(1:vcount(net.bg), c(1, vcount(net.bg):2))
plot(net.bg, layout=l)
Randomly placed vertices
l <- layout_randomly(net.bg)
plot(net.bg, layout=l)
Circle layout
l <- layout_in_circle(net.bg)
plot(net.bg, layout=l)
3D sphere layout
l <- layout_on_sphere(net.bg)
plot(net.bg, layout=l)
Fruchterman-Reingold is one of the most used force-directed layout algorithms out there.
Force-directed layouts try to get a nice-looking graph where edges are similar in length and cross each other as little as possible. They simulate the graph as a physical system. Nodes are electrically charged particles that repulse each other when they get too close. The edges act as springs that attract connected nodes closer together. As a result, nodes are evenly distributed through the chart area, and the layout is intuitive in that nodes which share more connections are closer to each other. The disadvantage of these algorithms is that they are rather slow and therefore less often used in graphs larger than ~1000 vertices. You can set the “weight” parameter which increases the attraction forces among nodes connected by heavier edges.
l <- layout_with_fr(net.bg)
plot(net.bg, layout=l)
You will notice that this layout is not deterministic
par(mfrow=c(2,2), mar=c(0,0,0,0)) # plot four figures - 2 rows, 2 columns
plot(net.bg, layout=layout_with_fr)
plot(net.bg, layout=layout_with_fr)
plot(net.bg, layout=l)
plot(net.bg, layout=l)
par(mfrow=c(1,1))
Rescale to create more compact or spread out layout versions
l <- layout_with_fr(net.bg)
l <- norm_coords(l, ymin=-1, ymax=1, xmin=-1, xmax=1)
par(mfrow=c(2,2), mar=c(0,0,0,0))
plot(net.bg, rescale=F, layout=l*0.4)
plot(net.bg, rescale=F, layout=l*0.6)
plot(net.bg, rescale=F, layout=l*0.8)
plot(net.bg, rescale=F, layout=l*1.0)
Another popular force-directed algorithm that produces nice results for connected graphs is Kamada Kawai. Like Fruchterman Reingold, it attempts to minimize the energy in a spring system.
l <- layout_with_kk(net.bg)
plot(net.bg, layout=l)
The LGL algorithm is meant for large, connected graphs. Here you can also specify a root: a node that will be placed in the middle of the layout.
plot(net.bg, layout=layout_with_lgl)
The MDS (multidimensional scaling) algorithm tries to place nodes based on some measure of similarity or distance between them. More similar nodes are plotted closer to each other. By default, the measure used is based on the shortest paths between nodes in the network. We can change that by using our own distance matrix (however defined) with the parameter dist. MDS layouts are nice because positions and distances have a clear interpretation. The problem with them is visual clarity: nodes often overlap, or are placed on top of each other.
plot(net.bg, layout=layout_with_mds)
Available layouts in igraph:
layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1]
# Remove layouts that do not apply to our graph.
layouts <- layouts[!grepl("bipartite|merge|norm|sugiyama|tree", layouts)]
par(mfrow=c(3,3), mar=c(1,1,1,1))
for (layout in layouts) {
print(layout)
l <- do.call(layout, list(net))
plot(net, edge.arrow.mode=0, layout=l, main=layout) }
## [1] "layout_as_star"
## [1] "layout_components"
## [1] "layout_in_circle"
## [1] "layout_nicely"
## [1] "layout_on_grid"
## [1] "layout_on_sphere"
## [1] "layout_randomly"
## [1] "layout_with_dh"
## [1] "layout_with_drl"
## [1] "layout_with_fr"
## [1] "layout_with_gem"
## [1] "layout_with_graphopt"
## [1] "layout_with_kk"
## [1] "layout_with_lgl"
## [1] "layout_with_mds"
There are more sophisticated ways to extract the key edges, but for the purposes of this exercise we’ll only keep ones that have weight higher than the mean for the network. In igraph, we can delete edges using delete_edges(net, edges):
par(mfrow=c(1,1))
hist(links$weight)
mean(links$weight)
## [1] 12.40816
sd(links$weight)
## [1] 9.905635
cut.off <- mean(links$weight)
net.sp <- delete_edges(net, E(net)[weight<cut.off])
plot(net.sp)
#par(mfrow=c(1,2))
# Community detection based on label propagation:
#clp <- cluster_label_prop(net)
#class(clp)
# Community detection returns an object of class "communities"
# which igraph knows how to plot:
#plot(clp, net)
# We can also plot the communities without relying on their built-in plot:
#V(net)$community <- clp$membership
#colrs <- adjustcolor( c("gray50", "tomato", "gold", "yellowgreen"), alpha=.6)
#plot(net, vertex.color=colrs[V(net)$community])
#dev.off()
Sometimes we want to focus the visualization on a particular node or a group of nodes. In our example media network, we can examine the spread of information from focal actors. For instance, let’s represent distance from the NYT.
The distances function returns a matrix of shortest paths from nodes listed in the v parameter to ones included in the to parameter.
par(mfrow=c(1,1))
dist.from.NYT <- distances(net, v=V(net)[media=="NY Times"],
to=V(net), weights=NA)
# Set colors to plot the distances:
oranges <- colorRampPalette(c("dark red", "gold"))
col <- oranges(max(dist.from.NYT)+1)
col <- col[dist.from.NYT+1]
plot(net, vertex.color=col, vertex.label=dist.from.NYT, edge.arrow.size=.6,
vertex.label.color="white")
We can also highlight a path in the network:
news.path <- shortest_paths(net,
from = V(net)[media=="MSNBC"],
to = V(net)[media=="New York Post"],
output = "both") # both path nodes and edges
# Generate edge color variable to plot the path:
ecol <- rep("gray80", ecount(net))
ecol[unlist(news.path$epath)] <- "orange"
# Generate edge width variable to plot the path:
ew <- rep(2, ecount(net))
ew[unlist(news.path$epath)] <- 4
# Generate node color variable to plot the path:
vcol <- rep("gray40", vcount(net))
vcol[unlist(news.path$vpath)] <- "gold"
plot(net, vertex.color=vcol, edge.color=ecol,
edge.width=ew, edge.arrow.mode=0)
We can highlight the edges going into or out of a vertex, for instance the WSJ. For a single node, use incident(), for multiple nodes use incident_edges()
inc.edges <- incident(net, V(net)[media=="Wall Street Journal"], mode="all")
# Set colors to plot the selected edges.
ecol <- rep("gray80", ecount(net))
ecol[inc.edges] <- "orange"
vcol <- rep("grey40", vcount(net))
vcol[V(net)$media=="Wall Street Journal"] <- "gold"
plot(net, vertex.color=vcol, edge.color=ecol)
We can also point to the immediate neighbors of a vertex, say WSJ. The neighbors function finds all nodes one step out from the focal actor.To find the neighbors for multiple nodes, use adjacent_vertices() instead of neighbors(). To find node neighborhoods going more than one step out, use function ego() with parameter order set to the number of steps out to go from the focal node(s).
neigh.nodes <- neighbors(net, V(net)[media=="Wall Street Journal"], mode="out")
# Set colors to plot the neighbors:
vcol[neigh.nodes] <- "#ff9d00"
plot(net, vertex.color=vcol)
A way to draw attention to a group of nodes (we saw this before with communities) is to “mark” them:
par(mfrow=c(1,2))
plot(net, mark.groups=c(1,4,5,8), mark.col="#C5E5E7", mark.border=NA)
# Mark multiple groups:
plot(net, mark.groups=list(c(1,4,5,8), c(15:17)),
mark.col=c("#C5E5E7","#ECD89A"), mark.border=NA)
par(mfrow=c(1,1))
R and igraph allow for interactive plotting of networks. This might be a useful option for you if you want to tweak slightly the layout of a small graph. After adjusting the layout manually, you can get the coordinates of the nodes and use them for other plots.
tkid <- tkplot(net) #tkid is the id of the tkplot that will open
l <- tkplot.getcoords(tkid) # grab the coordinates from tkplot
plot(net, layout=l)
At this point it might be useful to provide a quick reminder that there are many ways to represent a network not limited to a hairball plot.
For example, here is a quick heatmap of the network matrix:
netm <- as_adjacency_matrix(net, attr="weight", sparse=F)
colnames(netm) <- V(net)$media
rownames(netm) <- V(net)$media
palf <- colorRampPalette(c("gold", "dark orange"))
heatmap(netm[,17:1], Rowv = NA, Colv = NA, col = palf(100),
scale="none", margins=c(10,10) )
Depending on what properties of the network or its nodes and edges are most important to you, simple graphs can often be more informative than network maps.
# Plot the degree distribution for our network:
deg.dist <- degree_distribution(net, cumulative=T, mode="all")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange",
xlab="Degree", ylab="Cumulative Frequency")
As you might remember, our second media example is a two-mode network examining links between news sources and their consumers.
head(nodes2)
## id media media.type media.name audience.size
## 1 s01 NYT 1 Newspaper 20
## 2 s02 WaPo 1 Newspaper 25
## 3 s03 WSJ 1 Newspaper 30
## 4 s04 USAT 1 Newspaper 32
## 5 s05 LATimes 1 Newspaper 20
## 6 s06 CNN 2 TV 56
head(links2)
## U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14 U15 U16 U17
## s01 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## s02 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
## s03 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
## s04 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0
## s05 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
## s06 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1
## U18 U19 U20
## s01 0 0 0
## s02 0 0 1
## s03 0 0 0
## s04 0 0 0
## s05 0 0 0
## s06 0 0 0
net2
## IGRAPH 4b68af1 UN-B 30 31 --
## + attr: type (v/l), name (v/c)
## + edges from 4b68af1 (vertex names):
## [1] s01--U01 s01--U02 s01--U03 s02--U04 s02--U05 s02--U20 s03--U06
## [8] s03--U07 s03--U08 s03--U09 s04--U09 s04--U10 s04--U11 s05--U11
## [15] s05--U12 s05--U13 s06--U13 s06--U14 s06--U17 s07--U14 s07--U15
## [22] s07--U16 s08--U16 s08--U17 s08--U18 s08--U19 s09--U06 s09--U19
## [29] s09--U20 s10--U01 s10--U11
plot(net2, vertex.label=NA)
As with one-mode networks, we can modify the network object to include the visual properties that will be used by default when plotting the network. Notice that this time we will also change the shape of the nodes - media outlets will be squares, and their users will be circles.
# Media outlets are blue squares, audience nodes are orange circles:
V(net2)$color <- c("steel blue", "orange")[V(net2)$type+1]
V(net2)$shape <- c("square", "circle")[V(net2)$type+1]
# Media outlets will have name labels, audience members will not:
V(net2)$label <- ""
V(net2)$label[V(net2)$type==F] <- nodes2$media[V(net2)$type==F]
V(net2)$label.cex=.6
V(net2)$label.font=2
plot(net2, vertex.label.color="white", vertex.size=(2-V(net2)$type)*8)
In igraph, there is also a special layout for bipartite networks (though it doesn’t always work great, and you might be better off generating your own two-mode layout).
plot(net2, vertex.label=NA, vertex.size=7, layout=layout.bipartite)
Using text as nodes may be helpful at times:
plot(net2, vertex.shape="none", vertex.label=nodes2$media,
vertex.label.color=V(net2)$color, vertex.label.font=2,
vertex.label.cex=.6, edge.color="gray70", edge.width=2)
In this example, we will also experiment with the use of images as nodes. In order to do this, you will need the png package (if missing, install with install.packages(‘png’)
# install.packages('png')
library('png')
img.1 =readPNG("D:/R_Files/news.png")
img.2 =readPNG("D:/R_Files/user.png")
V(net2)$raster <- list(img.1, img.2)[V(net2)$type+1]
plot(net2, vertex.shape="raster", vertex.label=NA,
vertex.size=16, vertex.size2=16, edge.width=2)
library('png')
img.3 =readPNG("D:/R_Files/tiger.png")
plot(net2, vertex.shape="raster", vertex.label=NA,
vertex.size=16, vertex.size2=16, edge.width=2)
rasterImage(img.3, xleft=0, xright=1.9, ybottom=0, ytop=1.5)
Plot bipartite projections for the two-mode network:
In some cases, the networks we want to plot are multigraphs: they can have multiple edges connecting the same two nodes. A related concept, multiplex networks, contain multiple types of ties. For instance, we can represent friendship, romantic, and work relationships between individuals in a single multiplex network. In our example network, we also have two tie types: hyperlinks and mentions. One thing we can do with them is plot each type of tie separately:
E(net)$width <- 1.5
plot(net, edge.color=c("dark red", "slategrey")[(E(net)$type=="hyperlink")+1],
vertex.color="gray40", layout=layout_in_circle, edge.curved=.3)
net.m <- net - E(net)[E(net)$type=="hyperlink"] # another way to delete edges:
net.h <- net - E(net)[E(net)$type=="mention"] # using the minus operator
# Plot the two links separately:
par(mfrow=c(1,2))
plot(net.h, vertex.color="orange", main="Tie: Hyperlink")
plot(net.m, vertex.color="lightsteelblue2", main="Tie: Mention")
# Make sure the nodes stay in place in both plots:
l <- layout_with_fr(net)
plot(net.h, vertex.color="orange", layout=l, main="Tie: Hyperlink")
plot(net.m, vertex.color="lightsteelblue2", layout=l, main="Tie: Mention")
In our example network, it so happens that we do not have node dyads connected by multiple types of connections. That is to say, we never have both a ‘hyperlink’ and a ‘mention’ tie between the same two news outlets. However, this could easily happen in a multiplex network.
One challenge in visualizing multigraphs is that multiple edges between the same two nodes may get plotted on top of each other in a way that makes impossible to see them clearly. For example, let us generate a very simple multiplex network with two nodes and three ties between them:
par(mfrow=c(1,1))
multigtr <- graph( edges=c(1,2, 1,2, 1,2), n=2 )
l <- layout_with_kk(multigtr)
# Let's just plot the graph:
plot(multigtr, vertex.color="lightsteelblue", vertex.frame.color="white",
vertex.size=40, vertex.shape="circle", vertex.label=NA,
edge.color=c("gold", "tomato", "yellowgreen"), edge.width=10,
edge.arrow.size=3, edge.curved=0.1, layout=l)
Because all edges in the graph have the same curvature, they are drawn over each other so that we only see one of them. What we can do is assign each edge a different curvature. One useful function in igraph called curve_multiple can help us here. For a graph G, curve.multiple(G) will generate a curvature for each edge that maximizes visibility.
plot(multigtr, vertex.color="lightsteelblue", vertex.frame.color="white",
vertex.size=40, vertex.shape="circle", vertex.label=NA,
edge.color=c("gold", "tomato", "yellowgreen"), edge.width=10,
edge.arrow.size=3, edge.curved=curve_multiple(multigtr), layout=l)
It is a good practice to detach packages when we stop needing them. Try to remember that especially with igraph and the statnet family packages, as bad things tend to happen if you have them loaded together.
detach('package:igraph')