email: jc3181 AT columbia DOT edu

DiagrammeR

A relatively new package that I have been keen to experiment with is DiagrammeR by Rich Iannone. It looks like it will become the go-to package for creating all sorts of flow-charts, sequence diagrams and chained box plots in R.

I wanted to play around with this package and decided to try it out to plot a sociogram. These plots show interactions between individual actors in a social group. In particular, I wanted to show how individual actors have direct influence over others in a circular layout sociogram.

If you haven’t already - load up and fire up the DiagrammeR package. The ReadMe, vignettes and help pages are great - read them to find out tons of info.

devtools::install_github('rich-iannone/DiagrammeR')
library("DiagrammeR")

 

Example Data

Here are some data. We have 12 individuals (A to L). The square matrix should be read from rows to columns. If a ‘1’ appears in a row, it means that that individual dominates the individual in the column. If a ‘0’ appears, it means that that individual does not dominate the individual in the column.

mat <-
structure(c(0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 
0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 
0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 
0), .Dim = c(12L, 12L), .Dimnames = list(c("A", "B", "C", "D", 
"E", "F", "G", "H", "I", "J", "K", "L"), c("A", "B", "C", "D", 
"E", "F", "G", "H", "I", "J", "K", "L")))

mat
##   A B C D E F G H I J K L
## A 0 0 0 1 0 0 1 0 0 0 0 0
## B 1 0 0 1 0 0 0 0 0 0 0 0
## C 0 1 0 0 0 0 0 0 1 0 0 1
## D 0 0 0 0 0 0 0 0 0 0 0 1
## E 0 1 0 0 0 0 0 0 0 0 0 0
## F 0 0 1 1 0 0 1 0 0 0 0 0
## G 0 1 0 0 0 0 0 0 0 0 0 1
## H 1 1 1 1 1 1 1 0 1 1 1 1
## I 0 1 0 1 1 0 1 0 0 0 0 0
## J 1 1 1 1 1 1 1 0 1 0 1 1
## K 1 1 1 1 1 1 1 0 1 0 0 1
## L 0 1 0 0 0 1 0 0 0 0 0 0

 

We want to get these data into a format to write the code for DiagrammeR. I am still a fan of reshape2 ‘s melt function. Applying this will transform the matrix into 3 columns (Indiv 1, Indiv 2, value of the corresponding matrix cell). We only want to keep cells that had a ’1’.

library(reshape2)
temp<-melt(mat)
temp <- temp[temp[,3]==1,]
temp
##     Var1 Var2 value
## 2      B    A     1
## 8      H    A     1
## 10     J    A     1
## 11     K    A     1
## 15     C    B     1
## 17     E    B     1
## 19     G    B     1
## 20     H    B     1
## 21     I    B     1
## 22     J    B     1
## 23     K    B     1
## 24     L    B     1
## 30     F    C     1
## 32     H    C     1
## 34     J    C     1
## 35     K    C     1
## 37     A    D     1
## 38     B    D     1
## 42     F    D     1
## 44     H    D     1
## 45     I    D     1
## 46     J    D     1
## 47     K    D     1
## 56     H    E     1
## 57     I    E     1
## 58     J    E     1
## 59     K    E     1
## 68     H    F     1
## 70     J    F     1
## 71     K    F     1
## 72     L    F     1
## 73     A    G     1
## 78     F    G     1
## 80     H    G     1
## 81     I    G     1
## 82     J    G     1
## 83     K    G     1
## 99     C    I     1
## 104    H    I     1
## 106    J    I     1
## 107    K    I     1
## 116    H    J     1
## 128    H    K     1
## 130    J    K     1
## 135    C    L     1
## 136    D    L     1
## 139    G    L     1
## 140    H    L     1
## 142    J    L     1
## 143    K    L     1

The ‘nodes’ of our sociogram are going to be the individuals A-L, which we can get from rownames plus a semi-colon. The ‘edges’ are going to be a combination of the first and second columns of our dataframe separated by an arrow ‘->’ followed by a semi-colon.

We can print this to our console and get rid of the quotation mark formatting by using cat.

nodes <- paste0(rownames(mat), ";")
myedges <- paste0(paste(temp[,1], temp[,2], sep="->"),";")

cat(nodes)
## A; B; C; D; E; F; G; H; I; J; K; L;
cat(myedges)
## B->A; H->A; J->A; K->A; C->B; E->B; G->B; H->B; I->B; J->B; K->B; L->B; F->C; H->C; J->C; K->C; A->D; B->D; F->D; H->D; I->D; J->D; K->D; H->E; I->E; J->E; K->E; H->F; J->F; K->F; L->F; A->G; F->G; H->G; I->G; J->G; K->G; C->I; H->I; J->I; K->I; H->J; H->K; J->K; C->L; D->L; G->L; H->L; J->L; K->L;

 

Writing the DiagrammeR function

The format required is as follows. I simply cut/paste the above output from cat (remembering to not include the very last semi-colon). In time, I’d quite like to automate this so that I can include these plots as an output from any inputted matrix in a package that I’m writing. I’ll work out how to do that soon (help appreciated!).

sociogram <- "
digraph boxes_and_circles {

# several 'node' statements
node [shape = circle, fixedsize = true, width = 0.9, color = blue] // sets as circles
A; B; C; D; E; F; G; H; I; J; K; L

# several 'edge' statements
edge [color = red] // this sets all edges to be red 
B->A; H->A; J->A; K->A; C->B; E->B; G->B; H->B; I->B; J->B; K->B; L->B; F->C; H->C; J->C; K->C; A->D; B->D; F->D; H->D; I->D; J->D; K->D; H->E; I->E; J->E; K->E; H->F; J->F; K->F; L->F; A->G; F->G; H->G; I->G; J->G; K->G; C->I; H->I; J->I; K->I; H->J; H->K; J->K; C->L; D->L; G->L; H->L; J->L; K->L

# a 'graph' statement
graph [overlap = true, fontsize = 20]
}
"

 

This is the R function, and I have chosen a circular layout.

grViz(sociogram, engine = "circo")

 

I’m pretty happy with the final product. It’s clear and informative (at least to me who is used to these sort of graphs). Sociomatrix data can be very clustered and confusing, so this circular layout is very helpful.