Circular Visualization of Flow Matrices

This memo documents the steps in R to develop figures similar to the one shown below (Source: http://www.brookings.edu/research/reports2/2015/06/metro-freight#interactives). There is a tutorial available here: https://cran.r-project.org/web/packages/circlize/circlize.pdf which goes into exhaustive detail about how to develop circular matrix visualizations.

As a first step, read in the data. The sample data shows rail flows between certain county pairs in Florida and is available here: https://goo.gl/G7sZhr

Step 1. Convert the input table to a Matrix

# Load the Input Dataset. The code below assumes that the input data lies in
# the same directory as your project and R file. If not use
# load('Path/To/File/i75_from_flow.RData') load('i75_from_flow.RData')

load("C:/TEMP/i75_from_flow.RData")

ORIGFIPS	TERMFIPS	Tons
12001	12113	3082.917
12001	12057	3585.423
12047	12047	1517.133
12023	12117	5890.677
12119	12105	4959.093

# Load Libraries
library(circlize)
library(reshape)

# Convert the input data from long to wide format using library(reshape)
t1 <- cast(i75_from_flow, ORIGFIPS ~ TERMFIPS)

The resulting file is shown below.

ORIGFIPS	12047	12057	12105	12113	12117
12001	NA	3585.423	NA	3082.917	NA
12023	NA	NA	NA	NA	5890.677
12047	1517.133	NA	NA	NA	NA
12119	NA	NA	4959.093	NA	NA

# Convert the wide table to a matrix

# Convert the table to matrix and delete the first column
i75_from_flow2 <- data.matrix(subset(t1, select = -c(1)))

# set Matrix Row and Column names
rownames(i75_from_flow2) <- t1$ORIGFIPS
colnames(i75_from_flow2) <- colnames(i75_from_flow2)

# View the Matrix
View(i75_from_flow2)

The final matrix file is shown below

	12047	12057	12105	12113	12117
12001	NA	3585.423	NA	3082.917	NA
12023	NA	NA	NA	NA	5890.677
12047	1517.133	NA	NA	NA	NA
12119	NA	NA	4959.093	NA	NA

Step 2. Convert All NAs to 0

Note: The latest updates to the library takes care of this step automatically

i75_from_flow2[is.na(i75_from_flow2)] = 0

After doing this, all the NAs in the matrix get assigned a 0

	12047	12057	12105	12113	12117
12001	0.000	3585.423	0.000	3082.917	0.000
12023	0.000	0.000	0.000	0.000	5890.677
12047	1517.133	0.000	0.000	0.000	0.000
12119	0.000	0.000	4959.093	0.000	0.000

Step 3. Transpose the Matrix [Optional]

For some reason the circlize code does not render the flow links with the same colors as the matrix columns but does it for matrix rows. Therefore, this step basically transposes the matrix so that columns are now rows.

i75_from_flow3 <- t(i75_from_flow2)

The transposed matrix (i75_from_flow3) is shown below.

	12001	12023	12047	12119
12047	0.000	0.000	1517.133	0.000
12057	3585.423	0.000	0.000	0.000
12105	0.000	0.000	0.000	4959.093
12113	3082.917	0.000	0.000	0.000
12117	0.000	5890.677	0.000	0.000

Step 4. Create the circular diagram

First, specify the colors you want to use.

# Initialize grid colors
grid.col = NULL

# Set row and column labels colors
grid.col[colnames(i75_from_flow3)] = "grey"
grid.col[rownames(i75_from_flow3)] = c("red", "green", "blue", "yellow", "purple")

# Parameters for circos layout. The gap.degree specifies the gap between two
# neighbour sectors. It can be a single value or a vector. If it is a
# vector, the first value corresponds to the gap after the first sector
circos.par(gap.degree = 8)

# The chordDiagram command draws the links between O/D pairs. For details on
# what each parameter means see the tutorial document linked above.
chordDiagram(i75_from_flow3, grid.col = grid.col, directional = TRUE, annotationTrack = "grid", 
    preAllocateTracks = list(list(track.height = 0.05), list(track.height = 0.05)))

While the resulting image shows flows, it does not include labels that make the output meaningful. The next code chunk does that. Ideally you want the labels to make sense and include County names but I have not done so for this example.

circos.trackPlotRegion(track.index = 1, panel.fun = function(x, y) {
    xlim = get.cell.meta.data("xlim")
    ylim = get.cell.meta.data("ylim")
    sector.index = get.cell.meta.data("sector.index")
    circos.text(mean(xlim), mean(ylim), sector.index, facing = "inside", niceFacing = TRUE)
}, bg.border = NA)

Finally, the following command resets the circos layout parameters. This is important because if you forget to do it, it messes up the entire layout.

circos.clear()

TO DO

This is first step to clear matrix flow visualizations that can be useful for presentations. There is obviously a lot more parameters that can be adjusted to get the visualization to look exactly as you want.

Circular Visualization of Flow Matrices

Krishnan Viswanathan

July 28, 2015