This memo documents the steps in R to develop figures similar to the one shown below (Source: http://www.brookings.edu/research/reports2/2015/06/metro-freight#interactives). There is a tutorial available here: https://cran.r-project.org/web/packages/circlize/circlize.pdf which goes into exhaustive detail about how to develop circular matrix visualizations.
As a first step, read in the data. The sample data shows rail flows between certain county pairs in Florida and is available here: https://goo.gl/G7sZhr
Step 1. Convert the input table to a Matrix
# Load the Input Dataset. The code below assumes that the input data lies in
# the same directory as your project and R file. If not use
# load('Path/To/File/i75_from_flow.RData') load('i75_from_flow.RData')
load("C:/TEMP/i75_from_flow.RData")
| ORIGFIPS | TERMFIPS | Tons |
|---|---|---|
| 12001 | 12113 | 3082.917 |
| 12001 | 12057 | 3585.423 |
| 12047 | 12047 | 1517.133 |
| 12023 | 12117 | 5890.677 |
| 12119 | 12105 | 4959.093 |
# Load Libraries
library(circlize)
library(reshape)
# Convert the input data from long to wide format using library(reshape)
t1 <- cast(i75_from_flow, ORIGFIPS ~ TERMFIPS)
The resulting file is shown below.
| ORIGFIPS | 12047 | 12057 | 12105 | 12113 | 12117 |
|---|---|---|---|---|---|
| 12001 | NA | 3585.423 | NA | 3082.917 | NA |
| 12023 | NA | NA | NA | NA | 5890.677 |
| 12047 | 1517.133 | NA | NA | NA | NA |
| 12119 | NA | NA | 4959.093 | NA | NA |
# Convert the wide table to a matrix
# Convert the table to matrix and delete the first column
i75_from_flow2 <- data.matrix(subset(t1, select = -c(1)))
# set Matrix Row and Column names
rownames(i75_from_flow2) <- t1$ORIGFIPS
colnames(i75_from_flow2) <- colnames(i75_from_flow2)
# View the Matrix
View(i75_from_flow2)
The final matrix file is shown below
| 12047 | 12057 | 12105 | 12113 | 12117 | |
|---|---|---|---|---|---|
| 12001 | NA | 3585.423 | NA | 3082.917 | NA |
| 12023 | NA | NA | NA | NA | 5890.677 |
| 12047 | 1517.133 | NA | NA | NA | NA |
| 12119 | NA | NA | 4959.093 | NA | NA |
Step 2. Convert All NAs to 0
Note: The latest updates to the library takes care of this step automatically
i75_from_flow2[is.na(i75_from_flow2)] = 0
After doing this, all the NAs in the matrix get assigned a 0
| 12047 | 12057 | 12105 | 12113 | 12117 | |
|---|---|---|---|---|---|
| 12001 | 0.000 | 3585.423 | 0.000 | 3082.917 | 0.000 |
| 12023 | 0.000 | 0.000 | 0.000 | 0.000 | 5890.677 |
| 12047 | 1517.133 | 0.000 | 0.000 | 0.000 | 0.000 |
| 12119 | 0.000 | 0.000 | 4959.093 | 0.000 | 0.000 |
Step 3. Transpose the Matrix [Optional]
For some reason the circlize code does not render the flow links with the same colors as the matrix columns but does it for matrix rows. Therefore, this step basically transposes the matrix so that columns are now rows.
i75_from_flow3 <- t(i75_from_flow2)
The transposed matrix (i75_from_flow3) is shown below.
| 12001 | 12023 | 12047 | 12119 | |
|---|---|---|---|---|
| 12047 | 0.000 | 0.000 | 1517.133 | 0.000 |
| 12057 | 3585.423 | 0.000 | 0.000 | 0.000 |
| 12105 | 0.000 | 0.000 | 0.000 | 4959.093 |
| 12113 | 3082.917 | 0.000 | 0.000 | 0.000 |
| 12117 | 0.000 | 5890.677 | 0.000 | 0.000 |
Step 4. Create the circular diagram
First, specify the colors you want to use.
# Initialize grid colors
grid.col = NULL
# Set row and column labels colors
grid.col[colnames(i75_from_flow3)] = "grey"
grid.col[rownames(i75_from_flow3)] = c("red", "green", "blue", "yellow", "purple")
# Parameters for circos layout. The gap.degree specifies the gap between two
# neighbour sectors. It can be a single value or a vector. If it is a
# vector, the first value corresponds to the gap after the first sector
circos.par(gap.degree = 8)
# The chordDiagram command draws the links between O/D pairs. For details on
# what each parameter means see the tutorial document linked above.
chordDiagram(i75_from_flow3, grid.col = grid.col, directional = TRUE, annotationTrack = "grid",
preAllocateTracks = list(list(track.height = 0.05), list(track.height = 0.05)))
While the resulting image shows flows, it does not include labels that make the output meaningful. The next code chunk does that. Ideally you want the labels to make sense and include County names but I have not done so for this example.
circos.trackPlotRegion(track.index = 1, panel.fun = function(x, y) {
xlim = get.cell.meta.data("xlim")
ylim = get.cell.meta.data("ylim")
sector.index = get.cell.meta.data("sector.index")
circos.text(mean(xlim), mean(ylim), sector.index, facing = "inside", niceFacing = TRUE)
}, bg.border = NA)
Finally, the following command resets the circos layout parameters. This is important because if you forget to do it, it messes up the entire layout.
circos.clear()
TO DO
This is first step to clear matrix flow visualizations that can be useful for presentations. There is obviously a lot more parameters that can be adjusted to get the visualization to look exactly as you want.