A Sankey diagram illustrates the connection between two variables or “nodes”. The connections are demonstrated by what is called the flow. The flow is a weighted measure used to illustrate the path between the source and the target of the network.
There are a variety of different uses for the Sankey Network diagram. For this illustration, a made-up set of data was created illustrating how cases might flow in a Student Conduct System within Higher Education.
This example incorporates instruction and code examples from the R Graph Gallery.
The data listed below is created data for the purposes of this illustration and is not based on actual data.
# Create a data frame that demonstrates a list (source) and the flow(value) towards the end result (target)
links <- data.frame(
source=c("Freshman","Sophomore","Junior", "Senior", "Graduate", "Employees","Employees", "Employees", "Freshman", "Junior"),
target=c("Informal Resolution","Dismissal", "Referral", "Formal Administrative Hearing", "Withdrawn Complaint", "Left University", "Withdrawn Complaint", "Informal Resolution","Dismissal", "Formal Administrative Hearing"),
value=c(2,6, 2, 5, 1, 3, 3, 5, 8, 15)
)
# From these "flows" create a data frame: it lists every entity involved in the flow
nodes <- data.frame(
name=c(as.character(links$source),
as.character(links$target)) %>% unique()
)
# With networkD3, the connection must be provided using id.
links$IDsource <- match(links$source, nodes$name)-1
links$IDtarget <- match(links$target, nodes$name)-1
# Make the Network
cases <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name",
sinksRight=FALSE)
casesFigure: Network Showing Cases Outcome, no formatting.
# fontSize, fontFamily, nodeWidth, nodePadding, sinksRight
#sinksRight = TRUE moves the right mode to the right of the screen
cases <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name", fontSize = 15,
fontFamily = "Bookman", nodeWidth = 30, nodePadding = 10,
sinksRight=TRUE)
casesFigure: Network Showing Cases Outcome with formatting.
This example incorporates instruction and code examples from the R Graph Gallery.
#Add a column to the data frame to create a "group" for each of the nodes
nodes$group <- as.factor(c("a","a","a","a","a","a","b","b","b","b","b","b"))
# Assign a color for each group:
node_color <- 'd3.scaleOrdinal() .domain(["a", "b"]) .range(["orange", "black"])'
link_color <- 'd3.scaleOrdinal() .domain(["a", "b"]) .range(["orange", "black"])'
# Add the fields colourScale and NodeGroup to Network
cases <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name", fontSize = 15,
fontFamily = "Bookman", nodeWidth = 30, nodePadding = 10, colourScale = node_color,
NodeGroup = "group",
sinksRight=TRUE)
casesFigure: Network Showing Cases Outcome, color change.
Please visit the following link for more information on how to change colors using d3.scaleOrdinal ().
To add a title or column header, the built in functions for R related to Sankey do not allow for that function. However, some functionality can be used by employing the functions available through the R package htmlwidgets. Please visit the following link for more information on htmlwidgets.
This example incorporates instruction and code examples from the Stackoverflow.
cases <- sankeyNetwork(Links = links, Nodes = nodes,
Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name", fontSize = 15,
fontFamily = "Bookman", nodeWidth = 30, nodePadding = 10, colourScale = node_color,
NodeGroup = "group",
sinksRight=TRUE)
# Add Column Headers
htmlwidgets::onRender(cases, '
function(el) {
var cols_x = this.sankey.nodes().map(d => d.x).filter((v, i, a) => a.indexOf(v) === i);
var labels = ["Complainant", "Outcome"];
cols_x.forEach((d, i) => {
d3.select(el).select("svg")
.append("text")
.attr("x", d)
.attr("y", 12)
.text(labels[i]);
})
}
')Figure: Network Showing Cases Outcome, column headers.
christophergandrud.github.io. (n.d.). https://christophergandrud.github.io/networkD3/#sankey.
Creating Custom Sankey Diagrams Using R. Displayr. (2020, December 7). https://www.displayr.com/sankey-diagrams-r/.
Holtz, Y. (2018, December 10). Pimp my RMD: a few tips for R Markdown. https://holtzy.github.io/Pimp-my-rmd/.
Holtz, Y. (n.d.). Customize colors in Sankey Diagram. – the R Graph Gallery. https://www.r-graph-gallery.com/322-custom-colours-in-sankey-diagram.html.
Holtz, Y. (n.d.). Most basic Sankey Diagram. – the R Graph Gallery. https://www.r-graph-gallery.com/321-introduction-to-interactive-sankey-diagram-2.html.
Mark K & CJ Yetman 6. (1967, September 1). Adjust background picture and title for plot from networkD3’s forceNetwork. Stack Overflow. https://stackoverflow.com/questions/53828831/adjust-background-picture-and-title-for-plot-from-networkd3s-forcenetwork.
User2321, CJ Yetman 6. (1969, December 7). How to add columnn titles in a Sankey chart networkD3. Stack Overflow. https://stackoverflow.com/questions/66813278/how-to-add-columnn-titles-in-a-sankey-chart-networkd3.