packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse')
for(p in packages){library
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
GAStech_nodes <- read_csv("data/GAStech_email_node.csv")
GAStech_edges <- read_csv("data/GAStech_email_edge-v2.csv")
GAStech_edges$SentDate = dmy(GAStech_edges$SentDate)
GAStech_edges$Weekday = wday(GAStech_edges$SentDate, label = TRUE, abbr = FALSE)
GAStech_edges_aggregated <- GAStech_edges %>%
filter(MainSubject == "Work related") %>%
group_by(source, target, Weekday) %>%
summarise(Weight = n()) %>%
filter(source!=target) %>%
filter(Weight > 1) %>%
ungroup()
GAStech_edges_aggregated
## # A tibble: 1,456 x 4
## source target Weekday Weight
## <dbl> <dbl> <ord> <int>
## 1 1 2 Monday 4
## 2 1 2 Tuesday 3
## 3 1 2 Wednesday 5
## 4 1 2 Friday 8
## 5 1 3 Monday 4
## 6 1 3 Tuesday 3
## 7 1 3 Wednesday 5
## 8 1 3 Friday 8
## 9 1 4 Monday 4
## 10 1 4 Tuesday 3
## # ... with 1,446 more rows
GAStech_graph <- tbl_graph(nodes = GAStech_nodes, edges = GAStech_edges_aggregated, directed = TRUE)
GAStech_graph %>%
activate(edges) %>%
arrange(desc(Weight))
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Edge Data: 1,456 x 4 (active)
## from to Weekday Weight
## <int> <int> <ord> <int>
## 1 40 41 Tuesday 23
## 2 40 43 Tuesday 19
## 3 41 43 Tuesday 15
## 4 41 40 Tuesday 14
## 5 42 41 Tuesday 13
## 6 42 40 Tuesday 12
## # ... with 1,450 more rows
## #
## # Node Data: 54 x 4
## id label Department Title
## <dbl> <chr> <chr> <chr>
## 1 1 Mat.Bramar Administration Assistant to CEO
## 2 2 Anda.Ribera Administration Assistant to CFO
## 3 3 Rachel.Pantanal Administration Assistant to CIO
## # ... with 51 more rows
g <- GAStech_graph %>%
mutate(betweenness_centrality = centrality_betweenness()) %>%
mutate(closeness_centrality = centrality_closeness()) %>%
ggraph(layout = "nicely") +
geom_edge_link(aes()) +
geom_node_point(aes(colour = closeness_centrality, size=betweenness_centrality))
g + theme_graph()
g <- ggraph(GAStech_graph, layout = "nicely") +
geom_edge_link(aes()) +
geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness()))
g + theme_graph()
ggraph2.0 allows for access to tidygraph algorithms in ggraph code, hence we can compute the centrality measures directly.
| No. | Issue | Proposed Improvement |
|---|---|---|
| 1 | The edge colours are too dark, which may visually interfere with the nodes. This makes it difficult to differentiate the edges easily according to importance. | Change the edge colour to a lighter one which is more visually similar to the background colour, such as light grey. The edge weights can also be represented by the opacity of the edge line. |
| 2 | The node colour used is not indicative of the departments which the node belongs to, which would be an important feature in an organisation chart. The nodes which overlap also cannot be differentiated due to the node colours. | Colour the nodes according to department, and use opacity to represent the closeness centrality instead. |
| 3 | The nodes are not labelled, making it difficult to know which node represents which employee. | Label the nodes by employee name. |
g_alt <- ggraph(GAStech_graph, layout='kk') +
geom_edge_link(colour = "grey50", aes(edge_alpha=Weight)) +
geom_node_point(aes(colour = Department, size=centrality_betweenness(), alpha=centrality_closeness())) +
geom_text(aes(x= g_alt[["data"]][["x"]], y=g_alt[["data"]][["y"]],label=g_alt[["data"]][["label"]]),size=2, alpha=0.8)
g_alt + theme_graph()
GAStech_edges_aggregated <- GAStech_edges %>%
left_join(GAStech_nodes, by = c("sourceLabel" = "label")) %>%
rename(from = id) %>%
left_join(GAStech_nodes, by = c("targetLabel" = "label")) %>%
rename(to = id) %>%
filter(MainSubject == "Work related") %>%
group_by(from, to) %>%
summarise(weight = n()) %>%
filter(from!=to) %>%
filter(weight > 1) %>%
ungroup()
GAStech_nodes <- GAStech_nodes %>%
rename(group = Department)
visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)
Improvement 1: When a name is selected from the drop-down list, the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.
Improvement 2: When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.
visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visNodes(label=GAStech_nodes$id, shape="circle") %>%
visOptions(highlightNearest = list(enabled = TRUE, labelOnly = FALSE), nodesIdSelection = TRUE)
| No. | Issue | Proposed Improvement |
|---|---|---|
| 1 | The node labels cannot be seen unless the viewer zooms in, which might not be practical nor intuitive. The labels also do not provide much information about the node. | Include node information on hover, such that it provides further detail to the user. Including information in the tooltip also allows the labels to only be showed upon user interaction, ensuring that the graph will not be too messy. |
| 2 | There is no legend provided to show an indication of what each colour represents. | Include a legend to show the corresponding department for each colour. |
| 3 | There is a lack of directionality and weight reflected in the graph. In this context, weight would be the number of emails sent from one employee to another. Directionality would reflect the sender and receiver. All these are important information which should be included in the network. | Include indication of directionality on the edges, as well as the weights. |
Data Preparation
GAStech_edges_aggregated <- GAStech_edges_aggregated %>%
rename(value = weight)
GAStech_nodes$title = paste0(GAStech_nodes$label, "<br> Title: ", GAStech_nodes$Title)
Improved Network Graph
visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visNodes(label=id, shape="circle") %>%
visEdges(arrows = "to", color = list(highlight = "blue")) %>%
visLegend(main = "Department") %>%
visOptions(highlightNearest = list(enabled = TRUE, labelOnly = FALSE, degree=1), nodesIdSelection = TRUE)
Sources:
https://www.rdocumentation.org/packages/visNetwork/versions/2.0.8/topics/visOptions
https://www.data-imaginist.com/2017/ggraph-introduction-edges/
https://www.data-imaginist.com/2017/ggraph-introduction-nodes/
https://stackoverflow.com/questions/46664645/r-visnetwork-igraph-weighted-network-visualization-with-visedges