packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'tidyverse', 'lubridate')
for(p in packages){library
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
GAStech_nodes <- read_csv("data/GAStech_email_node.csv")
GAStech_edges <- read_csv("data/GAStech_email_edge-v2.csv")
GAStech_edges$SentDate = dmy(GAStech_edges$SentDate)
GAStech_edges$Weekday = wday(GAStech_edges$SentDate, label = TRUE, abbr = FALSE)
GAStech_edges_aggregated <- GAStech_edges %>%
filter(MainSubject == "Work related") %>%
group_by(source, target, Weekday) %>%
summarise(Weight = n()) %>%
filter(source!=target) %>%
filter(Weight > 1) %>%
ungroup()
# GAStech_nodes <- GAStech_nodes %>%
# rename(group = Department)
GAStech_graph <- tbl_graph(nodes = GAStech_nodes, edges = GAStech_edges_aggregated, directed = TRUE)
GAStech_graph %>%
activate(edges) %>%
arrange(desc(Weight))
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Edge Data: 1,456 x 4 (active)
## from to Weekday Weight
## <int> <int> <ord> <int>
## 1 40 41 Tuesday 23
## 2 40 43 Tuesday 19
## 3 41 43 Tuesday 15
## 4 41 40 Tuesday 14
## 5 42 41 Tuesday 13
## 6 42 40 Tuesday 12
## # ... with 1,450 more rows
## #
## # Node Data: 54 x 4
## id label Department Title
## <dbl> <chr> <chr> <chr>
## 1 1 Mat.Bramar Administration Assistant to CEO
## 2 2 Anda.Ribera Administration Assistant to CFO
## 3 3 Rachel.Pantanal Administration Assistant to CIO
## # ... with 51 more rows
Current graph:
g <- GAStech_graph %>%
mutate(betweenness_centrality = centrality_betweenness()) %>%
mutate(closeness_centrality = centrality_closeness()) %>%
ggraph(layout = "nicely") +
geom_edge_link(aes()) +
geom_node_point(aes(colour = closeness_centrality, size=betweenness_centrality))
g + theme_graph()
Changes using latest functions provided in ggraph 2.0:
1) Using the new default “stress” layout
2) Using tidygraph algorithms as input to aesthetic mappings
g <- GAStech_graph %>%
ggraph() +
geom_edge_link(aes()) +
geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness()))
g + theme_graph()
Edge lines
The edge lines are too dark which makes the graph hard to see and unappealing aesthetically. Moreover, the nodes with higher closeness centrality which are closer to the colour black are almost impossible to be seen especially when the betweenness centrality for these nodes are low as well. These issues can all be solved by setting the colour of the edges to a more contrasting colour such as orange.
Layout
With the current layout, it is not possible to see how each node is connected to other nodes as it is too complex.
Graph does not show information such as department
This graph it is too complex and users are not able to get information just by looking at this graph. The only information that users can easily get out of this graph is those nodes with low closeness centrality.
Changes:
1) Change the layout to circle so that the nodes and edges are separated and can be seen clearly
2) Use orange for the edge so that it can be distinguish from black nodes and is more appealing aesthetically.
3) Mark nodes with shapes by department for a more insightful analysis
# g <- GAStech_graph %>%
# ggraph(layout="layout_in_circle") +
# geom_edge_link(colour="grey",alpha=0.1) +
# geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness()))
# g <- GAStech_graph %>%
# ggraph(layout="circle") +
# geom_edge_link(colour="grey",alpha=0.1) +
# geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness()))
g2 <- GAStech_graph %>%
ggraph(layout="circle") +
geom_edge_link(colour="#FF7E00",alpha=0.1) +
geom_node_point(aes(shape = Department, colour = centrality_closeness(), size=centrality_betweenness()))
g2 + theme_graph()
GAStech_nodes <- GAStech_nodes %>%
rename(group = Department)
GAStech_edges_aggregated <- GAStech_edges %>%
left_join(GAStech_nodes, by = c("sourceLabel" = "label")) %>%
rename(from = id) %>%
left_join(GAStech_nodes, by = c("targetLabel" = "label")) %>%
rename(to = id) %>%
filter(MainSubject == "Work related") %>%
group_by(from, to) %>%
summarise(weight = n()) %>%
filter(from!=to) %>%
filter(weight > 1) %>%
ungroup()
Current graph:
visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)
1) When a name is selected from the drop-down list, the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.
2) When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.
visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visNodes(font = list(size = 50)) %>%
visOptions(highlightNearest = list(enabled =TRUE,labelOnly = FALSE),
nodesIdSelection = TRUE
)%>%
visLayout(randomSeed = 123)
Legends
The graph does not have legend to show what each colour represents
Not directed
The edges do not indicate the direction hence, users will not be able to find out if the email was sent to or from the node or both.
Colours when selected
When the node is selected, due to the colourful edges, it is hard to see which are the edges that are highlighted and it can appear as messy. Hence, edges that are not selected can be coloured to perhaphs light grey.
Changes:
1) Add in legends to let users now what the colours represent
2) Add in the directed arrow so that users will be able to identify who sent the email to who
3) Make all edges and nodes grey when they are not 1 degree away from selected nodes in order to allow users to see clearly the selected and highlighted nodes.
visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visEdges(arrows = "to;from") %>%
visOptions(highlightNearest = list(enabled=TRUE,algorithm ="hierarchical",labelOnly = FALSE),
nodesIdSelection = list(enabled=TRUE),
collapse = TRUE
)%>%
visNodes(color = list(background = "lightblue",
highlight = "yellow",
label = GAStech_nodes$label),
shadow = list(enabled = TRUE, size = 10),
font = list(size = 50)) %>%
visLayout(randomSeed = 123) %>%
visLegend(main="Legend for Departments")