Task 1: Static Organisation Graph

Q1.Improve the code chunk used to create the organisation network graph by using the latest functions provided in ggraph2.0.

Original code:

First lets get the packages:

packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse', 'ggplot2','deldir')

for(p in packages){library
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

Prepare the data available:

GAStech_nodes <- read_csv("GAStech_email_node.csv")
## Parsed with column specification:
## cols(
##   id = col_double(),
##   label = col_character(),
##   Department = col_character(),
##   Title = col_character()
## )
GAStech_edges <- read_csv("GAStech_email_edge-v2.csv")
## Parsed with column specification:
## cols(
##   source = col_double(),
##   target = col_double(),
##   SentDate = col_character(),
##   SentTime = col_time(format = ""),
##   Subject = col_character(),
##   MainSubject = col_character(),
##   sourceLabel = col_character(),
##   targetLabel = col_character()
## )
GAStech_edges$SentDate  = dmy(GAStech_edges$SentDate)
GAStech_edges$Weekday = wday(GAStech_edges$SentDate, label = TRUE, abbr = FALSE)
GAStech_edges_aggregated <- GAStech_edges %>%
  filter(MainSubject == "Work related") %>%
  group_by(source, target, Weekday) %>%
    summarise(Weight = n()) %>%
  filter(source!=target) %>%
  filter(Weight > 1) %>%
  ungroup()
GAStech_edges_aggregated
## # A tibble: 1,456 x 4
##    source target Weekday   Weight
##     <dbl>  <dbl> <ord>      <int>
##  1      1      2 Monday         4
##  2      1      2 Tuesday        3
##  3      1      2 Wednesday      5
##  4      1      2 Friday         8
##  5      1      3 Monday         4
##  6      1      3 Tuesday        3
##  7      1      3 Wednesday      5
##  8      1      3 Friday         8
##  9      1      4 Monday         4
## 10      1      4 Tuesday        3
## # … with 1,446 more rows
GAStech_graph <- tbl_graph(nodes = GAStech_nodes, edges = GAStech_edges_aggregated, directed = TRUE)
GAStech_graph %>%
  activate(edges) %>%
  arrange(desc(Weight))
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Edge Data: 1,456 x 4 (active)
##    from    to Weekday Weight
##   <int> <int> <ord>    <int>
## 1    40    41 Tuesday     23
## 2    40    43 Tuesday     19
## 3    41    43 Tuesday     15
## 4    41    40 Tuesday     14
## 5    42    41 Tuesday     13
## 6    42    40 Tuesday     12
## # … with 1,450 more rows
## #
## # Node Data: 54 x 4
##      id label           Department     Title           
##   <dbl> <chr>           <chr>          <chr>           
## 1     1 Mat.Bramar      Administration Assistant to CEO
## 2     2 Anda.Ribera     Administration Assistant to CFO
## 3     3 Rachel.Pantanal Administration Assistant to CIO
## # … with 51 more rows

This is the original graph

g <- GAStech_graph %>%
  mutate(betweenness_centrality = centrality_betweenness()) %>%
  mutate(closeness_centrality = centrality_closeness()) %>%
  ggraph(layout = "nicely") + 
  geom_edge_link(aes()) +
  geom_node_point(aes(colour = closeness_centrality, size=betweenness_centrality))

g + theme_graph()

Solution to Qns1: We do not need to mutate centrality_betweenness and centrality_closeness functions anymore. Using the new ggraph function, we can just include our already created GAStech_graph variable into the graph slot. Next we can simply add centrality_betweenness and centrality_closeness as color and size repectivley.

g <- 
  ggraph(GAStech_graph,layout = "nicely") + 
  geom_edge_link(aes()) +
  geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness()))

g + theme_graph()

Q2.Identify three aspects of the graph visualisation in Section 6.1 that can be improved.

First problem: It is clear problem that the black bold edge links between all the nodes make it hard to read the graph and understand the relationship between the nodes. You cannot see the Edges link (geom_edge_link) that connecting each node has the same weight and are too thick. Its not clear which direction the links are in or on which weekday did the transaction occured.

First Solution: Change the color of the links to represent the days of the transaction. The weight of the is the boldness of the links and thier colors are the weekdays the transaction occured.

Second Problem: The nodes are not labeled in any way. A reader would have no idea what each circle (node) would mean or what department where they in. The original monotone blue color did not sit well on the back edge links either.

Second Solution: Find ways to display more information about the nodes such as department, names or title of the person whose node it is. Find a color gradient that distinguish itself from the background colors (if there are any)

Third Problem: Current arrangment/layout of the nodes and edges is very messy. Its hard to develop insights from them.

Third Solution: Find better layout that can best display all the information.

Q3.Provide the sketch of your alternative design.

Q4. Using appropriate ggraph functions, plot the alternative design.

g <- 
  ggraph(GAStech_graph,  layout = "linear", circular = TRUE) + 
  geom_node_voronoi(aes(fill = Department), 
                    max.radius = 0.1, 
                    colour = 'white',alpha=0.5)+
  geom_edge_arc(aes(alpha = Weight, 
                    color = Weekday), 
                width = 0.5) +
  geom_node_text(aes(label = label,size=70), 
                 repel = TRUE)+
  scale_color_gradient(low = "white",
                       high = "black")+
  geom_node_point(aes(colour = centrality_closeness(), 
                      size=centrality_betweenness()))+
  labs(title="GAStech Network Graph",
       colour="Closeness Centrality",
       size="Betweenness Centrality")
g

Task 2: Interactive Organisation Graph

Data preparation:

GAStech_edges_aggregated <- GAStech_edges %>%
  left_join(GAStech_nodes, by = c("sourceLabel" = "label")) %>%
  rename(from = id) %>%
  left_join(GAStech_nodes, by = c("targetLabel" = "label")) %>%
  rename(to = id) %>%
  filter(MainSubject == "Work related") %>%
  group_by(from, to) %>%
    summarise(weight = n()) %>%
  filter(from!=to) %>%
  filter(weight > 1) %>%
  ungroup()
GAStech_nodes <- GAStech_nodes %>%
  rename(group = Department)

Q1.Improve the design of the graph by incorporating the following interactivity:

A.When a name is selected from the drop-down list,the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.

B.When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.

Original code:

visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)

New Code: Based on the requirments, we have added two new important modules: visOptions that allows the selected node and adjacent nodes to be highlighted. reference: https://www.rdocumentation.org/packages/visNetwork/versions/2.0.8/topics/visOptions

visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = list(enabled = TRUE, 
                                     labelOnly = FALSE,
                                     algorithm = "hierarchical"),
             nodesIdSelection = TRUE, 
             autoResize = TRUE,
             collapse = TRUE) %>%
  visInteraction(hover = TRUE)%>%
  visNodes(mass = 50, 
           font = list(size = 30))

Q2.Identify three aspects of the graph visualisation in Section 7.4 that can be improved.

First Problem: All the nodes and the edges are color coded but there is no legend to understand what colors mean.

First Solution: Create a legend to show the groups/departments of the employess.

Second Problem: While there is a dropdown and select filter for individual employees. There is none for the departments. If a user wants to understand the relationships within a department, they are unable to.

Second Solution: A departments dropdown filter.

Third Problem: Nodes are labbled but cannot be easily read. The labels are outside the circle and hidden if the graph is zommed out to a certain extent. They tend to clump up together when zoomed out too.

Third solution: Change the node shape from a circle to a box and put the label inside. Try to make the labels not overlap and show when zoommed out to a certain extent.

Extras: Edges links now have arrows to understand the to and from between the nodes.

Q3.Provide the sketch of your alternative design.

Q4.Using appropriate visNetwork functions, plot the alternative design.

reference: https://www.rdocumentation.org/packages/visNetwork/versions/2.0.8/topics/visNetwork

g<-visNetwork(GAStech_nodes, GAStech_edges_aggregated,main = "GAStech Network Graph") %>%
  
  visIgraphLayout(layout = "layout_with_fr") %>%

  visOptions(highlightNearest = list(enabled = TRUE,
                                     labelOnly = FALSE,
                                     algorithm = "hierarchical"),
             autoResize = TRUE, 
             collapse = TRUE, 
             nodesIdSelection = list(enabled = TRUE, 
                                     values = unique(GAStech_nodes$id)),
             selectedBy = list(variable="group")) %>%
  visNodes(shape = "box" ) %>%
  visEdges(arrows = "to") %>%
  visInteraction(hover = TRUE)%>%
  visEdges(smooth = FALSE) %>%
  visPhysics(stabilization = FALSE) %>%
  visNodes(mass = 50, 
           font = list(size = 30))
visLegend(g)