1. Installing and Launching R Packages

Before we get start, it is important to ensure that tidyverse, tidygraph, igraph and ggraph have been install in R. If anyone or all of them have yet to be installed, you are required to install them.

  packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse', 'gganimate', 'dplyr', 'readxl')

  for(p in packages){library
    if(!require(p, character.only = T)){
      install.packages(p)
    }
    library(p, character.only = T)
  }

If you have installed the above packeges before, you can use the code chunks below to launch these packages on R.

  p <- c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse')
  lapply(p, require, character.only = TRUE)
## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE
## 
## [[6]]
## [1] TRUE

2. Data Wrangling

In this step, you will be importing and preparing the data based on what was given in Hands-on Exercise 10.

  GAStech_nodes <- read_csv("data/GAStech_email_node.csv")
  GAStech_edges <- read_csv("data/GAStech_email_edge-v2.csv")
  
  glimpse(GAStech_edges)
## Observations: 9,063
## Variables: 8
## $ source      <dbl> 43, 43, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 26...
## $ target      <dbl> 41, 40, 51, 52, 53, 45, 44, 46, 48, 49, 47, 54, 27...
## $ SentDate    <chr> "6/1/2014", "6/1/2014", "6/1/2014", "6/1/2014", "6...
## $ SentTime    <time> 08:39:00, 08:39:00, 08:58:00, 08:58:00, 08:58:00,...
## $ Subject     <chr> "GT-SeismicProcessorPro Bug Report", "GT-SeismicPr...
## $ MainSubject <chr> "Work related", "Work related", "Work related", "W...
## $ sourceLabel <chr> "Sven.Flecha", "Sven.Flecha", "Kanon.Herrero", "Ka...
## $ targetLabel <chr> "Isak.Baza", "Lucas.Alcazar", "Felix.Resumir", "Hi...
  GAStech_edges$SentDate  = dmy(GAStech_edges$SentDate)
  GAStech_edges$Weekday = wday(GAStech_edges$SentDate, label = TRUE, abbr = FALSE)
  
  GAStech_edges_aggregated <- GAStech_edges %>%
    filter(MainSubject == "Work related") %>%
    group_by(source, target, Weekday) %>%
      dplyr::summarise(Weight = n()) %>%
    filter(source!=target) %>%
    filter(Weight > 1) %>%
    ungroup()
  GAStech_edges_aggregated
## # A tibble: 1,456 x 4
##    source target Weekday   Weight
##     <dbl>  <dbl> <ord>      <int>
##  1      1      2 Monday         4
##  2      1      2 Tuesday        3
##  3      1      2 Wednesday      5
##  4      1      2 Friday         8
##  5      1      3 Monday         4
##  6      1      3 Tuesday        3
##  7      1      3 Wednesday      5
##  8      1      3 Friday         8
##  9      1      4 Monday         4
## 10      1      4 Tuesday        3
## # ... with 1,446 more rows
  GAStech_graph <- tbl_graph(nodes = GAStech_nodes, edges = GAStech_edges_aggregated, directed = TRUE)
  GAStech_graph
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Node Data: 54 x 4 (active)
##      id label              Department    Title                             
##   <dbl> <chr>              <chr>         <chr>                             
## 1     1 Mat.Bramar         Administrati~ Assistant to CEO                  
## 2     2 Anda.Ribera        Administrati~ Assistant to CFO                  
## 3     3 Rachel.Pantanal    Administrati~ Assistant to CIO                  
## 4     4 Linda.Lagos        Administrati~ Assistant to COO                  
## 5     5 Ruscella.Mies.Hab~ Administrati~ Assistant to Engineering Group Ma~
## 6     6 Carla.Forluniau    Administrati~ Assistant to IT Group Manager     
## # ... with 48 more rows
## #
## # Edge Data: 1,456 x 4
##    from    to Weekday   Weight
##   <int> <int> <ord>      <int>
## 1     1     2 Monday         4
## 2     1     2 Tuesday        3
## 3     1     2 Wednesday      5
## # ... with 1,453 more rows
  GAStech_graph %>%
  activate(edges) %>%
  arrange(desc(Weight))
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Edge Data: 1,456 x 4 (active)
##    from    to Weekday Weight
##   <int> <int> <ord>    <int>
## 1    40    41 Tuesday     23
## 2    40    43 Tuesday     19
## 3    41    43 Tuesday     15
## 4    41    40 Tuesday     14
## 5    42    41 Tuesday     13
## 6    42    40 Tuesday     12
## # ... with 1,450 more rows
## #
## # Node Data: 54 x 4
##      id label           Department     Title           
##   <dbl> <chr>           <chr>          <chr>           
## 1     1 Mat.Bramar      Administration Assistant to CEO
## 2     2 Anda.Ribera     Administration Assistant to CFO
## 3     3 Rachel.Pantanal Administration Assistant to CIO
## # ... with 51 more rows

3. Task 1: Static Organisation Graph

With reference to the organisation network graph in Section 6.1 of Hands-on Exercise 10, you are required to complete the following tasks:

  • Improve the code chunk used to create the organisation network graph by using the latest functions provided in ggraph2.0.
  • Identify three aspects of the graph visualisation in Section 6.1 that can be improved.
  • Provide the sketch of your alternative design.
  • Using appropriate ggraph functions, plot the alternative design.

3.1 Improving code chunk

This is the previous code chunk that is used to create the organisation network graph.

  g <- GAStech_graph %>%
    mutate(betweenness_centrality = centrality_betweenness()) %>%
    mutate(closeness_centrality = centrality_closeness()) %>%
    ggraph(layout = "nicely") + 
    geom_edge_link(aes()) +
    geom_node_point(aes(colour = closeness_centrality, size=betweenness_centrality))
  
  g + theme_graph()

This is the new improved code chunk that is used to create the organisation network graph. Based on what was researched, the latest functions provided in ggraph2.0.

Now, in ggraph the initiation will need to specify a layout to use for the subsequent node and edge geoms. Many of these layouts use different node and edge variables in their calculations e.g. a node size or an edge weight. Prior to v2 these arguments would simply take a string naming the respective variable to use, but following the v2 update these arguments implement Non-Standard Evaluation (NSE) in a manner known from both dplyr and ggplot2 where it is used inside aes() calls.

  ggraph(GAStech_graph, layout = "nicely") + 
  geom_edge_link() +
  geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness())) +
  theme_graph()

3.2 3 aspects that can be improved on

No. Problem Solution
1 Dark colored nodes hinder users from being able to view the nodes in clear distinction from one another Change the colour of the nodes to a brighter colour in order for users to view the nodes in clear distinction from one another
2 Unable to view the name for each node Add in labels for each nodes to allow user to have a clearer of which node belongs to who
3 The shape nicely makes it hard for the user to view the implemented labels Change shape to circle insted of nicely to give a clearer view

3.3 Sketch

image:

3.4 Alternative Design

ggraph(GAStech_graph, layout = "circle") + 
  geom_edge_link(alpha=0.05) +
  geom_node_point(aes(colour = centrality_closeness(), size=centrality_betweenness())) +
  scale_color_gradient(low='yellow', high='orange') + 
  geom_node_text(aes(label = label), colour = "black", size = 1.5, repel = TRUE) +
  theme_graph()

4. Task 2: Interactive Organisation Graph

With reference to the organisation network graph in Section 7.4 of Hands-on Exercise 10, you are required to complete the following tasks:

  • Improve the design of the graph by incorporating the following interactivity:
  • When a name is selected from the drop-down list, the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.
  • When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.
  • Identify three aspects of the graph visualisation in Section 7.4 that can be improved.
  • Provide the sketch of your alternative design.
  • Using appropriate visNetwork functions, plot the alternative design.

Data preparation

  GAStech_edges_aggregated <- GAStech_edges %>%
    left_join(GAStech_nodes, by = c("sourceLabel" = "label")) %>%
    rename(from = id) %>%
    left_join(GAStech_nodes, by = c("targetLabel" = "label")) %>%
    rename(to = id) %>%
    filter(MainSubject == "Work related") %>%
    group_by(from, to) %>%
      dplyr::summarise(weight = n()) %>%
    filter(from!=to) %>%
    filter(weight > 1) %>%
    ungroup()

  GAStech_nodes <- GAStech_nodes %>%
  #must rename so that they will know it is department that is the grp
    rename(group = Department)

4.1 Improving code chunk

This is the previous code chunk that is used to create the interactive organisation graph.

  visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
    visIgraphLayout(layout = "layout_with_fr") %>%
    visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)

This is the new improved code chunk that is used to create the interactive organisation graph based on what was needed.

  • When a name is selected from the drop-down list, the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.
  • When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.
  visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
    visIgraphLayout(layout = "layout_with_fr") %>%
    visOptions(nodesIdSelection = TRUE, highlightNearest = list(labelOnly = FALSE, enabled= TRUE))

4.2 3 aspects that can be improved on

No. Problem Solution
1 Unable to view the name for each node Change the nodes to another shape to include the name of each node inside
2 Scrolling the page causes the visualisation to zoom in and out when it is not needed Add in navigation buttons to allow the user to press “+” to zoom and “-” to zoom out and disable the zoomview variables while scrolling
3 Unable to know which node belong to which department even though it is seperated clearly by its colours Add in legends to allow user to view which colour belongs to which specific department

4.3 Sketch

image:

4.4 Alternative Design

  visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
    visNodes(shape = "box") %>%
    visEdges(arrows = "to") %>% 
    visIgraphLayout(layout = "layout_with_fr") %>%
    visOptions(selectedBy = "group", nodesIdSelection = TRUE, highlightNearest = list(labelOnly = FALSE, enabled= TRUE), manipulation = TRUE) %>%
    visLegend(position = "right", main = "Department") %>%
    visInteraction(dragNodes = FALSE, dragView = FALSE, zoomView = FALSE, navigationButtons = TRUE)