Pre-Requisite

Disable message and warning. Set default plots width and height to 10 and 5 respectively

## Global options
knitr::opts_chunk$set(
               message=FALSE,
               warning=FALSE,
               fig.width=10,
               fig.height = 5)

Installing packages are essential before moving on to the subsequent tasks

packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse')

for(p in packages){library
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

Import network data. Ensure the file path is correct

GAStech_nodes <- read_csv("data/GAStech_email_node.csv")
GAStech_edges <- read_csv("data/GAStech_email_edge-v2.csv")

Task 1: Static Organisation Graph

Data Wringling from Hands-on Exercise 10

GAStech_edges$SentDate  = dmy(GAStech_edges$SentDate)
GAStech_edges$Weekday = wday(GAStech_edges$SentDate, label = TRUE, abbr = FALSE)

GAStech_edges_aggregated <- GAStech_edges %>%
  filter(MainSubject == "Work related") %>%
  group_by(source, target, Weekday) %>%
    summarise(Weight = n()) %>%
  filter(source!=target) %>%
  filter(Weight > 1) %>%
  ungroup()

Creating Network Objects from Hands-on Exercise 10

GAStech_graph <- 
  tbl_graph(nodes = GAStech_nodes, edges = GAStech_edges_aggregated, directed = TRUE)

GAStech_graph %>%
  activate(edges) %>%
  arrange(desc(Weight))

With reference to the organisation network graph in Section 6.1 of Hands-on Exercise 10, you are required to complete the following tasks:

1.1 Improve the code chunk used to create the organisation network graph by using the latest functions provided in ggraph2.0.

gggraph 2.0 releases a new feature called qgraph to create a quick Organisation Network Chart for exploratory purposes. It serves a similar purpose to the qmap to produce a quick map

Reference: https://cran.rstudio.com/web/packages/ggraph/news/news.html

g <-GAStech_graph %>%
  mutate(betweenness_centrality = centrality_betweenness()) %>%
  mutate(closeness_centrality = centrality_closeness()) %>%
  #Modified code
  qgraph(node_size = betweenness_centrality, node_colour = closeness_centrality)
g

1.2 Identify three aspects of the graph visualisation in Section 6.1 that can be improved.

  • Lack of labelling in general poses a difficulty for users to interpret any meaningful information without any categorisation by Weekday or Department.

  • The edges are difficult to differentiate due to thick edges and the black edges overlap with the darker shades nodes.

  • Nodes with lower centrality faces visibility problems as followed:

    • The size of nodes are not scaled properly
    • Lower closeness centrality leads to darker shades.

1.3 Provide the sketch of your alternative design.

1.4 Using appropriate ggraph functions, plot the alternative design.

Overall network graph

The improvised chart will include:

  • The key employees’ name

  • Change of edge colour to grey for lighter shade

  • Change to concentric layout where the most central nodes are placed in the center

  • Introduction of scaling - Edges and Node size. The scaling of edges will reduce the thickness while the node size are scaled larger to improve readibility

GAStech_graph<- GAStech_graph %>%
  mutate(betweenness_centrality = centrality_betweenness()) %>%
  mutate(closeness_centrality = centrality_closeness()) 

g <- ggraph(GAStech_graph, layout="nicely") +
  geom_edge_link0(aes(edge_width=Weight), color="grey66", alpha=0.3) +
  geom_node_point(aes(colour = closeness_centrality, size=betweenness_centrality)) +
  scale_color_gradient(low="#f44336", high="#FFEB3B") +
  theme_graph() + theme(legend.position = "left")

g + geom_node_label(aes(filter=closeness_centrality > 0.015, label= label), repel = TRUE)+
  ggtitle("Centrality indices Chart") 

Network graph by Department

The overall network graph does provide critical information regarding the key employees and its interaction with other employees. However, it does not display which department Mat, Birgitta, Hideki and Ruscella often interact with and the frequency of interaction as indicated by the weight of the edges. The thicker the edges, the higher the frequency of interaction.

The graph below is categorised/faceted by Department.

g + facet_nodes(~Department) + 
  geom_node_label(aes(filter=closeness_centrality > 0.010, label= label), 
                  repel = TRUE) +
  ggtitle("Centrality indices Chart by Department") 

The chart derives key information as followed:

  • Engineering, Security and Administration are vital part of company’s day-to-day operation

  • Key employees for each department

    • Hideki is in charge of security department
    • Mat and Ruscella is in charge of Administration
    • Birgitta is in charge of Engineering
  • Executives department are rarely involved in day-to-day operation in the company and have very little interaction with each other

  • The interaction occurs more often among orange nodes employees.

  • Birgitta interact with 2 teams, which suggests there are 2 sub-departments within engineering

Task 2: Interactive Organisation Graph

Data Preparation from Hands-on Exercise 10

GAStech_edges_aggregated <- GAStech_edges %>%
  left_join(GAStech_nodes, by = c("sourceLabel" = "label")) %>%
  rename(from = id) %>%
  left_join(GAStech_nodes, by = c("targetLabel" = "label")) %>%
  rename(to = id) %>%
  filter(MainSubject == "Work related") %>%
  group_by(from, to) %>%
    summarise(weight = n()) %>%
  filter(from!=to) %>%
  filter(weight > 1) %>%
  ungroup()

Based on Hands-on exercise 10, the visNetwork() looks for a field called “group”. Rename Department column to group

GAStech_nodes <- GAStech_nodes %>%
  rename(group = Department)
GAStech_nodes

2.1 Improve the design of the graph by incorporating the following interactivity:

  • When a name is selected from the drop-down list, the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.

  • When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.

In the code chunk below, VisNodes is used to incorporate the font size and scaling. With regards to scaling, there are 2 properties:

  • Threshold - When zooming out, the font will be drawn smaller. It defines the minimum limit

  • maxVisible - When zooming in, the font will be drawn larger. It defines the maximum limit

visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
  visNodes(font = list(size=30), 
           scaling = list(label=list(Threshold=30, maxVisible=60))) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = list(enabled = T, degree = 0), nodesIdSelection = TRUE) 

2.2 Identify three aspects of the graph visualisation in Section 7.4 that can be improved.

2.3 Provide the sketch of your alternative design.

2.4 Using appropriate visNetwork functions, plot the alternative design.

visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = list(enabled = T, degree = 1, hover = T, 
                                     algorithm ="hierarchical"), 
            selectedBy = list(variable= "group", main="Department"), 
            nodesIdSelection = list(main="Employee Name")) %>%
  visNodes(labelHighlightBold = TRUE, shape = "box", shadow = list(enabled = TRUE, size = 50), 
           font = list(size=30), scaling = list(label=list(Threshold=30, maxVisible=60))) %>%
  visLegend(position="left", zoom=FALSE) %>%
  visEdges(arrows = "to", color = list(highlight="#424242"))