1.0 Import the library & packages.

First thing first, we have to import the library of packages.

packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse')

for(p in packages){library
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

If you feel that you have installed the above packages before, we can use this code chunk to check if we have installed them before.

p <- c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'tidyverse')
lapply(p, require, character.only = TRUE)
## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE
## 
## [[6]]
## [1] TRUE

Since we have installed before, we can use this code chunk to call the library.

for(p in packages){library
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

2.0 Import the datasets

2.1 Data Wrangling

It is time for us to import the datasets

GAStech_nodes <- read_csv("data/GAStech_email_node.csv")
GAStech_edges <- read_csv("data/GAStech_email_edge-v2.csv")

Thereafter, we will examine the structure of dataset using glimpse() of dplyr.

glimpse(GAStech_edges)
## Observations: 9,063
## Variables: 8
## $ source      <dbl> 43, 43, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 26...
## $ target      <dbl> 41, 40, 51, 52, 53, 45, 44, 46, 48, 49, 47, 54, 27...
## $ SentDate    <chr> "6/1/2014", "6/1/2014", "6/1/2014", "6/1/2014", "6...
## $ SentTime    <time> 08:39:00, 08:39:00, 08:58:00, 08:58:00, 08:58:00,...
## $ Subject     <chr> "GT-SeismicProcessorPro Bug Report", "GT-SeismicPr...
## $ MainSubject <chr> "Work related", "Work related", "Work related", "W...
## $ sourceLabel <chr> "Sven.Flecha", "Sven.Flecha", "Kanon.Herrero", "Ka...
## $ targetLabel <chr> "Isak.Baza", "Lucas.Alcazar", "Felix.Resumir", "Hi...

It’s time to wrangle the datasets

GAStech_edges$SentDate  = dmy(GAStech_edges$SentDate)
GAStech_edges$Weekday = wday(GAStech_edges$SentDate, label = TRUE, abbr = FALSE)

Thereafter, we wrangle the attributes

GAStech_edges_aggregated <- GAStech_edges %>%
  filter(MainSubject == "Work related") %>%
  group_by(source, target, Weekday) %>%
    summarise(Weight = n()) %>%
  filter(source!=target) %>%
  filter(Weight > 1) %>%
  ungroup()
GAStech_edges_aggregated

2.2 Create network objects using tidygrah

Time to create network objects using tidygraph

GAStech_graph <- tbl_graph(nodes = GAStech_nodes, edges = GAStech_edges_aggregated, directed = TRUE)
GAStech_graph
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Node Data: 54 x 4 (active)
##      id label              Department    Title                             
##   <dbl> <chr>              <chr>         <chr>                             
## 1     1 Mat.Bramar         Administrati~ Assistant to CEO                  
## 2     2 Anda.Ribera        Administrati~ Assistant to CFO                  
## 3     3 Rachel.Pantanal    Administrati~ Assistant to CIO                  
## 4     4 Linda.Lagos        Administrati~ Assistant to COO                  
## 5     5 Ruscella.Mies.Hab~ Administrati~ Assistant to Engineering Group Ma~
## 6     6 Carla.Forluniau    Administrati~ Assistant to IT Group Manager     
## # ... with 48 more rows
## #
## # Edge Data: 1,456 x 4
##    from    to Weekday   Weight
##   <int> <int> <ord>      <int>
## 1     1     2 Monday         4
## 2     1     2 Tuesday        3
## 3     1     2 Wednesday      5
## # ... with 1,453 more rows
GAStech_graph %>%
  activate(edges) %>%
  arrange(desc(Weight))
## # A tbl_graph: 54 nodes and 1456 edges
## #
## # A directed multigraph with 1 component
## #
## # Edge Data: 1,456 x 4 (active)
##    from    to Weekday Weight
##   <int> <int> <ord>    <int>
## 1    40    41 Tuesday     23
## 2    40    43 Tuesday     19
## 3    41    43 Tuesday     15
## 4    41    40 Tuesday     14
## 5    42    41 Tuesday     13
## 6    42    40 Tuesday     12
## # ... with 1,450 more rows
## #
## # Node Data: 54 x 4
##      id label           Department     Title           
##   <dbl> <chr>           <chr>          <chr>           
## 1     1 Mat.Bramar      Administration Assistant to CEO
## 2     2 Anda.Ribera     Administration Assistant to CFO
## 3     3 Rachel.Pantanal Administration Assistant to CIO
## # ... with 51 more rows

Hence, it is time to start our Take Home Exercise 2.

3.0 Task 1: Static Organisation Graph

With reference to the organisation network graph in Section 6.1 of Hands-on Exercise 10, you are required to complete the following tasks:

3.1 Improve the code chunk used to create the organisation network graph by using the latest functions provided in ggraph2.0.

By referring section 6.1 of Hands-On Exercise 10, improve the code chunk used to create the organisation network graph by using the latest functions provided in ggraph2.0.

newgraph <- GAStech_graph

qgraph(
  newgraph, 
  node_colour = centrality_closeness(), 
  node_size = centrality_betweenness()
)

From above, we can see that there is reduction of line of codes by using the qgraph function which is a new add-on in ggraph2.0. This new add-on automatically creates an appropriate network and sends it to the plotting method.

3.2 Identify three aspects of the graph visualisation in Section 6.1 that can be improved.

The three aspects of this chart can be improved are that:

  1. In terms of aesthetics, the edges are too dark which is visually unappealing. To improve, can reduce the edges’ colour to something like light grey.

  2. In terms of clarity, the node did not indicate which department do the node belong to. To improve, use outline of the nodes to indicate the department.

  3. In terms of clarity, there is no title shown in the chart and hence will put the title of the chart.

3.3 Provide the sketch of your alternative design

3.4 Using appropriate ggraph functions, plot the alternative design.

The following shows two alternative designs of the chart based on the sketches which can improve the original chart as shown above.

Chart 1

ggraph(GAStech_graph, layout = 'nicely') +
  ggtitle("Chart showing Betweenness and Closeness of the Centrality of each nodes by Department") +
  theme(plot.title = element_text(size=12)) +
  theme(legend.title = element_text(size=10)) +
  geom_edge_link(colour = "darkgray") +
  geom_node_circle(aes(fill = centrality_closeness(), r = log(centrality_betweenness())/50, col = Department), size = 1)

Chart 2

cols_f <- colorRampPalette(RColorBrewer::brewer.pal(11, 'Spectral'))

ggraph(GAStech_graph, layout = 'nicely') +
  ggtitle("Facet charts showing Betweenness and Closeness of the Centrality of each nodes by Department") +
  theme(plot.title = element_text(size=8)) +
  theme(legend.title = element_text(size=6)) +
  geom_edge_link(aes(width=Weight), alpha = 0.8, colour = "darkgray") +
  scale_edge_width(range = c(0.1, 5)) +
  geom_node_point(aes(size = centrality_betweenness(), colour = centrality_closeness())) +
  set_graph_style(size=10) +
  facet_nodes(~Department) +
  th_foreground(foreground = "#32CD32",  border = TRUE)

Both charts show an improvement from the original chart due to the fact that the edges are no longer in dark colours (i.e. black) as we are now using gray colour to indicate the contrasts. Furthermore, the title of each chart are shown in order to allow the users to understand what the charts are showing. Moreover, you can see how the networks are differentiated from each department in the office in the two different kinds of charts shown above where the first chart shows based on the node outline’s colours while the second chart shows based on facet.

4.0 Task 2: Interactive Organisation Graph

4.1 Data Preparation and Assign Groups to Data

GAStech_edges_aggregated <- GAStech_edges %>%
  left_join(GAStech_nodes, by = c("sourceLabel" = "label")) %>%
  rename(from = id) %>%
  left_join(GAStech_nodes, by = c("targetLabel" = "label")) %>%
  rename(to = id) %>%
  filter(MainSubject == "Work related") %>%
  group_by(from, to) %>%
    summarise(weight = n()) %>%
  filter(from!=to) %>%
  filter(weight > 1) %>%
  ungroup()

GAStech_nodes <- GAStech_nodes %>%
  rename(group = Department)

With reference to the organisation network graph in Section 7.4 of Hands-on Exercise 10, you are required to complete the following tasks:

4.2 Improve the design of the graph by incorporating the following interactivity

  1. When a name is selected from the drop-down list, the corresponding node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will also be labelled too.
  2. When a node of the interactive graph is selected, the node will not only be highlighted but also will be labelled. Furthermore, all the linked nodes of the selected node will be labelled as well.
GAStech_nodes <- GAStech_nodes %>%
  mutate(font.size = 48)

visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = list(enabled= TRUE, labelOnly = FALSE), nodesIdSelection = TRUE)

4.3 Identify three aspects of the graph visualisation in Section 7.4 that can be improved.

The three aspects of this chart can be improved are that:

  1. In terms of clarity, it did not show which group/department the nodes belong to. To improve, a dropdown selection will be added to allow the user to see the nodes belong to which department.

  2. In terms of aesthetics, the dropdown box does not provide full visual of the values. To improve, extend the width of the dropdown selection box will be imposed to fully see all the values.

  3. In terms of aesthetics, the labels are overlapped with each other. To improve, a new layout (i.e. sphere layout) will be adopted to minimise the labels of the nodes from overlapping each other. In the event where the labels still overlap with each other, the user can manually drag the nodes to an empty space.

4.4 Provide the sketch of your alternative design.

4.5 Using appropriate visNetwork functions, plot the alternative design.

GAStech_nodes <- GAStech_nodes %>%
  mutate(font.size = 48)

p <- visNetwork(GAStech_nodes, GAStech_edges_aggregated) %>%
  visIgraphLayout(layout = "layout_on_sphere") %>%
  visOptions(selectedBy = list(variable = "group",
                               style = 'width: 160px; height: 26px;'),
             highlightNearest = list(enabled= TRUE, 
                                     labelOnly = FALSE),
             nodesIdSelection = list(enabled = TRUE,
                                     style = 'width: 175px; height: 26px;')) 

visInteraction(p,dragNodes = TRUE)

By applying the aspects mentioned and the sketch shown, this new alternative design chart allows user to see clearly as the labels of the nodes are minimised to prevent from overlapping by applying the sphere layout. Furthermore, user can able to select the department in the dropdown menu and the dropdown menu of the employee are visible enough due to the width of the dropdown “Select by id” design is expanded.