The network analysis presented below attempts to show the importance (connectivity) of the relationships between nodes (communities at sea) over time. The script displayed here uses the igraph suite of packages for the analysis, and the tidygraph and ggraph packages, to generate a workflow similar to tidyverse and ``ggplot```. The code starts by preparing a dummy database with random values, however the functions and structure are ready to receive real data.

Packages

The code starts loading the necessary packages, as an additional note it is important to mention that ggraph requires ggplot.

library(tidyverse)
library(igraph)
library(tidygraph)
library(ggraph)
library (kableExtra) #this package is for deploying nice tables

Data simulation

To test the performance of this script a random database was generated with the variables indicated by Becca. The ports included in this test are COCA sites and all gear types. As for the years, a sampling period comprising the years 2000-2019 was simulated. For the interactions, the combinatorial of all ports were considered, however at the end of the trial, a random sampling was carried out to simulate “the absence of relations”. Finally, for the variables “num_trips” and “fisherdays” random data with a normalized integer distribution were simulated.

dummy.portori <- c('seabrook','point.pleasant','point.judith','chicoteague','cape.may')
dummy.portlnd <- dummy.portori
dummy.gear <- c('65minus','65plus','dredge','gillnet','midwater','pots.traps')
dummy.year <- c(2000:2019)

set.seed(5)
dummy.data <- expand_grid(dummy.year,dummy.gear,dummy.portori,dummy.portlnd) %>% 
                    rename(year=dummy.year, gear=dummy.gear, port_origin=dummy.portori, portlnd1=dummy.portlnd) %>%
                    add_column(num_trips = sample(1:50, nrow(.), replace = T),
                    total_fisherdays = sample(1:40, nrow(.), replace = T)) %>% 
                    # quitting random rows to simulate lack of relations
                    .[sample(nrow(.),250,replace = F),]
head(dummy.data) %>% 
  kbl() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
year gear port_origin portlnd1 num_trips total_fisherdays
2014 pots.traps point.judith cape.may 6 18
2018 65plus cape.may seabrook 44 34
2001 midwater seabrook seabrook 41 18
2001 65minus seabrook seabrook 41 22
2006 dredge point.judith chicoteague 6 28
2016 gillnet cape.may point.judith 6 26

Calculating Average Percentage Change over different years

Once the dataset was simulated, a function was generated to calculate the Average Percentage Change over different years for each variable at each port. This new data matrix allows the analysis of the time series by calculating the proportional change of each variable during the analysis period.

apc.data <- dummy.data %>% group_by(gear, port_origin, portlnd1) %>% 
                        summarise(apc_num_trips = mean(c(NA, diff(num_trips)),na.rm = TRUE),
                                  apc_total_fisherdays = mean(c(NA, diff(total_fisherdays)),na.rm = TRUE)) %>% 
                        ungroup()

head(apc.data) %>% 
  kbl() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
gear port_origin portlnd1 apc_num_trips apc_total_fisherdays
65minus cape.may cape.may -6.50 0.50
65minus cape.may chicoteague -0.25 3.75
65minus cape.may point.judith NaN NaN
65minus cape.may point.pleasant 1.00 25.00
65minus cape.may seabrook NaN NaN
65minus chicoteague cape.may NaN NaN

Network Analysis

Data preparation

Igraph requires two dataframes for its operation.The first one is the nodes or vertices that will be the actors, in this case the combination of ports and gear types. The second dataset contains the edges or links that will be the links indicated by the port_origin and portlnd variables respectively.

It is important to note that since the network will be based on the nodes defined by the communities at sea, the matrix concatenates the ‘port’ + ‘gear’ to associate the edge with the ‘name’ of the nodes. After this process the ‘Port’ and ‘gear’ are now considered attributes of the nodes and not of the edges, so they are omitted from the links dataframe (I know it’s a bit cryptic but we can discuss it together).

#Vectors to obtain every port and gear type

v.ports <- unique(c(apc.data$port_origin,apc.data$portlnd1))
v.gear <-  unique(apc.data$gear)

# Node data base with name = community at sea, port and gear type

nodes <- expand_grid(port=v.ports,gear=v.gear) %>% mutate(name = paste(port,gear,sep = "-")) %>% select(name,port,gear)

head(nodes) %>% 
  kbl() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
name port gear
cape.may-65minus cape.may 65minus
cape.may-65plus cape.may 65plus
cape.may-dredge cape.may dredge
cape.may-gillnet cape.may gillnet
cape.may-midwater cape.may midwater
cape.may-pots.traps cape.may pots.traps
# edge/links dataframe
edges <- apc.data %>% select(port_origin:apc_total_fisherdays,gear) %>% 
                      mutate(port_origin = paste(port_origin, gear ,sep = "-"),
                             portlnd1 = paste(portlnd1, gear, sep = "-")) %>% 
                      select(-gear)

head(edges) %>% 
  kbl() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
port_origin portlnd1 apc_num_trips apc_total_fisherdays
cape.may-65minus cape.may-65minus -6.50 0.50
cape.may-65minus chicoteague-65minus -0.25 3.75
cape.may-65minus point.judith-65minus NaN NaN
cape.may-65minus point.pleasant-65minus 1.00 25.00
cape.may-65minus seabrook-65minus NaN NaN
chicoteague-65minus cape.may-65minus NaN NaN

Network function

The data prepared in the previous section are used for the tbl_graph function. In this function the nodes and edges are indicated and if the relation is directional. In our particular case each vessel has a port of origin and destination, so the relationship is directional. Additionally, the degree of centrality was calculated to know the relative importance of each port given its connectivity.

As a second step the `ggraph function generates a graphical representation of the network. In this, each gear type creates a particular network in which the nodes are the different ports. The relationships are given by the source and destination ports. In addition, the color of the line is based on the average percentage change of the ‘fisherdays’, while the ‘num_trips’ indicates the thickness of the line. (This is fully modifiable).

# Building the network-matrix
net <- tbl_graph(nodes = nodes, edges = edges, directed = TRUE) %>%
                      mutate(degree = centrality_degree(normalized = T)) #Degree of centrality


# Plotting the network

ggraph(net, layout = 'star') + 
  geom_edge_link(aes(width = apc_num_trips, color = apc_total_fisherdays), alpha = 0.8) + 
  scale_edge_width(range = c(0.2, 2)) +
  scale_edge_color_viridis()+
  geom_node_point(aes(size= degree)) +
  geom_node_text(aes(label = port), repel = TRUE) +
  labs(edge_width = "num_trips") +
  facet_nodes(~gear)+
  coord_fixed()

Calculating centrality scores and other vertex attributes

Besides the graphical representation, this code concludes with the preparation of a score table for each node. The calculated indices are individual and cluster centrality; authority (sum of the relationship values in each cluster) and closeness to the central node.

V(net)$eig <- evcent(net)$vector                    # Eigenvector centrality
V(net)$hubs <- hub.score(net)$vector                # "Hub" centrality
V(net)$authorities <- authority.score(net)$vector   # "Authority" centrality
V(net)$closeness <- closeness(net)                  # Closeness centrality
V(net)$betweenness <- betweenness(net)              # Vertex betweenness centrality

network.stat <- data.frame(centrality      = V(net)$degree,
                         closeness   = V(net)$closeness,
                         betweenness = V(net)$betweenness,
                         eigenvector = V(net)$eig)

head(network.stat) %>% 
  kbl() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
centrality closeness betweenness eigenvector
cape.may-65minus 5 0.0013263 2.8333333 0.0000000
cape.may-65plus 3 0.0013245 2.5000000 0.0000000
cape.may-dredge 3 0.0013245 0.3333333 0.0000000
cape.may-gillnet 4 0.0013245 0.0000000 0.8027136
cape.may-midwater 5 0.0013263 2.5000000 0.0000000
cape.may-pots.traps 3 0.0013228 1.5000000 0.0000000