Strategic and Competitive Intelligence Master Degree in Data Science and Business Informatics Univeristà di Pisa
Author
Irene Spada
This exercise introduces the fundamentals of network analysis using the tidygraph and ggraph packages in R. Building on the text mining concepts explored in the previous lecture—such as tokenization, n-grams, and word correlations—we now shift our focus toward network structures derived from relational data.
You will learn how to:
Transform an edge list into a graph object suitable for analysis;
Compute key metrics such as degree centrality and detect communities using the Louvain algorithm;
Visualize networks using intuitive and customizable layouts with ggraph.
Preliminary step
# Load necessary librarieslibrary(tidyverse) #for data manipulation
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidygraph) #for graph manipulation
Attaching package: 'tidygraph'
The following object is masked from 'package:stats':
filter
library(ggraph) #for graph visualization
Step 1: Convert a Data Frame to a Graph
We first define an edge list representing connections between individuals. Next, we use as_tbl_graph() to convert the edge list into a tbl_graph object, which is a tidy representation of a graph suitable for analysis with tidygraph.
# Create a sample edge list data frameedges <- tibble::tibble( from =c("Alice", "Alice", "Bob", "Carol", "Dave", "Eve", "Frank", "Grace"), to =c("Bob", "Carol", "Dave", "Eve", "Frank", "Grace", "Alice", "Bob") )# Convert the edge list into a tbl_graph objectgraph <-as_tbl_graph(edges, directed =FALSE)# View the graph structuregraph
# A tbl_graph: 7 nodes and 8 edges
#
# An undirected simple graph with 1 component
#
# Node Data: 7 × 1 (active)
name
<chr>
1 Alice
2 Bob
3 Carol
4 Dave
5 Eve
6 Frank
7 Grace
#
# Edge Data: 8 × 2
from to
<int> <int>
1 1 2
2 1 3
3 2 4
# ℹ 5 more rows
Step 2: Compute Centrality Measures and Detect Communities
We first set the context to node-level operations with the function activate(nodes). Next centrality_degree() calculates the degree centrality for each node, indicating how many connections each node has. Finally, group_louvain() detects communities within the graph using the Louvain algorithm, assigning a community membership to each node. It assigns a community (cluster) label to each node using the Louvain algorithm. This method compute degree centrality and detect communities by optimizing modularity. It works by putting each node into a group for optimizing modularity score. The output is a vector of group IDs (one for each node).
# Compute centrality and communitiesgraph <- graph %>%activate(nodes) %>%# Focus on node datamutate( degree =centrality_degree(), # Degree centrality community =group_louvain() # Community detection )# View the updated node datagraph %>%as_tibble()
# A tibble: 7 × 3
name degree community
<chr> <dbl> <int>
1 Alice 3 1
2 Bob 3 2
3 Carol 2 1
4 Dave 2 3
5 Eve 2 1
6 Frank 2 3
7 Grace 2 2
Step 3: Visualize the Graph with ggraph
First we initializes the graph plot with a specified layout with ggraph. Next we add the other elements of the graphs in terms of data and visual:
geom_edge_link() adds edges between nodes.
geom_node_point() plots nodes, sizing them by their degree centrality and coloring them by community membership.
geom_node_text() adds labels to the nodes.
theme_void() removes background annotations for a cleaner look.
labs() adds a title to the plot.
# Visualize the graphggraph(graph, layout ="fr") +# Use Fruchterman-Reingold layoutgeom_edge_link(alpha =0.8) +# Draw edges with some transparencygeom_node_point(aes(size = degree, color =as.factor(community))) +# Nodes sized by degree and colored by community geom_node_text(aes(label = name), repel =TRUE, size =3) +# Add node labelstheme_void() +# Clean theme without axeslabs(title ="Network Graph with Centrality and Community Detection")