library(igraph)
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ lubridate::%--%() masks igraph::%--%()
## ✖ dplyr::as_data_frame() masks tibble::as_data_frame(), igraph::as_data_frame()
## ✖ purrr::compose() masks igraph::compose()
## ✖ tidyr::crossing() masks igraph::crossing()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::simplify() masks igraph::simplify()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(igraphdata)
Networks are powerful representations of relationships, allowing us to explore social, organizational, and technological systems in rich detail. In this project, I analyze Zachary’s Karate Club dataset a classic network in social science. The dataset records friendships between 34 members of a university karate club observed from 1970 to 1972 by Wayne Zachary. Eventually, a dispute caused the club to split into two factions. This project aims to uncover how network structure reflects real-world social divisions by calculating and interpreting centrality metrics and community structures. I use R and the igraph package to explore how leadership, influence, and group membership can be revealed through connections alone. This work builds on my data visualization and analysis skills and offers a compelling artifact to share with future employers.
data("karate")
g <- karate
Nodes with higher degree centrality are more socially active and well-connected. In this dataset, Node 1 and Node 34 stand out with the highest degrees, reflecting their roles as the instructor and club president. Their centrality suggests significant influence over the group’s dynamics and communication.
cat("Number of nodes:", vcount(g), "\n")
## This graph was created by an old(er) igraph version.
## ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
## For now we convert it on the fly...
## Number of nodes: 34
cat("Number of edges:", ecount(g), "\n")
## Number of edges: 78
# Node labels represent individual members, node size is uniform, and color is set to light blue.
# The layout helps reveal natural groupings and overall structure in the network.
plot(g,
vertex.label = V(g)$name,
vertex.size = 6,
vertex.color = "skyblue ",
edge.arrow.size = 0.5,
layout = layout_with_fr,
main = "Zachary's Karate Club Network")
# This code calculates the degree centrality for each node in the network,
# prints a summary of the degree values, and then visualizes them using a bar-like plot.
# Nodes with higher degree have more direct connections, indicating higher social activity or influence.
deg <- degree(g)
print(summary(deg))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 4.588 5.000 17.000
plot(deg,
type = "h",
main = "Degree Centrality",
xlab = "Node",
ylab = "Degree")
# This code computes betweenness centrality for each node, summarizes the values,
# and displays the distribution in a histogram. Nodes with high betweenness act as
# bridges or brokers in the network, playing key roles in connecting different parts of the graph.
btwn <- betweenness(g)
print(summary(btwn))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.000 2.167 26.194 15.950 250.150
hist(btwn,
main = "Betweenness Centrality",
xlab = "Betweenness",
col = "lightgreen ")
cl <- closeness(g)
print(summary(cl))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.003289 0.004751 0.005348 0.005451 0.006051 0.007692
plot(cl,
type = "h",
main = "Closeness Centrality",
ylab = "Closeness")
avg_clustering <- transitivity(g, type = "average")
cat("Average clustering coefficient:", avg_clustering, "\n")
## Average clustering coefficient: 0.5879306
fg <- cluster_fast_greedy(g)
plot(fg, g,
layout = layout_with_fr,
vertex.size = 6,
main = "Community Detection (Fast Greedy)")
print(sizes(fg))
## Community sizes
## 1 2 3
## 18 11 5
Through this network analysis of Zachary’s Karate Club, we’ve seen how simple measures like degree, between ness, and closeness can reveal key players and communication pathways in a social system. More importantly, community detection algorithms mirrored the real world split that occurred in the club, proving how structural data can reflect and even anticipate human behavior. This project not only deepened my skills in network analysis and R visualization, but also reinforced the importance of combining quantitative analysis with thoughtful interpretation. It serves as a foundation for future work in social network analysis, organizational research, and any domain where relationships matter.
cat("\n--- Final Summary ---\n")
##
## --- Final Summary ---
cat("• The network contains", vcount(g), "nodes and", ecount(g), "edges.\n")
## • The network contains 34 nodes and 78 edges.
cat("• Nodes with highest degree are likely leaders (Node 1 and 34).\n")
## • Nodes with highest degree are likely leaders (Node 1 and 34).
cat("• High betweenness nodes act as information bridges.\n")
## • High betweenness nodes act as information bridges.
cat("• The community detection algorithm identified", length(fg), "subgroups.\n")
## • The community detection algorithm identified 3 subgroups.
cat("• The split aligns with the historical club division.\n")
## • The split aligns with the historical club division.
cat("• This analysis demonstrates how structural measures predict social outcomes.\n")
## • This analysis demonstrates how structural measures predict social outcomes.