2026-5-4

Groups!

##           group 1         group 2         group 3        group 4
## 1 Batson, Anthony    Mendoza, Ava   Pacheco, Alex           <NA>
## 2    Myoung, Sein Randall, Javion Bell, Mary Rose    Ong, Alyssa
## 3     Qin, Celine Kang, Christine     Smith, Reid    Pham, Canon
## 4                   Leahy, Olivia  Knowles, Genny Devir, Lindsey
##             group 5
## 1 Wolfenstein, Luci
## 2 Mahoney, Brigette
## 3      Barga, Jolie
## 4     Moore, Allana

Warm-up

  • Create a visualization with the provided network data
##           group 1         group 2         group 3        group 4
## 1 Batson, Anthony    Mendoza, Ava   Pacheco, Alex           <NA>
## 2    Myoung, Sein Randall, Javion Bell, Mary Rose    Ong, Alyssa
## 3     Qin, Celine Kang, Christine     Smith, Reid    Pham, Canon
## 4                   Leahy, Olivia  Knowles, Genny Devir, Lindsey
##             group 5
## 1 Wolfenstein, Luci
## 2 Mahoney, Brigette
## 3      Barga, Jolie
## 4     Moore, Allana

Warm-up

  • Create a visualization with the provided network data
  • How did you represent the data?
  • What are the important components?
  • What message do you seek to communicate?
  • What challenges/questions came up?

Plotting Networks

  • On Wednesday, we will plot networks!

Today’s Class

  • Warm-up: network data
  • What is network data?
  • Network metrics
  • Activity: Calculating networks by hand
  • Mid-quarter evaluation

Wednesday’s Class

  • Introduction to igraph
  • Network centrality with authorship data
  • Network data and tidycensus

Office Hours

  • Office Hours:Friday 1:30-3:30pm (Tyler)
  • Tuesdays, 10:30am-12:00pm (Yao)

Miscellaneous

  • Final Project: Collaboration encouraged, track individual contributions (e.g. through GitHub)
  • Final Project Rubric: Will be available this week
  • May 13th: Guest speaker from Recidiviz

Event of Interest

Learning Goals

  • Motivate examination of network data
  • Understand network structure (node vs. edge, directed vs. undirected)
  • Understand different types of centrality and what these mean
  • (Wednesday) Learn how to plot networks using igraph

What is Network Data?

Why Networks?

  • Up to this point, our units of analysis have mostly been people and places
  • What if we want to study connections between people or places?

Why Networks?

  • Activity Spaces
  • Friendship ties
  • Funding connections
  • Migration flows

Basics of Network Data

  • Two elements:
  1. Nodes
  2. Edges
Network example. From Nykamp DQ, An introduction to networks.

Network example. From Nykamp DQ, An introduction to networks.

What Are Networks?

  • Two types of edges:
  1. Directed
  • e.g. cash payments, job applications, academic citations, etc.
  1. Undirected
Directed Network. From [Nykamp DQ, An introduction to networks.

Directed Network. From [Nykamp DQ, An introduction to networks.

What Are Networks?

  • Two types of edges:
  1. Directed
  2. Undirected
  • e.g. friendship ties, common board members, shared destinations, etc.
Undirected Network. From Nykamp DQ, An introduction to networks.

Undirected Network. From Nykamp DQ, An introduction to networks.

Friendship Networks?

From McMillan, 2019: Friendship network at Sunshine High School by immigrant generation status. Circles represent students, and curved lines represent friendships. For the purpose of this illustration, both reciprocated and nonreciprocated friendships have been graphed.

From McMillan, 2019: Friendship network at Sunshine High School by immigrant generation status. Circles represent students, and curved lines represent friendships. For the purpose of this illustration, both reciprocated and nonreciprocated friendships have been graphed.

Funding Flows

What Are Networks?

  • In groups:
  • In the warm up, what are the nodes? edges?
  • Is the data undirected or directed?
  • We can modify the nodes and edges according to various characteristics to make our visuals more interesting

Data Visualization

  • Nodes are users
  • Edges are twitter interactions
  • Colors based on opinions

Data Visualization

  • Nodes are actors
  • Edges are information shared
  • Color based on agreement with ‘There should be an international binding commitment on all nations to reduce GHG emissions’.

How to Represent Network Data

  • One-mode vs. Two-mode

Network Visualization

  • Some networks have multiple “modes” or levels
  • For example, two-mode network below
  • Small nodes are actors, large nodes are organizations:

Network Visualization

  • Can represent this as one-mode network
  • Now ties represent shared actors
  • Colors represent funding from Koch/Exxon (green) or not (red)

Network Visualization

  • In pairs:
  • How would you describe the following network? (one mode, two mode)
  • Could it be projected to a one-mode network? How?

Recap of Network Basics

  • Nodes and edges are the building blocks of networks
  • Networks can be directed or undirected
  • Two mode networks have two levels (e.g. individuals, institutions)

Measures of Centrality

Why Centrality?

  • We might want to evaluate the central nodes in our network
  • For example: centrality in a network might represent social capital or influence

Measures of Centrality

  • Degree Centrality
  • Node with the highest number of ties is most central
  • Think: most friends

Measures of Centrality

  • Betweenness Centrality
  • Based on “shortest paths” between nodes, node involved in most “shortest paths” is most central
  • Think: who is involved in the most friendship pathways?

Measures of Centrality

  • Closeness Centrality
  • Based on “shortest paths” between nodes, node with lowest average “shortest path” to the others is most central
  • Think: who could spread a message fastest?

Measures of Centrality

  • Eigenvalue Centrality
  • Based on the degree of adjacent nodes, node with “friends in high places” is most central
  • Think: who has friends with the most connections?

Measures of Centrality

  • Group activity:
  • Which point(s) have the highest degree centrality? (most edges)
  • Betweenness centrality? (involved in most “shortest paths”)
  • Closeness centrality? (lowest average “shortest path” to others)
  • Eigenvalue Centrality? (“friends in high places”)

Measures of Centrality

Centrality Recap

  • Centrality is a way to measure the imporance of a network position
  • There are many different measures of centrality
  • We can choose a measure based on our theory of why centrality is important (e.g. number of edges, betweenness, closeness, friends in high places)

Warm-up

  • Create a visualization with the provided network data
  • Consider using color/scale/labels/comparisons
  • In your visualization, which are most central nodes?
##           group 1         group 2         group 3        group 4
## 1 Batson, Anthony    Mendoza, Ava   Pacheco, Alex           <NA>
## 2    Myoung, Sein Randall, Javion Bell, Mary Rose    Ong, Alyssa
## 3     Qin, Celine Kang, Christine     Smith, Reid    Pham, Canon
## 4                   Leahy, Olivia  Knowles, Genny Devir, Lindsey
##             group 5
## 1 Wolfenstein, Luci
## 2 Mahoney, Brigette
## 3      Barga, Jolie
## 4     Moore, Allana

Wednesday’s Class

  • Introduction to igraph
  • Network visualization and centrality with authorship data

Office Hours

  • Office Hours:Friday 1:30-3:30pm (Tyler)
  • Tuesdays, 10:30am-12:00pm (Yao)

Miscellaneous

  • PSets due on Tuesdays! Starting with PSet 6
  • Final Project Rubric available on Canvas
  • May 13th: Guest speaker from Recidiviz

Learning Goals

  • Build our own networks in R
  • Calculate centrality measures
  • Set network characteristics

Building Our First Networks

Building Our First Networks

  • Let’s create a network!
  • We’ll represent friendship ties among 7 people
  • We’ll call it el for “edge list”
  • Notice how the data structure is different
  • Needs to be a matrix
el <- data.frame(person_1 = c("Lionel", "Frederica", "Garrett", "Lina",
                               "Sampson", "Garrett", "Frederica", "Lionel",
                              "Frederica", "Frederica", "Ahmad"),
                  person_2 = c("Lina", "Charlotte", "Lina", "Sampson",
                               "Charlotte", "Lionel", "Lina", "Sampson",
                               "Sampson", "Garrett", "Garrett")) %>%
  as.matrix()
# take a look
el
##       person_1    person_2   
##  [1,] "Lionel"    "Lina"     
##  [2,] "Frederica" "Charlotte"
##  [3,] "Garrett"   "Lina"     
##  [4,] "Lina"      "Sampson"  
##  [5,] "Sampson"   "Charlotte"
##  [6,] "Garrett"   "Lionel"   
##  [7,] "Frederica" "Lina"     
##  [8,] "Lionel"    "Sampson"  
##  [9,] "Frederica" "Sampson"  
## [10,] "Frederica" "Garrett"  
## [11,] "Ahmad"     "Garrett"

Building Our First Networks

  • Create your own:
el <- data.frame(person_1 = c("Lionel", "Frederica", "Garrett", "Lina",
                               "Sampson", "Garrett", "Frederica", "Lionel",
                              "Frederica", "Frederica", "Ahmad"),
                  person_2 = c("Lina", "Charlotte", "Lina", "Sampson",
                               "Charlotte", "Lionel", "Lina", "Sampson",
                               "Sampson", "Garrett", "Garrett")) %>%
  as.matrix()
# take a look
el
##       person_1    person_2   
##  [1,] "Lionel"    "Lina"     
##  [2,] "Frederica" "Charlotte"
##  [3,] "Garrett"   "Lina"     
##  [4,] "Lina"      "Sampson"  
##  [5,] "Sampson"   "Charlotte"
##  [6,] "Garrett"   "Lionel"   
##  [7,] "Frederica" "Lina"     
##  [8,] "Lionel"    "Sampson"  
##  [9,] "Frederica" "Sampson"  
## [10,] "Frederica" "Garrett"  
## [11,] "Ahmad"     "Garrett"

Building Our First Networks

  • We can use igraph to create a network object
  • Should it be directed?
library(igraph)

# create network from edge list
net <- graph_from_edgelist(el)

Building Our First Networks

  • Let’s assume friendships are bidirectional (undirected)
library(igraph)

# create network from edge list
net <- graph_from_edgelist(el, directed = FALSE)

Building Our First Networks

  • Let’s try plotting our network!
# plot network
plot(net)

Building Our First Networks

  • We can adjust the graph using arguments that start with vertex and edge
# customize network plot
plot(net, vertex.size = 30, 
     vertex.color = "lavender", 
     vertex.frame.color = NA, 
     vertex.label.cex = .7, 
     edge.curved = .05, 
     edge.arrow.size = .3, 
     edge.width = .7, 
     edge.color = "gray1")

Building Our First Networks

  • We can adjust the graph using arguments that start with vertex and edge
# customize network plot
plot(net, vertex.size = 30, 
     vertex.color = "lavender", 
     vertex.frame.color = NA, 
     vertex.label.cex = .7, 
     edge.curved = .05, 
     edge.arrow.size = .3, 
     edge.width = .7, 
     edge.color = "gray1")

Measures of Centrality

Measures of Centrality

  • Degree Centrality
  • Based on number of edges, node with most edges is most central
  • degree()

Measures of Centrality

  • Betweenness Centrality
  • Based on “shortest paths” between nodes, node involved in most “shortest paths” is most central
  • betweenness()

Measures of Centrality

  • Closeness Centrality
  • Based on “shortest paths” between nodes, node with lowest average “shortest path” to the others is most central
  • closeness()

Measures of Centrality

  • Eigenvalue Centrality
  • Based on the degree of adjacent nodes, node with “friends in high places” is most central
  • evcent()

Calculating Centrality

  • With el, calculate centrality!
  • Use degree, betweenness, closeness, and evcent
  • Which has the highest closeness score?
  • Which has the highest eigenvalue centrality score?
el <- data.frame(person_1 = c("Lionel", "Frederica", "Garrett", "Lina",
                               "Sampson", "Garrett", "Frederica", "Lionel",
                              "Frederica", "Frederica", "Ahmad"),
                  person_2 = c("Lina", "Charlotte", "Lina", "Sampson",
                               "Charlotte", "Lionel", "Lina", "Sampson",
                               "Sampson", "Garrett", "Garrett")) %>%
  as.matrix()
# take a look
el
##       person_1    person_2   
##  [1,] "Lionel"    "Lina"     
##  [2,] "Frederica" "Charlotte"
##  [3,] "Garrett"   "Lina"     
##  [4,] "Lina"      "Sampson"  
##  [5,] "Sampson"   "Charlotte"
##  [6,] "Garrett"   "Lionel"   
##  [7,] "Frederica" "Lina"     
##  [8,] "Lionel"    "Sampson"  
##  [9,] "Frederica" "Sampson"  
## [10,] "Frederica" "Garrett"  
## [11,] "Ahmad"     "Garrett"
# create network from edge list
net <- graph_from_edgelist(el, directed = FALSE)

Setting Network Characteristics

How do we set network characteristics?

  • How would we normally create a variable?
  • mutate()
  • Does this work with a network?
  • Try net %>% mutate()

How do we set network characteristics?

  • Network data are not like other data we’ve worked with
  • Is net a dataframe? How would we find out?
  • class(net)

How do we set network characteristics?

  • We can explore network characteristics with V()
  • Try it! V(net)
  • We can call variables with the $ sign (similar to dataframes)
  • We can set variables using V(net)$variable <- newvariable
  • Let’s try it!
V(net)$betweenness <- betweenness(net)

How do we set network characteristics?

  • Try setting betweenness, closeness, degree, eigenvalue centrality
V(net)$betweenness <- betweenness(net)

Benefits of setting network characteristics?

  • A benefit of setting characteristics: we can plot with them!
  • Try modifying vertex.size = V(net)$betweenness
  • Try modifying other characteristics too
plot(net, 
     vertex.size = V(net)$betweenness*10, 
     vertex.color = "lavender", 
     vertex.frame.color = NA, 
     vertex.label.cex = .5, 
     edge.curved = .05, 
     edge.width = .1, 
     edge.color = "gray1")

Benefits of setting network characteristics?

IPCC Report Authorship

IPCC Report Authorship

  • IPCC has written 6 climate assessments since 1990 (FAR - AR6)
  • Three working groups:
  • WGI: physical and scientific basis of climate change
  • WGII: vulnerability of natural and social systems to climate change
  • WGIII: mitigation strategies and policy recommendations
  • All of these influence global policies and mitigation strategies
  • Geographic bias? Most authors come from “Global North” countries

IPCC Report Authorship

  • Let’s take a look at the WGIII authors
  • Try the following code (section 6.4 of course site)
  • Then examine your data with View()
  • How would you describe your data?
library(RCurl)

# input url from github
url <- getURL("https://raw.githubusercontent.com/tylermcdaniel/css/main/Data/IPCC_co_authorship.csv")

# download authorship data (from github)
ipcc_authors <- read_csv(url, skip = 1)

# replace NA's with 0's
ipcc_authors %<>%
  mutate_all(~replace(., is.na(.), 0))

IPCC Report Authorship

  • Need to clean up data
library(dplyr)
# reorder column names to match rows
ipcc_authors %<>%
  select(order(colnames(.)))


# turn dataframe into matrix
ipcc_authors %<>%
  select(-Author) %>%
  as.matrix()

IPCC Report Coauthoship

  • Now we can turn it into a network!
ipcc_net <- graph_from_adjacency_matrix(ipcc_authors,
                                        mode = "undirected")

IPCC Report Coauthoship

  • Try it on your own!
# reorder column names to match rows
ipcc_authors %<>%
  select(order(colnames(.)))


# turn dataframe into matrix
ipcc_authors %<>%
  select(-Author) %>%
  as.matrix()

ipcc_net <- graph_from_adjacency_matrix(ipcc_authors,
                                        mode = "undirected")

IPCC Report Coauthorship

  • Try plotting
plot(ipcc_net, vertex.size = 2, 
     vertex.color = "lavender", 
     vertex.frame.color = NA, 
     vertex.label.cex = .5, 
     edge.curved = .05, 
     edge.width = .1, 
     edge.color = "gray1")

IPCC Report Coauthorship

  • Let’s only look at \(\ge2\) collaborations
# replace 1s with 0s
ipcc_authors[ipcc_authors == 1] = 0

# replace NAs with 0s
ipcc_authors[is.na(ipcc_authors)] = 0

IPCC Report Coauthorship

  • Let’s remove authors who did not collaborate with anyone
# vector of authors who do not collaborate with others
no_colab <- which(apply(ipcc_authors, 1, sum, na.rm = TRUE) == 0)


# then we can remove these from ipcc_authors
ipcc_authors <- ipcc_authors[-no_colab,
                             -no_colab]

IPCC Report Coauthorship

  • Now let’s try again
# create network, again
ipcc_net <- graph_from_adjacency_matrix(ipcc_authors,
                                        mode = "undirected")

plot(ipcc_net, vertex.size = 2, 
     vertex.color = "lavender", 
     vertex.frame.color = NA, 
     vertex.label.cex = .5, 
     edge.curved = .05, 
     edge.width = .1, 
     edge.color = "gray1")

IPCC Report Coauthorship

  • We can set network characteristics using V()
# add betweenness centrality
V(ipcc_net)$betweenness <- betweenness(ipcc_net)

IPCC Report Coauthorship

  • We can use network characteristics to format plot
plot(ipcc_net, 
     vertex.size = V(ipcc_net)$betweenness/max(V(ipcc_net)$betweenness) * 20, 
     vertex.color = "lavender", 
     vertex.frame.color = NA, 
     vertex.label.cex = .5, 
     edge.curved = .05, 
     edge.width = .1, 
     edge.color = "gray1")

IPCC Report Coauthorship

  • Create new igraph object
  • Can change weighted = TRUE to weight by collaborations
# create network, again
ipcc_net <- graph_from_adjacency_matrix(ipcc_authors,
                                        mode = "undirected",
                                        weighted = TRUE)

IPCC Report Coauthorship

  • We set network characteristics using V()
# add betweenness centrality
V(ipcc_net)$betweenness <- betweenness(ipcc_net)

IPCC Report Coauthorship

  • We can examine the “most central” authors, according to betweeenness scores
# show authors in descending order
V(ipcc_net)[order(V(ipcc_net)$betweenness, decreasing = TRUE)]
## + 148/148 vertices, named, from 23c573d:
##   [1] Shukla             Schaeffer          Smith             
##   [4] Edenhofer          Price              Carraro           
##   [7] Uerge-Vorsatz      Krey               Riahi             
##  [10] Clarke             Rogner             Lutz              
##  [13] Kahn-Ribeiro       Nakicenovic        Winkler           
##  [16] Hoehne             Paltsev            Michaelowa        
##  [19] Strachan           Sathaye            Faaij             
##  [22] Hourcade           Ravindranath       Halsnaes          
##  [25] Bosetti            Zhang              Skea              
##  [28] Den-Elzen          Lecocq             Fischedick        
## + ... omitted several vertices

IPCC Report Coauthorship

  • We can specify that we want edge width to be proportional to weights:
# plot network
plot(ipcc_net, 
   edge.width=(E(ipcc_net)$weight)/5,
    vertex.size = V(ipcc_net)$betweenness/max(V(ipcc_net)$betweenness) * 20, 
     vertex.color = "lavender", 
     vertex.alpha = 0.5,
     vertex.frame.color = NA, 
     vertex.label.cex = .5, 
     edge.curved = .05, 
     edge.width = .1, 
     edge.color = "gray1")

IPCC Report Coauthorship

  • We can specify that we want edge width to be proportional to weights:

IPCC Report Coauthorship

  • What else could we add to the plot?

IPCC Report Coauthorship

  • We could add region to examine differences in network centrality
# input url from github
url <- getURL("https://raw.githubusercontent.com/tylermcdaniel/css/main/Data/IPCC_cv.csv")

# download authorship data (from github)
ipcc_cvs <- read_csv(url)

# remove authors who did not collaborate
ipcc_cvs %<>%
  slice(-no_colab)

IPCC Report Coauthorship

  • We could add region to examine differences in network centrality
# define global regions
ipcc_cvs %<>%
  mutate(region = case_match(`IPCC representing country (i.e. Country of residence from IPCC doc)`,
                             # Europe will be green
                             c("Austria", "Belgium", "Denmark", "Finland", 
                               "France", "Germany", "Greece", "Hungary", "Italy",
                               "Netherlands", "Norway", "Spain", "Sweden", 
                               "Switzerland", "UK") ~ "green4", 
                             # North America will be red
                             c("Canada", "USA", "Mexico") ~ "tomato",
                             # BRICS will be blue
                             c("Brazil", "Russia", "India", "China", 
                               "South Africa") ~ "steelblue", 
                             # Other countries will be gold
                             .default = "gold"))

IPCC Report Coauthorship

  • Define region variable for the vertices
# define country variable
V(ipcc_net)$region <- ipcc_cvs$region

IPCC Report Coauthorship

  • Can change vertex color using vertex.color
# plot network
plot(ipcc_net, 
   edge.width=(E(ipcc_net)$weight)/5,
    vertex.size = V(ipcc_net)$betweenness/max(V(ipcc_net)$betweenness) * 20, 
     vertex.color = V(ipcc_net)$region, 
     vertex.alpha = 0.5,
     vertex.frame.color = NA, 
     vertex.label.cex = .5, 
     edge.curved = .05, 
     edge.width = .1, 
     edge.color = "gray1")

IPCC Report Coauthorship

  • Finally, we can add a legend manually
# add legend
legend(
  "bottomright",
  legend = c("Europe", "North America", "BRICS", "Other"),
  pt.bg  = c("green4", "tomato", "steelblue", "gold"),
  pch    = 21,
  cex    = 1,
  title  = "Region"
  )

IPCC Report Coauthorship

  • Finally, we can add a legend manually

Interactive Networks with networkD3

Interactive Networks with networkD3

  • First, transform igraph object to networkd3 object
library(networkD3)

# create dataframe for networkD3
ipcc_d3 <- igraph_to_networkD3(ipcc_net, group = V(ipcc_net)$region)

Interactive Networks with networkD3

  • Next, set color palette (optional)
# set color scale
ColourScale <- 'd3.scaleOrdinal()
            .domain(["Europe", "North America", "BRICS", "Other"])
           .range(["#07C343", "#D2140C", "#0C4AD2", "#C7D20C"]);'

Interactive Networks with networkD3

  • Then use forceNetwork() to build interactive network
p <- forceNetwork(Links = ipcc_d3$links,
                  Nodes = ipcc_d3$nodes,
                  Group = "group",
                  height="400px", width="800px",
                  NodeID = "name",
                  fontSize = 14,
                  colourScale = JS(ColourScale),
                  zoom = TRUE)

Interactive Networks with networkD3

  • Plot network!
# plot network!
p

Miscellaneous

  • PSets due on Tuesdays! Starting with PSet 6
  • Final Project Rubric available on Canvas
  • May 13th: Guest speaker from Recidiviz