This case study analyzed the Social Network Analysis and Education Year 1 Collaboration Data set. The data consist of 43 school leaders within one school district. This study looks at the centrality and the reciprocity of the school leader collaboration network in Year 1.
There are four libraries needed in this case study: tidygraph 📦 igraph 📦 ggraph 📦 readxl 📦
library(tidygraph)
##
## Attaching package: 'tidygraph'
## The following object is masked from 'package:stats':
##
## filter
library(igraph)
##
## Attaching package: 'igraph'
## The following object is masked from 'package:tidygraph':
##
## groups
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
library(ggraph)
## Loading required package: ggplot2
library(readxl)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ tibble 3.1.6 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ✓ purrr 0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::as_data_frame() masks tibble::as_data_frame(), igraph::as_data_frame()
## x purrr::compose() masks igraph::compose()
## x tidyr::crossing() masks igraph::crossing()
## x dplyr::filter() masks tidygraph::filter(), stats::filter()
## x dplyr::groups() masks igraph::groups(), tidygraph::groups()
## x dplyr::lag() masks stats::lag()
## x purrr::simplify() masks igraph::simplify()
year_1_collaboration <- read_excel("data/year_1_collaboration.xlsx",
col_names = FALSE)
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ...
Building on the Unit 2 practice, this case study is guided by two questions: 1. How do the centrality measures for the Year 1 directed network reveal the collaboration pattern formed at the begining of the reform? 2. What are the reciprocated ties in Year 1 directed network? In particular, this case study looked into the centrality and reciprocity concepts more in depth.
#Add row and column names for Year 3
rownames(year_1_collaboration) <- 1:43
## Warning: Setting row names on a tibble is deprecated.
colnames(year_1_collaboration) <- 1:43
#View the Year 1 dataset with added names
year_1_collaboration
## # A tibble: 43 × 43
## `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11` `12` `13`
## * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 0 3 0 0 0 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 0 0 0 0 0 0 0
## 3 4 0 0 0 0 0 0 0 0 0 0 0 0
## 4 0 0 0 0 0 0 0 0 0 0 0 0 0
## 5 0 0 0 0 0 0 0 0 3 0 0 0 0
## 6 3 0 0 0 0 0 0 0 0 0 0 0 0
## 7 0 0 0 0 0 0 0 0 0 0 0 0 0
## 8 0 0 0 0 0 0 0 0 0 4 0 0 0
## 9 0 0 0 0 4 0 0 0 0 0 0 0 0
## 10 0 0 0 0 0 0 0 4 0 0 0 0 0
## # … with 33 more rows, and 30 more variables: `14` <dbl>, `15` <dbl>,
## # `16` <dbl>, `17` <dbl>, `18` <dbl>, `19` <dbl>, `20` <dbl>, `21` <dbl>,
## # `22` <dbl>, `23` <dbl>, `24` <dbl>, `25` <dbl>, `26` <dbl>, `27` <dbl>,
## # `28` <dbl>, `29` <dbl>, `30` <dbl>, `31` <dbl>, `32` <dbl>, `33` <dbl>,
## # `34` <dbl>, `35` <dbl>, `36` <dbl>, `37` <dbl>, `38` <dbl>, `39` <dbl>,
## # `40` <dbl>, `41` <dbl>, `42` <dbl>, `43` <dbl>
#Convert to matrix object for Year 3
year_1_matrix <- as.matrix(year_1_collaboration)
#Convert to graph object for Year 3 (directed)
year_1_network_D <- as_tbl_graph(year_1_matrix, directed = TRUE)
#View the Year 3 directed dataset after the conversion
year_1_network_D
## # A tbl_graph: 43 nodes and 82 edges
## #
## # A directed simple graph with 3 components
## #
## # Node Data: 43 × 1 (active)
## name
## <chr>
## 1 1
## 2 2
## 3 3
## 4 4
## 5 5
## 6 6
## # … with 37 more rows
## #
## # Edge Data: 82 × 3
## from to weight
## <int> <int> <dbl>
## 1 1 3 3
## 2 3 1 4
## 3 3 24 3
## # … with 79 more rows
In a directed network, the centrality consists of the in- and out-degree. Therefore, the in-degree centrality for Year 1 is: 0.1
# in-degree
InD <- centr_degree(year_1_network_D, mode = "in")
InD
## $res
## [1] 4 0 3 1 2 0 2 6 3 4 1 0 0 1 2 1 1 3 2 1 2 2 2 3 2 3 1 0 6 2 2 1 1 3 2 0 4 0
## [39] 4 1 1 2 1
##
## $centralization
## [1] 0.09745293
##
## $theoretical_max
## [1] 1806
# in-degree
hist(degree(year_1_network_D, mode = "in"), col="light blue",
main = "In-degree distribution in Year 1 Network",
xlab = "Degree", ylab = "Frecuency",
ylim = c(0,20),
xlim = c(0,10)
)
The out-degree centrality for Year 3 is: 0.02
# out-degree
OuD <- centr_degree(year_1_network_D, mode = "out")
OuD
## $res
## [1] 1 0 2 2 1 3 2 2 1 3 3 1 0 3 2 2 1 2 2 3 3 2 1 3 2 1 3 3 2 2 2 2 2 3 1 3 0 3
## [39] 2 1 2 2 1
##
## $centralization
## [1] 0.02602436
##
## $theoretical_max
## [1] 1806
# out-degree
hist(degree(year_1_network_D, mode = "out"), col="blue",
main = "Out-degree distribution in Year 3",
xlab = "Degree", ylab = "Frecuency",
ylim = c(0,20),
xlim = c(0,10),breaks =4)
Given the fact that the in- and out-degree centrality is very different, I looked at the hub score of the vectors.
# Hub Score
hub.score(year_1_network_D)$vector
## 1 2 3 4 5 6
## 8.545589e-04 0.000000e+00 1.972920e-02 4.238596e-01 8.957799e-03 1.360977e-02
## 7 8 9 10 11 12
## 1.196560e-01 8.120156e-04 1.675361e-03 1.854966e-03 1.795702e-03 5.027704e-05
## 13 14 15 16 17 18
## 0.000000e+00 1.063527e-01 3.085444e-04 5.818655e-01 1.491743e-05 6.261540e-01
## 19 20 21 22 23 24
## 1.234457e-01 9.490801e-04 1.413754e-02 7.957694e-01 4.910621e-04 6.350592e-03
## 25 26 27 28 29 30
## 4.205193e-03 8.545589e-04 9.902890e-03 1.000000e+00 7.841599e-01 1.576128e-03
## 31 32 33 34 35 36
## 4.364355e-03 7.685158e-03 5.756383e-17 9.557115e-01 0.000000e+00 6.202752e-01
## 37 38 39 40 41 42
## 0.000000e+00 6.896191e-01 2.398443e-02 1.377725e-01 1.727447e-05 1.540214e-03
## 43
## 3.048151e-04
#Add a new variable to examine whether the tie is mutual or not
year_1_network_D <- year_1_network_D %>%
activate(edges) %>%
mutate(reciprocated = edge_is_mutual())
year_1_network_D
## # A tbl_graph: 43 nodes and 82 edges
## #
## # A directed simple graph with 3 components
## #
## # Edge Data: 82 × 4 (active)
## from to weight reciprocated
## <int> <int> <dbl> <lgl>
## 1 1 3 3 TRUE
## 2 3 1 4 TRUE
## 3 3 24 3 FALSE
## 4 4 29 3 FALSE
## 5 4 41 4 FALSE
## 6 5 9 3 TRUE
## # … with 76 more rows
## #
## # Node Data: 43 × 1
## name
## <chr>
## 1 1
## 2 2
## 3 3
## # … with 40 more rows
reciprocity(year_1_network_D)
## [1] 0.1707317
set.seed(555)
in_graph <- plot.igraph(year_1_network_D,
vertex.size=degree(year_1_network_D, mode="in"),
main="In-degree")
in_graph
## NULL
set.seed(555)
out_graph <- plot.igraph(year_1_network_D,
vertex.size=degree(year_1_network_D, mode="out"),
main="Out-degree")
out_graph
## NULL
set.seed(555)
hub_graph <- plot.igraph(year_1_network_D,
vertex.size=hub.score(year_1_network_D)$vector*30, # Re-scaled
main="Hubs")
hub_graph
## NULL
# Define colors of reciprocated ties
V(year_1_network_D)$color <- "white"
# Graph layout
layout <- layout.fruchterman.reingold(year_1_network_D)
# igraph plot
plot(year_1_network_D, layout = layout)
ggraph(year_1_network_D, layout="stress") +
geom_edge_link(aes(color = reciprocated), alpha = 0.5,
start_cap = circle(2, 'mm'), end_cap = circle(2, 'mm')) +
scale_edge_width(range = c(0.5, 2.5)) +
geom_node_point(color = V(year_1_network_D)$color, size = 5, alpha = 0.5) +
geom_node_text(aes(label = name), repel = TRUE) +
theme_void() +
theme(legend.position = "none")
ggraph(year_1_network_D, layout = "linear") +
geom_edge_arc(aes(width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.2, 2)) +
geom_node_text(aes(label = name)) +
labs(edge_width = "Degree") +
theme_graph()
R_graph <- ggraph(year_1_network_D, layout = "linear") +
geom_edge_arc(aes(colour = factor(reciprocated),width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.2, 2)) +
geom_node_text(aes(label = name)) +
labs(edge_width = "Degree") +
theme_graph()
In this case study, I used the Year 1 school leader collaboration dataset to understand the network collaboration pattern using the degree centrality and the reciprocity.
As shown in Section 3, the network’s (in-degree) centrality score is about 10%. The in-degree network graph in Section 4 shows that ID 29 received the most collaboration request. The network’s (out-degree) centrality is only 2% and its graph shows that the school leaders who had higher in-degrees did not always have the higher out-degrees. Only a few IDs (e.g., IDs 34, 28, 29) were seen in a clustered location in the hub. score graph in Section 4. This finding indicates that the collaboration network is not very strong which a cluster of school leaders had some collaboration and others might not have similar or stronger connections which could affect the process of sharing resources and information among the school leaders.
Reciprocity refers to the mutuality of the ties within a network. For Year 1, reciprocity for the entire network is about 0.17 which indicates that 17% of the ties are reciprocated. This is not surprising given the context of the data set as the more experience and time that the school leaders were building connections, the more collaboration opportunities were likely to increase over time. The first three graphs show the reciprocated ties. In the graphs, there are 7 reciprocated ties which indicate that the information flow did frequently happen in this network. The last graph in Section 4 shows not only the reciprocated ties but also the strengths of the connections between each tie. In this graph, we can see that reciprocated ties seem to have a higher degree compared with non-reciprocated ties. However, there are also some weaker connections among the reciprocated ties such as the relationship between ID 1 and 3. Given the context of the data, we can learn that the collaboration relationship among school leaders is not strong with a small degree of centrality and a very small amount of reciprocated ties.