My key interest in this analysis was to investigate whether district site and trust scores are potential predictors of tie formation in a school leaders network. Since this is a predictive analysis, I will use ERGM modeling to establish the significance of the two variables.
The main research question guiding this study is:
This independent study extends from the Unit 4 Case Study and makes use of the Alan Daly school leaders dataset that was referenced in Chapter 9 of Carolan(2014).
# Loading Libraries
library(statnet)
library(tidyverse)
library(readxl)
library(igraph)
library(tidygraph)
library(ggraph)
library(skimr)
library(janitor)
The wrangling process involved importing and reading School leaders nodes and edges excel files into R environment.
schoolleader_nodes <- read_excel("data/School Leaders Data Chapter 9_e.xlsx",
col_types = c("text", "numeric", "numeric", "numeric", "numeric")) |>
clean_names()
schoolleader_nodes
## # A tibble: 43 × 5
## id efficacy trust district_site male
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1 6.06 4 1 0
## 2 2 6.56 5.63 1 0
## 3 3 7.39 4.63 1 0
## 4 4 4.89 4 1 0
## 5 5 6.06 5.75 0 1
## 6 6 7.39 4.38 0 0
## 7 7 5.56 3.63 0 1
## 8 8 7.5 5.63 1 1
## 9 9 7.67 5.25 0 0
## 10 10 6.64 4.78 0 0
## # … with 33 more rows
schoolleader_matrix <- read_excel("data/School Leaders Data Chapter 9_d.xlsx",
col_names=FALSE) |>
clean_names()
schoolleader_matrix
schoolleader_matrix <- schoolleader_matrix |>
as.matrix()
schoolleader_matrix[schoolleader_matrix <= 2] <- 0
schoolleader_matrix[schoolleader_matrix >= 3] <- 1
rownames(schoolleader_matrix) <- schoolleader_nodes$id
colnames(schoolleader_matrix) <- schoolleader_nodes$id
schoolleader_matrix
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
## 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
## 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1
## 5 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 7 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
## 8 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1
## 9 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 10 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0
## 11 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
## 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 13 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 14 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
## 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
## 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0
## 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## 20 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 21 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0
## 22 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1
## 23 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
## 25 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
## 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
## 28 1 1 1 1 0 1 0 1 1 1 0 0 1 0 0 0 0 1 0 0 1 1 0 0 1 0 1 1
## 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
## 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
## 34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1
## 35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
## 36 1 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 1
## 37 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 38 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1
## 39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 41 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 42 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0
## 43 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 1 1 0 0 0 0 0
## 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
## 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
## 4 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0
## 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 8 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
## 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 10 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
## 11 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
## 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 13 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0
## 14 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
## 15 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1
## 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 18 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## 19 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
## 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 21 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1
## 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 24 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
## 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 26 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
## 27 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0
## 28 1 0 0 1 0 1 0 1 1 1 0 0 0 0 0
## 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 34 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 36 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## 37 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
## 38 0 0 0 0 1 1 0 1 1 0 1 0 0 0 1
## 39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 41 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
## 42 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
## 43 0 0 0 1 1 1 0 0 1 0 0 0 0 1 0
adjacency_matrix <- graph.adjacency(schoolleader_matrix,
diag = FALSE)
class(adjacency_matrix)
## [1] "igraph"
adjacency_matrix
## IGRAPH f3ff7fd DN-- 43 143 --
## + attr: name (v/c)
## + edges from f3ff7fd (vertex names):
## [1] 3 ->1 3 ->27 3 ->28 3 ->36 3 ->38 4 ->16 4 ->28 4 ->34 4 ->36 5 ->9
## [11] 7 ->10 7 ->20 8 ->3 8 ->10 8 ->25 8 ->27 8 ->28 8 ->37 8 ->38 9 ->5
## [21] 9 ->14 10->3 10->7 10->8 10->11 10->18 10->20 10->38 11->10 11->21
## [31] 11->23 11->35 13->8 13->35 13->42 14->37 15->21 15->39 15->42 15->43
## [41] 16->22 17->26 18->29 18->34 19->31 20->7 20->10 21->1 21->11 21->12
## [51] 21->15 21->33 21->35 21->37 21->38 21->43 22->8 22->16 22->28 23->5
## [61] 23->11 24->27 24->42 25->3 25->8 26->17 26->41 27->24 27->32 27->42
## [71] 28->1 28->2 28->3 28->4 28->6 28->8 28->9 28->10 28->13 28->18
## + ... omitted several edges
schoolleader_edges <- get.data.frame(adjacency_matrix) |>
mutate(from = as.character(from)) |>
mutate(to = as.character(to))
schoolleader_edges
## from to
## 1 3 1
## 2 3 27
## 3 3 28
## 4 3 36
## 5 3 38
## 6 4 16
## 7 4 28
## 8 4 34
## 9 4 36
## 10 5 9
## 11 7 10
## 12 7 20
## 13 8 3
## 14 8 10
## 15 8 25
## 16 8 27
## 17 8 28
## 18 8 37
## 19 8 38
## 20 9 5
## 21 9 14
## 22 10 3
## 23 10 7
## 24 10 8
## 25 10 11
## 26 10 18
## 27 10 20
## 28 10 38
## 29 11 10
## 30 11 21
## 31 11 23
## 32 11 35
## 33 13 8
## 34 13 35
## 35 13 42
## 36 14 37
## 37 15 21
## 38 15 39
## 39 15 42
## 40 15 43
## 41 16 22
## 42 17 26
## 43 18 29
## 44 18 34
## 45 19 31
## 46 20 7
## 47 20 10
## 48 21 1
## 49 21 11
## 50 21 12
## 51 21 15
## 52 21 33
## 53 21 35
## 54 21 37
## 55 21 38
## 56 21 43
## 57 22 8
## 58 22 16
## 59 22 28
## 60 23 5
## 61 23 11
## 62 24 27
## 63 24 42
## 64 25 3
## 65 25 8
## 66 26 17
## 67 26 41
## 68 27 24
## 69 27 32
## 70 27 42
## 71 28 1
## 72 28 2
## 73 28 3
## 74 28 4
## 75 28 6
## 76 28 8
## 77 28 9
## 78 28 10
## 79 28 13
## 80 28 18
## 81 28 21
## 82 28 22
## 83 28 25
## 84 28 27
## 85 28 29
## 86 28 32
## 87 28 34
## 88 28 36
## 89 28 37
## 90 28 38
## 91 31 19
## 92 32 24
## 93 33 21
## 94 34 18
## 95 34 22
## 96 34 28
## 97 34 29
## 98 35 21
## 99 36 1
## 100 36 3
## 101 36 4
## 102 36 8
## 103 36 16
## 104 36 18
## 105 36 26
## 106 36 27
## 107 36 28
## 108 36 34
## 109 37 3
## 110 37 14
## 111 37 38
## 112 38 1
## 113 38 3
## 114 38 8
## 115 38 11
## 116 38 18
## 117 38 22
## 118 38 26
## 119 38 27
## 120 38 28
## 121 38 33
## 122 38 34
## 123 38 36
## 124 38 37
## 125 38 39
## 126 38 43
## 127 42 9
## 128 42 15
## 129 42 24
## 130 42 27
## 131 42 43
## 132 43 5
## 133 43 8
## 134 43 15
## 135 43 18
## 136 43 21
## 137 43 22
## 138 43 23
## 139 43 32
## 140 43 33
## 141 43 34
## 142 43 37
## 143 43 42
schoolleader_graph <- tbl_graph(edges = schoolleader_edges,
nodes = schoolleader_nodes,
directed = TRUE)
schoolleader_graph
## # A tbl_graph: 43 nodes and 143 edges
## #
## # A directed simple graph with 4 components
## #
## # Node Data: 43 × 5 (active)
## id efficacy trust district_site male
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1 6.06 4 1 0
## 2 2 6.56 5.63 1 0
## 3 3 7.39 4.63 1 0
## 4 4 4.89 4 1 0
## 5 5 6.06 5.75 0 1
## 6 6 7.39 4.38 0 0
## # … with 37 more rows
## #
## # Edge Data: 143 × 2
## from to
## <int> <int>
## 1 3 1
## 2 3 27
## 3 3 28
## # … with 140 more rows
In the explore phase, I calculated the centrality measures
#Calculating centrality measures
schoolleader_measures <- schoolleader_graph |>
activate(nodes) |>
mutate(in_degree = centrality_degree(mode = "in")) |>
mutate(out_degree = centrality_degree(mode = "out"))|>
mutate(degree = centrality_degree(mode = "all"))
schoolleader_measures
## # A tbl_graph: 43 nodes and 143 edges
## #
## # A directed simple graph with 4 components
## #
## # Node Data: 43 × 8 (active)
## id efficacy trust district_site male in_degree out_degree degree
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 6.06 4 1 0 5 0 5
## 2 2 6.56 5.63 1 0 1 0 1
## 3 3 7.39 4.63 1 0 7 5 12
## 4 4 4.89 4 1 0 2 4 6
## 5 5 6.06 5.75 0 1 3 1 4
## 6 6 7.39 4.38 0 0 1 0 1
## # … with 37 more rows
## #
## # Edge Data: 143 × 2
## from to
## <int> <int>
## 1 3 1
## 2 3 27
## 3 3 28
## # … with 140 more rows
node_measures <- schoolleader_measures |>
activate(nodes) |>
as_tibble()
summary (node_measures)
## id efficacy trust district_site
## Length:43 Min. :4.610 Min. :3.630 Min. :0.0000
## Class :character 1st Qu.:5.670 1st Qu.:4.130 1st Qu.:0.0000
## Mode :character Median :6.780 Median :4.780 Median :0.0000
## Mean :6.649 Mean :4.783 Mean :0.4186
## 3rd Qu.:7.470 3rd Qu.:5.440 3rd Qu.:1.0000
## Max. :8.500 Max. :5.880 Max. :1.0000
## male in_degree out_degree degree
## Min. :0.0000 Min. :0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.:0.0000 1st Qu.:2.000 1st Qu.: 1.000 1st Qu.: 3.000
## Median :0.0000 Median :3.000 Median : 2.000 Median : 4.000
## Mean :0.4419 Mean :3.326 Mean : 3.326 Mean : 6.651
## 3rd Qu.:1.0000 3rd Qu.:5.000 3rd Qu.: 4.000 3rd Qu.: 9.500
## Max. :1.0000 Max. :8.000 Max. :20.000 Max. :27.000
skim (node_measures)
| Name | node_measures |
| Number of rows | 43 |
| Number of columns | 8 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| numeric | 7 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| id | 0 | 1 | 1 | 2 | 0 | 43 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| efficacy | 0 | 1 | 6.65 | 1.10 | 4.61 | 5.67 | 6.78 | 7.47 | 8.50 | ▅▅▃▇▃ |
| trust | 0 | 1 | 4.78 | 0.71 | 3.63 | 4.13 | 4.78 | 5.44 | 5.88 | ▆▆▅▆▇ |
| district_site | 0 | 1 | 0.42 | 0.50 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 | ▇▁▁▁▆ |
| male | 0 | 1 | 0.44 | 0.50 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 | ▇▁▁▁▆ |
| in_degree | 0 | 1 | 3.33 | 2.12 | 0.00 | 2.00 | 3.00 | 5.00 | 8.00 | ▅▇▂▅▂ |
| out_degree | 0 | 1 | 3.33 | 4.26 | 0.00 | 1.00 | 2.00 | 4.00 | 20.00 | ▇▁▁▁▁ |
| degree | 0 | 1 | 6.65 | 5.80 | 0.00 | 3.00 | 4.00 | 9.50 | 27.00 | ▇▃▂▁▁ |
Group statistics for Trust
node_measures |>
group_by(district_site) |>
summarise(n = n(),
mean = mean(trust),
sd = sd(trust))
## # A tibble: 2 × 4
## district_site n mean sd
## <dbl> <int> <dbl> <dbl>
## 1 0 25 4.90 0.719
## 2 1 18 4.62 0.688
schoolleader_measures |>
ggraph(layout = "fr") +
geom_edge_link(color = "grey",width= 0.3) +
geom_node_point(shape=21,aes(fill = factor(district_site)),size= 3) +
theme_graph()+
labs(title= "Year 3 Confidential Exchange Network")
schoolleader_network <- as.network(schoolleader_edges,
vertices = schoolleader_nodes)
schoolleader_network
## Network attributes:
## vertices = 43
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 143
## missing edges= 0
## non-missing edges= 143
##
## Vertex attribute names:
## district_site efficacy male trust vertex.names
##
## No edge attributes
summary(schoolleader_network ~ edges +
mutual +
transitive +
gwesp(0.25, fixed=T) + nodefactor("district_site") + nodecov("trust"))
## edges mutual
## 143.0000 36.0000
## transitive gwesp.fixed.0.25
## 202.0000 116.2971
## nodefactor.district_site.1 nodecov.trust
## 166.0000 1389.3900
#Estimating the model
ergm_schoolleader <- ergm(schoolleader_network ~ edges +
mutual +
gwesp(0.25, fixed=T) +
nodefactor('district_site') +
nodecov('trust')
)
# fitted model object
summary(ergm_schoolleader)
## Call:
## ergm(formula = schoolleader_network ~ edges + mutual + gwesp(0.25,
## fixed = T) + nodefactor("district_site") + nodecov("trust"))
##
## Monte Carlo Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -4.87814 0.51707 0 -9.434 < 1e-04 ***
## mutual 2.28851 0.31422 0 7.283 < 1e-04 ***
## gwesp.fixed.0.25 1.00733 0.13915 0 7.239 < 1e-04 ***
## nodefactor.district_site.1 0.22278 0.07790 0 2.860 0.00424 **
## nodecov.trust 0.07832 0.05099 0 1.536 0.12451
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 2503.6 on 1806 degrees of freedom
## Residual Deviance: 795.1 on 1801 degrees of freedom
##
## AIC: 805.1 BIC: 832.6 (Smaller is better. MC Std. Err. = 0.655)
The main objective of analysis was to investigate whether district site and trust are potential tie predictors in the school leaders network. The findings presented in the summary table indicate that while controlling for edges, reciprocity and transitivity which are all significant predictors, district site is significantly associated with tie prediction at P<0.00. This means , with a presence in district leadership,the likelihood of tie formation increases by 0.22249. On the other hand , according to the analysis, trust score is not a significant predictor in the model and it is not associated with tie formation in the network.
Understanding the significance of tie predictors is important in understanding factors that contribute to relations of actors in the network. With this information policy makers, administrators and other stakeholders can anticipate fruitful relationships, information flow and collaborations between leaders working in the same district. Furthermore, in knowing that variables such as “trust” do not correspond to tie formation in this model, it can be an opportunity for identifying other variables that can be good predictors of tie formation. This also extends to further work that can be done to improve the model.