Prepare

The purpose of this case study is to explore network structures in the Witcher book series. I was interested in the data set because I like the Witcher series and it seemed like a fun data set to work with for my final project. This project also makes me think of something I could do with my major research interests in English and literary studies. My research question for this analysis is:

Research Question: How does the network structure change over the course of the seven books?

The network data is specifically built from the seven books in the Witcher series by Andrzej Sapkowski. For this data set, Ava Sadasivan (2022) created “a function which searched each line in the book for the name of the characters appearing in the sentence and added them as the Source character. Then [she] defined a window size of 5 lines and made it so that if two characters were listed within 5 lined of eachother there would be an edge with a weight of one assigned to them OR if they were already in the dataset for that book +1 would be added to their weight.”

The key variables are:

Source & Target: The two characters who have had an interaction. Together, the source and target form a character set.
Type: The type of connection, directed or undirected. All of the connections in this data set are undirected, likely due to the manner of data collection.
Weight: Number of interactions between two characters.
Book: Book number from which this character set is from (character sets may occur in multiple books).

The primary audience for my analysis would be people who are interested in the Witcher series or who are interested in SNA for literary analysis. The second group could use this as a model for how to conduct similar analyses in the future.

Load Libraries

The first step in preparing myself for actually working with the data is to load the relevant libraries.

library(tidygraph)

## 
## Attaching package: 'tidygraph'

## The following object is masked from 'package:stats':
## 
##     filter

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks tidygraph::filter(), stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggraph)
library(igraph)

## 
## Attaching package: 'igraph'
## 
## The following objects are masked from 'package:lubridate':
## 
##     %--%, union
## 
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## 
## The following objects are masked from 'package:purrr':
## 
##     compose, simplify
## 
## The following object is masked from 'package:tidyr':
## 
##     crossing
## 
## The following object is masked from 'package:tibble':
## 
##     as_data_frame
## 
## The following object is masked from 'package:tidygraph':
## 
##     groups
## 
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## 
## The following object is masked from 'package:base':
## 
##     union

library(janitor)

## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

library(tidytext)
library(dplyr)

Wrangle

Before working with my data, I had to make a few changes. This data set did not require a lot of wrangling because the raw data was in a easily usable format. I didn’t have to add new variables or features at this point in my project, although I do add a few during my actual analysis of my networks. My main focus was on building network objects for the entire Witcher series and for each individual book.

Import Data

I uploaded the “witcher_network.csv” file I got from Kaggle to my project and then imported it as a data file. This is my raw data before doing any wrangling.

witcher_raw <- read.csv("witcher_network.csv")

Create Edgelists

My next step is to create edgelists. I decided to make an edgelist for the entire series and for each of the seven books. This made the data more manageable later in my project and made it less work to evaluate each book as I continued working. The only change from the raw data that I needed to make for the entire series was to remove the first row, labeled “X”. This row’s purpose was to number the entries and it interfered with making nodelists later. From there, I made an edgelist for each book by filtering the “book” feature in the data.

witcher_edgelist <- witcher_raw |>
  select(!X)

book_1 <- witcher_edgelist |>
  filter(book == "1")

book_2 <- witcher_edgelist |>
  filter(book == "2")

book_3 <- witcher_edgelist |>
  filter(book == "3")

book_4 <- witcher_edgelist |>
  filter(book == "4")

book_5 <- witcher_edgelist |>
  filter(book == "5")

book_6 <- witcher_edgelist |>
  filter(book == "6")

book_7 <- witcher_edgelist |>
  filter(book == "7")

Create Nodelists

After having all of my edgelists, I went on to make my nodelists. I selected the “Source” and “Target” features and then reshaped the data frame to have a single column of the “actors” in the network. I repeated this for all of the edgelists I created so I would have a pair for the entire series and for each book.

actors <- witcher_edgelist |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_1 <- book_1 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_2 <- book_2 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_3 <- book_3 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_4 <- book_4 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_5 <- book_5 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_6 <- book_6 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

actors_7 <- book_7 |>
  select(Source, Target) |>
  pivot_longer(cols = c(Source, Target)) |>
  select(value) |>
  rename(actors = value) |> 
  distinct()

Create Network Objects

Next, I made network objects for the entire series and each individual book. This was necessary so I could analyze the network data.

witcher_network <- tbl_graph(edges = witcher_edgelist, 
                          nodes = actors, 
                          directed = FALSE)

book_1_network <- tbl_graph(edges = book_1, 
                          nodes = actors_1, 
                          directed = FALSE)

book_2_network <- tbl_graph(edges = book_2, 
                          nodes = actors_2, 
                          directed = FALSE)

book_3_network <- tbl_graph(edges = book_3, 
                          nodes = actors_3, 
                          directed = FALSE)

book_4_network <- tbl_graph(edges = book_4, 
                          nodes = actors_4, 
                          directed = FALSE)

book_5_network <- tbl_graph(edges = book_5, 
                          nodes = actors_5, 
                          directed = FALSE)

book_6_network <- tbl_graph(edges = book_6, 
                          nodes = actors_6, 
                          directed = FALSE)

book_7_network <- tbl_graph(edges = book_7, 
                          nodes = actors_7, 
                          directed = FALSE)

Analyze

Clear descriptions of the exploratory techniques and/or models (e.g., level of analysis, community detection, ERGMs, etc.) you applied to analyze your data.
Interpretations/meanings of specific metrics (e.g., counts, means, centrality measures, etc.) that your analysis generated.
Data visualizations that are attractive and easy to interpret, and include text or narrative for aiding their Interpretation by others.

Size

The first part of my analysis involved looking at the size of each of the networks. I added the feature “size” to each of my networks and looked at the local size for my actors. Local size is the number of connections a node has. In order of largest local size, the first three actors for each network are:

Complete Series (224 actors): Geralt, Ciri, and Yennifer
Book 1 (67 actors): Geralt, Calanthe, and Pavetta
Book 2 (36 actors): Geralt, Dandelion, and Ciri
Book 3 (56 actors): Ciri, Geralt, and Yennifer
Book 4 (83 actors): Ciri, Emhyr, and Geralt
Book 5 (76 actors): Geralt, Dandelion, and Milva
Book 6 (72 actors): Ciri, Geralt, and Emhyr
Book 7 (117 actors): Ciri, Geralt, and Yennifer

The size data shows how the main characters shift throughout the series and shows the network size for all of the books. Book 2 has the smallest network and book 7 has the largest network, apart from considering the entire series as a whole.

# total

witcher_network <- witcher_network |>
  activate(nodes) |>
  mutate(size = local_size())

witcher_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 224 × 2
##    actors      size
##    <chr>      <dbl>
##  1 Geralt       138
##  2 Ciri         103
##  3 Yennefer      73
##  4 Dandelion     70
##  5 Emhyr         63
##  6 Triss         46
##  7 Philippa      46
##  8 Vilgefortz    45
##  9 King          44
## 10 Calanthe      40
## # ℹ 214 more rows

# book 1

book_1_network <- book_1_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_1_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 67 × 2
##    actors     size
##    <chr>     <dbl>
##  1 Geralt       59
##  2 Calanthe     15
##  3 Pavetta      12
##  4 Yennefer     11
##  5 Eist         11
##  6 Crach        10
##  7 Rainfarn      9
##  8 Mousesack     9
##  9 Stregobor     8
## 10 Renfri        8
## # ℹ 57 more rows

# book 2

book_2_network <- book_2_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_2_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 36 × 2
##    actors     size
##    <chr>     <dbl>
##  1 Geralt       31
##  2 Dandelion    19
##  3 Ciri         13
##  4 Braenn       10
##  5 Yurga        10
##  6 Yennefer      8
##  7 Eithné        8
##  8 Sir           7
##  9 Agloval       7
## 10 Essi          7
## # ℹ 26 more rows

# book 3

book_3_network <- book_3_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_3_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 56 × 2
##    actors     size
##    <chr>     <dbl>
##  1 Ciri         36
##  2 Geralt       26
##  3 Yennefer     23
##  4 Dandelion    19
##  5 Rience       17
##  6 Calanthe     14
##  7 Triss        14
##  8 King         13
##  9 Foltest      13
## 10 Yarpen       11
## # ℹ 46 more rows

# book 4

book_4_network <- book_4_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_4_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 83 × 2
##    actors      size
##    <chr>      <dbl>
##  1 Ciri          42
##  2 Emhyr         36
##  3 Geralt        33
##  4 Yennefer      29
##  5 Vilgefortz    23
##  6 Philippa      20
##  7 Codringher    19
##  8 Sabrina       19
##  9 Gar           19
## 10 Tissaia       19
## # ℹ 73 more rows

# book 5

book_5_network <- book_5_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_5_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 76 × 2
##    actors     size
##    <chr>     <dbl>
##  1 Geralt       36
##  2 Dandelion    29
##  3 Milva        28
##  4 Ciri         27
##  5 Francesca    26
##  6 Assire       24
##  7 Philippa     22
##  8 Yennefer     22
##  9 Emhyr        21
## 10 Cahir        21
## # ℹ 66 more rows

# book 6

book_6_network <- book_6_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_6_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 72 × 2
##    actors      size
##    <chr>      <dbl>
##  1 Ciri          38
##  2 Geralt        29
##  3 Emhyr         22
##  4 Baron         16
##  5 Falka         14
##  6 Yennefer      14
##  7 Cahir         14
##  8 Vilgefortz    14
##  9 Rience        14
## 10 King          13
## # ℹ 62 more rows

# book 7

book_7_network <- book_7_network |>
  activate(nodes) |>
  mutate(size = local_size())

book_7_network |> 
  as_tibble() |>
  arrange(desc(size)) |> 
  select(actors, size)

## # A tibble: 117 × 2
##    actors      size
##    <chr>      <dbl>
##  1 Ciri          50
##  2 Geralt        45
##  3 Yennefer      29
##  4 Dandelion     28
##  5 Emhyr         27
##  6 King          26
##  7 Zoltan        24
##  8 Triss         23
##  9 Count         22
## 10 Vilgefortz    22
## # ℹ 107 more rows

Density

Next, I looked at the density of all of my networks. Density is the number of actual ties divided by the number of possible ties in the network. I didn’t add density as a feature to my network since I wasn’t considering using it when graphing my network models later in my analysis. The density values (rounded to 3 decimal places) for each of the networks are:

Complete Series: 0.104
Book 1: 0.120
Book 2: 0.252
Book 3: 0.195
Book 4: 0.125
Book 5: 0.158
Book 6: 0.132
Book 7: 0.098

The most dense social network is found in book 2 and the least dense is found in book 7. This aligns the book 2 having the fewest actors and book 7 having the most actors for any given book. Overall, the networks are not dense. This makes sense for having have a few characters that most of the action is centered around and the side characters don’t interact very much with each other.

# total

edge_density(witcher_network)

## [1] 0.1040999

# book 1

edge_density(book_1_network)

## [1] 0.1198553

# book 2

edge_density(book_2_network)

## [1] 0.252381

# book 3

edge_density(book_3_network)

## [1] 0.1948052

# book 4

edge_density(book_4_network)

## [1] 0.1254775

# book 5

edge_density(book_5_network)

## [1] 0.1578947

# book 6

edge_density(book_6_network)

## [1] 0.1318466

# book 7

edge_density(book_7_network)

## [1] 0.09755379

Centrality

Centrality examines how much the network revolves around a specific node. I looked at three centrality measures: degree, closeness, and betweenness centrality:

Degree centrality is a mesure of the total ties connected to a node. High degree centralization indicates a highly centralized network. Low degree centralization indicates a decentralized network.
Closeness centrality is a measure of how close a node is to all the other nodes in the network by calculating the average distance. The lower the closeness centrality, the closer a node is to all other nodes in the network.
Betweenness centrality is measures how often a node is on the lowest path length between any two nodes. The higher the betweenness centrality, the more the node lies on the shortest path between nodes.

I arranged the tibbles in order of descending degree centrality so that I could look at the characters most central to the network by degree. These values were added to the networks as additional features.

The top characters according to degree centrality for each network were almost exactly the same as the top characters according to local size. If we look just at Geralt and Ciri, we can see the trends in their centrality over each book:

Geralt

Complete Series: degree: 404; closeness: 0.00326; betweenness: 10855
Book 1: degree: 99; closeness: 0.0135; betweenness: 1896
Book 2: degree: 56; closeness: 0.0250; betweenness: 362
Book 3: degree: 41; closeness: 0.0115; betweenness: 205
Book 4: degree: 50; closeness: 0.007634; betweenness: 364
Book 5: degree: 48; closeness: 0.00885; betweenness: 482
Book 6: degree: 41; closeness: 0.00806; betweenness: 489
Book 7: degree: 69; closeness: 0.00500; betweenness: 1226

While Geralt maintains a relatively high degree centrality throughout the whole series, there are small ebbs and flows. Additionally, Geralt becomes closer with other characters as the series progresses. Lastly, Geralt starts out often between characters and gradually become less so until book 4 where he is increasingly between characters again as the series finishes up.

Ciri

Complete Series: degree: 331; closeness: 0.00289; betweenness: 4726
Book 1: Not in book 1
Book 2: degree: 19; closeness: 0.0169; betweenness: 79
Book 3: degree: 59; closeness: 0.0128; betweenness: 581
Book 4: degree: 65; closeness: 0.00826; betweenness: 1066
Book 5: degree: 44; closeness: 0.00826; betweenness: 343
Book 6: degree: 63; closeness: 0.00935; betweenness: 998
Book 7: degree: 81; closeness: 0.00521; betweenness: 1663

Ciri becomes increasingly central to the plot as the series progresses, from not even being in book 1 to having high degree and betweenness centrality in book 7. However, her closeness centrality decreases as the series progresses which indicates her becoming closer with the other egos in the network.

# total
# degree centrality

witcher_network <- witcher_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

witcher_network <- witcher_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

witcher_network <- witcher_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(witcher_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 224 × 4
##    actors     degree closeness betweenness
##    <chr>       <dbl>     <dbl>       <dbl>
##  1 Geralt        404   0.00326      10855.
##  2 Ciri          331   0.00289       4726.
##  3 Yennefer      208   0.00262       1579.
##  4 Dandelion     188   0.00262       1998.
##  5 Emhyr         150   0.00254       1374.
##  6 Philippa      125   0.00242        765.
##  7 Triss         108   0.00243        333.
##  8 King          103   0.00242        681.
##  9 Vilgefortz    102   0.00240        516.
## 10 Calanthe       98   0.00236        645.
## # ℹ 214 more rows

# book 1
# degree centrality

book_1_network <- book_1_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_1_network <- book_1_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_1_network <- book_1_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_1_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 67 × 4
##    actors    degree closeness betweenness
##    <chr>      <dbl>     <dbl>       <dbl>
##  1 Geralt        99   0.0135      1896.  
##  2 Calanthe      21   0.00826      108.  
##  3 Pavetta       18   0.00806       25.9 
##  4 Yennefer      17   0.00769       12.0 
##  5 Renfri        14   0.00769       36.2 
##  6 Eist          14   0.00787       15.5 
##  7 Crach         14   0.00781        1.87
##  8 Chireadan     13   0.00752        2.22
##  9 Rainfarn      12   0.00775       13.1 
## 10 Nenneke       11   0.00746        2.92
## # ℹ 57 more rows

# book 2
# degree centrality

book_2_network <- book_2_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_2_network <- book_2_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_2_network <- book_2_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_2_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 36 × 4
##    actors    degree closeness betweenness
##    <chr>      <dbl>     <dbl>       <dbl>
##  1 Geralt        56    0.025       362.  
##  2 Dandelion     31    0.0192       94.7 
##  3 Ciri          19    0.0169       79.4 
##  4 Yurga         16    0.0161       22.6 
##  5 Braenn        13    0.0161       12.2 
##  6 Eithné        13    0.0156        5.22
##  7 Agloval       12    0.0147        1.47
##  8 Yennefer      11    0.0154       22.1 
##  9 Calanthe      11    0.0154        3.05
## 10 Essi          10    0.0149        1.19
## # ℹ 26 more rows

# book 3
# degree centrality

book_3_network <- book_3_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_3_network <- book_3_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_3_network <- book_3_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_3_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 56 × 4
##    actors    degree closeness betweenness
##    <chr>      <dbl>     <dbl>       <dbl>
##  1 Ciri          59   0.0128        581. 
##  2 Geralt        41   0.0115        205. 
##  3 Yennefer      35   0.0109        208. 
##  4 Dandelion     31   0.0103        189. 
##  5 Rience        23   0.0103        138. 
##  6 Calanthe      21   0.0101         83.5
##  7 Triss         20   0.00935        21.6
##  8 Foltest       18   0.00893       131. 
##  9 Yarpen        16   0.00901        43.4
## 10 King          15   0.00971        34.4
## # ℹ 46 more rows

# book 4
# degree centrality

book_4_network <- book_4_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_4_network <- book_4_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_4_network <- book_4_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_4_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 83 × 4
##    actors     degree closeness betweenness
##    <chr>       <dbl>     <dbl>       <dbl>
##  1 Ciri           65   0.00826      1067. 
##  2 Geralt         50   0.00763       364. 
##  3 Emhyr          46   0.00794      1006. 
##  4 Yennefer       46   0.00730       290. 
##  5 Philippa       30   0.00645        49.1
##  6 Tissaia        29   0.00676        97.5
##  7 Vilgefortz     29   0.00699       223. 
##  8 Sabrina        28   0.00621        94.5
##  9 Codringher     25   0.00676       118. 
## 10 Gar            25   0.00621       112. 
## # ℹ 73 more rows

# book 5
# degree centrality

book_5_network <- book_5_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_5_network <- book_5_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_5_network <- book_5_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_5_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 76 × 4
##    actors    degree closeness betweenness
##    <chr>      <dbl>     <dbl>       <dbl>
##  1 Geralt        48   0.00885       483. 
##  2 Milva         45   0.00806       426. 
##  3 Dandelion     45   0.00826       343. 
##  4 Ciri          44   0.00826       351. 
##  5 Francesca     37   0.008         275. 
##  6 Assire        37   0.00775       211. 
##  7 Yennefer      34   0.00787       177. 
##  8 Philippa      33   0.00752       165. 
##  9 Cahir         28   0.00781       152. 
## 10 Fringilla     28   0.00752        85.0
## # ℹ 66 more rows

# book 6
# degree centrality

book_6_network <- book_6_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_6_network <- book_6_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_6_network <- book_6_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_6_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 72 × 4
##    actors     degree closeness betweenness
##    <chr>       <dbl>     <dbl>       <dbl>
##  1 Ciri           63   0.00935       999. 
##  2 Geralt         41   0.00806       489. 
##  3 Emhyr          25   0.00787       394. 
##  4 Yennefer       24   0.00694        49.3
##  5 Falka          20   0.00694       115. 
##  6 Baron          20   0.00704       110. 
##  7 Rience         20   0.00694       110. 
##  8 Cahir          19   0.00730        87.1
##  9 Crach          18   0.00671        22.2
## 10 Vilgefortz     18   0.00699       131. 
## # ℹ 62 more rows

# book 7
# degree centrality

book_7_network <- book_7_network |>
  activate(nodes) |>
  mutate(degree = centrality_degree(mode = "all"))

# closeness centrality

book_7_network <- book_7_network |>
  activate(nodes) |>
  mutate(closeness = centrality_closeness(mode = "all"))

# betweenness centrality

book_7_network <- book_7_network |>
  activate(nodes) |>
  mutate(betweenness = centrality_betweenness(directed = FALSE))

as_tibble(book_7_network) |>
  arrange(desc(degree)) |> 
  select(actors, degree, closeness, betweenness)

## # A tibble: 117 × 4
##    actors    degree closeness betweenness
##    <chr>      <dbl>     <dbl>       <dbl>
##  1 Ciri          81   0.00521       1663.
##  2 Geralt        69   0.005         1226.
##  3 Dandelion     45   0.00446        434.
##  4 Emhyr         42   0.00437        500.
##  5 Yennefer      41   0.00426        323.
##  6 Fringilla     36   0.004          211.
##  7 King          35   0.00444        489.
##  8 Philippa      33   0.00385        293.
##  9 Triss         32   0.00424        190.
## 10 Jarre         32   0.00426        713.
## # ℹ 107 more rows

Network Models

For my network models, I decided to model each of my networks in the same way. I used degree as the color and I added the label for each character in the network. These models show the changes that are shown in the size, density, and centrality measure in a more readable fashion.

Complete Series Network

ggraph(witcher_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Warning: ggrepel: 93 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Book 1 Network

Book 2 Network

ggraph(book_2_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

Book 3 Network

ggraph(book_3_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

Book 4 Network

ggraph(book_4_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

## Warning: ggrepel: 8 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Book 5 Network

ggraph(book_5_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

## Warning: ggrepel: 4 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Book 6 Network

ggraph(book_6_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

Book 7 Network

ggraph(book_7_network, layout = "fr") + 
  geom_node_point(aes(size = degree,
                      color = degree)) +
  geom_node_text(aes(label = actors,
                     size = degree/2,
                     color = degree),
                 repel=TRUE) +
  geom_edge_link(arrow = NULL, 
                 end_cap = circle(3, 'mm'),
                 alpha = .3) +
    theme_graph()

## Warning: ggrepel: 12 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Communicate

Key Findings and Insights

Research Question: How does the network structure change over the course of the seven books?

The network structure changes in size over the course of the series and it changes in who is most central to the network. In the beginning, the series is focused most prominently on Geralt; however, as the series continues, Ciri begins to take a more prominent role. Outside of those two characters, the series gradually becomes more and more decentralized, which shows the increasing depth and development of many characters throughout the series.

Potential Action

Since this project isn’t focused on research that could improve an educational setting, this is a bit tricky. This type of analysis could be conducted by other people interested in literary research or who want to use this type of analysis as an educational material. In these cases, it’s important to have a good data set to work with. Additionally, they might want to collect information about the specific characters that could make the analysis more intricate, such as gender, age, and profession. These demographic features would allow you to do analyses that looked for trends in relationship that you otherwise would not be able to, such as examining the social networks for just women or just men in the series.

Limitations

I can’t think of any limitations of this data set for the particular question I set out to answer. Additionally, I don’t think there are any ethical or legal issues. The data set is open-source and the network is mapping the relationships of fictional characters.

References

Sadasivan, A. (2022). Witcher network. [Data set]. Kaggle. https://www.kaggle.com/datasets/avasadasivan/witcher-network?resource=download

ECI 589 Final Project

Grace Wiedrich

2023-04-29

Prepare

Load Libraries

Wrangle

Import Data

Create Edgelists

Create Nodelists

Create Network Objects

Analyze

Size

Density

Centrality

Geralt

Ciri

Network Models

Complete Series Network

Book 1 Network

Book 2 Network

Book 3 Network

Book 4 Network

Book 5 Network

Book 6 Network

Book 7 Network

Communicate

Key Findings and Insights

Potential Action

Limitations

References