Identify a data source

The data source I selected for my independent analysis was the Fraternity Data from Ch. 5 of “Social Network Analysis and Education: Theory, Methods & Applications”.

Formulate a question

There are several things we know about the Ch. 5 Fraternity data. First, the recorded data set is directed (asymmetrical) and binary (nonvalued). Secondly, this network is fairly small in size (17 actors) but is also well connected with no isolates. Thirdly, however, the network has few reciprocal friendships and a varying “thickness” (density), indicating subgroups, or “cliques and clans”.

For the purposes of this research, I am interested in reciprocity and seeing what inferences I can draw about this network. From the text, we also know that the reciprocity of this network is 0.31. There are a lot of things that we can infer from a network’s reciprocity, including the flow of resources, stability, and hierarchy.

RQ: Which actors have reciprocal friendship rankings and what can be inferred from them?

Analyze the data

First, we need to load libraries.

install.packages("tidyverse")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.1'
## (as 'lib' is unspecified)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ dplyr   1.0.7
## ✓ tidyr   1.2.0     ✓ stringr 1.4.0
## ✓ readr   2.1.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(igraph)
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## The following objects are masked from 'package:purrr':
## 
##     compose, simplify
## The following object is masked from 'package:tidyr':
## 
##     crossing
## The following object is masked from 'package:tibble':
## 
##     as_data_frame
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
library(tidygraph)
## 
## Attaching package: 'tidygraph'
## The following object is masked from 'package:igraph':
## 
##     groups
## The following object is masked from 'package:stats':
## 
##     filter
library(ggraph)
library(readxl)

Now that our libraries have been loaded, we’ll pull our selected data set. For now, we’ll stick with the same naming conventions.

Fraternity_Data_Chapter_5 <- read_excel("data/Fraternity Data Chapter 5.xlsx", col_names = FALSE)
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ...
Fraternity_Data_Chapter_5
## # A tibble: 17 × 17
##     ...1  ...2  ...3  ...4  ...5  ...6  ...7  ...8  ...9 ...10 ...11 ...12 ...13
##    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1     0     0     0     0     0     0     0     0     0     0     1     0     1
##  2     0     0     0     1     0     0     1     0     0     0     0     0     0
##  3     0     0     0     0     0     0     0     0     0     0     1     1     0
##  4     0     1     0     0     0     0     1     0     0     0     0     0     0
##  5     0     0     0     0     0     0     0     0     0     0     1     1     0
##  6     0     0     0     1     0     0     0     1     0     0     0     0     1
##  7     0     0     0     1     0     0     0     0     0     0     0     1     0
##  8     0     0     0     0     0     1     0     0     0     1     1     0     0
##  9     0     0     0     0     0     0     0     0     0     0     1     1     0
## 10     1     0     0     0     0     0     1     0     0     0     0     0     0
## 11     0     0     0     0     0     0     0     0     1     0     0     1     0
## 12     0     0     1     0     0     0     0     0     0     0     1     0     0
## 13     1     0     0     0     0     1     0     0     0     0     0     0     0
## 14     0     0     0     0     0     0     1     0     1     1     0     0     0
## 15     0     0     0     0     1     0     0     0     0     1     1     0     0
## 16     0     0     0     1     0     0     0     0     1     0     1     0     0
## 17     0     0     0     1     0     0     0     0     1     0     0     1     0
## # … with 4 more variables: ...14 <dbl>, ...15 <dbl>, ...16 <dbl>, ...17 <dbl>

Our data appears as a 17 x 17 tibble, as expected. By viewing the data, it is apparent that our rows and columns are not easily visible, which we will correct by creating new names for each. For simplicity, we’ll use 1-17 for each.

rownames(Fraternity_Data_Chapter_5) <- 1:17
## Warning: Setting row names on a tibble is deprecated.
colnames(Fraternity_Data_Chapter_5) <- 1:17

Next, we’ll convert the data into a matrix. And just to be sure we were successful, we’ll check the class.

fraternity_matrix <- as.matrix(Fraternity_Data_Chapter_5)

class(fraternity_matrix)
## [1] "matrix" "array"

Now, using the matrix we just created, we’ll transform it using the tidygraph library.

fraternity_network <- as_tbl_graph(fraternity_matrix, directed = TRUE)

fraternity_network
## # A tbl_graph: 17 nodes and 51 edges
## #
## # A directed simple graph with 1 component
## #
## # Node Data: 17 × 1 (active)
##   name 
##   <chr>
## 1 1    
## 2 2    
## 3 3    
## 4 4    
## 5 5    
## 6 6    
## # … with 11 more rows
## #
## # Edge Data: 51 × 3
##    from    to weight
##   <int> <int>  <dbl>
## 1     1    11      1
## 2     1    13      1
## 3     1    17      1
## # … with 48 more rows

Create a data product

Now that we have something a little more polished, we’ll be using the ggraph library to plot the data. Since we’re interested in reciprocity, our data product has a few specifications to make this information easier to digest. First, we’ve added arrows to indicate directionality. We’ve also used our naming conventions from earlier to show where our actors fit in the network. As well, we’ve added start and end caps to the arrows so it’s easy to scan the number of nominations received by actors as well as which of these were reciprocated.

ggraph(fraternity_network) + geom_edge_link(arrow = arrow(length = unit(2, 'mm')), start_cap = circle(5, 'mm'), end_cap = circle(5, 'mm')) + geom_node_text(aes(label = name))
## Using `stress` as default layout

Key Findings and Potential Implications

Using the data product above, we can tell which nominations were reciprocal. They are as follows: [1] 1 and 13, [2] 2 and 4, [3] 3 and 12, [4] 4 and 7, [5] 4 and 17, [6] 6 and 8, [7] 6 and 13, [8] 7 and 10, [9] 9 and 11, [10] 9 and 17, [11] 10 and 15, and [12] 11 and 12.

Actor 4 had the highest number of reciprocal nominations at three (from 2, 7, and 17). Since this data was collected at the beginning of the study, it can be inferred that these individuals had existing friendships. Interesting, 2 nominated 7 and 7 nominated 17, but they did not have reciprocity. This could suggest that actor 4 had a deeper friendship or knew these individuals before they met each other. Perhaps actor 4 is older or was in the fraternity before the others, which could serve as a potential reason for the others joining.

This method of analysis, if fine-tuned, could be used by organizations (fraternities and others) to identify trends in recruitment numbers and patterns. It could also show members that are more likely to discontinue affiliation with organizations.