Unit 2: Part 3

Unit 2_Part 3: Independent Analysis

Step 1

Identity a data source.

I will be using the Peer_Groups_Data_Chapter_3_b dataset.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ dplyr   1.0.7
## ✓ tidyr   1.2.0     ✓ stringr 1.4.0
## ✓ readr   2.1.2     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(readr)
Peer_Groups_Chapter_3_b_Sheet1_1_ <- read_csv(here::here("/cloud/project/Peer_Groups_Chapter_3_b - Sheet1 (1).csv"))

## Rows: 27 Columns: 27

## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (27): 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Peer_Groups_Data <- Peer_Groups_Chapter_3_b_Sheet1_1_

Peer_Groups_Data

## # A tibble: 27 × 27
##      `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`  `12`  `13`
##    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1     0     2     0     2     2     2     1     2     2     0     0     2     0
##  2     1     0     0     0     2     0     0     0     0     2     2     0     0
##  3     2     0     0     2     0     0     0     2     0     2     0     0     0
##  4     1     0     0     0     0     0     0     0     0     0     0     2     0
##  5     1     2     0     2     0     2     2     2     2     0     2     2     2
##  6     2     0     0     0     1     0     0     0     2     0     2     2     2
##  7     2     0     1     2     0     0     0     0     2     0     0     0     2
##  8     2     0     2     1     2     0     2     0     2     2     1     0     2
##  9     2     0     0     0     0     2     2     0     0     0     2     0     1
## 10     2     1     2     2     2     0     2     2     0     0     1     2     2
## # … with 17 more rows, and 14 more variables: `14` <dbl>, `15` <dbl>,
## #   `16` <dbl>, `17` <dbl>, `18` <dbl>, `19` <dbl>, `20` <dbl>, `21` <dbl>,
## #   `22` <dbl>, `23` <dbl>, `24` <dbl>, `25` <dbl>, `26` <dbl>, `27` <dbl>

Step 2

Formulate a question. I recommend keeping this simple and limiting to no more than one or two questions. Your question(s) should be appropriate to your data set and ideally be answered by applying concepts and skills from our course readings and case study. For example, you may be interested in determining measures of centrality for a network and identifying key actors.

RQ1: What is the reciprocity between “best” friend nominations (coded in this dataset as a 1)?

RQ2: Are there any students who might be identified as “at risk” (defined as receiving no friendship nominations of either 1 or 2)?

Step 3

Analyze the data.

The file located in the Data panel did not match the file located in the course textbook resources. Somehow the column names and the first row of data were entered into the same cell in the file located in the data panel. This altered the number of rows from 27 to 26. Therefore, I downloaded the original file from the textbook website and uploaded it to the RMD file.

# After examining my Peer_Groups_Data, I noted that the columns were named but the rows were not. Therefore, I used the rownames() function to add rows names 1-27 and inspected the object.

rownames(Peer_Groups_Data) <- 1:27

## Warning: Setting row names on a tibble is deprecated.

Peer_Groups_Data

## # A tibble: 27 × 27
##      `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`  `12`  `13`
##  * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1     0     2     0     2     2     2     1     2     2     0     0     2     0
##  2     1     0     0     0     2     0     0     0     0     2     2     0     0
##  3     2     0     0     2     0     0     0     2     0     2     0     0     0
##  4     1     0     0     0     0     0     0     0     0     0     0     2     0
##  5     1     2     0     2     0     2     2     2     2     0     2     2     2
##  6     2     0     0     0     1     0     0     0     2     0     2     2     2
##  7     2     0     1     2     0     0     0     0     2     0     0     0     2
##  8     2     0     2     1     2     0     2     0     2     2     1     0     2
##  9     2     0     0     0     0     2     2     0     0     0     2     0     1
## 10     2     1     2     2     2     0     2     2     0     0     1     2     2
## # … with 17 more rows, and 14 more variables: `14` <dbl>, `15` <dbl>,
## #   `16` <dbl>, `17` <dbl>, `18` <dbl>, `19` <dbl>, `20` <dbl>, `21` <dbl>,
## #   `22` <dbl>, `23` <dbl>, `24` <dbl>, `25` <dbl>, `26` <dbl>, `27` <dbl>

 #Next, with properly labeled columns and rows, I converted the datashet to a matrix using the as.matrix() function and inspected my new object. 
Peer_Groups_Data <- as.matrix(Peer_Groups_Data)

Peer_Groups_Data

##    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
## 1  0 2 0 2 2 2 1 2 2  0  0  2  0  0  0  0  2  0  0  0  2  0  2  0  2  0  2
## 2  1 0 0 0 2 0 0 0 0  2  2  0  0  0  0  0  2  0  0  0  2  0  0  0  0  0  0
## 3  2 0 0 2 0 0 0 2 0  2  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## 4  1 0 0 0 0 0 0 0 0  0  0  2  0  0  0  2  0  0  0  0  2  0  0  0  0  2  0
## 5  1 2 0 2 0 2 2 2 2  0  2  2  2  2  0  0  2  2  0  0  2  1  0  0  2  0  2
## 6  2 0 0 0 1 0 0 0 2  0  2  2  2  2  0  0  0  0  0  0  0  0  0  0  0  0  0
## 7  2 0 1 2 0 0 0 0 2  0  0  0  2  0  0  0  0  0  0  0  0  0  2  0  0  0  0
## 8  2 0 2 1 2 0 2 0 2  2  1  0  2  0  0  0  0  0  0  0  2  0  0  2  2  2  2
## 9  2 0 0 0 0 2 2 0 0  0  2  0  1  0  0  0  0  1  0  1  2  0  2  2  2  2  2
## 10 2 1 2 2 2 0 2 2 0  0  1  2  2  0  0  0  2  0  0  0  2  2  0  0  0  0  2
## 11 2 2 1 2 2 2 2 2 2  2  0  2  2  2  0  0  0  2  0  1  2  0  2  0  2  0  2
## 12 2 1 0 2 2 2 0 0 0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  2  0  2
## 13 0 0 0 0 2 2 1 1 2  0  0  0  0  0  0  0  0  0  0  0  0  0  2  0  0  0  2
## 14 0 0 0 2 0 2 2 0 0  0  2  0  0  0  0  0  2  0  0  0  0  2  0  0  0  0  0
## 15 0 0 0 0 0 0 0 2 0  0  0  0  0  0  0  2  0  0  0  0  0  0  0  0  0  0  0
## 16 0 0 0 0 0 0 0 2 0  2  0  0  0  0  1  0  0  0  0  2  0  0  0  0  0  0  0
## 17 2 0 0 2 2 0 0 0 0  0  0  0  0  2  0  0  0  0  0  0  0  2  0  0  0  0  0
## 18 0 0 0 0 2 0 0 0 2  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  2  0  0
## 19 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## 20 0 0 0 0 0 0 0 0 0  0  1  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
## 21 2 2 0 2 2 0 2 0 2  0  0  0  0  2  0  0  0  2  0  0  0  0  2  0  2  2  2
## 22 0 1 0 0 0 0 0 0 0  0  0  0  0  1  0  0  2  0  0  0  2  0  0  0  2  2  0
## 23 2 0 0 0 0 0 0 0 2  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  2  0  0
## 24 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
## 25 2 0 0 0 2 0 0 0 2  0  2  2  0  0  0  0  0  2  0  0  2  2  2  0  0  0  2
## 26 0 0 0 2 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  2  2  0  0  0  0  0
## 27 2 0 0 2 2 0 2 0 2  0  2  2  2  2  0  0  0  0  0  2  0  0  2  0  2  0  0

Once the data were in a matrix, I could use the as_tbl_graph function to convert the matrix into a format that could be manipulated within ggraph.

library("ggraph")
library("tidygraph")

## 
## Attaching package: 'tidygraph'

## The following object is masked from 'package:stats':
## 
##     filter

Peer_Groups_Network <- as_tbl_graph(Peer_Groups_Data, directed = TRUE)

Peer_Groups_Network

## # A tbl_graph: 27 nodes and 203 edges
## #
## # A directed simple graph with 2 components
## #
## # Node Data: 27 × 1 (active)
##   name 
##   <chr>
## 1 1    
## 2 2    
## 3 3    
## 4 4    
## 5 5    
## 6 6    
## # … with 21 more rows
## #
## # Edge Data: 203 × 3
##    from    to weight
##   <int> <int>  <dbl>
## 1     1     2      2
## 2     1     4      2
## 3     1     5      2
## # … with 200 more rows

Step 4

Create a data product. When you feel you’ve wrangled and analyzed the data to your satisfaction, create an R Markdown file that includes a polished sociogram and/or data table and a narrative highlighting your research question, data source, and key findings and potential implications. Your R Markdown file should include a polished sociogram and/or table, a title and narrative, and all code necessary to read, wrangle, and explore your data.

First, I created a sociogram with the basic Peer Groups Network data. I highlighted the edges according to weight, which shows visually that there are significantly more “friend” nominations within the network than “best friend” nominations. This visual also shows us that there is 1 outlier, completely separate from the rest of the network, which provides an answer to RQ2. I used the “star” layout so that the density of ties in and out of nodes would be more visually accessible. I added label names to the nodes so that we could identify the students with few (or no) ties to other students. With these labels, we can see that Student 19 has no ties to other students, and Students 15, 16, 17, 18, 20, 24, and 26 have five or fewer ties. If I were the teacher of this classroom, I would pay attention to these 8 students and their interactions with others as well as try to discern if they had sources of peer support outside of this particular classroom.

library(ggraph)
ggraph(Peer_Groups_Network, layout = "star") +
geom_node_point(aes(fill = "darksalmon", size = 3), shape = 22) +
  geom_node_text(aes(label = name),family="serif") +
  geom_edge_link(aes(colour = weight)) +
                   theme_dark()

Next, I determined the reciprocity of the Peer Groups Network in its entirety so that I would have a point of comparison when answering RQ1. This showed that the network as a whole has a reciprocity of roughly 67%.

library(tidygraph)
library(igraph)

## 
## Attaching package: 'igraph'

## The following object is masked from 'package:tidygraph':
## 
##     groups

## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union

## The following objects are masked from 'package:purrr':
## 
##     compose, simplify

## The following object is masked from 'package:tidyr':
## 
##     crossing

## The following object is masked from 'package:tibble':
## 
##     as_data_frame

## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum

## The following object is masked from 'package:base':
## 
##     union

reciprocity(Peer_Groups_Network)

## [1] 0.6699507

In order to examine the reciprocity of Best Friends (node value = 1), I used the mutate function to create a new network titled “Best Friends Network” that filtered out all node values other than 1.

Best_Friends_Network <- Peer_Groups_Network |>  
  activate(edges) |> 
  mutate(reciprocated = edge_is_mutual()) |> 
  filter(weight == 1)  

Best_Friends_Network

## # A tbl_graph: 27 nodes and 27 edges
## #
## # A directed simple graph with 7 components
## #
## # Edge Data: 27 × 4 (active)
##    from    to weight reciprocated
##   <int> <int>  <dbl> <lgl>       
## 1     1     7      1 TRUE        
## 2     2     1      1 TRUE        
## 3     4     1      1 TRUE        
## 4     5     1      1 TRUE        
## 5     5    22      1 FALSE       
## 6     6     5      1 TRUE        
## # … with 21 more rows
## #
## # Node Data: 27 × 1
##   name 
##   <chr>
## 1 1    
## 2 2    
## 3 3    
## # … with 24 more rows

The new network has 7 components, indicating that there are several outliers within the Best Friends Network. I used the reciprocity function to determine the reciprocity of Best Friends, finding that it is surprisingly low, especially when compared to the network reciprocity as a whole—only 7.4%. Therefore, the answer to RQ1 is 7.4%. However, after examining the original dataset, I see that <15% of the entire 203 edges within the network are “Best Friend” edges. There are far more “Friend” nominations (node value = 2) than “Best Friend” nominations.

reciprocity(Best_Friends_Network)

## [1] 0.07407407

Finally, I used the Best Friends Network to create a sociogram similar to that above in order to highlight the reciprocity between Best Friends. This visual shows that there are 15 reciprocated best friend nominations within the network: 1-7; 1-5; 1-4; 1-2; 2-10; 5-6; 7-13; 8-11; 8-13; 9-13; 9-18; 10-11; 11-20; 14-22; 15-16. It also shows that five students—17, 19, 25, 26, and 27— neither send nor receive best friend nominations.

ggraph(Best_Friends_Network, layout = "star") +
  geom_node_point(aes(fill = "darksalmon", size = 3), shape = 22) +
  geom_node_text(aes(label = name),family="serif") +
  geom_edge_link(aes(color = reciprocated)) +
  theme_dark()

Step 5

Share your findings.

Reflection: My main takeaway from this analysis was that although the peer groups network had high reciprocity overall, the reciprocity within “best friends” was quite low. Also, I learned that one student, Student 19, neither sent nor received any friendship nominations. There are several possibly explanations for this—perhaps the student is new to the school; perhaps the student has friends in a different network and just not within this class; perhaps the student was absent on the day of the survey and this was not reported in the information we were given. However, if I were the teacher of this class, I would be sure to get to know Student 19 and investigate the possible explanations behind their lack of connectivity with classmates.

Furthermore, with a more complete dataset, I would have been interested in further exploring the gradations labeled herein as “other” (node value = 0). The textbook indicates that the survey was a 5-point scale, and there seems to me to be a large difference between coding someone as a “5,” which indicates dislike, and coding someone as a “3,” which indicates neutrality. As a teacher, classroom dynamics are extremely important, and understanding the relationships (and perceived relationships) within the class are integral to overall productivity and student performance.

Finally, with the incomplete dataset that we have, which gives us values of 0, 1, and 2, I would be interested in exploring the difference between people who “sent” a best friend nomination (node value = 1) to a particular person only to “receive” an “other” nomination (node value = 0). Understanding the differences in perceived relationships is also an important factor in understanding overall classroom functioning.

Step 6

Please see course Moodle.

References

Carolan, Brian. 2014. “Social Network Analysis and Education: Theory, Methods & Applications.” https://doi.org/10.4135/9781452270104.