This code through explores some basics of Networks as well as a deeper dive into the program DiagrammeR
Before beginning, make sure that you install DiagrammeR and load it into your library. This codethrough will also be utilizing functions from the packages dplyr and pander
#Clear out your Global Environment
rm(list = ls())
install.packages("DiagrammeR")
library(DiagrammeR)
library(dplyr)
library(pander)
library(kableExtra)I will demonstrate how to construct a network using data from a course discussion board. I will then show how this information may be visualized.
This topic is valuable because networks are everywhere! Think about the last 10 people you have spoken to. Now how about the last 10 people each of them spoke to. That’s a network! Now what about in the news? Networks, like the one below (found here) have been used to track the spread of disease!
Following COVID-19, many classes moved online, changing the ways we all interact. This example will look at how students interacted with one another in a class discussion board.
Specifically, you’ll learn how to…
1.) Load data into DiagrammeR as a nodelist and edgelist
2.) Graph a network
3.) Customize how the network is visualized
Let’s start with a few definitions. For more detail, see this fabulous overview of network visualization
Nodes: Vertices in a network. These can be people, places, organizations, etc. In this example, they are students in a class identified by their initials.
Edges: Connections between nodes. These can be conversations, relationships, etc. In this example, an interaction occurred when a student posted on another student’s discussion post.
We will begin with an example created from a class discussion board.
The first data set, discussion nodes, contains the intitials of all students. It also includes some basic demographic information, such as field of interest and whether or not the student has experience in R.
The second data set, discussion edges, contains one line for each response on the second discussion board. Let’s look at the data:
##Examining Your Data
nodes_messy <- read.csv("discussion_nodes.csv")
nodes <-nodes_messy %>%
rename ("Initials" = ï..Initials) %>%
select("Student.ID", "Initials", "Graduate")
n <-head(nodes)
kable(n) %>%
kable_styling(bootstrap_options = c("striped", "hover"))| Student.ID | Initials | Graduate |
|---|---|---|
| 1 | UA | 1 |
| 2 | GB | 1 |
| 3 | MP | 1 |
| 4 | MR | 1 |
| 5 | BL | 1 |
| 6 | SW | 1 |
edges_messy <-read.csv("discussion_edges.csv")
edges <- edges_messy %>%
rename ("Poster.Initials" = ï..Poster.Initials,
"Responder.Initials"= Responder.Inititials) %>%
select("Poster.ID", "Responder.ID")
e <- edges %>%
head()
kable(e) %>%
kable_styling(bootstrap_options = c("striped", "hover"))| Poster.ID | Responder.ID |
|---|---|
| 17 | 13 |
| 13 | 11 |
| 3 | 11 |
| 3 | 8 |
| 8 | 11 |
| 6 | 14 |
In order to map a network, DiagrammeR requires graph objects of class dgr_graph. For more detailed instructions, click here
# Create nodes (initials and area of study for all class members)
class_node<-
create_node_df(n = 25,
label = c("UA", "GB","MP","MR","BL",
"SW", "AE","IQ","SC","BS",
"KP","EC","RZ","MO","AC",
"DS","CK","AG","CC","AF",
"SK","AS","SS","BP","SQ"),
shape = "circle",
type = "student",
data = c("Economics", "Economics","Criminal Justice",
"Psychology","Accounting","Applied Linguistics",
"Public Health", "Political Science", "Neuroscience",
"Economics", "Public Health", "Political Science",
"Finance", "Finance", "Political Science", "Political Science",
"Economics","Public Health","Public Policy","Economics",
"Political Science","Public Policy","Criminal Justice",
"Economics","Public Health"))
#Create an edgelist (interaction on discussion board #2)
class_edge<-
create_edge_df(from = c(13,11,11,8,11,14,11,17,17,23,
15,23,20,4,11,23,4,3,4,19,11,
4,13,4,22,14),
to = c(17,13,3,3,8,6,5,5,19,14,22,22,
11,11,4,20,20,21,21,15,15,15,
12,12,12,23))
class_disscussion <-
create_graph(nodes_df = class_node,
edges_df = class_edge)
#Let's check our Network
ndf <- get_node_df(class_disscussion) %>%
head()
kable(ndf) %>%
kable_styling(bootstrap_options = c("striped", "hover"))| id | type | label | shape | data |
|---|---|---|---|---|
| 1 | student | UA | circle | Economics |
| 2 | student | GB | circle | Economics |
| 3 | student | MP | circle | Criminal Justice |
| 4 | student | MR | circle | Psychology |
| 5 | student | BL | circle | Accounting |
| 6 | student | SW | circle | Applied Linguistics |
edf <- get_edge_df(class_disscussion) %>%
head()
kable(edf) %>%
kable_styling(bootstrap_options = c("striped", "hover"))| id | from | to | rel |
|---|---|---|---|
| 1 | 13 | 17 | NA |
| 2 | 11 | 13 | NA |
| 3 | 11 | 3 | NA |
| 4 | 8 | 3 | NA |
| 5 | 11 | 8 | NA |
| 6 | 14 | 6 | NA |
# Graph Your Network
Now that we have made our network, we should graph it
As you can see below, this is a directed network. This means that the direction of the interaction matters and is indicated by an arrow (instead of just a line).
If we look at student MP, we can see that they replied to student SK’s post. Students MO and SS relied to each other.
For more information on creating network graphs, see Creating Simple Graphs from NDFs/EDFs
Although DiagrammeR automatically generates a network graph, we can tell it to arrange the data differently:
Networks are extremely visual! So choosing the arrangement that is best suited to your data is important!Learn more about Visualizing Networks with the following:
Resource I Preparing Network Data in R
Resource II Introduction to Network Analysis
Resource III Creating a Node Selection
This code through references and cites the following sources:
cran.r-project.org DiagrammeR
cran.r-project.org Creating Simple Graphs from NDFs/EDFs
cran.r-project.org Selections