For this practicum, you will be designing (an imaginary) network, loading it into R, and visualizing the network.
The objectives for this practicum are to:
To save time, you will work in groups on the first part of this practicum.
Each group of four people will be given a type of network by me. (If you were not in class on this day, let me know and I will give you a type of network to model.) Work together to design what a network of that sort would look like. You may draw it out, describe it, or do anything that you like to figure out what this should look like.
People tend to get a little overzealous with this exercise. To avoid problems later, please do not design a network with more than fifteen (15) nodes.
Important: Create a folder on your desktop labeled “Practicum 2” on your Desktop - or wherever you will be able to easily find it again. Then, set your working directory to that folder in RStudio:
If you have not already done so, install igraph into your instance of R.
install.packages("igraph", dependencies=TRUE)
Next, each person from the group will work on their own to create an edgelist. You may help one another, but I want everyone doing their own work when they are entering the information. Use a spreadsheet program like Excel or Google Sheets for this part. Save the edgelist as a .csv file and name it whatever you like.
As a refresher, remember that edgelists are just lists of who has a connection to whom. To create an edgelist, open a spreadsheet and enter the names of the nodes. For directed networks, treat the first column (Column A) as the “from” column. The second column (Column B) is the node that is receiving the tie (the “to” column). If the tie goes in both directions in a directed network, then make sure that you reverse the order of the names on the next row down.
In undirected networks, the order of the names does not matter. But, be sure that you to not enter a pair of nodes more than once. Otherwise, you will be telling igraph that the nodes have two undirected ties between them. This isn’t a big problem. But, we won’t be covering what you can do to clean that up for a while.
Once you have entered all the names into the edgelist, you should have a spreadsheet that looks something like this:
Once you have completed your edgelist, use the following code to import your network into igraph:
library(igraph)
Data <- read.csv(file.choose(), header=FALSE, check.names=TRUE) # First find and load your data.
DataMatrix <- as.matrix(Data) # Next, make sure it is in matrix format - just in case...
Finally, convert the matrix to a network object that igraph will recognize using one of the two functions, below.
## For directed networks, use this:
g <- graph_from_edgelist(DataMatrix, directed=TRUE)
## For undirected networks, use this:
g <- graph_from_edgelist(DataMatrix, directed=FALSE)
At this point, the astute observer will have observed that the only difference in the two lines is whether the directed argument is set to “TRUE” or “FALSE”.
Challenge: Use the help function to find out what the “check.names” portion of the read.csv function means.
Finally, take a look at the details about the network you just entered.
g
Here is a handy guide to understanding igraph objects, provided by Brendan Knapp - superstar.
Let’s understand the information contained in an igraph object:
IGRAPH simply annotates g as an igraph object6ab1e54, 16b4918, or whatever follows IGRAPH is simply how igraph identifies the g for itself
UN-- refers to descriptive details of g:
U tells us that g is an undirected graph
D would tell us that it is directed graphN indicates that g is a named graph, in that the vertices have a name attribute-- refers to attributes not applicable to g, but we will see them in the future:
W would refer to a weighted graph, where edges have a weight attributeB would refer to a bipartite graph, where verties have a type attribute571 refers to the number of vertices in g61102 refers to the number of edges in gattr: is a list of attributes within the graph. We only see name, but we will see multiple attributes in the future.
(v/c), which we see following name, tells us that it is a vertex attribute of a character data type. character is simply what R calls a string.(e/c) or (e/n) referring to edge attributes that are of character or numeric data types(g/c) or (g/n) referring to graph attributes that are of character or numeric data types+ edges from *arbitrary igraph name* (vertex names): lists a sample of g’s edges using the names of the vertices which they connectThere are a lot of ways to save your data.
Let’s suppose that you want to name your network “DesignTrial”. Then you could save it as an R object that you can use later in R.
save(g, file="DesignTrial.rda")
# Later you can load this using:
load("DesignTrial.rda") # This will work only if you have set the working directory to where you have stored this data set.
To save the network as either, and edgelist, or as a matrix, then it is best to use .csv format.
# To extract an edgelist:
get.edgelist(g)
# Save the edgelist as an object
el <- get.edgelist(g)
write.csv(el, file="DesignTrialEL.csv") # Then save as CSV
# To extract an adjacency matrix:
get.adjacency(g, names=TRUE)
# Save the edgelist as an object
am <- get.adjacency(g, names=TRUE)
amMat <- as.matrix(am) # One extra step: make it a matrix before exporting
write.csv(amMat, file="DesignTrialAM.csv")
The native plot() function in R will produce a network visualization. For more explicit plotting options, use plot.igraph() and check out its help section using ?plot.igraph.
Alternatively, you can use tkplot() if you would rather use an interactive graph plotter. One word of caution: tkplot is not very efficient, so large graphs will be difficult - and slow - to plot in tkplot.
Below, are a few options that you can try out:
# R's native plot function (operates like plot.igraph)
plot(g)
# igraph-specific plotting function
plot.igraph(g)
# Some layout options:
plot.igraph(g, layout=layout.fruchterman.reingold) # Try a few to see what you like.
plot.igraph(g, layout=layout.kamada.kawai)
plot.igraph(g, layout=layout.circle)
plot.igraph(g, layout=layout.davidson.harel)
plot.igraph(g, layout=layout.mds)
plot.igraph(g, layout=layout.gem)
plot.igraph(g, layout=layout.sphere)
plot.igraph(g, layout=layout.spring)
# To use the interactive tkplot:
tkplot(g)
When using the comon plot() or plot.igraph() functions, the network will be visualized in the plot window of RStudio. To save your visualization in RStudio, look in the “Plots” viewer for “Export”:
From there:
In tkplot, you have only one option:
Everyone should - electronically - submit a google document with the following to satisfy the terms of this practicum: