Setup

For this practicum, you will be designing (an imaginary) network, loading it into R, and visualizing the network.

The objectives for this practicum are to:

  • Provide a little insight into networks and what they represent
  • Provide some experience with getting network data into R
  • Add some satisfaction by letting you visualize your network



Instructions

Group Work

To save time, you will work in groups on the first part of this practicum.

Each group of four people will be given a type of network by me. (If you were not in class on this day, let me know and I will give you a type of network to model.) Work together to design what a network of that sort would look like. You may draw it out, describe it, or do anything that you like to figure out what this should look like.

People tend to get a little overzealous with this exercise. To avoid problems later, please do not design a network with more than fifteen (15) nodes.

Individual Work

Important: Create a folder on your desktop labeled “Practicum 2” on your Desktop - or wherever you will be able to easily find it again. Then, set your working directory to that folder in RStudio:

  • Session
    • Set Working Directory
    • Choose Directory…

If you have not already done so, install igraph into your instance of R.

install.packages("igraph", dependencies=TRUE)

Next, each person from the group will work on their own to create an edgelist. You may help one another, but I want everyone doing their own work when they are entering the information. Use a spreadsheet program like Excel or Google Sheets for this part. Save the edgelist as a .csv file and name it whatever you like.

As a refresher, remember that edgelists are just lists of who has a connection to whom. To create an edgelist, open a spreadsheet and enter the names of the nodes. For directed networks, treat the first column (Column A) as the “from” column. The second column (Column B) is the node that is receiving the tie (the “to” column). If the tie goes in both directions in a directed network, then make sure that you reverse the order of the names on the next row down.

In undirected networks, the order of the names does not matter. But, be sure that you to not enter a pair of nodes more than once. Otherwise, you will be telling igraph that the nodes have two undirected ties between them. This isn’t a big problem. But, we won’t be covering what you can do to clean that up for a while.

Once you have entered all the names into the edgelist, you should have a spreadsheet that looks something like this:

Pro Tips:

  • To make this just a little easier on you, don’t use headers. (Don’t label your columns.)
  • Double-check to make sure that you saved the edgelist as a .csv file.
  • Make sure that you spelled (and capitalized) all the names consistently. (You can use the “sort” function in the spreadsheet to help with this.)
  • Name the file something that you will remember.




Once you have completed your edgelist, use the following code to import your network into igraph:

library(igraph)

Data <- read.csv(file.choose(), header=FALSE, check.names=TRUE) # First find and load your data.
DataMatrix <- as.matrix(Data)      # Next, make sure it is in matrix format - just in case...

Finally, convert the matrix to a network object that igraph will recognize using one of the two functions, below.

## For directed networks, use this:
g <- graph_from_edgelist(DataMatrix, directed=TRUE)

## For undirected networks, use this:
g <- graph_from_edgelist(DataMatrix, directed=FALSE)

At this point, the astute observer will have observed that the only difference in the two lines is whether the directed argument is set to “TRUE” or “FALSE”.

Challenge: Use the help function to find out what the “check.names” portion of the read.csv function means.


Finally, take a look at the details about the network you just entered.

g    

Here is a handy guide to understanding igraph objects, provided by Brendan Knapp - superstar.

Let’s understand the information contained in an igraph object:

  • IGRAPH simply annotates g as an igraph object
  • 6ab1e54, 16b4918, or whatever follows IGRAPH is simply how igraph identifies the g for itself
    • it’s not important for our purposes and will be referred to as arbitrary igraph name
  • UN-- refers to descriptive details of g:
    • U tells us that g is an undirected graph
      • D would tell us that it is directed graph
    • N indicates that g is a named graph, in that the vertices have a name attribute
    • -- refers to attributes not applicable to g, but we will see them in the future:
      • W would refer to a weighted graph, where edges have a weight attribute
      • B would refer to a bipartite graph, where verties have a type attribute
  • 571 refers to the number of vertices in g
  • 61102 refers to the number of edges in g
  • attr: is a list of attributes within the graph. We only see name, but we will see multiple attributes in the future.
    • (v/c), which we see following name, tells us that it is a vertex attribute of a character data type. character is simply what R calls a string.
    • In the future we will also see:
      • (e/c) or (e/n) referring to edge attributes that are of character or numeric data types
      • (g/c) or (g/n) referring to graph attributes that are of character or numeric data types
  • + edges from *arbitrary igraph name* (vertex names): lists a sample of g’s edges using the names of the vertices which they connect

Save your Network Data

There are a lot of ways to save your data.

Let’s suppose that you want to name your network “DesignTrial”. Then you could save it as an R object that you can use later in R.

save(g, file="DesignTrial.rda")
  # Later you can load this using:
load("DesignTrial.rda") # This will work only if you have set the working directory to where you have stored this data set.

To save the network as either, and edgelist, or as a matrix, then it is best to use .csv format.

# To extract an edgelist:
get.edgelist(g)
  # Save the edgelist as an object
el <- get.edgelist(g)
write.csv(el, file="DesignTrialEL.csv") # Then save as CSV

# To extract an adjacency matrix:
get.adjacency(g, names=TRUE)
  # Save the edgelist as an object
am <- get.adjacency(g, names=TRUE)
amMat <- as.matrix(am) # One extra step: make it a matrix before exporting
write.csv(amMat, file="DesignTrialAM.csv")

Last, try some visualizations

The native plot() function in R will produce a network visualization. For more explicit plotting options, use plot.igraph() and check out its help section using ?plot.igraph.

Alternatively, you can use tkplot() if you would rather use an interactive graph plotter. One word of caution: tkplot is not very efficient, so large graphs will be difficult - and slow - to plot in tkplot.

Below, are a few options that you can try out:

# R's native plot function (operates like plot.igraph)
plot(g)

# igraph-specific plotting function
plot.igraph(g)
    # Some layout options:
plot.igraph(g, layout=layout.fruchterman.reingold) # Try a few to see what you like.
plot.igraph(g, layout=layout.kamada.kawai)
plot.igraph(g, layout=layout.circle)
plot.igraph(g, layout=layout.davidson.harel)
plot.igraph(g, layout=layout.mds)
plot.igraph(g, layout=layout.gem)
plot.igraph(g, layout=layout.sphere)
plot.igraph(g, layout=layout.spring)

# To use the interactive tkplot:
tkplot(g)

When using the comon plot() or plot.igraph() functions, the network will be visualized in the plot window of RStudio. To save your visualization in RStudio, look in the “Plots” viewer for “Export”:
From there:

  • Export
    • Save as Image…
      (use jpeg)

In tkplot, you have only one option:

  • Export
    • Postscript

Deliverable

Everyone should - electronically - submit a google document with the following to satisfy the terms of this practicum:

  • Your name and student ID
  • An edgelist for your network
  • A matrix representation of your network
  • Your favorite version of the plotted network
  • Describe your network.
    • (Use your words.)
    • What is it supposed to be?
    • Why did you design it the way you did?
    • What patterns should I see that tell me what type of network it is?
  • Let me know who you worked with to design the network.
  • Bonus (This is not required and you don’t actually get any extra points for this, but it is cool.) Try using the visualization tips and tricks for igraph to make your visualization look better. Change node and edge colors, fonts, or anything else you would like to try with your graph. Use the “Resources for R” links for ideas.