Setup

To run a “chunk” - the code between the back ticks - click on the green arrow to the right. For more options see Run at the top of the source editor window (this window).

In this tutorial we are going to work with the igraph package. Run this command:

If and only if you get an error saying that the package is unkown run the chunk above called Package_Installation. Then, or otherwise, continue with the next line.

The best source of information on igraph is the online documentation.

Prelude: Minimal R

R is not a command language - e.g., just a bunch of commands for statistics operations - but a full programming language. This provides great flexilibity, but also means that one has to understand a little bit about programming. Really, just a little bit. Therefore, Occasionally, we need to learn - or remind ourselves of - something about R. It’ll always be short and to the point. Two elementary things today: assignments and the list constructor c().

Assignment

You can assign a value to an object using assign(), “<-”, or “=”:

x <- 3         # Assignment
x              # Evaluate the expression and print result
[1] 3

Now you: Assign y to 4 and evaluate.

y = 4
y
[1] 4

Next, what do you expect as result when you run this chunk:

y + 5
[1] 9
y
[1] 4

Why did the value of y not change?

Because is was not assigned to a new value; it was only used in evaluations.

We can chain objects together, like in algebra:

z =  x + 5*y    # Assignment
z             # Evaluation
[1] 23

Not only can you create objects, you can also destroy them:

rm(z)          # Remove z: deletes the object.
z              # Error!
Error: object 'z' not found

Lists and vectors.

Lists are basic to R: this is a list: (apple, pear, banana, cherry).These kind of simple list–all elements have the same type (characters, numbers), not embedding of lists in lists–are called vectors in R. The vector, not and individual value (atom), is the basic data type in R. One way to construct vectors is with c().

fruit
[1] "apple"  "pear"   "banana" "cherry"

Now you: Make a vector - a simple list - of numbers 1-4. Assign it to numVec

numVec <- c(1, 2, 3, 4)
numVec
[1] 1 2 3 4

R is all about vectors, so math is done on vectors if they contain numbers:

numVec + 1
[1] 2 3 4 5
numVec * numVec
[1]  1  4  9 16

(Those of you who have done a bit of coding will see immediately what this means: Much less need for looping!)

If you want to pick an element, or a set of elements, you can use the index:

numVec[2]
[1] 2
c("Eins", "zwei", "drei")[1]
[1] "Eins"

When we assign different kinds of atoms to a vector, they will be transformed automatically to be of one type - a process called coercion. Like so:

c(1, 2, 3, "four")
[1] "1"    "2"    "3"    "four"

Better not to try arithmetic on this vector.

Now the difference to a list can be demonstrated, using the list() operator to create, you guessed it, a list.

list(1, "apple", 5, "pear")
[[1]]
[1] 1

[[2]]
[1] "apple"

[[3]]
[1] 5

[[4]]
[1] "pear"

You see, because vectors are R’s ‘primitive’ data structure, a list is a list of vectors, even when these vector have length 1.

MixedBag <- list(fruit, numVec)
MixedBag
[[1]]
[1] "apple"  "pear"   "banana" "cherry"

[[2]]
[1] 1 2 3 4

As you see, lists can contain multiple vectors - and, indeed, multiple lists - and these vectors can contain data of different types (strings and numbers in this case)

This ends the Prelude.

Create networks

Example network: undirected

Usually we would read network data from files, but for an initial understanding - and for small networks - we can create them “manually”. For instance, an undirected graph with 3 edges:

g1 <- make_graph(edges=c(1,2, 2,3, 3,1), directed=FALSE)
  • g1 is a variable (in a programming sense)
  • <- is an assignment operator: “Assign to what is left of the operator the value of that what is to the right of the operator”. It’s meant to look like an arrow.
  • c() concatinates what is between the brackets into a list.
  • The numbers are interpreted as vertex IDs, so the edges are 1–2, 2–3, 3–1.
  • The graph is undirected (the default of directed=TRUE is overwritten)

A simple plot of g1:

plot(g1)

And some information about g1:

g1
IGRAPH 094b0d4 U--- 3 3 -- 
+ edges from 094b0d4:
[1] 1--2 2--3 1--3

The first line is a bit cryptic, but informative. The description of an igraph object starts with four letters:

  • IGRAPH 094b0d4: Class and and internal identifier for the graph object- we can safely ignore this
  • D or U, for a directed or undirected graph
  • N for a named graph (where nodes have a name attribute), else -.
  • W for a weighted graph (where edges have a weight attribute), else -.
  • B for a bipartite (two-mode) graph (where nodes have a type attribute), else -.
  • number of vertices (nodes)
  • number of edges (links, ties)
  • followed by two dashes -- that have no meaning other than “end of first line”.

The lines after the first contain the network connections.

Question: What does it tell you that the listing of edges (links) is preceeded by [1]?

It means that the links are stored in a vector, but it’s not quite clear to me how because the links are not represented as strings:

c("1--2", "2--3", "1--3")
[1] "1--2" "2--3" "1--3"

I presume igraph defined a new data type that makes a tie such as 1--2 or 1->2 a ‘primitive’.

Example network: directed

g2 <- graph( edges=c(1,2, 2,3, 3,1), n=10 )
g2
IGRAPH 90fe0b9 D--- 10 3 -- 
+ edges from 90fe0b9:
[1] 1->2 2->3 3->1

This yields a network with n=10 vertices, not all of which are connected.

plot(g2)

Example network: named vertices

g3 <- graph( c("John", "Jim", "Jim", "Jill", "Jill", "John"))
plot(g3)

g3
IGRAPH 013dfe5 DN-- 3 3 -- 
+ attr: name (v/c)
+ edges from 013dfe5 (vertex names):
[1] John->Jim  Jim ->Jill Jill->John

You can see in line 1 that this is Directed and Named graph. The description also lists node & edge attributes.In this case the name (v/c) string means “vertex level attribute name with data type character”. In general:

  • (g/c) - graph-level character attribute
  • (v/c) - vertex-level character attribute
  • (e/n) - edge-level numeric attribute

Example network: named vertices and with named isolates

g4 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"), 
             isolates=c("Jesse", "Janis", "Jennifer", "Justin"))
plot(g4, edge.arrow.size=.5, vertex.color="gold", vertex.size=15, 
     vertex.frame.color="gray", vertex.label.color="black", 
     vertex.label.cex=0.8, vertex.label.dist=2, edge.curved=0.2)

This example shows how to label isolates in addition to connected vertices, and demonstrates some of ways one can use the plot() function to control display features.

Feel free to explore the plot function a bit more. For instance, try to change the vertex colour to red or green instead of gold. Remember, you can add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

(insert one or more chunks here, as you like.)

Exercise

Create a network object (and plot) that satisfies this IGRAPH signature:

DN-- 7 4 --

I copy Adrienne’s solution as an example (correct and first submitted):

g5 <- graph( c("Chris", "Adrienne", "Adrienne", "Kate", "Kate", "Jesse", "Jesse", "Chris"),
              isolates=c("Mary", "Ado", "Kuzma"))
g5
IGRAPH 35058c0 DN-- 7 4 -- 
+ attr: name (v/c)
+ edges from 35058c0 (vertex names):
[1] Chris   ->Adrienne Adrienne->Kate     Kate    ->Jesse    Jesse   ->Chris   
