Overview

The analysis of two-mode data is important to the field of network analysis, mainly because it opens up so many more opportunities to use the network analysis toolset. This, of course, also comes with the caveat that we should not try to treat data as they are a two-mode network just because we can make the computer to so. Rather, if the association that you have found provides a reasonable depiction of latent ties, the opportunity to form a tie, or a fair amount of certainty that a tie is implied; then two-mode networks are appropriate.

The trouble with two-mode networks is that their sparse nature - with the nodes in one mode being reachable only through a node from the other mode - provides certain limitations on how they may be analyzed.

This practicum is designed to familiarize you with working with two-mode data. As such, it does not go too much further than working to manipulate the network in order to get it into R, convert it from two modes to one, and to run a few simple plots and analyses.

To download the data, go to the Network Data link on the course website. https://goo.gl/3rkUK4 You will find the data (davis.csv) on the Network Data page.

Setup

This part is important. Please be sure to do this part again

Create a folder labeled “Bipartite Networks in igraph” someplace on your computer, such as your Desktop or wherever you will be able to easily find it again. Then, set your working directory to that folder in RStudio:

Session
- Set Working Directory
- Choose Directory…

Getting Started

Start by loading igraph.

library(igraph)

Loading and configuring two-mode data

You can download the example data at here: http://bit.ly/2xzM1po

If you have downloaded the file into the folder you created for this exercise, then you can use the following script:

davis <- read.csv("davis.csv", header=FALSE)

If you are not sure where you put it and would appreciate the ability to look for the file, then use the file.choose() function.

davis <- read.csv(file.choose(), header=FALSE)

Once the edgelist is loaded, take a look at it. We don’t need to see it all to get an idea of what is in there. So, use the head() function to view just the first six rows of the data.

head(davis)

##       V1 V2
## 1 EVELYN  1
## 2 EVELYN  2
## 3 EVELYN  3
## 4 EVELYN  4
## 5 EVELYN  5
## 6 EVELYN  6

As you can see, the first column is the women from Davis’ Southern Women network and the second column is the events that they attended. This is how a two-mode edgelist should be organized: the first mode will be whatever is represented in the first column and the second mode is represented in the second column.

Recall that, for this to be a two-mode network, ties should exist only between modes, and not within modes. In this case, that means that ties are only possible between women and events, not between women and women or between events and events. Any direct ties between nodes within a mode may be derived, as we will do below. But they should not appear within the network at this point.

Go ahead and make the network.

g <- graph.data.frame(davis, directed=FALSE)

Because igraph does not automatically recognize two-mode networks, it is necessary to tell igraph that there are two types of vertices. There are multiple methods for doing this. We cover two options here:

Using igraph’s native bipartite.mapping() function
Manually telling igraph that it has a two-mode (bipartite) network

Igraph’s `bipartite.mapping()` function

Igraph can evaluate the network that you have entered for whether it meets the criteria of a two-mode network. Those criteria are that there are (1) two sets of nodes in the network, and (2) there are only ties between node sets and not within them. That is, there are two sets of entities in the network, and the entities from each set are only connected with one another through the other node set. If the network meets the criteria, igraph will identify which nodes belong in each mode.

To see what the function does, try running it:

bipartite.mapping(g)

## $res
## [1] TRUE
## 
## $type
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

Igraph returns two responses:

Whether the network meets the criteria of a two-mode network ($res), and
Which nodes fall into each mode ($type).

The “type” argument is what igraph uses to identify the two modes. We can add this into the network fairly easily.

V(g)$type <- bipartite_mapping(g)$type  ## Add the "type" attribute
                                            ##  to the network.

Note: If you would like to learn more about manually assigning nodes to each mode, check out our page on importing Pajek (.net) files, at: http://rpubs.com/pjmurphy/317505

First Look: Visualizaiton of Two-Mode Networks

There are a range of options for visualizing two-mode networks. At a minimum, you will want to find some manner in which to differentiate which node belongs to each mode. Beyond that, it will be up to the analyst to decide which option best suits their particular needs.

What follows are just a few of the basic options for visualizing networks in two-modes.

Plotting a bipartite Network

plot(g)

We can do better than that. Let’s make our plot more easily recognizable as being bipartite.

Tweaking Labels

Let’s take a look at our bipartite network, but let’s tweak the labels a tad to clean up the visualization.

We’re going to use

vertex.label.cex to change the size of the labels
- vertex.label.cex acts as a scale by which label size is multiplied
vertex.label.color
- which is self-explanatory

If you haven’t yet, you should take a look at standard options for plotting igraph objects here.

plot(g, vertex.label.cex = 0.8, vertex.label.color = "black")

`V(g)$color` and `V(g)$shape`

Since we’re dealing with two different kinds of vertices in a bipartite network, the first thing we should do is make identification of the type of each vertex more visually intuitive.

Let’s do this using color and shape with slight variations of the ifelse() function we used to assign each vertices’ type.

Since our network’s types are already either TRUE or FALSE, evaluating them only requires that we use the type of vertices with V(g)$type as our first argument in ifelse().

If a value is already TRUE, there’s no need to compare it to anything. This is one of the ways that understanding logical evaluations in R can simplify your code.

The result of using ifelse() is that "blue" is going to be assigned to the color of our TRUE vertices and "red" will be assigned to the color of our FALSE vertices.

V(g)$color <- ifelse(V(g)$type, "lightblue", "salmon")
V(g)$shape <- ifelse(V(g)$type, "circle", "square")
E(g)$color <- "lightgray"

plot(g, vertex.label.cex = 0.8, vertex.label.color = "black")

You might guess that customizing our plot() arguments can get really complicated, but we can make life easier.

When we want to change things like vertex.label.color or vertex.label.cex, we’re modifying the network is visually represented.

Translation? We can just add the plot arguments that we want to the igraph object itself in the same way we did for color, shape, and size.

Since we modify vertices by using V(), we can omit the vertex.

For example:

Instead of using vertex.label.color = "black"
- as an argument in plot(),
we can just assign this value to the igraph object
- by using V(g)$label.color <- "black"

Let’s…

make the labels for our vertices "black"
assign the label.cex value to adjust our label size
- considering that the numbers don’t take up as much room as the names, we can let them be bigger
assign label.font to igraph’s argument for bold font (2)
- more options here
assign frame.color to NA to remove the vertex borders
increase the size of the vertices slightly by using size
- the default size is 15

We can then use layout_with_graphopt to minimize vertex overlap when we plot().

V(g)$label.color <- "black" ##ifelse(V(g)$type, "black", "white")
## V(g)$label.font <-  2
V(g)$label.cex <- 1 ##ifelse(V(g)$type, 0.8, 1.2)
## V(g)$label.dist <-0
V(g)$frame.color <-  "gray"
V(g)$size <- 18

plot(g, layout = layout_with_graphopt)

Alternatively, we can use the bipartite-specific layout.

plot(g, layout=layout.bipartite, vertex.size=7, vertex.label.cex=0.6)

There’s no silver bullet to visualizations and it is as much an art as it is science and you may feel that you can represent the network in a better way than shown here. Play with options until you get a visualization that easily understandable and highlights the features to which you’re trying to draw interest.

We can make our visualization more informative by incorporating measurements into our plot. For instructions on how to do so, see the sections below.

Analytic Options for Two-Mode Networks

Two-mode networks can be very rewarding to study. But, they are also somewhat of a challenge, as their analysis is still not well defined. Below, we present three options for analyzing two-mode networks.

Pretend like they are one-mode networks and analyze as usual
Analyze each mode independently using metrics that are specialized for use with two-mode networks
Convert the two-mode network to two one-mode networks and analyze them as usual

Each of the above options has its trade-offs and strengths. For a larger discussion of these trade-offs, see this week’s readings. For a description of how to do each, we’ll approach each option in order.

Option 1: Proceed as Usual for One-Mode Networks

Igraph was not designed with two-mode networks in mind. It does, however recognize that the network is two-mode. Keep in mind, however, that when you run centrality measures on a two-mode network, igraph will be treating each of these nodes as though they are in the same mode. Igraph makes no allowance for calculating centralities that are specific to the special case of two-mode networks.

If you are interested in understanding the relative prominence of nodes in each mode, relative to other nodes in that mode, then the best you will be able to do in igraph will be to analyze each mode separately. For more on that, see the following two sections.

Calculating Centrality

types <- V(g)$type                 ## getting each vertex `type` let's us sort easily
deg <- degree(g)
bet <- betweenness(g)
clos <- closeness(g)
eig <- eigen_centrality(g)$vector

cent_df <- data.frame(types, deg, bet, clos, eig)

cent_df[order(cent_df$type, decreasing = TRUE),] ## sort w/ `order` by `type`

##           types deg         bet       clos       eig
## 1          TRUE   3   0.9737485 0.01190476 0.2801697
## 2          TRUE   3   0.9440910 0.01190476 0.2970199
## 3          TRUE   6   8.2374247 0.01282051 0.4990855
## 4          TRUE   4   3.4813278 0.01219512 0.3473754
## 5          TRUE   8  17.0378880 0.01351351 0.6350431
## 6          TRUE   8  29.3873904 0.01562500 0.6466602
## 8          TRUE  14 110.2063970 0.01923077 1.0000000
## 7          TRUE  10  58.5347884 0.01666667 0.7569644
## 9          TRUE  12 101.9321435 0.01785714 0.7490484
## 10         TRUE   5   5.1719208 0.01250000 0.3363400
## 12         TRUE   6   8.1785703 0.01282051 0.4002802
## 13         TRUE   3   1.0127765 0.01190476 0.2229050
## 11         TRUE   4   8.8887567 0.01219512 0.1767478
## 14         TRUE   3   1.0127765 0.01190476 0.2229050
## EVELYN    FALSE   8  42.9802009 0.01666667 0.6607035
## LAURA     FALSE   7  22.8541379 0.01515152 0.6103528
## THERESA   FALSE   8  38.9796350 0.01666667 0.7314245
## BRENDA    FALSE   7  22.0215369 0.01515152 0.6178219
## CHARLOTTE FALSE   4   4.7152628 0.01250000 0.3320230
## FRANCES   FALSE   4   4.7678826 0.01388889 0.4124632
## ELEANOR   FALSE   4   4.2026349 0.01388889 0.4507133
## PEARL     FALSE   3   3.0261439 0.01388889 0.3553458
## RUTH      FALSE   4   7.4684831 0.01470588 0.4659001
## VERNE     FALSE   4   7.0032613 0.01470588 0.4310787
## MYRNA     FALSE   4   7.2732350 0.01428571 0.3686892
## KATHERINE FALSE   6  21.0764255 0.01515152 0.4348144
## SYLVIA    FALSE   7  31.9105696 0.01612903 0.5470918
## NORA      FALSE   8  50.4903061 0.01666667 0.5208987
## HELEN     FALSE   5  18.8624553 0.01515152 0.3960796
## DOROTHY   FALSE   2   0.8693195 0.01351351 0.2594293
## OLIVIA    FALSE   2   2.2492548 0.01219512 0.1373196
## FLORA     FALSE   2   2.2492548 0.01219512 0.1373196

Sizing Vertices by Centrality

V(g)$size <- degree(g)
V(g)$label.cex <- degree(g) * 0.2

plot(g, layout = layout_with_graphopt)

Option 2: Analyze Each Mode Separately Using Two-Mode Metrics

There are presently few options for analyzing two-mode networks in R. One of the more established options will be to use Tore Opsahl’s tnet package. You can find more information about tnet, as well as many other two-mode network analysis ideas [on Tore’s blog:] (https://toreopsahl.com/2009/06/12/tnet-software-for-analysing-weighted-networks/) https://toreopsahl.com/

You can also find a lot of information about the package using the ?tnet function.

To install tnet, just do as you usually would.

install.packages("tnet", dependencies=TRUE)

library(tnet)  # start tnet

Working with the `tnet` data format

When using tnet, keep in mind that it was not designed to work like igraph and it was not designed to work with igraph data objects. This means that you will have to convert the data that you have entered into igraph to work with tnet.

The tnet package expects the data to be formatted as a numeric edgelist. Do not include names. Thankfully, producing a numeric edgelist from an igraph data object is fairly easy and quick using the get.edgelist function in igraph.

To suppress the names that would normally exist in an edgelist like this, include the following argument: names=FALSE.

tm<-get.edgelist(g, names=FALSE)

head(tm)  # check to make sure it worked

##      [,1] [,2]
## [1,]    1   19
## [2,]    1   20
## [3,]    1   21
## [4,]    1   22
## [5,]    1   23
## [6,]    1   24

A notable inconvenience to using this particular data format is that it can be difficult to keep up with the names of the various nodes. To help with that, you can extract the labels of the nodes in the igraph data object for later use.

NodeLabels <- V(g)$name

head(NodeLabels)   # Again, check

## [1] "EVELYN"    "LAURA"     "THERESA"   "BRENDA"    "CHARLOTTE" "FRANCES"

There is just one more thing to keep in mind as you work with tnet. The package only analyzes the first of the two modes (whichever one is in column one). To analyze the second mode, you will need to transpose the columns.

Here, we represented the transposed edgelist as “mt”.

mt <- tm[, c(2, 1)]

head(mt)

##      [,1] [,2]
## [1,]   19    1
## [2,]   20    1
## [3,]   21    1
## [4,]   22    1
## [5,]   23    1
## [6,]   24    1

Calculating centrality in `tnet`

deg_tm <- degree_tm(tm)

deg_mt <- degree_tm(mt)

Okay, so Tore has only defined how to caluclate degree in a two-mode network. Other programs, like UCINET and Pajek have already defined two-mode centrality measures, with appropriate normalizations. In this case, however, Tore has explained that he feels that it is more appropriate to convert the two-mode network into a one-mode network with weighted ties and then analyze that. He has designed some methods for analyzing weighted centrality in one-mode networks (also included in tnet) for that purpose.

You may feel that this was a lot of buildup for very little payoff, and you are correct. But, we will be able to use tnet once we have completed the next section. This brings us to option 3: converting two-mode networks to one-mode networks for further analysis. We’ll pick tnet back up at the end of option three.

Option 3: Converting Two-Mode to One-Mode Networks

Because igraph does not include measures that are designed specifically for use with two-mode data, you may wish to convert the two-mode network into two one-mode networks. So, if you have a network of women and events, as we do here, you will be able to create a woman-by-woman network and an event-by-event network. These resulting networks may initially be valued to reflect similarity, opportunity, or simple overlaps in behavior. The tnet package is designed to handle weighted networks to produce measures of centrality, .

Alternatively, you may decide to binarize the networks in order to better reflect your own consideration of what should constitute a tie under the circumstances that you are researching.

There are several ways to convert a two-mode network. In this introduction, we will focus only on five: a simple count of overlaps, simple matching, Jaccard similarity, Pearson’s Correlation, and Yule’s Q. Although there are many, many options, these should give you a good start.

Table 1: Options for converting 2-modes to one-mode

To read the table, above, consider the two-by-two tables just below it. For each pair of nodes in one of the modes, we may count the number of nodes in the other mode to which they both have a connection (a); the number of nodes in the other mode to which neither has a connection (d); and the number of nodes in the other mode to which one has a connection, but the other does not (b and c). For example, if we consider Davis’ Southern Women network, below, then we can see that Ruth and Pearl attended two events in common. Similarly, there were nine events that neither attended. Also, Ruth attended two events that Pearl did not attend, and Pearl attended one event that Ruth did not.

We can use this information to calculate the potential for ties or overall similarities between women or between events in the Southern Women network, or within modes of any two-mode network. We will treat each of the four methods listed above in the order that they appear in the table.

Note: This list is far from exhaustive. But, it should be a good place to start.

Overlap Count through Manual Projection

`as_incidence_matrix()`

An overlap count is the simplest, and most common, approach to converting two modes to one. As the name suggests, this is a count of the number of nodes in the second mode that each pair in the first mode have in common, and visa versa. To use the Southern Women as an example, to convert the network into a woman-by-woman network, this would be the number of events that each pair of women co-attended. Conversely, if you are converting the network into an event-by-event network, then it would be the number of women that each pair of events had in common.

To count overlaps, we first convert g to a rectangular matrix where the “Southern Women” are columns and the events are rows.

bipartite_matrix <- as_incidence_matrix(g)

bipartite_matrix

##           1 2 3 4 5 6 8 7 9 10 12 13 11 14
## EVELYN    1 1 1 1 1 1 1 0 1  0  0  0  0  0
## LAURA     1 1 1 0 1 1 1 1 0  0  0  0  0  0
## THERESA   0 1 1 1 1 1 1 1 1  0  0  0  0  0
## BRENDA    1 0 1 1 1 1 1 1 0  0  0  0  0  0
## CHARLOTTE 0 0 1 1 1 0 0 1 0  0  0  0  0  0
## FRANCES   0 0 1 0 1 1 1 0 0  0  0  0  0  0
## ELEANOR   0 0 0 0 1 1 1 1 0  0  0  0  0  0
## PEARL     0 0 0 0 0 1 1 0 1  0  0  0  0  0
## RUTH      0 0 0 0 1 0 1 1 1  0  0  0  0  0
## VERNE     0 0 0 0 0 0 1 1 1  0  1  0  0  0
## MYRNA     0 0 0 0 0 0 1 0 1  1  1  0  0  0
## KATHERINE 0 0 0 0 0 0 1 0 1  1  1  1  0  1
## SYLVIA    0 0 0 0 0 0 1 1 1  1  1  1  0  1
## NORA      0 0 0 0 0 1 0 1 1  1  1  1  1  1
## HELEN     0 0 0 0 0 0 1 1 0  1  1  0  1  0
## DOROTHY   0 0 0 0 0 0 1 0 1  0  0  0  0  0
## OLIVIA    0 0 0 0 0 0 0 0 1  0  0  0  1  0
## FLORA     0 0 0 0 0 0 0 0 1  0  0  0  1  0

`t()` transpose

Next, let’s look at the transpose of biparite_matrix by using the function t().

In formal terms, if we refer to the set of nodes and ties as a matrix A, then its transpose is referred to as A’. This will be important in the section below.

t(bipartite_matrix)

##    EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH VERNE MYRNA
## 1       1     1       0      1         0       0       0     0    0     0     0
## 2       1     1       1      0         0       0       0     0    0     0     0
## 3       1     1       1      1         1       1       0     0    0     0     0
## 4       1     0       1      1         1       0       0     0    0     0     0
## 5       1     1       1      1         1       1       1     0    1     0     0
## 6       1     1       1      1         0       1       1     1    0     0     0
## 8       1     1       1      1         0       1       1     1    1     1     1
## 7       0     1       1      1         1       0       1     0    1     1     0
## 9       1     0       1      0         0       0       0     1    1     1     1
## 10      0     0       0      0         0       0       0     0    0     0     1
## 12      0     0       0      0         0       0       0     0    0     1     1
## 13      0     0       0      0         0       0       0     0    0     0     0
## 11      0     0       0      0         0       0       0     0    0     0     0
## 14      0     0       0      0         0       0       0     0    0     0     0
##    KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
## 1          0      0    0     0       0      0     0
## 2          0      0    0     0       0      0     0
## 3          0      0    0     0       0      0     0
## 4          0      0    0     0       0      0     0
## 5          0      0    0     0       0      0     0
## 6          0      0    1     0       0      0     0
## 8          1      1    0     1       1      0     0
## 7          0      1    1     1       0      0     0
## 9          1      1    1     0       1      1     1
## 10         1      1    1     1       0      0     0
## 12         1      1    1     1       0      0     0
## 13         1      1    1     0       0      0     0
## 11         0      0    1     1       0      1     1
## 14         1      1    1     0       0      0     0

Now that we’ve seen how t() works, we can multiply bipartite_matrix by its transpose: t(biparite_matrix).

Similar to the %in% operator we saw earlier, R gives us a special operator to use for matrix multiplication: %*%.

Matrix Multiplication - For Overlap Count

The method we’re going to use to project our bipartite matrix to one mode matrices is the Cross-Product Method with manual matrix multiplication. This is the method that we covered in class where we can multiply a two-mode matrix by its transpose to produce a one-mode network that reclects the ties between the nodes in one of the two modes.

In the instructions that follow, we are using the Southern Women data, shown above. As you can see, the women are on the y axis, and the events are on the x axis. In matrix multiplication, the order in which you enter the matrices that you are multiplying into the expression matters.

To produce a Y by Y (women x women) network, multiply the matrix we have above by its transpose (AA'). To produce an X by X network (event x event), multiply the transposed network by the original (A'A). So, let’s try it.

We’re going to multiply

the transpose of bipartite_matrix (t(bipartite_mattrix))
- by
the original bipartite_matrix
- using the matrix multiplication operator %*%
and assign the whole thing to a new variable called person_matrix_prod.
- Using the transposed matrix (t(bipartite_matrix)) as our first variable in the multiplication will produce the person_matrix_prod, as the “Southern Women” are bipartite_matrix’s columns.
It doesn’t make any sense that there would be loops when projecting a bipartite network (as we would then be adding information that did not exist in the original network), so we want to set the diagonal of our result to 0.
- to do this, we use the diag() function with person_matrix_prod as the argument.

event_matrix_prod <- t(bipartite_matrix) %*% bipartite_matrix 
## crossprod() does same and scales better, but this is better to learn at first at first so you understand the method

diag(event_matrix_prod) <- 0

event_matrix_prod

##    1 2 3 4 5 6 8 7 9 10 12 13 11 14
## 1  0 2 3 2 3 3 3 2 1  0  0  0  0  0
## 2  2 0 3 2 3 3 3 2 2  0  0  0  0  0
## 3  3 3 0 4 6 5 5 4 2  0  0  0  0  0
## 4  2 2 4 0 4 3 3 3 2  0  0  0  0  0
## 5  3 3 6 4 0 6 7 6 3  0  0  0  0  0
## 6  3 3 5 3 6 0 7 5 4  1  1  1  1  1
## 8  3 3 5 3 7 7 0 8 9  4  5  2  1  2
## 7  2 2 4 3 6 5 8 0 5  3  4  2  2  2
## 9  1 2 2 2 3 4 9 5 0  4  5  3  3  3
## 10 0 0 0 0 0 1 4 3 4  0  5  3  2  3
## 12 0 0 0 0 0 1 5 4 5  5  0  3  2  3
## 13 0 0 0 0 0 1 2 2 3  3  3  0  1  3
## 11 0 0 0 0 0 1 1 2 3  2  2  1  0  1
## 14 0 0 0 0 0 1 2 2 3  3  3  3  1  0

You may also want to do the same thing to get the event_matrix. To do this, all we have to do are reverse the order of the variables that we are multiplying.

Using the original matrix (bipartite_matrix) as our first variable in the multiplication will produce the event_matrix, as the events are bipartite_matrix’s rows.
- Again, we want to set the diagonal to 0 by using the diag() function.

person_matrix_prod <- bipartite_matrix %*% t(bipartite_matrix)

diag(person_matrix_prod) <- 0

person_matrix_prod

##           EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
## EVELYN         0     6       7      6         3       4       3     3    3
## LAURA          6     0       6      6         3       4       4     2    3
## THERESA        7     6       0      6         4       4       4     3    4
## BRENDA         6     6       6      0         4       4       4     2    3
## CHARLOTTE      3     3       4      4         0       2       2     0    2
## FRANCES        4     4       4      4         2       0       3     2    2
## ELEANOR        3     4       4      4         2       3       0     2    3
## PEARL          3     2       3      2         0       2       2     0    2
## RUTH           3     3       4      3         2       2       3     2    0
## VERNE          2     2       3      2         1       1       2     2    3
## MYRNA          2     1       2      1         0       1       1     2    2
## KATHERINE      2     1       2      1         0       1       1     2    2
## SYLVIA         2     2       3      2         1       1       2     2    3
## NORA           2     2       3      2         1       1       2     2    2
## HELEN          1     2       2      2         1       1       2     1    2
## DOROTHY        2     1       2      1         0       1       1     2    2
## OLIVIA         1     0       1      0         0       0       0     1    1
## FLORA          1     0       1      0         0       0       0     1    1
##           VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
## EVELYN        2     2         2      2    2     1       2      1     1
## LAURA         2     1         1      2    2     2       1      0     0
## THERESA       3     2         2      3    3     2       2      1     1
## BRENDA        2     1         1      2    2     2       1      0     0
## CHARLOTTE     1     0         0      1    1     1       0      0     0
## FRANCES       1     1         1      1    1     1       1      0     0
## ELEANOR       2     1         1      2    2     2       1      0     0
## PEARL         2     2         2      2    2     1       2      1     1
## RUTH          3     2         2      3    2     2       2      1     1
## VERNE         0     3         3      4    3     3       2      1     1
## MYRNA         3     0         4      4    3     3       2      1     1
## KATHERINE     3     4         0      6    5     3       2      1     1
## SYLVIA        4     4         6      0    6     4       2      1     1
## NORA          3     3         5      6    0     4       1      2     2
## HELEN         3     3         3      4    4     0       1      1     1
## DOROTHY       2     2         2      2    1     1       0      1     1
## OLIVIA        1     1         1      1    2     1       1      0     2
## FLORA         1     1         1      1    2     1       1      2     0

`graph_from_adjacency_matrix()`

women_overlap <- graph_from_adjacency_matrix(person_matrix_prod, 
                                        mode = "undirected", 
                                        weighted = TRUE)

women_overlap

## IGRAPH d6e4c80 UNW- 18 139 -- 
## + attr: name (v/c), weight (e/n)
## + edges from d6e4c80 (vertex names):
##  [1] EVELYN --LAURA     EVELYN --THERESA   EVELYN --BRENDA    EVELYN --CHARLOTTE
##  [5] EVELYN --FRANCES   EVELYN --ELEANOR   EVELYN --PEARL     EVELYN --RUTH     
##  [9] EVELYN --VERNE     EVELYN --MYRNA     EVELYN --KATHERINE EVELYN --SYLVIA   
## [13] EVELYN --NORA      EVELYN --HELEN     EVELYN --DOROTHY   EVELYN --OLIVIA   
## [17] EVELYN --FLORA     LAURA  --THERESA   LAURA  --BRENDA    LAURA  --CHARLOTTE
## [21] LAURA  --FRANCES   LAURA  --ELEANOR   LAURA  --PEARL     LAURA  --RUTH     
## [25] LAURA  --VERNE     LAURA  --MYRNA     LAURA  --KATHERINE LAURA  --SYLVIA   
## [29] LAURA  --NORA      LAURA  --HELEN     LAURA  --DOROTHY   THERESA--BRENDA   
## + ... omitted several edges

events_overlap <- graph_from_adjacency_matrix(event_matrix_prod, 
                                       mode = "undirected", 
                                       weighted = TRUE)

events_overlap

## IGRAPH 2b116bd UNW- 14 66 -- 
## + attr: name (v/c), weight (e/n)
## + edges from 2b116bd (vertex names):
##  [1] 1 --2  1 --3  1 --4  1 --5  1 --6  1 --8  1 --7  1 --9  2 --3  2 --4 
## [11] 2 --5  2 --6  2 --8  2 --7  2 --9  3 --4  3 --5  3 --6  3 --8  3 --7 
## [21] 3 --9  4 --5  4 --6  4 --8  4 --7  4 --9  5 --6  5 --8  5 --7  5 --9 
## [31] 6 --8  6 --7  6 --9  6 --10 6 --12 6 --13 6 --11 6 --14 8 --7  8 --9 
## [41] 8 --10 8 --12 8 --13 8 --11 8 --14 7 --9  7 --10 7 --12 7 --13 7 --11
## [51] 7 --14 9 --10 9 --12 9 --13 9 --11 9 --14 10--12 10--13 10--11 10--14
## [61] 12--13 12--11 12--14 13--11 13--14 11--14

Notice that in both person_g and event_g that on the attr: row there is an attribute called weight. This is how igraph handles the results of our matrix multiplication.

You can take a peek using the E() function, which is short for edges in the same way that V() is short for vertices.

E(women_overlap)$weight

##   [1] 6 7 6 3 4 3 3 3 2 2 2 2 2 1 2 1 1 6 6 3 4 4 2 3 2 1 1 2 2 2 1 6 4 4 4 3 4
##  [38] 3 2 2 3 3 2 2 1 1 4 4 4 2 3 2 1 1 2 2 2 1 2 2 2 1 1 1 1 3 2 2 1 1 1 1 1 1
##  [75] 1 2 3 2 1 1 2 2 2 1 2 2 2 2 2 2 1 2 1 1 3 2 2 3 2 2 2 1 1 3 3 4 3 3 2 1 1
## [112] 4 4 3 3 2 1 1 6 5 3 2 1 1 6 4 2 1 1 4 1 2 2 1 1 1 1 1 2

E(events_overlap)$weight

##  [1] 2 3 2 3 3 3 2 1 3 2 3 3 3 2 2 4 6 5 5 4 2 4 3 3 3 2 6 7 6 3 7 5 4 1 1 1 1 1
## [39] 8 9 4 5 2 1 2 5 3 4 2 2 2 4 5 3 3 3 5 3 2 3 3 2 3 1 3 1

Another way to produce an overlap count in igraph `bipartite_projection()`

igraph includes a built-in function that will project a bipartite network to one-mode networks for you by using the same cross product method.

That said, it is not the only method that can be used to project a two-mode network and now you know how to perform calculations on a matrix by its transpose.

We’re going to take a look at how igraph handles this task with bipartite.projection(), but first let’s reload our data and have a fresh g without the data that we added when we made our visualizations.

davis <- read.csv(file.choose(), header=FALSE)

g <- graph.data.frame(davis, directed=FALSE)  ## Convert to an igraph network

V(g)$type <- bipartite_mapping(g)$type        ## Add the "type" attribute
                                              ##  to the network.
g

## IGRAPH e59aeed UN-B 32 89 -- 
## + attr: name (v/c), type (v/l)
## + edges from e59aeed (vertex names):
##  [1] EVELYN   --1 EVELYN   --2 EVELYN   --3 EVELYN   --4 EVELYN   --5
##  [6] EVELYN   --6 EVELYN   --8 LAURA    --1 LAURA    --2 LAURA    --3
## [11] LAURA    --5 LAURA    --6 LAURA    --7 THERESA  --2 THERESA  --3
## [16] THERESA  --4 THERESA  --5 THERESA  --6 THERESA  --7 THERESA  --8
## [21] BRENDA   --1 BRENDA   --3 BRENDA   --4 BRENDA   --5 BRENDA   --6
## [26] BRENDA   --7 CHARLOTTE--3 CHARLOTTE--4 CHARLOTTE--5 FRANCES  --3
## [31] FRANCES  --5 FRANCES  --6 ELEANOR  --5 ELEANOR  --6 ELEANOR  --7
## [36] PEARL    --6 PEARL    --8 RUTH     --5 RUTH     --7 RUTH     --8
## + ... omitted several edges

We’re going make a new variable called projected_g and assign to it our projected network. The minimal argument that you need to provide is your igraph object, g.

projected_g <- bipartite_projection(g, multiplicity = TRUE)

projected_g

## $proj1
## IGRAPH 9856ab6 UNW- 18 139 -- 
## + attr: name (v/c), weight (e/n)
## + edges from 9856ab6 (vertex names):
##  [1] EVELYN --LAURA     EVELYN --BRENDA    EVELYN --THERESA   EVELYN --CHARLOTTE
##  [5] EVELYN --FRANCES   EVELYN --ELEANOR   EVELYN --RUTH      EVELYN --PEARL    
##  [9] EVELYN --NORA      EVELYN --VERNE     EVELYN --MYRNA     EVELYN --KATHERINE
## [13] EVELYN --SYLVIA    EVELYN --HELEN     EVELYN --DOROTHY   EVELYN --OLIVIA   
## [17] EVELYN --FLORA     LAURA  --BRENDA    LAURA  --THERESA   LAURA  --CHARLOTTE
## [21] LAURA  --FRANCES   LAURA  --ELEANOR   LAURA  --RUTH      LAURA  --PEARL    
## [25] LAURA  --NORA      LAURA  --VERNE     LAURA  --MYRNA     LAURA  --KATHERINE
## [29] LAURA  --SYLVIA    LAURA  --HELEN     LAURA  --DOROTHY   THERESA--BRENDA   
## + ... omitted several edges
## 
## $proj2
## IGRAPH b5c25bb UNW- 14 66 -- 
## + attr: name (v/c), weight (e/n)
## + edges from b5c25bb (vertex names):
##  [1] 1 --2  1 --3  1 --4  1 --5  1 --6  1 --8  1 --9  1 --7  2 --3  2 --4 
## [11] 2 --5  2 --6  2 --8  2 --9  2 --7  3 --4  3 --5  3 --6  3 --8  3 --9 
## [21] 3 --7  4 --5  4 --6  4 --8  4 --9  4 --7  5 --6  5 --8  5 --9  5 --7 
## [31] 6 --8  6 --9  6 --7  6 --10 6 --12 6 --13 6 --11 6 --14 8 --9  8 --7 
## [41] 8 --12 8 --10 8 --13 8 --14 8 --11 7 --9  7 --12 7 --10 7 --13 7 --14
## [51] 7 --11 9 --12 9 --10 9 --13 9 --14 9 --11 10--12 10--13 10--14 10--11
## [61] 12--13 12--14 12--11 13--14 13--11 11--14

bipartite_projection() returns a list of two networks, one for each projected mode from the original bipartite network, which are notated with $ as proj1 and proj2.

There are options to tweak bipartite_projection() that will allow you to specify which network you want (which can save significant compute time on larger, denser networks) as well as disable the storing of weights produced by the matrix multiplication. Check it out with ?bipartite_projection.

In order to access each of our graphs, we can use $ which allows us to access values from lists and data frames (which are technically still lists).

Let’s assign each of networks to their own variables with $ to simplify our work.

events_auto_projected <- projected_g$proj1
south_women_auto_projected <- projected_g$proj2

events_auto_projected

## IGRAPH 9856ab6 UNW- 18 139 -- 
## + attr: name (v/c), weight (e/n)
## + edges from 9856ab6 (vertex names):
##  [1] EVELYN --LAURA     EVELYN --BRENDA    EVELYN --THERESA   EVELYN --CHARLOTTE
##  [5] EVELYN --FRANCES   EVELYN --ELEANOR   EVELYN --RUTH      EVELYN --PEARL    
##  [9] EVELYN --NORA      EVELYN --VERNE     EVELYN --MYRNA     EVELYN --KATHERINE
## [13] EVELYN --SYLVIA    EVELYN --HELEN     EVELYN --DOROTHY   EVELYN --OLIVIA   
## [17] EVELYN --FLORA     LAURA  --BRENDA    LAURA  --THERESA   LAURA  --CHARLOTTE
## [21] LAURA  --FRANCES   LAURA  --ELEANOR   LAURA  --RUTH      LAURA  --PEARL    
## [25] LAURA  --NORA      LAURA  --VERNE     LAURA  --MYRNA     LAURA  --KATHERINE
## [29] LAURA  --SYLVIA    LAURA  --HELEN     LAURA  --DOROTHY   THERESA--BRENDA   
## + ... omitted several edges

south_women_auto_projected

## IGRAPH b5c25bb UNW- 14 66 -- 
## + attr: name (v/c), weight (e/n)
## + edges from b5c25bb (vertex names):
##  [1] 1 --2  1 --3  1 --4  1 --5  1 --6  1 --8  1 --9  1 --7  2 --3  2 --4 
## [11] 2 --5  2 --6  2 --8  2 --9  2 --7  3 --4  3 --5  3 --6  3 --8  3 --9 
## [21] 3 --7  4 --5  4 --6  4 --8  4 --9  4 --7  5 --6  5 --8  5 --9  5 --7 
## [31] 6 --8  6 --9  6 --7  6 --10 6 --12 6 --13 6 --11 6 --14 8 --9  8 --7 
## [41] 8 --12 8 --10 8 --13 8 --14 8 --11 7 --9  7 --12 7 --10 7 --13 7 --14
## [51] 7 --11 9 --12 9 --10 9 --13 9 --14 9 --11 10--12 10--13 10--14 10--11
## [61] 12--13 12--14 12--11 13--14 13--11 11--14

When we manually projected our network, we multiplied the matrix by its transpose. It’s important to note here that igraph’s bipartite_projection() does not use the cross-product, but instead uses a sum method. If we take a look at bipartite_projection()’s documentation, you’ll get this explanation for its multiplicity argument.

If TRUE, then igraph keeps the multiplicity of the edges as an edge attribute. E.g. if there is an A-C-B and also an A-D-B triple in the bipartite graph (but no more X, such that A-X-B is also in the graph), then the multiplicity of the A-B edge in the projection will be 2.

Jaccard Similarity

For the next two measures, you will need to install the package ade4 if you do not already have it. As with other installations, you will only need to do this once.

install.packages("ade4", dependencies = TRUE)

To learn more about what is available in the distance function in ade4, check out ?dist.binary once you have loaded ade4. In the meantime, just to keep this fairly straightforward, we’ll keep the code for these conversions fairly compact.

library(ade4) # If you have not already done so

bipartite_matrix <- as_incidence_matrix(g)  # Extract the matrix

women_jaccard <- dist.binary(bipartite_matrix, method=1, upper=TRUE, diag = FALSE) # Method #1 is "Jaccard Index"
event_jaccard <- dist.binary(t(bipartite_matrix), method=1, upper=TRUE, diag = FALSE) 

women_jaccard <- as.matrix(women_jaccard)   
diag(women_jaccard)<-0

# women_jaccard          # Look at the matrix before you binarize
jaccard_women <- ifelse(women_jaccard>0.95, 1, 0)     # Binarize

# jaccard_women      # Take a look at the matrix if you like.

jacc_women <- graph_from_adjacency_matrix(jaccard_women,    # Create an igraph network
                                        mode = "undirected")
plot(jacc_women)

Simple Matching

Simple matching is also carried out using the ade4 package.

library(ade4)

bipartite_matrix <- as_incidence_matrix(g)  # Extract the matrix

women_match <- dist.binary(bipartite_matrix, method=2, upper=TRUE, diag = FALSE) # Method #2 is "simple matching"
event_match <- dist.binary(t(bipartite_matrix), method=2, upper=TRUE, diag = FALSE) # Method #2 is "simple matching"

The matrix that is returned will be valued between 1 and 0. If you treat it as though it is a normal network, there will be a value in every cell. That will not be very helpful, so you will need to binarize the new matrices.

To binarize the matrices, first choose your cutoff value. For instance, if you decide that a value of 0.80 is the lowest similarity that should constitute a tie between nodes, then use the following code to change all values of 0.80 and greater to 1, and all values lower than 0.80 to 0.

women_match <- as.matrix(women_match)
matching_women <- ifelse(women_match>0.8, 1, 0)
matching_women

##           EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
## EVELYN         0     0       0      0         0       0       0     0    0
## LAURA          0     0       0      0         0       0       0     0    0
## THERESA        0     0       0      0         0       0       0     0    0
## BRENDA         0     0       0      0         0       0       0     0    0
## CHARLOTTE      0     0       0      0         0       0       0     0    0
## FRANCES        0     0       0      0         0       0       0     0    0
## ELEANOR        0     0       0      0         0       0       0     0    0
## PEARL          0     0       0      0         0       0       0     0    0
## RUTH           0     0       0      0         0       0       0     0    0
## VERNE          0     0       0      0         0       0       0     0    0
## MYRNA          0     1       0      1         0       0       0     0    0
## KATHERINE      1     1       1      1         1       0       0     0    0
## SYLVIA         1     1       1      1         1       1       0     0    0
## NORA           1     1       1      1         1       1       0     0    0
## HELEN          1     0       1      0         0       0       0     0    0
## DOROTHY        0     0       0      0         0       0       0     0    0
## OLIVIA         0     1       0      1         0       0       0     0    0
## FLORA          0     1       0      1         0       0       0     0    0
##           VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
## EVELYN        0     0         1      1    1     1       0      0     0
## LAURA         0     1         1      1    1     0       0      1     1
## THERESA       0     0         1      1    1     1       0      0     0
## BRENDA        0     1         1      1    1     0       0      1     1
## CHARLOTTE     0     0         1      1    1     0       0      0     0
## FRANCES       0     0         0      1    1     0       0      0     0
## ELEANOR       0     0         0      0    0     0       0      0     0
## PEARL         0     0         0      0    0     0       0      0     0
## RUTH          0     0         0      0    0     0       0      0     0
## VERNE         0     0         0      0    0     0       0      0     0
## MYRNA         0     0         0      0    0     0       0      0     0
## KATHERINE     0     0         0      0    0     0       0      0     0
## SYLVIA        0     0         0      0    0     0       0      0     0
## NORA          0     0         0      0    0     0       0      0     0
## HELEN         0     0         0      0    0     0       0      0     0
## DOROTHY       0     0         0      0    0     0       0      0     0
## OLIVIA        0     0         0      0    0     0       0      0     0
## FLORA         0     0         0      0    0     0       0      0     0

Then you can change this back into an igraph object and plot it.

match_women <- graph_from_adjacency_matrix(matching_women, 
                                        mode = "undirected")
plot(match_women)

Pearson’s Correlation

bipartite_matrix <- as_incidence_matrix(g)  # Extract the matrix

women_correl <- cor(t(bipartite_matrix))
event_correl <- cor(bipartite_matrix)

women_correl <- as.matrix(women_correl)   
# women_correl          # Look at the matrix before you binarize
correl_women <- ifelse(women_correl>0.6, 1, 0)    # Binarize 
diag(correl_women)<-0
# correl_women    # Take a look at the matrix if you like


corr_women <- graph_from_adjacency_matrix(correl_women,     # Create an igraph network
                                        mode = "undirected")
plot(corr_women)

Yule’s Q

Yule’s Q is a correlation calculation that is designed for binary data. Compare your results to what you get with Pearson’s correlation, which is designed for continuous data.

As with Jaccard and simple matching, you will need to install a new package in order to run Yule’s Q. You will only need to do this once.

install.packages("psych", dependencies = TRUE)

library(psych)

bipartite_matrix <- as_incidence_matrix(g)  # Extract the matrix

women_Q <-YuleCor(t(bipartite_matrix))$rho
event_Q <-YuleCor(bipartite_matrix)$rho

women_Q <- as.matrix(women_Q) 
women_Q        # Look at the matrix before you binarize

##               EVELYN      LAURA    THERESA     BRENDA  CHARLOTTE    FRANCES
## EVELYN     0.9995953  0.8617594  0.9353287  0.8617594  0.4761905  0.9677419
## LAURA      0.8617594  0.9996033  0.8617594  0.9370120  0.6148591  0.9789259
## THERESA    0.9353287  0.8617594  0.9995953  0.8617594  0.9677419  0.9677419
## BRENDA     0.8617594  0.9370120  0.8617594  0.9996033  0.9789259  0.9789259
## CHARLOTTE  0.4761905  0.6148591  0.9677419  0.9789259  0.9995171  0.5882353
## FRANCES    0.9677419  0.9789259  0.9677419  0.9789259  0.5882353  0.9995171
## ELEANOR    0.4761905  0.9789259  0.9677419  0.9789259  0.5882353  0.9177430
## PEARL      0.9474768  0.3908795  0.9474768  0.3908795 -0.8941878  0.7843137
## RUTH       0.4761905  0.6148591  0.9677419  0.6148591  0.5882353  0.5882353
## VERNE     -0.1960784  0.0000000  0.4761905  0.0000000 -0.1033295 -0.1033295
## MYRNA     -0.1960784 -0.6148591 -0.1960784 -0.6148591 -0.9299656 -0.1033295
## KATHERINE -0.7002039 -0.8617594 -0.7002039 -0.8617594 -0.9677419 -0.4761905
## SYLVIA    -0.8617594 -0.7100592 -0.5251641 -0.7100592 -0.6148591 -0.6148591
## NORA      -0.9887761 -0.8617594 -0.7681849 -0.8617594 -0.7317073 -0.7317073
## HELEN     -0.8529599 -0.2948403 -0.4878049 -0.2948403 -0.3089598 -0.3089598
## DOROTHY    0.9090909  0.0000000  0.9090909  0.0000000 -0.8280255  0.4918033
## OLIVIA    -0.1639344 -0.9338521 -0.1639344 -0.9338521 -0.8280255 -0.8280255
## FLORA     -0.1639344 -0.9338521 -0.1639344 -0.9338521 -0.8280255 -0.8280255
##              ELEANOR       PEARL       RUTH      VERNE      MYRNA  KATHERINE
## EVELYN     0.4761905  0.94747683  0.4761905 -0.1960784 -0.1960784 -0.7002039
## LAURA      0.9789259  0.39087948  0.6148591  0.0000000 -0.6148591 -0.8617594
## THERESA    0.9677419  0.94747683  0.9677419  0.4761905 -0.1960784 -0.7002039
## BRENDA     0.9789259  0.39087948  0.6148591  0.0000000 -0.6148591 -0.8617594
## CHARLOTTE  0.5882353 -0.89418778  0.5882353 -0.1033295 -0.9299656 -0.9677419
## FRANCES    0.9177430  0.78431373  0.5882353 -0.1033295 -0.1033295 -0.4761905
## ELEANOR    0.9995171  0.78431373  0.9177430  0.5882353 -0.1033295 -0.4761905
## PEARL      0.7843137  0.99941894  0.7843137  0.7843137  0.7843137  0.5355304
## RUTH       0.9177430  0.78431373  0.9995171  0.9177430  0.5882353  0.1960784
## VERNE      0.5882353  0.78431373  0.9177430  0.9995171  0.9177430  0.7317073
## MYRNA     -0.1033295  0.78431373  0.5882353  0.9177430  0.9995171  0.9874327
## KATHERINE -0.4761905  0.53553038  0.1960784  0.7317073  0.9874327  0.9995953
## SYLVIA     0.0000000  0.39087948  0.6148591  0.9789259  0.9789259  0.9949332
## NORA      -0.1960784  0.22962113 -0.1960784  0.4761905  0.4761905  0.7681849
## HELEN      0.3921569 -0.04872107  0.3921569  0.8315098  0.8315098  0.4878049
## DOROTHY    0.4918033  0.99060632  0.9803922  0.9803922  0.9803922  0.9529277
## OLIVIA    -0.8280255  0.65573770  0.4918033  0.4918033  0.4918033  0.1639344
## FLORA     -0.8280255  0.65573770  0.4918033  0.4918033  0.4918033  0.1639344
##               SYLVIA       NORA       HELEN    DOROTHY     OLIVIA      FLORA
## EVELYN    -0.8617594 -0.9887761 -0.85295990  0.9090909 -0.1639344 -0.1639344
## LAURA     -0.7100592 -0.8617594 -0.29484029  0.0000000 -0.9338521 -0.9338521
## THERESA   -0.5251641 -0.7681849 -0.48780488  0.9090909 -0.1639344 -0.1639344
## BRENDA    -0.7100592 -0.8617594 -0.29484029  0.0000000 -0.9338521 -0.9338521
## CHARLOTTE -0.6148591 -0.7317073 -0.30895984 -0.8280255 -0.8280255 -0.8280255
## FRANCES   -0.6148591 -0.7317073 -0.30895984  0.4918033 -0.8280255 -0.8280255
## ELEANOR    0.0000000 -0.1960784  0.39215686  0.4918033 -0.8280255 -0.8280255
## PEARL      0.3908795  0.2296211 -0.04872107  0.9906063  0.6557377  0.6557377
## RUTH       0.6148591 -0.1960784  0.39215686  0.9803922  0.4918033  0.4918033
## VERNE      0.9789259  0.4761905  0.83150985  0.9803922  0.4918033  0.4918033
## MYRNA      0.9789259  0.4761905  0.83150985  0.9803922  0.4918033  0.4918033
## KATHERINE  0.9949332  0.7681849  0.48780488  0.9529277  0.1639344  0.1639344
## SYLVIA     0.9996033  0.8617594  0.76002815  0.9338521  0.0000000  0.0000000
## NORA       0.8617594  0.9995953  0.64516129 -0.1639344  0.9090909  0.9090909
## HELEN      0.7600281  0.6451613  0.99956915  0.3278689  0.3278689  0.3278689
## DOROTHY    0.9338521 -0.1639344  0.32786885  0.9992132  0.8196721  0.8196721
## OLIVIA     0.0000000  0.9090909  0.32786885  0.8196721  0.9992132  0.9992132
## FLORA      0.0000000  0.9090909  0.32786885  0.8196721  0.9992132  0.9992132

Q_women <- ifelse(women_Q>0.9, 1, 0) # Binarize
diag(Q_women)<-0
# Q_women    # Take a look at the matrix

YQ_women <- graph_from_adjacency_matrix(Q_women,     # Create an igraph network
                                        mode = "undirected")
plot(YQ_women)

One-Mode Metrics

Once you have converted the two-mode network into two one-mode networks, you have another choice to make. You may analyze the networks that you converted to binary ties with igraph, or you may analyze the initial valued ties using tnet.

Centrality measures using igraph

You can use any of the above binary networks as you would any one-mode network. For example…

women_deg <- degree(jacc_women)
women_bet <- betweenness(jacc_women)
women_clos <- closeness(jacc_women)

## Warning in closeness(jacc_women): At centrality.c:2784 :closeness centrality is
## not well-defined for disconnected graphs

women_eig <- eigen_centrality(jacc_women)$vector

women_cent_df <- data.frame(women_deg, women_bet, women_clos, women_eig)

women_cent_df

##           women_deg women_bet  women_clos women_eig
## EVELYN            1  0.000000 0.003460208 0.0000000
## LAURA             3  1.485714 0.007633588 0.6665521
## THERESA           0  0.000000 0.003267974 0.0000000
## BRENDA            3  1.485714 0.007633588 0.6665521
## CHARLOTTE         7 32.328571 0.008130081 1.0000000
## FRANCES           3  2.500000 0.007633588 0.6146365
## ELEANOR           2  0.200000 0.007518797 0.4979648
## PEARL             1  0.000000 0.007518797 0.2688099
## RUTH              0  0.000000 0.003267974 0.0000000
## VERNE             0  0.000000 0.003267974 0.0000000
## MYRNA             1  0.000000 0.007518797 0.2688099
## KATHERINE         3  3.400000 0.007751938 0.6271615
## SYLVIA            0  0.000000 0.003267974 0.0000000
## NORA              2  1.476190 0.007633588 0.4340303
## HELEN             1  0.000000 0.003460208 0.0000000
## DOROTHY           1  0.000000 0.007518797 0.2688099
## OLIVIA            5 11.061905 0.008000000 0.9262398
## FLORA             5 11.061905 0.008000000 0.9262398

…and for events…

Note: The jaccard_event conversion was run, but not shown above in order to conserve space. See if you can replicate these by creating the code yourself.

events_deg <- degree(jacc_event)
events_bet <- betweenness(jacc_event)
events_clos <- closeness(jacc_event)

## Warning in closeness(jacc_event): At centrality.c:2784 :closeness centrality is
## not well-defined for disconnected graphs

events_eig <- eigen_centrality(jacc_event)$vector

events_cent_df <- data.frame(events_deg, events_bet, events_clos, events_eig)

events_cent_df

##    events_deg events_bet events_clos events_eig
## 1           6       14.8 0.031250000  1.0000000
## 2           5        3.8 0.029411765  0.9609569
## 3           5        3.8 0.029411765  0.9609569
## 4           5        3.8 0.029411765  0.9609569
## 5           5        3.8 0.029411765  0.9609569
## 6           1        0.0 0.022727273  0.1968016
## 8           1        0.0 0.022727273  0.1968016
## 7           0        0.0 0.005494505  0.0000000
## 9           1        0.0 0.023255814  0.1975932
## 10          5        2.8 0.028571429  0.9571074
## 12          6       13.8 0.030303030  0.9959940
## 13          5        2.8 0.028571429  0.9571074
## 11          6       13.8 0.030303030  0.9959940
## 14          5        2.8 0.028571429  0.9571074

Centrality measures using tnet

The second option is to not lose information and use tnet’s weighted centrality functions. To do so, you will first need to convert the matrices you created (not the igraph objects you created) above into a tnet data object.

For the example, we’ll use the Jaccard matching matrix.

JW <- as.tnet(women_jaccard)

## Warning in as.tnet(women_jaccard): Data assumed to be weighted one-mode tnet (if
## this is not correct, specify type)

head(JW)

##      i j         w
## [1,] 1 2 0.5773503
## [2,] 1 3 0.4714045
## [3,] 1 4 0.5773503
## [4,] 1 5 0.8164966
## [5,] 1 6 0.7071068
## [6,] 1 7 0.8164966

JE <- as.tnet(jaccard_event)

## Warning in as.tnet(jaccard_event): Data assumed to be weighted one-mode tnet (if
## this is not correct, specify type)

head(JE)

##      i  j w
## [1,] 1  9 1
## [2,] 1 10 1
## [3,] 1 11 1
## [4,] 1 12 1
## [5,] 1 13 1
## [6,] 1 14 1

Now that you have the data format that tnet expects, you are free to calculate weighted centrality measures.

women_Wdeg <- degree_w(JW)
women_Wbet <- betweenness_w(JW)
women_Wclos <- closeness_w(JW, gconly=FALSE)
# Note: tnet does not include eigenvector centrality

women_W_cent_df <- data.frame(women_Wdeg, women_Wbet, women_Wclos)

women_W_cent_df

##    node degree   output node.1 betweenness node.2 closeness n.closeness
## 1     1     17 13.82975      1   0.0000000      1  16.54190   0.9730528
## 2     2     17 13.86178      2   0.2000000      2  16.58020   0.9753061
## 3     3     17 13.26816      3   0.0000000      3  15.87017   0.9335393
## 4     4     17 13.72586      4   0.2000000      4  16.41763   0.9657432
## 5     5     17 15.18170      5   1.2000000      5  18.15897   1.0681747
## 6     6     17 14.29000      6   0.2000000      6  17.09241   1.0054357
## 7     7     17 13.88997      7   0.2000000      7  16.61392   0.9772894
## 8     8     17 14.01449      8   0.0000000      8  16.76286   0.9860509
## 9     9     17 13.48588      9   0.0000000      9  16.13058   0.9488579
## 10   10     17 13.61255     10   0.0000000     10  16.28210   0.9577705
## 11   11     17 14.01298     11   0.0000000     11  16.76106   0.9859445
## 12   12     17 14.17401     12   0.0000000     12  17.08389   1.0049345
## 13   13     17 13.73522     13   0.0000000     13  16.55905   0.9740617
## 14   14     17 14.51753     14   0.0000000     14  17.36456   1.0214446
## 15   15     17 14.53363     15   0.0000000     15  17.38381   1.0225773
## 16   16     17 14.24073     16   0.0000000     16  17.03347   1.0019688
## 17   17     16 14.89155     17   0.6666667     17  18.40998   1.0829401
## 18   18     16 14.89155     18   0.6666667     18  18.40998   1.0829401

event_Wdeg <- degree_w(JE)
event_Wbet <- betweenness_w(JE)
event_Wclos <- closeness_w(JE, gconly=FALSE)
# Note: tnet does not include eigenvector centrality

Event_W_cent_df <- data.frame(event_Wdeg, event_Wbet, event_Wclos)

Event_W_cent_df

##    node degree output node.1 betweenness node.2 closeness n.closeness
## 1     1      6      6      1        14.8      1  9.000000   0.6923077
## 2     2      5      5      2         3.8      2  8.333333   0.6410256
## 3     3      5      5      3         3.8      3  8.333333   0.6410256
## 4     4      5      5      4         3.8      4  8.333333   0.6410256
## 5     5      5      5      5         3.8      5  8.333333   0.6410256
## 6     6      1      1      6         0.0      6  5.416667   0.4166667
## 7     7      1      1      7         0.0      7  5.416667   0.4166667
## 8     8      0      0      8         0.0      8  0.000000   0.0000000
## 9     9      1      1      9         0.0      9  5.500000   0.4230769
## 10   10      5      5     10         2.8     10  8.166667   0.6282051
## 11   11      6      6     11        13.8     11  8.833333   0.6794872
## 12   12      5      5     12         2.8     12  8.166667   0.6282051
## 13   13      6      6     13        13.8     13  8.833333   0.6794872
## 14   14      5      5     14         2.8     14  8.166667   0.6282051

Compare the measures that you produced using igraph with the binary networks, and then with the valued ties using tnet. Was there a difference in, say, the top five nodes? What do you notice when you take the tie values into account?

This all took longer than expected. So there is no deliverable for this practicum. The more adventurous among you may import one or more of the attributes from the movie as an edgelist to analyze using two-mode techniques. But, this is not required.

For now, use this as a reference guide to analyzing two-mode networks in R.

Bipartite/Two-Mode Networks in igraph

Phil Murphy & Brendan Knapp

Overview

Setup

This part is important. Please be sure to do this part again

Getting Started

Loading and configuring two-mode data

Igraph’s `bipartite.mapping()` function

First Look: Visualizaiton of Two-Mode Networks

Plotting a bipartite Network

Tweaking Labels

`V(g)$color` and `V(g)$shape`

Analytic Options for Two-Mode Networks

Option 1: Proceed as Usual for One-Mode Networks

Calculating Centrality

Sizing Vertices by Centrality

Option 2: Analyze Each Mode Separately Using Two-Mode Metrics

Working with the `tnet` data format

Calculating centrality in `tnet`

Option 3: Converting Two-Mode to One-Mode Networks

Overlap Count through Manual Projection

`as_incidence_matrix()`

`t()` transpose

Matrix Multiplication - For Overlap Count

`graph_from_adjacency_matrix()`

Another way to produce an overlap count in igraph `bipartite_projection()`

Jaccard Similarity

Simple Matching

Pearson’s Correlation

Yule’s Q

One-Mode Metrics

Centrality measures using igraph

Centrality measures using tnet

Bipartite/Two-Mode Networks in igraph

Phil Murphy & Brendan Knapp

Overview

Setup

This part is important. Please be sure to do this part again

Getting Started

Loading and configuring two-mode data

Igraph’s bipartite.mapping() function

First Look: Visualizaiton of Two-Mode Networks

Plotting a bipartite Network

Tweaking Labels

V(g)$color and V(g)$shape

Analytic Options for Two-Mode Networks

Option 1: Proceed as Usual for One-Mode Networks

Calculating Centrality

Sizing Vertices by Centrality

Option 2: Analyze Each Mode Separately Using Two-Mode Metrics

Working with the tnet data format

Calculating centrality in tnet

Option 3: Converting Two-Mode to One-Mode Networks

Overlap Count through Manual Projection

as_incidence_matrix()

t() transpose

Matrix Multiplication - For Overlap Count

graph_from_adjacency_matrix()

Another way to produce an overlap count in igraph bipartite_projection()

Jaccard Similarity

Simple Matching

Pearson’s Correlation

Yule’s Q

One-Mode Metrics

Centrality measures using igraph

Centrality measures using tnet

Igraph’s `bipartite.mapping()` function

`V(g)$color` and `V(g)$shape`

Working with the `tnet` data format

Calculating centrality in `tnet`

`as_incidence_matrix()`

`t()` transpose

`graph_from_adjacency_matrix()`

Another way to produce an overlap count in igraph `bipartite_projection()`