Overview

This is part two of a two-part entry on two mode networks in R. The last entry focused on entering two-mode ( referred to in igraph as bipartite) network data into igraph. This part is concerned with the same topic, but in the statnet suite of packages. For those who are reading only this version, a few things from the igraph tutorial bear repeating. (If you have read through or completed the igraph version of this page, feel free to skip ahead to the Setup section.)

The analysis of two-mode data is important to the field of network analysis, mainly because it opens up so many more opportunities to use the network analysis toolset. This, of course, also comes with the caveat that we should not try to treat data as they are a two-mode network just because we can make the computer to so. Rather, if the association that you have found provides a reasonable depiction of latent ties, the opportunity to form a tie, or a fair amount of certainty that a tie is implied; then two-mode networks are appropriate.

The trouble with two-mode networks is that their sparse nature - with the nodes in one mode being reachable only through a node from the other mode - provides certain limitations on how they may be analyzed.

This practicum is designed to familiarize you with working with two-mode data. As such, it does not go too much further than working to manipulate the network in order to get it into R, convert it from two modes to one, and to run a few simple plots and analyses.
To download the data, go to the Network Data link on the course website. https://goo.gl/3rkUK4 You will find the data (davis.csv) on the Network Data page.


Setup

This part is important. Please be sure to do this part again

Create a folder labeled “Bipartite Networks in statnet” someplace on your computer, such as your Desktop or wherever you will be able to easily find it again. Then, set your working directory to that folder in RStudio:

  • Session
    • Set Working Directory
    • Choose Directory…


Getting Started

Start by loading statnet.

library(statnet)

Loading and configuring two-mode data in statnet

You can download the example data at here: http://bit.ly/2xzM1po

If you have downloaded the file into the folder you created for this exercise, then you can use the following script:

davis <- read.csv("davis.csv", 
                  directed=FALSE, 
                  header=FALSE)

If you are not sure where you put it and would appreciate the ability to look for the file, then use the file.choose() function.

davis <- read.csv(file.choose(),
                  directed=FALSE, 
                  header=FALSE)

Once the edgelist is loaded, take a look at it. We don’t need to see it all to get an idea of what is in there. So, use the head() function to view just the first six rows of the data.

head(davis)
##       V1 V2
## 1 EVELYN  1
## 2 EVELYN  2
## 3 EVELYN  3
## 4 EVELYN  4
## 5 EVELYN  5
## 6 EVELYN  6

As you can see, the first column is the women from Davis’ Southern Women network and the second column is the events that they attended. This is how a two-mode edgelist should be organized: the first mode will be whatever is represented in the first column and the second mode is represented in the second column.

Recall that, for this to be a two-mode network, ties should exist only between modes, and not within modes. In this case, that means that ties are only possible between women and events, not between women and women or between events and events. Any direct ties between nodes within a mode may be derived, as we will do below. But they should not appear within the network at this point.

Because statnet does not automatically recognize two-mode networks, it is necessary to tell it that there are two types of vertices. The way that we do this in statnet is to tell it how many nodes are in each mode. This, of course, requires you to know something about the network before you load it.

The functions native to R can help you to summarize how many nodes are in each mode. You can identify the number of unique names in each mode with the unique() function in R. Recall that square brackets can be used to subset a data object like a dataframe or a matrix. The format is always data[rows, columns]. If we leave one of those two options blank (rows or columns), then everything under that option will be included. For example, we can get the first five rows of the first column from a data object with data[1:5, 1], or we can get all rows in the first column with data[ , 1].

With the four line script, below, we are saving all the unique names from the first column (first mode) into an object called “mode1”, and all the unique names from the second column (second mode) into an object called “mode2”. The length() function counts the number of items in a given list. With it, we cancount the number of unique names in each mode of the network.

mode1 <- unique(davis[,1])
mode2 <- unique(davis[,2])
length(mode1)
## [1] 18
length(mode2)
## [1] 14

Now that we know that there are 14 nodes in the second mode of the network, we can tell statnet how to import the two-mode network.

net <- as.network(davis, 
                  bipartite=14,   # Number of nodes in the second mode (events)
                  directed=FALSE)

This process is a little simpler than igraph, which requires an extra step to make the network identifiable as two-mode.



First Look: Visualizaiton of Two-Mode Networks

There are a range of options for visualizing two-mode networks. At a minimum, you will want to find some manner in which to differentiate which node belongs to each mode. Beyond that, it will be up to the analyst to decide which option best suits their particular needs.

What follows are just a few of the basic options for visualizing networks in two-modes.

Plotting a bipartite Network

Here is what it looks like if you just plot the network without any special attention to this being a two-mode network.

gplot(net)

This is obviously hideous. We can do better than that.

This network could use at least three big improvements: labeled nodes, removal of arrowheads, and some recognition that it is two-modes.

Tweaking Labels

To start, statnet will automatically recolor and vary the shapes of nodes in a two-mode network, provided that gmode is set to "twomode". Although arrowheads are used by default, they are meaningless in a two-mode network, since all ties should be considered as undirected. To remove arrowheads, set usearrows to FALSE. Last, to include the labels for the nodes set displaylabels to TRUE. Also, to improve the legibility of the labels, we can lighten the edges by setting edge.col to gray.

For more options, check out the help section of gplot using help(gplot)

gplot(net, 
      gmode="twomode", 
      usearrows = FALSE, 
      displaylabels = TRUE, 
      edge.col="gray")



Analytic Options for Two-Mode Networks

Two-mode networks can be very rewarding to study. But, they are also somewhat of a challenge, as their analysis is still not well defined. Below, we present three options for analyzing two-mode networks.

  • Pretend like they are one-mode networks and analyze as usual
  • Analyze each mode independently using metrics that are specialized for use with two-mode networks
  • Convert the two-mode network to two one-mode networks and analyze them as usual

Each of the above options has its trade-offs and strengths. For a larger discussion of these trade-offs, see this week’s readings. For a description of how to do each, we’ll approach each option in order.

Option 1: Proceed as Usual for One-Mode Networks

Statnet recognizes that the network is two-mode when it was c. Keep in mind, however, that when you run centrality measures on a two-mode network, statnet will be treating each of these nodes as though they are in the same mode. statnet makes no allowance for calculating centralities that are specific to the special case of two-mode networks.

If you are interested in understanding the relative prominence of nodes in each mode, relative to other nodes in that mode, then the best you will be able to do in statnet will be to analyze each mode separately. For more on that, see the following two sections.

Calculating Centrality

IDs <- get.vertex.attribute(net, "vertex.names")
deg <- degree(net)
bet <- betweenness(net)
clos <- closeness(net, cmode="suminvdir")
eig <- evcent(net)
## Warning in evcent(net): Maximum iterations exceeded in evcent_R without
## convergence. This matrix may be pathological - increase maxiter or try
## eigen().
cent_df <- data.frame(IDs, deg, bet, clos, eig)

cent_df
##          IDs deg        bet      clos        eig
## 1          1   6   1.947497 0.4274194 0.09097313
## 2          2   6   1.888182 0.4274194 0.09644452
## 3          3  12  16.474849 0.4919355 0.16205670
## 4          4   8   6.962656 0.4489247 0.11279531
## 5          5  16  34.075776 0.5349462 0.20620311
## 6          6  16  58.774781 0.5752688 0.20997526
## 7          7  20 117.069577 0.6182796 0.24579184
## 8          8  28 220.412794 0.7043011 0.32470726
## 9          9  24 203.864287 0.6612903 0.24322144
## 10        10  10  10.343842 0.4704301 0.10921206
## 11        11   8  17.777513 0.4489247 0.05739130
## 12        12  12  16.357141 0.4919355 0.12997389
## 13        13   6   2.025553 0.4274194 0.07237887
## 14        14   6   2.025553 0.4274194 0.07237887
## 15    BRENDA  14  44.043074 0.5591398 0.24026989
## 16 CHARLOTTE   8   9.430526 0.4623656 0.12912318
## 17   DOROTHY   4   1.738639 0.4596774 0.10089161
## 18   ELEANOR   8   8.405270 0.4946237 0.17528165
## 19    EVELYN  16  85.960402 0.5967742 0.25694647
## 20     FLORA   4   4.498510 0.4274194 0.05340336
## 21   FRANCES   8   9.535765 0.4946237 0.16040623
## 22     HELEN  10  37.724911 0.5322581 0.15403470
## 23 KATHERINE  12  42.152851 0.5456989 0.16909858
## 24     LAURA  14  45.708276 0.5591398 0.23736519
## 25     MYRNA   8  14.546470 0.5026882 0.14338260
## 26      NORA  16 100.980612 0.5967742 0.20257662
## 27    OLIVIA   4   4.498510 0.4274194 0.05340336
## 28     PEARL   6   6.052288 0.4811828 0.13819337
## 29      RUTH   8  14.936966 0.5107527 0.18118778
## 30    SYLVIA  14  63.821139 0.5752688 0.21276310
## 31   THERESA  16  77.959270 0.5967742 0.28444976
## 32     VERNE   8  14.006523 0.5107527 0.16764578

Sizing Vertices and Labels by Centrality

We can size the nodes according to betweenness centrality and size the labels according to degree centrality. The only problem is that both of these parameters default to a size of “1”. A look at the centralities, above, reveals values that are well in excess of one, indicating that using these centralities will make the nodes appear huge. We remedy this by dividing each centrality in order to reduce the magnitude, while preserving the relative differences.

gplot(net, 
      gmode="twomode", 
      usearrows = FALSE, 
      displaylabels = TRUE, 
      edge.col="gray", 
      vertex.cex=betweenness(net)/75, # Resize nodes
      label.cex=degree(net)/10)       # Resize labels


Option 2: Analyze Each Mode Separately Using Two-Mode Metrics

There are presently few options for analyzing two-mode networks in R. One of the more established options will be to use Tore Opsahl’s tnet package. You can find more information about tnet, as well as many other two-mode network analysis ideas [on Tore’s blog:] (https://toreopsahl.com/2009/06/12/tnet-software-for-analysing-weighted-networks/) https://toreopsahl.com/

You can also find a lot of information about the package using the ?tnet function.

To install tnet, just do as you usually would.

install.packages("tnet", dependencies=TRUE)

library(tnet)  # start tnet


Working with the tnet data format

When using tnet, keep in mind that it was not designed to work like statnet and it was not designed to work with statnet data objects. This means that you will have to convert the network to work with tnet.

The tnet package expects the data to be formatted as a numeric edgelist. Do not include names. Thankfully, producing a numeric edgelist from an statnet data object is fairly easy and quick using the as.edgelist function in statnet.

tm<-as.edgelist(net, names=FALSE)

head(tm)  # check to make sure it worked
##      [,1] [,2]
## [1,]    1   15
## [2,]    1   19
## [3,]    1   24
## [4,]    2   19
## [5,]    2   24
## [6,]    2   31

A notable inconvenience to using this particular data format is that it can be difficult to keep up with the names of the various nodes. To help with that, you can extract the labels of the nodes in the statnet data object for later use.

NodeLabels <- net%v%"vertex.names"
  
head(NodeLabels)   # Again, check
## [1] " 1" " 2" " 3" " 4" " 5" " 6"

There is just one more thing to keep in mind as you work with tnet. The package only analyzes the first of the two modes (whichever one is in column one). To analyze the second mode, you will need to transpose the columns.

Here, we represented the transposed edgelist as “mt”.

mt <- tm[, c(2, 1)]

head(mt) 
##      [,1] [,2]
## [1,]   15    1
## [2,]   19    1
## [3,]   24    1
## [4,]   19    2
## [5,]   24    2
## [6,]   31    2


Calculating centrality in tnet

deg_tm <- degree_tm(tm)

deg_mt <- degree_tm(mt)

Okay, so Tore has only defined how to caluclate degree in a two-mode network. Other programs, like UCINET and Pajek have already defined two-mode centrality measures, with appropriate normalizations. In this case, however, Tore has explained that he feels that it is more appropriate to convert the two-mode network into a one-mode network with weighted ties and then analyze that. He has designed some methods for analyzing weighted centrality in one-mode networks (also included in tnet) for that purpose.

You may feel that this was a lot of buildup for very little payoff, and you are correct. But, we will be able to use tnet once we have completed the next section. This brings us to option 3: converting two-mode networks to one-mode networks for further analysis. We’ll pick tnet back up at the end of option three.




Option 3: Converting Two-Mode to One-Mode Networks

Because statnet does not include measures that are designed specifically for use with two-mode data, you may wish to convert the two-mode network into two one-mode networks. So, if you have a network of women and events, as we do here, you will be able to create a woman-by-woman network and an event-by-event network. These resulting networks may initially be valued to reflect similarity, opportunity, or simple overlaps in behavior. The tnet package is designed to handle weighted networks to produce measures of centrality, .

Alternatively, you may decide to binarize the networks in order to better reflect your own consideration of what should constitute a tie under the circumstances that you are researching.

There are several ways to convert a two-mode network. In this introduction, we will focus only on five: a simple count of overlaps, simple matching, Jaccard similarity, Pearson’s Correlation, and Yule’s Q. Although there are many, many options, these should give you a good start.


Table 1: Options for converting 2-modes to one-mode


To read the table, above, consider the two-by-two tables just below it. For each pair of nodes in one of the modes, we may count the number of nodes in the other mode to which they both have a connection (a); the number of nodes in the other mode to which neither has a connection (d); and the number of nodes in the other mode to which one has a connection, but the other does not (b and c). For example, if we consider Davis’ Southern Women network, below, then we can see that Ruth and Pearl attended two events in common. Similarly, there were nine events that neither attended. Also, Ruth attended two events that Pearl did not attend, and Pearl attended one event that Ruth did not.

We can use this information to calculate the potential for ties or overall similarities between women or between events in the Southern Women network, or within modes of any two-mode network. We will treat each of the four methods listed above in the order that they appear in the table.

Note: This list is far from exhaustive. But, it should be a good place to start.



Overlap Count through Manual Projection

as.matrix()

An overlap count is the simplest, and most common, approach to converting two modes to one. As the name suggests, this is a count of the number of nodes in the second mode that each pair in the first mode have in common, and visa versa. To use the Southern Women as an example, to convert the network into a woman-by-woman network, this would be the number of events that each pair of women co-attended. Conversely, if you are converting the network into an event-by-event network, then it would be the number of women that each pair of events had in common.

To count overlaps, we first convert net to a rectangular matrix where the “Southern Women” are columns and the events are rows.

bipartite_matrix <- as.matrix(net)

bipartite_matrix
##    BRENDA CHARLOTTE DOROTHY ELEANOR EVELYN FLORA FRANCES HELEN KATHERINE
##  1      1         0       0       0      1     0       0     0         0
##  2      0         0       0       0      1     0       0     0         0
##  3      1         1       0       0      1     0       1     0         0
##  4      1         1       0       0      1     0       0     0         0
##  5      1         1       0       1      1     0       1     0         0
##  6      1         0       0       1      1     0       1     0         0
##  7      1         1       0       1      0     0       0     1         0
##  8      1         0       1       1      1     0       1     1         1
##  9      0         0       1       0      1     1       0     0         1
## 10      0         0       0       0      0     0       0     1         1
## 11      0         0       0       0      0     1       0     1         0
## 12      0         0       0       0      0     0       0     1         1
## 13      0         0       0       0      0     0       0     0         1
## 14      0         0       0       0      0     0       0     0         1
##    LAURA MYRNA NORA OLIVIA PEARL RUTH SYLVIA THERESA VERNE
##  1     1     0    0      0     0    0      0       0     0
##  2     1     0    0      0     0    0      0       1     0
##  3     1     0    0      0     0    0      0       1     0
##  4     0     0    0      0     0    0      0       1     0
##  5     1     0    0      0     0    1      0       1     0
##  6     1     0    1      0     1    0      0       1     0
##  7     1     0    1      0     0    1      1       1     1
##  8     1     1    0      0     1    1      1       1     1
##  9     0     1    1      1     1    1      1       1     1
## 10     0     1    1      0     0    0      1       0     0
## 11     0     0    1      1     0    0      0       0     0
## 12     0     1    1      0     0    0      1       0     1
## 13     0     0    1      0     0    0      1       0     0
## 14     0     0    1      0     0    0      1       0     0
t() transpose

Next, let’s look at the transpose of biparite_matrix by using the function t().

In formal terms, if we refer to the set of nodes and ties as a matrix A, then its transpose is referred to as A’. This will be important in the section below.

t(bipartite_matrix)
##            1  2  3  4  5  6  7  8  9 10 11 12 13 14
## BRENDA     1  0  1  1  1  1  1  1  0  0  0  0  0  0
## CHARLOTTE  0  0  1  1  1  0  1  0  0  0  0  0  0  0
## DOROTHY    0  0  0  0  0  0  0  1  1  0  0  0  0  0
## ELEANOR    0  0  0  0  1  1  1  1  0  0  0  0  0  0
## EVELYN     1  1  1  1  1  1  0  1  1  0  0  0  0  0
## FLORA      0  0  0  0  0  0  0  0  1  0  1  0  0  0
## FRANCES    0  0  1  0  1  1  0  1  0  0  0  0  0  0
## HELEN      0  0  0  0  0  0  1  1  0  1  1  1  0  0
## KATHERINE  0  0  0  0  0  0  0  1  1  1  0  1  1  1
## LAURA      1  1  1  0  1  1  1  1  0  0  0  0  0  0
## MYRNA      0  0  0  0  0  0  0  1  1  1  0  1  0  0
## NORA       0  0  0  0  0  1  1  0  1  1  1  1  1  1
## OLIVIA     0  0  0  0  0  0  0  0  1  0  1  0  0  0
## PEARL      0  0  0  0  0  1  0  1  1  0  0  0  0  0
## RUTH       0  0  0  0  1  0  1  1  1  0  0  0  0  0
## SYLVIA     0  0  0  0  0  0  1  1  1  1  0  1  1  1
## THERESA    0  1  1  1  1  1  1  1  1  0  0  0  0  0
## VERNE      0  0  0  0  0  0  1  1  1  0  0  1  0  0

Now that we’ve seen how t() works, we can multiply bipartite_matrix by its transpose: t(biparite_matrix).

Similar to the %in% operator we saw earlier, R gives us a special operator to use for matrix multiplication: %*%.

Matrix Multiplication - For Overlap Count

The method we’re going to use to project our bipartite matrix to one mode matrices is the Cross-Product Method with manual matrix multiplication. This is the method that we covered in class where we can multiply a two-mode matrix by its transpose to produce a one-mode network that reclects the ties between the nodes in one of the two modes.

In the instructions that follow, we are using the Southern Women data, shown above. As you can see, the women are on the y axis, and the events are on the x axis. In matrix multiplication, the order in which you enter the matrices that you are multiplying into the expression matters.

To produce a Y by Y (women x women) network, multiply the matrix we have above by its transpose (AA'). To produce an X by X network (event x event), multiply the transposed network by the original (A'A). So, let’s try it.

We’re going to multiply

  • the transpose of bipartite_matrix (t(bipartite_mattrix))
    • by
  • the original bipartite_matrix
    • using the matrix multiplication operator %*%
  • and assign the whole thing to a new variable called person_matrix_prod.
    • Using the transposed matrix (t(bipartite_matrix)) as our first variable in the multiplication will produce the person_matrix_prod, as the “Southern Women” are bipartite_matrix’s columns.
  • It doesn’t make any sense that there would be loops when projecting a bipartite network (as we would then be adding information that did not exist in the original network), so we want to set the diagonal of our result to 0.
    • to do this, we use the diag() function with person_matrix_prod as the argument.
event_matrix_prod <- bipartite_matrix %*% t(bipartite_matrix)
## crossprod() does same and scales better, but this is better to learn at first at first so you understand the method

diag(event_matrix_prod) <- 0

event_matrix_prod
##     1  2  3  4  5  6  7  8  9 10 11 12 13 14
##  1  0  2  3  2  3  3  2  3  1  0  0  0  0  0
##  2  2  0  3  2  3  3  2  3  2  0  0  0  0  0
##  3  3  3  0  4  6  5  4  5  2  0  0  0  0  0
##  4  2  2  4  0  4  3  3  3  2  0  0  0  0  0
##  5  3  3  6  4  0  6  6  7  3  0  0  0  0  0
##  6  3  3  5  3  6  0  5  7  4  1  1  1  1  1
##  7  2  2  4  3  6  5  0  8  5  3  2  4  2  2
##  8  3  3  5  3  7  7  8  0  9  4  1  5  2  2
##  9  1  2  2  2  3  4  5  9  0  4  3  5  3  3
## 10  0  0  0  0  0  1  3  4  4  0  2  5  3  3
## 11  0  0  0  0  0  1  2  1  3  2  0  2  1  1
## 12  0  0  0  0  0  1  4  5  5  5  2  0  3  3
## 13  0  0  0  0  0  1  2  2  3  3  1  3  0  3
## 14  0  0  0  0  0  1  2  2  3  3  1  3  3  0

You may also want to do the same thing to get the event_matrix. To do this, all we have to do are reverse the order of the variables that we are multiplying.

  • Using the original matrix (bipartite_matrix) as our first variable in the multiplication will produce the event_matrix, as the events are bipartite_matrix’s rows.

    • Again, we want to set the diagonal to 0 by using the diag() function.
person_matrix_prod <- t(bipartite_matrix) %*% bipartite_matrix

diag(person_matrix_prod) <- 0

person_matrix_prod
##           BRENDA CHARLOTTE DOROTHY ELEANOR EVELYN FLORA FRANCES HELEN
## BRENDA         0         4       1       4      6     0       4     2
## CHARLOTTE      4         0       0       2      3     0       2     1
## DOROTHY        1         0       0       1      2     1       1     1
## ELEANOR        4         2       1       0      3     0       3     2
## EVELYN         6         3       2       3      0     1       4     1
## FLORA          0         0       1       0      1     0       0     1
## FRANCES        4         2       1       3      4     0       0     1
## HELEN          2         1       1       2      1     1       1     0
## KATHERINE      1         0       2       1      2     1       1     3
## LAURA          6         3       1       4      6     0       4     2
## MYRNA          1         0       2       1      2     1       1     3
## NORA           2         1       1       2      2     2       1     4
## OLIVIA         0         0       1       0      1     2       0     1
## PEARL          2         0       2       2      3     1       2     1
## RUTH           3         2       2       3      3     1       2     2
## SYLVIA         2         1       2       2      2     1       1     4
## THERESA        6         4       2       4      7     1       4     2
## VERNE          2         1       2       2      2     1       1     3
##           KATHERINE LAURA MYRNA NORA OLIVIA PEARL RUTH SYLVIA THERESA
## BRENDA            1     6     1    2      0     2    3      2       6
## CHARLOTTE         0     3     0    1      0     0    2      1       4
## DOROTHY           2     1     2    1      1     2    2      2       2
## ELEANOR           1     4     1    2      0     2    3      2       4
## EVELYN            2     6     2    2      1     3    3      2       7
## FLORA             1     0     1    2      2     1    1      1       1
## FRANCES           1     4     1    1      0     2    2      1       4
## HELEN             3     2     3    4      1     1    2      4       2
## KATHERINE         0     1     4    5      1     2    2      6       2
## LAURA             1     0     1    2      0     2    3      2       6
## MYRNA             4     1     0    3      1     2    2      4       2
## NORA              5     2     3    0      2     2    2      6       3
## OLIVIA            1     0     1    2      0     1    1      1       1
## PEARL             2     2     2    2      1     0    2      2       3
## RUTH              2     3     2    2      1     2    0      3       4
## SYLVIA            6     2     4    6      1     2    3      0       3
## THERESA           2     6     2    3      1     3    4      3       0
## VERNE             3     2     3    3      1     2    3      4       3
##           VERNE
## BRENDA        2
## CHARLOTTE     1
## DOROTHY       2
## ELEANOR       2
## EVELYN        2
## FLORA         1
## FRANCES       1
## HELEN         3
## KATHERINE     3
## LAURA         2
## MYRNA         3
## NORA          3
## OLIVIA        1
## PEARL         2
## RUTH          3
## SYLVIA        4
## THERESA       3
## VERNE         0

as.matrix()

women_overlap <- as.matrix(person_matrix_prod, 
                            mode = "undirected", 
                            weighted = TRUE)

women_overlap
##           BRENDA CHARLOTTE DOROTHY ELEANOR EVELYN FLORA FRANCES HELEN
## BRENDA         0         4       1       4      6     0       4     2
## CHARLOTTE      4         0       0       2      3     0       2     1
## DOROTHY        1         0       0       1      2     1       1     1
## ELEANOR        4         2       1       0      3     0       3     2
## EVELYN         6         3       2       3      0     1       4     1
## FLORA          0         0       1       0      1     0       0     1
## FRANCES        4         2       1       3      4     0       0     1
## HELEN          2         1       1       2      1     1       1     0
## KATHERINE      1         0       2       1      2     1       1     3
## LAURA          6         3       1       4      6     0       4     2
## MYRNA          1         0       2       1      2     1       1     3
## NORA           2         1       1       2      2     2       1     4
## OLIVIA         0         0       1       0      1     2       0     1
## PEARL          2         0       2       2      3     1       2     1
## RUTH           3         2       2       3      3     1       2     2
## SYLVIA         2         1       2       2      2     1       1     4
## THERESA        6         4       2       4      7     1       4     2
## VERNE          2         1       2       2      2     1       1     3
##           KATHERINE LAURA MYRNA NORA OLIVIA PEARL RUTH SYLVIA THERESA
## BRENDA            1     6     1    2      0     2    3      2       6
## CHARLOTTE         0     3     0    1      0     0    2      1       4
## DOROTHY           2     1     2    1      1     2    2      2       2
## ELEANOR           1     4     1    2      0     2    3      2       4
## EVELYN            2     6     2    2      1     3    3      2       7
## FLORA             1     0     1    2      2     1    1      1       1
## FRANCES           1     4     1    1      0     2    2      1       4
## HELEN             3     2     3    4      1     1    2      4       2
## KATHERINE         0     1     4    5      1     2    2      6       2
## LAURA             1     0     1    2      0     2    3      2       6
## MYRNA             4     1     0    3      1     2    2      4       2
## NORA              5     2     3    0      2     2    2      6       3
## OLIVIA            1     0     1    2      0     1    1      1       1
## PEARL             2     2     2    2      1     0    2      2       3
## RUTH              2     3     2    2      1     2    0      3       4
## SYLVIA            6     2     4    6      1     2    3      0       3
## THERESA           2     6     2    3      1     3    4      3       0
## VERNE             3     2     3    3      1     2    3      4       3
##           VERNE
## BRENDA        2
## CHARLOTTE     1
## DOROTHY       2
## ELEANOR       2
## EVELYN        2
## FLORA         1
## FRANCES       1
## HELEN         3
## KATHERINE     3
## LAURA         2
## MYRNA         3
## NORA          3
## OLIVIA        1
## PEARL         2
## RUTH          3
## SYLVIA        4
## THERESA       3
## VERNE         0
events_overlap <- as.matrix(event_matrix_prod, 
                            mode = "undirected", 
                            weighted = TRUE)

events_overlap
##     1  2  3  4  5  6  7  8  9 10 11 12 13 14
##  1  0  2  3  2  3  3  2  3  1  0  0  0  0  0
##  2  2  0  3  2  3  3  2  3  2  0  0  0  0  0
##  3  3  3  0  4  6  5  4  5  2  0  0  0  0  0
##  4  2  2  4  0  4  3  3  3  2  0  0  0  0  0
##  5  3  3  6  4  0  6  6  7  3  0  0  0  0  0
##  6  3  3  5  3  6  0  5  7  4  1  1  1  1  1
##  7  2  2  4  3  6  5  0  8  5  3  2  4  2  2
##  8  3  3  5  3  7  7  8  0  9  4  1  5  2  2
##  9  1  2  2  2  3  4  5  9  0  4  3  5  3  3
## 10  0  0  0  0  0  1  3  4  4  0  2  5  3  3
## 11  0  0  0  0  0  1  2  1  3  2  0  2  1  1
## 12  0  0  0  0  0  1  4  5  5  5  2  0  3  3
## 13  0  0  0  0  0  1  2  2  3  3  1  3  0  3
## 14  0  0  0  0  0  1  2  2  3  3  1  3  3  0



Jaccard Similarity

For the next two measures, you will need to install the package ade4 if you do not already have it. As with other installations, you will only need to do this once.

install.packages("ade4", dependencies = TRUE)

To learn more about what is available in the distance function in ade4, check out ?dist.binary once you have loaded ade4. In the meantime, just to keep this fairly straightforward, we’ll keep the code for these conversions fairly compact.

library(ade4) # If you have not already done so

bipartite_matrix <- as.matrix(net)  # Extract the matrix

women_jaccard <- dist.binary(t(bipartite_matrix), 
                             method=1, upper=TRUE, 
                             diag = FALSE) # Method #1 is "Jaccard Index"
event_jaccard <- dist.binary(bipartite_matrix, 
                             method=1, upper=TRUE, 
                             diag = FALSE) 

women_jaccard <- as.matrix(women_jaccard)   
diag(women_jaccard)<-0

# women_jaccard          # Look at the matrix before you binarize
jaccard_women <- ifelse(women_jaccard>0.95, 1, 0)     # Binarize

# jaccard_women      # Take a look at the matrix if you like.

jacc_women <- as.network(jaccard_women,    # Create a statnet network
                         directed=FALSE)
gplot(jacc_women,
      usearrows = FALSE, 
      displaylabels = TRUE)



Simple Matching

Simple matching is also carried out using the ade4 package.

library(ade4)

bipartite_matrix <- as.matrix(net)  # Extract the matrix

women_match <- dist.binary(t(bipartite_matrix), 
                           method=2, upper=TRUE,
                           diag = FALSE) # Method #2 is "simple matching"
event_match <- dist.binary(bipartite_matrix, 
                           method=2, upper=TRUE, 
                           diag = FALSE) # Method #2 is "simple matching"

The matrix that is returned will be valued between 1 and 0. If you treat it as though it is a normal network, there will be a value in every cell. That will not be very helpful, so you will need to binarize the new matrices.

To binarize the matrices, first choose your cutoff value. For instance, if you decide that a value of 0.80 is the lowest similarity that should constitute a tie between nodes, then use the following code to change all values of 0.80 and greater to 1, and all values lower than 0.80 to 0.

women_match <- as.matrix(women_match)
matching_women <- ifelse(women_match>0.8, 1, 0)
matching_women
##           BRENDA CHARLOTTE DOROTHY ELEANOR EVELYN FLORA FRANCES HELEN
## BRENDA         0         0       0       0      0     1       0     0
## CHARLOTTE      0         0       0       0      0     0       0     0
## DOROTHY        0         0       0       0      0     0       0     0
## ELEANOR        0         0       0       0      0     0       0     0
## EVELYN         0         0       0       0      0     0       0     1
## FLORA          1         0       0       0      0     0       0     0
## FRANCES        0         0       0       0      0     0       0     0
## HELEN          0         0       0       0      1     0       0     0
## KATHERINE      1         1       0       0      1     0       0     0
## LAURA          0         0       0       0      0     1       0     0
## MYRNA          1         0       0       0      0     0       0     0
## NORA           1         1       0       0      1     0       1     0
## OLIVIA         1         0       0       0      0     0       0     0
## PEARL          0         0       0       0      0     0       0     0
## RUTH           0         0       0       0      0     0       0     0
## SYLVIA         1         1       0       0      1     0       1     0
## THERESA        0         0       0       0      0     0       0     1
## VERNE          0         0       0       0      0     0       0     0
##           KATHERINE LAURA MYRNA NORA OLIVIA PEARL RUTH SYLVIA THERESA
## BRENDA            1     0     1    1      1     0    0      1       0
## CHARLOTTE         1     0     0    1      0     0    0      1       0
## DOROTHY           0     0     0    0      0     0    0      0       0
## ELEANOR           0     0     0    0      0     0    0      0       0
## EVELYN            1     0     0    1      0     0    0      1       0
## FLORA             0     1     0    0      0     0    0      0       0
## FRANCES           0     0     0    1      0     0    0      1       0
## HELEN             0     0     0    0      0     0    0      0       1
## KATHERINE         0     1     0    0      0     0    0      0       1
## LAURA             1     0     1    1      1     0    0      1       0
## MYRNA             0     1     0    0      0     0    0      0       0
## NORA              0     1     0    0      0     0    0      0       1
## OLIVIA            0     1     0    0      0     0    0      0       0
## PEARL             0     0     0    0      0     0    0      0       0
## RUTH              0     0     0    0      0     0    0      0       0
## SYLVIA            0     1     0    0      0     0    0      0       1
## THERESA           1     0     0    1      0     0    0      1       0
## VERNE             0     0     0    0      0     0    0      0       0
##           VERNE
## BRENDA        0
## CHARLOTTE     0
## DOROTHY       0
## ELEANOR       0
## EVELYN        0
## FLORA         0
## FRANCES       0
## HELEN         0
## KATHERINE     0
## LAURA         0
## MYRNA         0
## NORA          0
## OLIVIA        0
## PEARL         0
## RUTH          0
## SYLVIA        0
## THERESA       0
## VERNE         0

Then you can change this back into an statnet object and plot it.

match_women <- as.matrix(matching_women, 
                                        mode = "undirected")
gplot(match_women, 
      usearrows = FALSE, 
      displaylabels = TRUE)



Pearson’s Correlation

bipartite_matrix <- as.matrix(net)  # Extract the matrix

women_correl <- cor(bipartite_matrix)
event_correl <- cor(t(bipartite_matrix))

women_correl <- as.matrix(women_correl)   
# women_correl          # Look at the matrix before you binarize
correl_women <- ifelse(women_correl>0.6, 1, 0)    # Binarize 
diag(correl_women)<-0
# correl_women    # Take a look at the matrix if you like


corr_women <- as.matrix(correl_women,     # Create an statnet network
                        mode = "undirected")
gplot(corr_women, 
      usearrows = FALSE, 
      displaylabels = TRUE)

Yule’s Q

Yule’s Q is a correlation calculation that is designed for binary data. Compare your results to what you get with Pearson’s correlation, which is designed for continuous data.

As with Jaccard and simple matching, you will need to install a new package in order to run Yule’s Q. You will only need to do this once.

install.packages("psych", dependencies = TRUE)
library(psych)

bipartite_matrix <- as.matrix(net)  # Extract the matrix

women_Q <-YuleCor(bipartite_matrix)$rho
event_Q <-YuleCor(t(bipartite_matrix))$rho

women_Q <- as.matrix(women_Q) 
women_Q        # Look at the matrix before you binarize
##               BRENDA  CHARLOTTE    DOROTHY    ELEANOR     EVELYN
## BRENDA     0.9996033  0.9789259  0.0000000  0.9789259  0.8617594
## CHARLOTTE  0.9789259  0.9995171 -0.8280255  0.5882353  0.4761905
## DOROTHY    0.0000000 -0.8280255  0.9992132  0.4918033  0.9090909
## ELEANOR    0.9789259  0.5882353  0.4918033  0.9995171  0.4761905
## EVELYN     0.8617594  0.4761905  0.9090909  0.4761905  0.9995953
## FLORA     -0.9338521 -0.8280255  0.8196721 -0.8280255 -0.1639344
## FRANCES    0.9789259  0.5882353  0.4918033  0.9177430  0.9677419
## HELEN     -0.2948403 -0.3089598  0.3278689  0.3921569 -0.8529599
## KATHERINE -0.8617594 -0.9677419  0.9529277 -0.4761905 -0.7002039
## LAURA      0.9370120  0.6148591  0.0000000  0.9789259  0.8617594
## MYRNA     -0.6148591 -0.9299656  0.9803922 -0.1033295 -0.1960784
## NORA      -0.8617594 -0.7317073 -0.1639344 -0.1960784 -0.9887761
## OLIVIA    -0.9338521 -0.8280255  0.8196721 -0.8280255 -0.1639344
## PEARL      0.3908795 -0.8941878  0.9906063  0.7843137  0.9474768
## RUTH       0.6148591  0.5882353  0.9803922  0.9177430  0.4761905
## SYLVIA    -0.7100592 -0.6148591  0.9338521  0.0000000 -0.8617594
## THERESA    0.8617594  0.9677419  0.9090909  0.9677419  0.9353287
## VERNE      0.0000000 -0.1033295  0.9803922  0.5882353 -0.1960784
##                FLORA    FRANCES       HELEN  KATHERINE      LAURA
## BRENDA    -0.9338521  0.9789259 -0.29484029 -0.8617594  0.9370120
## CHARLOTTE -0.8280255  0.5882353 -0.30895984 -0.9677419  0.6148591
## DOROTHY    0.8196721  0.4918033  0.32786885  0.9529277  0.0000000
## ELEANOR   -0.8280255  0.9177430  0.39215686 -0.4761905  0.9789259
## EVELYN    -0.1639344  0.9677419 -0.85295990 -0.7002039  0.8617594
## FLORA      0.9992132 -0.8280255  0.32786885  0.1639344 -0.9338521
## FRANCES   -0.8280255  0.9995171 -0.30895984 -0.4761905  0.9789259
## HELEN      0.3278689 -0.3089598  0.99956915  0.4878049 -0.2948403
## KATHERINE  0.1639344 -0.4761905  0.48780488  0.9995953 -0.8617594
## LAURA     -0.9338521  0.9789259 -0.29484029 -0.8617594  0.9996033
## MYRNA      0.4918033 -0.1033295  0.83150985  0.9874327 -0.6148591
## NORA       0.9090909 -0.7317073  0.64516129  0.7681849 -0.8617594
## OLIVIA     0.9992132 -0.8280255  0.32786885  0.1639344 -0.9338521
## PEARL      0.6557377  0.7843137 -0.04872107  0.5355304  0.3908795
## RUTH       0.4918033  0.5882353  0.39215686  0.1960784  0.6148591
## SYLVIA     0.0000000 -0.6148591  0.76002815  0.9949332 -0.7100592
## THERESA   -0.1639344  0.9677419 -0.48780488 -0.7002039  0.8617594
## VERNE      0.4918033 -0.1033295  0.83150985  0.7317073  0.0000000
##                MYRNA       NORA     OLIVIA       PEARL       RUTH
## BRENDA    -0.6148591 -0.8617594 -0.9338521  0.39087948  0.6148591
## CHARLOTTE -0.9299656 -0.7317073 -0.8280255 -0.89418778  0.5882353
## DOROTHY    0.9803922 -0.1639344  0.8196721  0.99060632  0.9803922
## ELEANOR   -0.1033295 -0.1960784 -0.8280255  0.78431373  0.9177430
## EVELYN    -0.1960784 -0.9887761 -0.1639344  0.94747683  0.4761905
## FLORA      0.4918033  0.9090909  0.9992132  0.65573770  0.4918033
## FRANCES   -0.1033295 -0.7317073 -0.8280255  0.78431373  0.5882353
## HELEN      0.8315098  0.6451613  0.3278689 -0.04872107  0.3921569
## KATHERINE  0.9874327  0.7681849  0.1639344  0.53553038  0.1960784
## LAURA     -0.6148591 -0.8617594 -0.9338521  0.39087948  0.6148591
## MYRNA      0.9995171  0.4761905  0.4918033  0.78431373  0.5882353
## NORA       0.4761905  0.9995953  0.9090909  0.22962113 -0.1960784
## OLIVIA     0.4918033  0.9090909  0.9992132  0.65573770  0.4918033
## PEARL      0.7843137  0.2296211  0.6557377  0.99941894  0.7843137
## RUTH       0.5882353 -0.1960784  0.4918033  0.78431373  0.9995171
## SYLVIA     0.9789259  0.8617594  0.0000000  0.39087948  0.6148591
## THERESA   -0.1960784 -0.7681849 -0.1639344  0.94747683  0.9677419
## VERNE      0.9177430  0.4761905  0.4918033  0.78431373  0.9177430
##               SYLVIA    THERESA      VERNE
## BRENDA    -0.7100592  0.8617594  0.0000000
## CHARLOTTE -0.6148591  0.9677419 -0.1033295
## DOROTHY    0.9338521  0.9090909  0.9803922
## ELEANOR    0.0000000  0.9677419  0.5882353
## EVELYN    -0.8617594  0.9353287 -0.1960784
## FLORA      0.0000000 -0.1639344  0.4918033
## FRANCES   -0.6148591  0.9677419 -0.1033295
## HELEN      0.7600281 -0.4878049  0.8315098
## KATHERINE  0.9949332 -0.7002039  0.7317073
## LAURA     -0.7100592  0.8617594  0.0000000
## MYRNA      0.9789259 -0.1960784  0.9177430
## NORA       0.8617594 -0.7681849  0.4761905
## OLIVIA     0.0000000 -0.1639344  0.4918033
## PEARL      0.3908795  0.9474768  0.7843137
## RUTH       0.6148591  0.9677419  0.9177430
## SYLVIA     0.9996033 -0.5251641  0.9789259
## THERESA   -0.5251641  0.9995953  0.4761905
## VERNE      0.9789259  0.4761905  0.9995171
Q_women <- ifelse(women_Q>0.9, 1, 0) # Binarize
diag(Q_women)<-0
# Q_women    # Take a look at the matrix

YQ_women <- as.network(Q_women,     # Create an statnet network
                      mode = "undirected")
gplot(YQ_women, 
      usearrows = FALSE, 
      displaylabels = TRUE)





One-Mode Metrics

Once you have converted the two-mode network into two one-mode networks, you have another choice to make. You may analyze the networks that you converted to binary ties with statnet, or you may analyze the initial valued ties using tnet.

Centrality measures using statnet

You can use any of the above binary networks as you would any one-mode network. For example…


IDs <- jacc_women%v%"vertex.names"
women_deg  <- degree(jacc_women)
women_bet  <- betweenness(jacc_women)
women_clos <- closeness(jacc_women,
                        cmode="suminvdir")
women_eig  <- evcent(jacc_women)

women_cent_df <- data.frame(IDs, 
                            women_deg, 
                            women_bet, 
                            women_clos, 
                            women_eig)

women_cent_df
##          IDs women_deg women_bet women_clos women_eig
## 1     BRENDA         6  2.971429 0.37254902 0.2856491
## 2  CHARLOTTE        14 64.657143 0.52941176 0.4285473
## 3    DOROTHY         2  0.000000 0.31372549 0.1243714
## 4    ELEANOR         4  0.400000 0.33333333 0.2134015
## 5     EVELYN         2  0.000000 0.05882353 0.0000000
## 6      FLORA        10 22.123810 0.47058824 0.4285473
## 7    FRANCES         6  5.000000 0.37254902 0.2634008
## 8      HELEN         2  0.000000 0.05882353 0.0000000
## 9  KATHERINE         6  6.800000 0.39215686 0.2901715
## 10     LAURA         6  2.971429 0.37254902 0.2856491
## 11     MYRNA         2  0.000000 0.31372549 0.1243714
## 12      NORA         4  2.952381 0.35294118 0.2008146
## 13    OLIVIA        10 22.123810 0.47058824 0.4285473
## 14     PEARL         2  0.000000 0.31372549 0.1243714
## 15      RUTH         0  0.000000 0.00000000 0.0000000
## 16    SYLVIA         0  0.000000 0.00000000 0.0000000
## 17   THERESA         0  0.000000 0.00000000 0.0000000
## 18     VERNE         0  0.000000 0.00000000 0.0000000

…and for events…

Note: The jaccard_event conversion was run, but not shown above in order to conserve space. See if you can replicate these by creating the code yourself.

IDs <- jacc_event%v%"vertex.names"
events_deg  <- degree(jacc_event)
events_bet  <- betweenness(jacc_event)
events_clos <- closeness(jacc_event,
                         cmode="suminvdir")
events_eig  <- evcent(jacc_event)

events_cent_df <- data.frame(IDs,
                             events_deg, 
                             events_bet, 
                             events_clos, 
                             events_eig)

events_cent_df
##    IDs events_deg events_bet events_clos events_eig
## 1    1         12       29.6   0.6923077 0.32922140
## 2    2         10        7.6   0.6410256 0.31636758
## 3    3         10        7.6   0.6410256 0.31636758
## 4    4         10        7.6   0.6410256 0.31636758
## 5    5         10        7.6   0.6410256 0.31636758
## 6    6          2        0.0   0.4166667 0.06479131
## 7    7          0        0.0   0.0000000 0.00000000
## 8    8          2        0.0   0.4166667 0.06479131
## 9    9          2        0.0   0.4230769 0.06285933
## 10  10         10        5.6   0.6282051 0.30447976
## 11  11         12       27.6   0.6794872 0.31685058
## 12  12         12       27.6   0.6794872 0.31685058
## 13  13         10        5.6   0.6282051 0.30447976
## 14  14         10        5.6   0.6282051 0.30447976

Centrality measures using tnet

The second option is to not lose information and use tnet’s weighted centrality functions. To do so, you will first need to convert the matrices you created (not the statnet objects you created) above into a tnet data object.

For the example, we’ll use the Jaccard matching matrix.

JW <- as.tnet(women_jaccard)
## Warning in as.tnet(women_jaccard): Data assumed to be weighted one-mode
## tnet (if this is not correct, specify type)
head(JW)
##      i j         w
## [1,] 1 2 0.6546537
## [2,] 1 3 0.9354143
## [3,] 1 4 0.6546537
## [4,] 1 5 0.5773503
## [5,] 1 6 1.0000000
## [6,] 1 7 0.6546537
JE <- as.tnet(jaccard_event)
## Warning in as.tnet(jaccard_event): Data assumed to be weighted one-mode
## tnet (if this is not correct, specify type)
head(JE)
##      i  j w
## [1,] 1  9 1
## [2,] 1 10 1
## [3,] 1 11 1
## [4,] 1 12 1
## [5,] 1 13 1
## [6,] 1 14 1

Now that you have the data format that tnet expects, you are free to calculate weighted centrality measures.

women_Wdeg  <- degree_w(JW)[,2]
women_Wbet  <- betweenness_w(JW)[,2]
women_Wclos <- closeness_w(JW, gconly=FALSE)[,2]
# Note: tnet does not include eigenvector centrality

women_W_cent_df <- data.frame(women_Wdeg, 
                              women_Wbet, 
                              women_Wclos)

women_W_cent_df
##    women_Wdeg women_Wbet women_Wclos
## 1          17  0.2000000    16.41763
## 2          17  1.2000000    18.15897
## 3          17  0.0000000    17.03347
## 4          17  0.2000000    16.61392
## 5          17  0.0000000    16.54190
## 6          16  0.6666667    18.40998
## 7          17  0.2000000    17.09241
## 8          17  0.0000000    17.38381
## 9          17  0.0000000    17.08389
## 10         17  0.2000000    16.58020
## 11         17  0.0000000    16.76106
## 12         17  0.0000000    17.36456
## 13         16  0.6666667    18.40998
## 14         17  0.0000000    16.76286
## 15         17  0.0000000    16.13058
## 16         17  0.0000000    16.55905
## 17         17  0.0000000    15.87017
## 18         17  0.0000000    16.28210
event_Wdeg  <- degree_w(JE)[,2]
event_Wbet  <- betweenness_w(JE)[,2]
event_Wclos <- closeness_w(JE, gconly=FALSE)[,2]
# Note: tnet does not include eigenvector centrality

Event_W_cent_df <- data.frame(event_Wdeg, 
                              event_Wbet, 
                              event_Wclos)

Event_W_cent_df
##    event_Wdeg event_Wbet event_Wclos
## 1           6       14.8    9.000000
## 2           5        3.8    8.333333
## 3           5        3.8    8.333333
## 4           5        3.8    8.333333
## 5           5        3.8    8.333333
## 6           1        0.0    5.416667
## 7           0        0.0    0.000000
## 8           1        0.0    5.416667
## 9           1        0.0    5.500000
## 10          5        2.8    8.166667
## 11          6       13.8    8.833333
## 12          6       13.8    8.833333
## 13          5        2.8    8.166667
## 14          5        2.8    8.166667

Compare the measures that you produced using statnet with the binary networks, and then with the valued ties using tnet. Was there a difference in, say, the top five nodes? What do you notice when you take the tie values into account?

This all took longer than expected. So there is no deliverable for this practicum. The more adventurous among you may import one or more of the attributes from the movie as an edgelist to analyze using two-mode techniques. But, this is not required.

For now, use this as a reference guide to analyzing two-mode networks in R.