Imagine we have three people in a (very small) college seminar class: \(Taylor\), \(Ash\), and \(Morgan\). At the beginning of the semester, none of these three people know each other, but at the end we observe that \(Taylor\) and \(Ash\) have become friends, \(Taylor\) and \(Morgan\) have become friends, but \(Ash\) and \(Morgan\) are not friends. How many different ways could friendship have formed between these 3 people? In this post I provide an answer to this question by drawing on insights from set theory, graph theory, and combinatorics.
Graph theory is a field of mathematics concerned with the structure of relationships between objects. These objects are typically abstracted away, but can be anything from circuits in an electrical grid, webpages on a website, or people interacting with each other in a community. These relational structures are called “graphs”. The individual components within these graphs (e.g. the people within a social network, the specific webpages) are often called “nodes” or “vertices” and the connections between them “edges”. There is an added layer of nuance to these structures such that these graphs can be categorized as “directed” or “undirected”. In an undirected graph, edges represent binary relationships between nodes (e.g. Are these two people friends or not?). As indicated by its name, a directed graph encodes directional information between vertices. A website on the Internet, for example, is a directed graph. Web Page A may link or point to Web Page B, but Web Page B may not link back to Web Page A. The answer to the question I posed above is contingent on the type of network being studied. For this post, I focus primarily on un-directed graphs and assume friendship is binary between our 3 college students.
Below is an undirected graph of order 3. This means that this graph has 3 vertices.
library(igraph)
library(dplyr)
#Defining friendship network
G <- cbind(
c("Taylor","Ash","Morgan"),
c("Ash","Ash","Taylor")
) %>%
graph_from_data_frame(directed = F) %>%
simplify #Removes the loop from/to Ash
#Plotting our friendship network
plot(G,
layout = layout_in_circle,
vertex.label.cex = 3,
vertex.shape = 'none',
edge.width = 4)
We can think of this graph as a combination of two collections: a collection of vertices and a collection of the edges connecting them. These collections are often formally referred to as “sets”. From the example earlier, we would denote the set of vertces as \(V(G)\) and we can see it is comprised of \(Taylor\), \(Ash\), and \(Morgan\). Again, the set of edges from our graph, which we would denote as \(E(G)\), contains an edge connecting \(Taylor\) to \(Ash\) and an edge connecting \(Taylor\) to \(Morgan\). Recall that this graph is referred to as an undirected graph, such that the edge represents a binary state between any two nodes.
\[V(G) =\]
## + 3/3 vertices, named, from 7020db7:
## [1] Taylor Ash Morgan
\[E(G) =\]
## + 2/2 edges from 7020db7 (vertex names):
## [1] Taylor--Ash Taylor--Morgan
Combinatorics is a field of mathematics concerned with counting. Often when people discuss permutations, combinations, sampling with replacement, sampling without replacement, sampling where order does matter, sampling when order doesn’t matter, etc. they are referring to topics within combinatorics. For this problem, I specifically focus on sampling without replacement where order does not matter.
The original question I posed requires us to ask a follow-up: What is the maximum number of edges a graph can have? To answer this, we need to rely on what is called the “binomial coefficient” which we define as \[ \binom{N}{k} = \frac{N!}{k!(N-k)!} \] where \(k\) is equal to 2 and \(N\) is the number of vertices. In this context, this tells us the maximum number of edges present in an undirected graph. From a combinatoric perspective, this value tells us the total number of ways we can draw any two “things” at the same time from a larger collection of \(N\) “things” (again, ignoring the order in which we draw the two “things” and sampling them without replacement).
Thus, for a graph of order \(N\) (recall that order refers to the size of a graph’s vertex set, \(V(G)\)), the maximum number of edges is \[\binom{N}{2}\] From our earlier friendship network example, the maximum number of edges that could exist in this graph is \(\binom{3}{2}= 3\).
Now that we’ve defined both the binomial coefficient and the size of the edge set, we only need these two tools to compute a solution to question I posed at the beginning of this post. For a group of three people there are
\[\sum_{i=0}^{\binom{3}{2}}\binom{\binom{3}{2}}{i} = 2^3 = 8\]
different ways this network of friendship could have occured. This formalization shows us the total number of ways we can draw all possible samples of edges, ranging from 0 to the drawing the entire edge set, from the larger collection of total possible edges in our undirected graph.
To generalize this solution for \(N\) people we can simply replace 3 with \(N\)
\[\sum_{i=0}^{\binom{N}{2}}\binom{\binom{N}{2}}{i} = 2^\binom{N}{2}\]
As a bonus, I wanted to provide visualizations of what all of the friendship networks for \(Taylor\), \(Ash\), and \(Morgan\) may have looked like.