Let us try to compute the expected number of \(k\)-cycles and \(k\)-cycles with targets in an Erdos-Renyi graph.
Theorem: Let \(H\) be a directed graph on \(k\) vertices with \(m\) edges. Let \(G \sim G(n,p)\) be the Erdos-Renyi graph on \(n\) vertices with edge probability \(p\). Let \(X_k\) denote the number of subgraphs of size \(k\) of \(G\) which are isomorphic to \(H\). Then
\[\mathbb{E}(X_k) = \frac{n!p^m(1-p)^{k(k-1)-m}}{(n-k)!Aut(H)}.\]
Proof: First, we compute the probability that an Erdos-Renyi graph \(G \sim G(k,p)\) is isomorphic to \(H\). If we can compute this, by noting that every subgraph of an Erdos-Renyi graph is in fact Erdos-Renyi gives us the desired expected value, by multiplying by \(\binom{n}{k}\).
We have that there are \(k!/Aut(H)\) distinct labellings of the vertices of \(G\) modded by the automorphism group of \(H\) (Figure 1). Given one such labelling, there are precisely \(m\) edges and \(k(k-1)-m\) non-edges which must be filled to make a graph isomorphic to \(H\). The probability of an edge and non-edge are respectively \(p\) and \(1-p\). Hence, the probability of all \(k(k-1)\) such events occurring is \(p^m(1-p)^{k(k-1)-m}\) and hence the probability that \(G(k,p)\) is isomorphic to \(H\) is
\[\mathbb{E}(G(k,p) \cong H) = \frac{k!p^m(1-p)^{k(k-1)-m}}{Aut(H)}.\]
Hence we have that
\[ \mathbb{E}(X_k) = \binom{n}{k} \frac{k!p^m(1-p)^{k(k-1)-m}}{Aut(H)} =\frac{n!p^m(1-p)^{k(k-1)-m}}{(n-k)!Aut(H)}. \square\]Figure 1: A. The graph depicted has automorphism group \(C_3 \times C_2\). B. An equivalence class of labellings of the graph in A, up to applications of elements of its automorphism group. As this will partition the \(5!\) labellings, we have that there are \(5!/6 = 20\) different labellings up to graph automorphism.
Given this formula, we have the following two corollaries
Corollary 1: The expected number of \(k\)-cycles in \(G \sim G(n,p)\)is
\[ \frac{n!p^k(1-p)^{k(k-2)}}{(n-k)!k}.\]
We say that a subgraph \(H\) of \(G\) has a no back edge target if there exists \(j \in V(G) \setminus V(H)\) such that the directed edge \(i \to j \in E(G)\), but the directed edge \(j \to i \not\in E(G)\) for all \(i \in V(H)\).
Corollary 2: The expected number of \(k\)-cycles with a no back edge target in \(G \sim G(n,p)\) is \[\frac{n!p^{2k}(1-p)^{k(k-1)}}{(n-k-1)!k}.\]
Proof: Let \(H'\) be a \(k\)-cycle with a target \(i\), and let \(H = H' \cup i.\) Then \(H\) has \(k + 1\) vertices and \(2k\) edges. The automorphism group of \(H\) is \(C_k\) whose order is \(k\). \(\square\)
Let’s check these results with just 20 trials, to ease the computational complexity.
First we compute the predictions for our experiments.
# This function implements the estimate above
E_k_cycle <- function(n,p,k){
x = factorial(n)*p^(k)*(1-p)^(k*(k-2))/(factorial(n-k)*k)
return(x)
}
E_k_cycle_with_target <- function(n,p,k){
x = factorial(n)*p^(2*k)*(1-p)^(k*(k-1))/(factorial(n-k-1)*k)
return(x)
}
E_k_cycle(100,0.2,3)
## [1] 1324.646
E_k_cycle_with_target(100,0.2,3)
## [1] 526.2979
Now we generate the number of 3-cycles in 20 Erdos-Renyi graphs with parameters \(n = 100, p = 0.2\).
# Define parameters
n = 100
p = 0.2
adj = matrix(c(0,1,0,
0,0,1,
1,0,0),3,3)
H <- graph_from_adjacency_matrix(adj)
k = nrow(adj)
num_trials = 20
dist = matrix(0, nrow = 1, ncol = num_trials)
for (i in 1:num_trials){
# Generate an Erdos Renyi Graph G(n,p)
G = sample_gnp(n, p, directed = TRUE, loops = FALSE)
subgraph_isomorphisms = count_subgraph_isomorphisms(H, G, method = c("lad"), induced = TRUE)
size_of_automorphism_group_H = count_isomorphisms(H,H)
count = subgraph_isomorphisms/size_of_automorphism_group_H
dist[1,i] = count
}
hist(dist, main = "number of 3-cycles in G(100, 0.2)")
mu = mean(dist)
pred = E_k_cycle(100,0.2,3)
abline(v=c(mu,pred), col=c("blue", "red"), lty=c(1,2), lwd=c(1, 2))
legend("topleft", legend=c("mean", "predicted"),
col=c("blue", "red"), lty=1:2, cex=0.8)
We display the actual mean and the predicted values.
## [1] "mean of generated data:"
## [1] 1321
## [1] "predicted expected vallue:"
## [1] 1324.646
We repeat this process for the number of targets
We again display the actual mean and the predicted values.
## [1] "mean of generated data:"
## [1] 522.95
## [1] "predicted expected vallue:"
## [1] 526.2979
The following two plots are for a fixed \(n\) and \(k\), while varying \(p\) for our estimate in Corollary 1. We see very good agreement.
We now explore whether Corollary 2 is a “good” first-order approximation of the number of 3-cycles with targets. There are two problems with this estimate. The first problem is that this estimate does not count all numbers of sinks. The second problem is that a 3-cycle may have multiple targets, and for large \(p\), this leads to an overcount of the number of 3-cycles with targets.
In the following figure (Figure 4), in dashed lines we see the result of experiments for 1. the number of 3 cycles, the number of 3-cycles with targets, and the number of target free 3-cycles. In solid lines, we see in blue the estimate for the number of 3-cycles, in red the number of 3-cycles with a no back edge target (see above), and in green, subtracting the two previous values.