You’ll verify for yourself that PageRank works by performing calculations on a small universe of web pages. Let’s use the 6-page universe that we had in the course notes.

\[\mathbf{A} =\begin{bmatrix} 0 & \frac { 1 }{ 2 } & \frac { 1 }{ 2 } & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \frac { 1 }{ 3 } & \frac { 1 }{ 3 } & 0 & 0 & \frac { 1 }{ 3 } & 0 \\ 0 & 0 & 0 & 0 & \frac { 1 }{ 2 } & \frac { 1 }{ 2 } \\ 0 & 0 & 0 & \frac { 1 }{ 2 } & 0 & \frac { 1 }{ 2 } \\ 0 & 0 & 0 & 1 & 0 & 0 \end{bmatrix}\]

1. Problem set 1

For this directed graph, perform the following calculations in R.

The above discrete-time Markov chain since it is not stochastic (square with non-negative real elements that sum to one), irreducible (have a path from every node to every other node), or aperiodic (have a greatest common divisor, period, of 1 for the length of every cycle in the graph). This is due to the all-zero dangling node in the second row which represents a page with no outgoing links of its own. The discrete-time Markov chain is therefore introduced in order to show how the calculations yield the results expected for a well-behaved matrix (network, graph).

\[\mathbf{A} =\begin{bmatrix} 0 & \frac { 1 }{ 2 } & \frac { 1 }{ 2 } & 0 & 0 & 0 \\ \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{red}{0} \\ \frac { 1 }{ 3 } & \frac { 1 }{ 3 } & 0 & 0 & \frac { 1 }{ 3 } & 0 \\ 0 & 0 & 0 & 0 & \frac { 1 }{ 2 } & \frac { 1 }{ 2 } \\ 0 & 0 & 0 & \frac { 1 }{ 2 } & 0 & \frac { 1 }{ 2 } \\ 0 & 0 & 0 & 1 & 0 & 0 \end{bmatrix}, \quad \mathbf{G} =\begin{bmatrix} 0 & \frac { 1 }{ 2 } & \frac { 1 }{ 2 } & 0 & 0 & 0 \\ \color{red}{\frac { 1 }{ 6 }} & \color{red}{\frac { 1 }{ 6 }} & \color{red}{\frac { 1 }{ 6 }} & \color{red}{\frac { 1 }{ 6 }} & \color{red}{\frac { 1 }{ 6 }} & \color{red}{\frac { 1 }{ 6 }} \\ \frac { 1 }{ 3 } & \frac { 1 }{ 3 } & 0 & 0 & \frac { 1 }{ 3 } & 0 \\ 0 & 0 & 0 & 0 & \frac { 1 }{ 2 } & \frac { 1 }{ 2 } \\ 0 & 0 & 0 & \frac { 1 }{ 2 } & 0 & \frac { 1 }{ 2 } \\ 0 & 0 & 0 & 1 & 0 & 0 \end{bmatrix}\]

(1.a) Form the \(\mathbf{A}\) matrix. Then, introduce decay (multiply by a number smaller than unity) and form the \(\mathbf{B}\) matrix as we did in the course notes.

a <- c(0,1/2,1/2,0,0,0, 0,0,0,0,0,0, 1/3,1/3,0,0,1/3,0,
       0,0,0,0,1/2,1/2, 0,0,0,1/2,0,1/2, 0,0,0,1,0,0)
n <- sqrt(length(a))
behavior <- rowSums(matrix(a, nrow=n))
order <- all(behavior == 1 | behavior ==0)
(A <- matrix(a, nrow = n, ncol = n, byrow=order))
##      [,1] [,2]      [,3] [,4] [,5] [,6]
## [1,]  0.0    0 0.3333333  0.0  0.0    0
## [2,]  0.5    0 0.3333333  0.0  0.0    0
## [3,]  0.5    0 0.0000000  0.0  0.0    0
## [4,]  0.0    0 0.0000000  0.0  0.5    1
## [5,]  0.0    0 0.3333333  0.5  0.0    0
## [6,]  0.0    0 0.0000000  0.5  0.5    0
G <- A; G[ ,2] <- rep(1 / n, n); G
##      [,1]      [,2]      [,3] [,4] [,5] [,6]
## [1,]  0.0 0.1666667 0.3333333  0.0  0.0    0
## [2,]  0.5 0.1666667 0.3333333  0.0  0.0    0
## [3,]  0.5 0.1666667 0.0000000  0.0  0.0    0
## [4,]  0.0 0.1666667 0.0000000  0.0  0.5    1
## [5,]  0.0 0.1666667 0.3333333  0.5  0.0    0
## [6,]  0.0 0.1666667 0.0000000  0.5  0.5    0

Damping effectively links to all pages in this matrix (network, graph, web) as seen below. \[{ B }=d\times { A }+\frac { \left( 1-d \right) }{ n }\]

d <- 0.85
(B_1 <- d * A + (1 - d) / n)
##       [,1]  [,2]      [,3]  [,4]  [,5]  [,6]
## [1,] 0.025 0.025 0.3083333 0.025 0.025 0.025
## [2,] 0.450 0.025 0.3083333 0.025 0.025 0.025
## [3,] 0.450 0.025 0.0250000 0.025 0.025 0.025
## [4,] 0.025 0.025 0.0250000 0.025 0.450 0.875
## [5,] 0.025 0.025 0.3083333 0.450 0.025 0.025
## [6,] 0.025 0.025 0.0250000 0.450 0.450 0.025
  1. Start with a uniform rank vector \(r\) and perform power iterations on \(\mathbf{B}\) till convergence. That is, compute the solution \(r=\mathbf{B}^{ n } \times r\). Attempt this for a sufficiently large \(n\) so that \(r\) actually converges.

In R, A^n is to A %*% A.

pagerank <- function(B) {
  pages <- dim(B)[1]; Test <- B; n = 1; check <- F
  r <- r_i <- r_f <- matrix(rep(1/pages, pages), nrow = pages, ncol = 1)
  while (check != TRUE) {
    r_i <- r
    Test <- Test %*% B
    r <- Test %*% r
    r_f <- r
    n = n + 1
    check <- all.equal(r_i, r_f)
    }
  r <- r_f / sqrt(sum(r_f^2)) # normalize
  results <- list("Iterations" = n, 
                  "r" = r, 
                  "Stabilized" = round(Test, 7))
  return(results)
}
(PageRank_1 <- pagerank(B_1))
## $Iterations
## [1] 26
## 
## $r
##            [,1]
## [1,] 0.07608991
## [2,] 0.11007101
## [3,] 0.08481827
## [4,] 0.71837250
## [5,] 0.39690895
## [6,] 0.54891237
## 
## $Stabilized
##           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
## [1,] 0.0048201 0.0012777 0.0066545 0.0119616 0.0119616 0.0119616
## [2,] 0.0069728 0.0018483 0.0096263 0.0173035 0.0173035 0.0173035
## [3,] 0.0053731 0.0014243 0.0074178 0.0133337 0.0133337 0.0133337
## [4,] 0.0455075 0.0120629 0.0628256 0.1129303 0.1129303 0.1129303
## [5,] 0.0251434 0.0066649 0.0347119 0.0623953 0.0623953 0.0623953
## [6,] 0.0347725 0.0092173 0.0480054 0.0862907 0.0862907 0.0862907

At a tolerance of 0.000000015, \(r_f \rightarrow r_i\) after \(n = 26\) iterations.

  1. Compute the eigen-decomposition of \(\mathbf{B}\) and verify that you indeed get an eigenvalue of \(\lambda = 1\) as the largest eigenvalue and that its corresponding eigenvector \(v\) is the same vector \(r\) that you obtained in the previous power iteration method. Further, this eigenvector has all positive entries and it sums to \(1\).
max(Re(eigen(B_1)$values))
## [1] 0.9516527
data.frame(
  "r" = PageRank_1$r[ ,1] / sqrt(sum(PageRank_1$r[ ,1]^2)),
  "v" = matrix(Re(eigen(B_1)$vectors)[ ,1]))
##            r          v
## 1 0.07608991 0.07608991
## 2 0.11007101 0.11007101
## 3 0.08481827 0.08481827
## 4 0.71837250 0.71837250
## 5 0.39690895 0.39690895
## 6 0.54891237 0.54891237
sum(PageRank_1$Stabilized[,1])
## [1] 0.1225894

Given the decayed matrix \(\mathbf{B_1}\) obtained by decaying matrix \(\mathbf{A}\) which is not well-behaved, \(r=v\) for \(\max { \left( \lambda \right)}\), but the largest eigenvalue of \(\mathbf{B_1}\) is \(\lambda = 0.9516527 \neq 1\). Furthermore, the Perron Frobenius Theorem does not hold. Although the largest eigenvalue has an associated eigenvector that is strictly positive, the values do not sum to one: \(\sum { v } =0.1225894\neq 1\)

B_2 <- d * G + (1 - d) / n
PageRank_2 <- pagerank(B_2)
max(Re(eigen(B_2)$values))
## [1] 1
data.frame(
  "r" = PageRank_2$r[ ,1] / sqrt(sum(PageRank_2$r[ ,1]^2)),
  "v" = matrix(Re(eigen(B_2)$vectors)[ ,1]))
##           r         v
## 1 0.1044385 0.1044385
## 2 0.1488249 0.1488249
## 3 0.1159674 0.1159674
## 4 0.7043472 0.7043472
## 5 0.4037861 0.4037861
## 6 0.5425377 0.5425377
sum(PageRank_2$Stabilized[,1])
## [1] 1

Given the decayed matrix \(\mathbf{B_2}\) obtained by decaying the matrix \(\mathbf{G}\) which is well-behaved, \(r=v\) for \(\max { \left( \lambda \right)}\) and the largest eigenvalue of \(\mathbf{B_2}\) is \(\lambda = 1\). Furthermore, the Perron Frobenius Theorem holds. The largest eigenvalue has an associated eigenvector that is strictly positive, and the values sum to one: \(\sum { v } =1\)

  1. Use the igraph package in R and its page_rank method to compute the Page Rank of the graph as given in \(\mathbf{A}\). Note that you don’t need to apply decay. The package starts with a connected graph and applies decay internally with damping = 0.85 by default. Verify that you do get the same PageRank vector as the two approaches above.
library(igraph)
g_1 <- graph_from_adjacency_matrix(t(A), weighted = TRUE, mode = "directed")
data.frame(
  "r" = PageRank_1$r,
  "v" = matrix(Re(eigen(B_1)$vectors)[ ,1]),
  "igraph" = matrix(page_rank(g_1)$vector) / sqrt(sum(matrix(page_rank(g_1)$vector)^2))
)
##            r          v    igraph
## 1 0.07608991 0.07608991 0.1044385
## 2 0.11007101 0.11007101 0.1488249
## 3 0.08481827 0.08481827 0.1159674
## 4 0.71837250 0.71837250 0.7043472
## 5 0.39690895 0.39690895 0.4037861
## 6 0.54891237 0.54891237 0.5425377

When normalized, the igraph package page_rank method does not return the same vector as the previous two approaches for matrix \(\mathbf{A}\) which is not well-behaved.

g_2 <- graph_from_adjacency_matrix(t(G), weighted = TRUE, mode = "directed")
data.frame(
  "r" = PageRank_2$r,
  "v" = matrix(Re(eigen(B_2)$vectors)[ ,1]),
  "igraph" = matrix(page_rank(g_2)$vector) / sqrt(sum(matrix(page_rank(g_2)$vector)^2))
)
##           r         v    igraph
## 1 0.1044385 0.1044385 0.1044385
## 2 0.1488249 0.1488249 0.1488249
## 3 0.1159674 0.1159674 0.1159674
## 4 0.7043472 0.7043472 0.7043472
## 5 0.4037861 0.4037861 0.4037861
## 6 0.5425377 0.5425377 0.5425377

When normalized, the igraph package page_rank method does return the same vector as the previous two approaches for matrix \(\mathbf{G}\) which is well-behaved.

par(mfrow=c(1, 2))
plot.igraph(g_1, main="Matrix A", xlab ="Not Well-Behaved")
plot.igraph(g_2, main="Matrix G", xlab ="Well-Behaved")

References

http://mathworld.wolfram.com/Perron-FrobeniusTheorem.html

http://www.math.cornell.edu/~mec/Winter2009/RalucaRemus/Lecture3/lecture3.html

http://www.cs.cmu.edu/~elaw/pagerank.pdf

http://cesg.tamu.edu/wp-content/uploads/2014/09/ECEN689-lec18.pdf

http://www.math.uah.edu/stat/markov/Periodicity.html

http://www.math.uah.edu/stat/markov/Periodicity.html

http://www.mathematik.uni-ulm.de/stochastik/lehre/ss06/markov/skript_engl/node12.html

http://igraph.org/r/doc/page_rank.html

https://cran.r-project.org/web/packages/igraph/igraph.pdf

http://michael.hahsler.net/SMU/LearnROnYourOwn/code/igraph.html