Problem Set 1 - Playing with PageRank

Start with a uniform rank vector r and perform power iterations on B till convergence. That is, compute the solution \(r = B^n \times r\). Attempt this for a sufficiently large \(n\) so that \(r\) actually converges.

Define \(A\) matrix and vector \(r\).

A <- matrix(c(0, 0, 0.25, 0, 0, 0, 
            0.5, 0, 0.25, 0, 0, 0,
            0.5, 1, 0, 0, 0, 0.5,
            0, 0, 0, 0, 0.5, 0.5,
            0, 0, 0.25, 0.5, 0, 0,
            0, 0, 0.25, 0.5, 0.5, 0), nrow = 6, byrow = T)
print(A)
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]  0.0    0 0.25  0.0  0.0  0.0
## [2,]  0.5    0 0.25  0.0  0.0  0.0
## [3,]  0.5    1 0.00  0.0  0.0  0.5
## [4,]  0.0    0 0.00  0.0  0.5  0.5
## [5,]  0.0    0 0.25  0.5  0.0  0.0
## [6,]  0.0    0 0.25  0.5  0.5  0.0
r <- (matrix(c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6), nrow = 6, byrow = T))
print(r)
##           [,1]
## [1,] 0.1666667
## [2,] 0.1666667
## [3,] 0.1666667
## [4,] 0.1666667
## [5,] 0.1666667
## [6,] 0.1666667

Define PageRank (PR) function and iterate to convergence.

PRfunc <- function(x, n, r) {
  newx = diag(nrow(x))
  for(i in 1:n) {
    newx <- newx %*% x
  }
  PR <- round((newx %*% r), 7)
  print(PR)
}

Test lecture notes example: 2 iterations, no decay.

PRfunc(A, 2, r)
##           [,1]
## [1,] 0.0833333
## [2,] 0.1041667
## [3,] 0.2500000
## [4,] 0.1666667
## [5,] 0.1666667
## [6,] 0.2291667

Apply 85% decay to \(A\), iterate through matrix \(B\) until convergence. Convergence happens at 24 iterations.

B <- 0.85 * A + 0.15/6

PRfunc(B, 10, r)
##           [,1]
## [1,] 0.0773800
## [2,] 0.1102758
## [3,] 0.2464459
## [4,] 0.1862835
## [5,] 0.1565573
## [6,] 0.2230575
PRfunc(B, 15, r)
##           [,1]
## [1,] 0.0773596
## [2,] 0.1102377
## [3,] 0.2463963
## [4,] 0.1863525
## [5,] 0.1565583
## [6,] 0.2230956
PRfunc(B, 23, r)
##           [,1]
## [1,] 0.0773589
## [2,] 0.1102364
## [3,] 0.2463947
## [4,] 0.1863539
## [5,] 0.1565593
## [6,] 0.2230969
PRfunc(B, 24, r)
##           [,1]
## [1,] 0.0773589
## [2,] 0.1102364
## [3,] 0.2463946
## [4,] 0.1863539
## [5,] 0.1565593
## [6,] 0.2230970
PRfunc(B, 100, r)
##           [,1]
## [1,] 0.0773589
## [2,] 0.1102364
## [3,] 0.2463946
## [4,] 0.1863539
## [5,] 0.1565593
## [6,] 0.2230970

Compute the eigen-decomposition of \(B\) and verify that you ineeed get an eigenvalue of 1 as the largest eigenvalue and that its corresponding eigenvector is the same vector that you obtained in the previous power iteration method. Further, this eigenvector has all positive entries and it sums to 1.

Compute eigenvalues and eigenvectors.

evB <- eigen(B)
sum_evB <- sum(evB$vectors[,1])
PR_evB <- round((evB$vectors[,1] * 1 / sum_evB), 7)

cat("Eigenvalues:", eigen(B)$values)
## Eigenvalues: 1+0i 0.5063824+0i -0.425+0i -0.425-0i -0.2531912+0.108131i -0.2531912-0.108131i
cat("Eigenvectors:", PR_evB)
## Eigenvectors: 0.0773589+0i 0.1102364+0i 0.2463946+0i 0.1863539+0i 0.1565593+0i 0.223097+0i
cat("Eigenvectors sum to:", sum(PR_evB))
## Eigenvectors sum to: 1+0i

Use the igraph package in R and its page.rank method to compute the PageRank of the graph as given in \(A\)…Verify that you do get the same PageRank vector as the two approaches above.

Use igraph to plot network of 6 URLs, compute PageRank.

library(igraph)
## 
## Attaching package: 'igraph'
## 
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## 
## The following object is masked from 'package:base':
## 
##     union
g <- graph.adjacency(t(A), weighted = T, mode = "directed")
plot(g)

PR_ig <- round((page.rank(g)$vector), 7)

print(PR_ig)
## [1] 0.0773589 0.1102364 0.2463946 0.1863539 0.1565593 0.2230970

Compare all 3 methods.

PRfunc(B, 24, r)
##           [,1]
## [1,] 0.0773589
## [2,] 0.1102364
## [3,] 0.2463946
## [4,] 0.1863539
## [5,] 0.1565593
## [6,] 0.2230970
print(PR_evB)
## [1] 0.0773589+0i 0.1102364+0i 0.2463946+0i 0.1863539+0i 0.1565593+0i
## [6] 0.2230970+0i
print(PR_ig)
## [1] 0.0773589 0.1102364 0.2463946 0.1863539 0.1565593 0.2230970
isTRUE(round(PRfunc(B, 24, r), 4) == round(PR_evB, 4) && round(PRfunc(B, 24, r), 4) == round(PR_ig, 4))
##           [,1]
## [1,] 0.0773589
## [2,] 0.1102364
## [3,] 0.2463946
## [4,] 0.1863539
## [5,] 0.1565593
## [6,] 0.2230970
##           [,1]
## [1,] 0.0773589
## [2,] 0.1102364
## [3,] 0.2463946
## [4,] 0.1863539
## [5,] 0.1565593
## [6,] 0.2230970
## [1] TRUE