You’ll verify for yourself that PageRank works by performing calculations on a small universe of web pages.
Let’s use the 6 page universe that we had in the course notes. For this directed graph, perform the following calculations in R.
library(knitr)
library(igraph)
##
## Attaching package: 'igraph'
##
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
##
## The following object is masked from 'package:base':
##
## union
#Matrix A
A <- matrix(c(0,0,1/4,0,0,0,
1/2,0,1/4,0,0,0,
1/2,1,0,0,0,1/2,
0,0,0,0,1/2,1/2,
0,0,1/4,1/2,0,0,
0,0,1/4,1/2,1/2,0),nrow=6, byrow = TRUE)
A
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0 0 0.25 0.0 0.0 0.0
## [2,] 0.5 0 0.25 0.0 0.0 0.0
## [3,] 0.5 1 0.00 0.0 0.0 0.5
## [4,] 0.0 0 0.00 0.0 0.5 0.5
## [5,] 0.0 0 0.25 0.5 0.0 0.0
## [6,] 0.0 0 0.25 0.5 0.5 0.0
B <- 0.85*A +0.15/6
B
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.025 0.025 0.2375 0.025 0.025 0.025
## [2,] 0.450 0.025 0.2375 0.025 0.025 0.025
## [3,] 0.450 0.875 0.0250 0.025 0.025 0.450
## [4,] 0.025 0.025 0.0250 0.025 0.450 0.450
## [5,] 0.025 0.025 0.2375 0.450 0.025 0.025
## [6,] 0.025 0.025 0.2375 0.450 0.450 0.025
# Matrix r
r<- matrix(c(1/6,1/6,1/6,1/6,1/6,1/6),nrow=6)
converged <- FALSE
B_i <- diag(6)
n<- 10000
i <-1
r_i <- r
while(!converged & i<n) {
B_i <- B_i%*%B
r_i <- B_i%*%r_i
converged <- isTRUE(all.equal(r_i, B_i%*%r_i))
i <- i+1
}
#Solution found in
i
## [1] 8
#and the rank for the decayed matrix A is:
r_i
## [,1]
## [1,] 0.07735886
## [2,] 0.11023638
## [3,] 0.24639464
## [4,] 0.18635389
## [5,] 0.15655927
## [6,] 0.22309696
ev <- eigen(B)
ev$values
## [1] 1.0000000+0.000000i 0.5063824+0.000000i -0.4250000+0.000000i
## [4] -0.4250000-0.000000i -0.2531912+0.108131i -0.2531912-0.108131i
As we see in the above list of eigenvalues, \(\lambda=1\) is the largest eigenvalue. The corresponding vector is:
r_ev <- ev$vectors[,1]
r_ev
## [1] -0.1784825+0i -0.2543376+0i -0.5684822+0i -0.4299561+0i -0.3612138+0i
## [6] -0.5147297+0i
This is not equal to the vector obtained thru the iterative method. In addition, it is interesting to note that the vector r_ev:
However, after some experimentation, I found that \(r_{i} = \frac{r_{ev}}{\sum{r_{ev}}}\). Therefore, the correct solution is:
r_ev <- as.numeric(r_ev / sum(r_ev))
r_ev
## [1] 0.07735886 0.11023638 0.24639464 0.18635389 0.15655927 0.22309696
g <- random.graph.game(20, 5/20, directed=TRUE)
g <- make_empty_graph(n = 6) %>%
add_edges(c(1,2 , 1,3 ,
2,3 ,
3,1 , 3,2 , 3,5 , 3,6 ,
4,5 , 4,6 ,
5,4 , 5,6 ,
6,3 , 6,4
))
plot(g)
r_graph <- page.rank(g)$vector
r_graph
## [1] 0.07735886 0.11023638 0.24639464 0.18635389 0.15655927 0.22309696
The obtained vectors compare as:
# Compare the iterative vs graph pagerank
all.equal(as.matrix(r_i),as.matrix(r_graph))
## [1] TRUE
# Compare the eigenvector solution vs graph pagerank
all.equal(as.matrix(r_ev),as.matrix(r_graph))
## [1] TRUE