You’ll verify for yourself that PageRank works by performing calculations on a small universe of web pages. Let’s use the 6 page universe that we had in the previous discussion For this directed graph, perform the following calculations in R.
Matrix A from the notes:
A <- matrix(c(0, 1/2, 1/2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1/3, 1/3, 0, 0, 1/3, 0, 0, 0, 0, 0, 1/2, 1/2, 0, 0, 0, 1/2, 0, 1/2, 0, 0, 0, 1, 0, 0), nrow = 6, byrow = TRUE)
A
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0000000 0.5000000 0.5 0.0 0.0000000 0.0
## [2,] 0.0000000 0.0000000 0.0 0.0 0.0000000 0.0
## [3,] 0.3333333 0.3333333 0.0 0.0 0.3333333 0.0
## [4,] 0.0000000 0.0000000 0.0 0.0 0.5000000 0.5
## [5,] 0.0000000 0.0000000 0.0 0.5 0.0000000 0.5
## [6,] 0.0000000 0.0000000 0.0 1.0 0.0000000 0.0
The second row should be filled with 1/6 instead of 0s as we are building the matrix for the the 6 page universe:
ri <- matrix(c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6),ncol=6)
A[2,]<-ri
A
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000
## [2,] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
## [3,] 0.3333333 0.3333333 0.0000000 0.0000000 0.3333333 0.0000000
## [4,] 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000
## [5,] 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000
## [6,] 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000
\[ A= \begin{bmatrix} 0 & 1/2 & 1/2 & 0 & 0 & 0 \\ 1/6 & 1/6 & 1/6 & 1/6 & 1/6 & 1/6 \\ 1/3 & 1/3 & 0 & 0 & 1/3 & 0 \\ 0& 0& 0& 0& 1/2& 1/2 \\ 0& 0& 0& 1/2& 0&1/2 \\ 0& 0& 0& 1& 0& 0 \end{bmatrix}\]
A new matrix B which is a decayed version of the original matrix A: \[B=0.85\cdot A + \frac{1-d}{n}\]
In the classic version by Page and Brin, the decay and n (size of matrix A) is: \[d=0.85, n=6\]
decay=0.85
n <- nrow(A)
B <- decay * A + (1-decay)/n
B
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0250000 0.4500000 0.4500000 0.0250000 0.0250000 0.0250000
## [2,] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
## [3,] 0.3083333 0.3083333 0.0250000 0.0250000 0.3083333 0.0250000
## [4,] 0.0250000 0.0250000 0.0250000 0.0250000 0.4500000 0.4500000
## [5,] 0.0250000 0.0250000 0.0250000 0.4500000 0.0250000 0.4500000
## [6,] 0.0250000 0.0250000 0.0250000 0.8750000 0.0250000 0.0250000
\[ B = \begin{bmatrix} 0.025 & 0.45 & 0.45 & 0.025 & 0.025 & 0.025 \\ 0.167 & 0.167 & 0.167 & 0.167 & 0.167 & 0.167 \\ 0.308 & 0.308 & 0.025 & 0.025 & 0.308 & 0.025 \\ 0.025 & 0.025 & 0.025 & 0.025 & 0.45 & 0.45 \\ 0.025 & 0.025 & 0.025 & 0.45 & 0.025 & 0.45 \\ 0.025 & 0.025 & 0.025 & 0.875 & 0.025 & 0.025 \end{bmatrix} \]
The convergence happens at the interation=30.
r_i <- rep(1/6, 6)
n <- 0
iter_0 <- matrix.power(t(B), n) %*% r_i
iter_0
## [,1]
## [1,] 0.1666667
## [2,] 0.1666667
## [3,] 0.1666667
## [4,] 0.1666667
## [5,] 0.1666667
## [6,] 0.1666667
n <- 10
iter_10 <- matrix.power(t(B), n) %*% r_i
iter_10
## [,1]
## [1,] 0.05205661
## [2,] 0.07428990
## [3,] 0.05782138
## [4,] 0.34797267
## [5,] 0.19975859
## [6,] 0.26810085
n <- 20
iter_20 <- matrix.power(t(B), n) %*% r_i
iter_20
## [,1]
## [1,] 0.05170616
## [2,] 0.07368173
## [3,] 0.05741406
## [4,] 0.34870083
## [5,] 0.19990313
## [6,] 0.26859408
n <- 30
r_converge <- matrix.power(t(B), n) %*% r_i
r_converge
## [,1]
## [1,] 0.05170475
## [2,] 0.07367927
## [3,] 0.05741242
## [4,] 0.34870367
## [5,] 0.19990381
## [6,] 0.26859607
n <- 40
iter_40 <- matrix.power(t(B), n) %*% r_i
iter_40
## [,1]
## [1,] 0.05170475
## [2,] 0.07367926
## [3,] 0.05741241
## [4,] 0.34870369
## [5,] 0.19990381
## [6,] 0.26859608
The eigen-decomposition of B:
eig_b <- eigen(B)
eig_b
## eigen() decomposition
## $values
## [1] 1.00000000 0.57619235 -0.42500001 -0.42499999 -0.34991524 -0.08461044
##
## $vectors
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.4082483 -0.7278031 -5.345224e-01 5.345225e-01 -0.795670150
## [2,] -0.4082483 -0.3721164 -5.216180e-09 -5.216180e-09 0.059710287
## [3,] -0.4082483 -0.5389259 5.345225e-01 -5.345225e-01 0.602762996
## [4,] -0.4082483 0.1174605 -2.672613e-01 2.672612e-01 0.002611877
## [5,] -0.4082483 0.1174605 -2.672613e-01 2.672612e-01 0.002611877
## [6,] -0.4082483 0.1174605 5.345225e-01 -5.345224e-01 0.002611877
## [,6]
## [1,] 0.486246420
## [2,] -0.673469294
## [3,] 0.556554233
## [4,] -0.009145393
## [5,] -0.009145393
## [6,] -0.009145393
The eigenvalue of 1 is the largest eigenvalue:
max_value <- which.max(eigen(B)$values)
max_value
## [1] 1
The eigenvector has all positive entries and it sums to 1:
pagerank_2 <- eig_b$vectors[,1]/sum(eig_b$vectors[,1])
sum(pagerank_2)
## [1] 1
We got the same results from the plot:
graph <- graph_from_adjacency_matrix(A, weighted = T)
plot(graph)
pageRank <- page.rank(graph)$vector
as.matrix(page.rank(graph)$vector)
## [,1]
## [1,] 0.05170475
## [2,] 0.07367926
## [3,] 0.05741241
## [4,] 0.34870369
## [5,] 0.19990381
## [6,] 0.26859608
results <- (round(r_converge, 4) == round(pageRank, 4))
results
## [,1]
## [1,] TRUE
## [2,] TRUE
## [3,] TRUE
## [4,] TRUE
## [5,] TRUE
## [6,] TRUE