Part I Playing with PageRank
You’ll verify for yourself that PageRank works by performing calculations on a small universe of web pages.
Let’s use the 6 page universe that we had in the previous discussion For this directed graph, perform the following calculations in R.• Form the A matrix. Then, introduce decay and form the B matrix as we did in the course notes. (5 Points)
ri <- matrix(c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6),ncol=6)
print(round(ri,3))## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.167 0.167 0.167 0.167 0.167 0.167
A <- matrix(c(0,1/2,1/2,0,0,0,0,0,0,0,0,0,1/3,1/3,0,0,1/3,0,0,0,0,0,1/2,1/2,0,0,0,1/2,0,1/2,0,0,0,1,0,0),nrow=6,ncol=6,byrow = T)
round(A,3)## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.000 0.500 0.5 0.0 0.000 0.0
## [2,] 0.000 0.000 0.0 0.0 0.000 0.0
## [3,] 0.333 0.333 0.0 0.0 0.333 0.0
## [4,] 0.000 0.000 0.0 0.0 0.500 0.5
## [5,] 0.000 0.000 0.0 0.5 0.000 0.5
## [6,] 0.000 0.000 0.0 1.0 0.000 0.0
custom_page_rank<-function(r,L,trials){
last_r <- matrix(c(0,0,0,0,0,0),ncol=6)
for(i in 1:trials){
r <- (r%*%L)
if (all.equal(last_r,r)==TRUE){
print(paste0('Convergence occurs at ',i,'. With values of ',round(r[1],4),', ',round(r[2],4),', ',
round(r[3],4),', ',round(r[4],4),', ',round(r[5],4),', ',round(r[6],4)))
return(r)
break
}
last_r<- r
}
}
pr_A <- custom_page_rank(ri,A,50)## [1] "Convergence occurs at 28. With values of 0, 0, 0, 0.2667, 0.1333, 0.2"
Add row uniform rank vector to stabilize the convergence.
A[2,]<-ri
print(A)## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000
## [2,] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
## [3,] 0.3333333 0.3333333 0.0000000 0.0000000 0.3333333 0.0000000
## [4,] 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000
## [5,] 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000
## [6,] 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000
• Start with a uniform rank vector r and perform power iterations on B till convergence. That is, compute the solution r = B n × r. Attempt this for a sufficiently large n so that r actually converges. (5 Points)
B <- .85*A+.15/nrow(A)
pr_B <- custom_page_rank(ri,B,50)## [1] "Convergence occurs at 32. With values of 0.0517, 0.0737, 0.0574, 0.3487, 0.1999, 0.2686"
• Compute the eigen-decomposition of B and verify that you indeed get an eigenvalue of 1 as the largest eigenvalue and that its corresponding eigenvector is the same vector that you obtained in the previous power iteration method. Further, this eigenvector has all positive entries and it sums to 1.(10 points)
## [1] 1.00000000+0i 0.57619235+0i -0.42500000+0i -0.42500000-0i -0.34991524+0i
## [6] -0.08461044+0i
## [,1] [,2] [,3]
## [1,] -0.4082483+0i -0.7278031+0i 5.345225e-01+0.000000e+00i
## [2,] -0.4082483+0i -0.3721164+0i 0.000000e+00-2.443024e-09i
## [3,] -0.4082483+0i -0.5389259+0i -5.345225e-01+0.000000e+00i
## [4,] -0.4082483+0i 0.1174605+0i 2.672612e-01-0.000000e+00i
## [5,] -0.4082483+0i 0.1174605+0i 2.672612e-01-0.000000e+00i
## [6,] -0.4082483+0i 0.1174605+0i -5.345225e-01+0.000000e+00i
## [,4] [,5] [,6]
## [1,] 5.345225e-01+0.000000e+00i -0.795670150+0i -0.486246420+0i
## [2,] 0.000000e+00+2.443024e-09i 0.059710287+0i 0.673469294+0i
## [3,] -5.345225e-01-0.000000e+00i 0.602762996+0i -0.556554233+0i
## [4,] 2.672612e-01+0.000000e+00i 0.002611877+0i 0.009145393+0i
## [5,] 2.672612e-01+0.000000e+00i 0.002611877+0i 0.009145393+0i
## [6,] -5.345225e-01-0.000000e+00i 0.002611877+0i 0.009145393+0i
## [1] "Max eigen value is 1"
corresponding_v <- as.numeric(eigen(B)$vectors[,which.max(eigen(B)$values)])
corresponding_v## [1] -0.4082483 -0.4082483 -0.4082483 -0.4082483 -0.4082483 -0.4082483
sum_B <- (1/sum(corresponding_v))*corresponding_v
sum_B## [1] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
sum(sum_B)## [1] 1
• Use the graph package in R and its page.rank method to compute the Page Rank of the graph as given in A. Note that you don’t need to apply decay. The package starts with a connected graph and applies decay internally. Verify that you do get the same PageRank vector as the two approaches above. (10 points)
library(igraph)##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
A_2 <- matrix(c(0,1/2,1/2,0,0,0,0,0,0,0,0,0,1/3,1/3,0,0,1/3,0,0,0,0,0,1/2,1/2,0,0,0,1/2,0,1/2,0,0,0,1,0,0),nrow=6,ncol=6,byrow = T)
igraph_A <- graph_from_adjacency_matrix(A_2, weighted=TRUE,mode = "directed")
plot(igraph_A)ipageRank <- as.matrix(page.rank(igraph_A)$vector)
round(ipageRank,4) == round(pr_B[1,],4)## [,1]
## [1,] TRUE
## [2,] TRUE
## [3,] TRUE
## [4,] TRUE
## [5,] TRUE
## [6,] TRUE