1.Playing with PageRank

You’ll verify for yourself that PageRank works by performing calculations on a small universe of web pages. Let’s use the 6 page universe that we had in the course notes. For this directed graph, perform the following calculations in R.

Form the A matrix. Then, introduce decay and form the B matrix as we did in the course notes.

#Transition Matrix for six websites(nodes)
A <- matrix(c(0,1/2,1/2,0,0,0,0,0,0,0,0,0,1/3,1/3,0,0,1/3,0,0,0,0,0,1/2,1/2,0,0,0,1/2,0,1/2,0,0,0,1,0,0),nrow  = 6)
A = t(A)
A
##           [,1]      [,2] [,3] [,4]      [,5] [,6]
## [1,] 0.0000000 0.5000000  0.5  0.0 0.0000000  0.0
## [2,] 0.0000000 0.0000000  0.0  0.0 0.0000000  0.0
## [3,] 0.3333333 0.3333333  0.0  0.0 0.3333333  0.0
## [4,] 0.0000000 0.0000000  0.0  0.0 0.5000000  0.5
## [5,] 0.0000000 0.0000000  0.0  0.5 0.0000000  0.5
## [6,] 0.0000000 0.0000000  0.0  1.0 0.0000000  0.0
#Initial rank matrix r
r <- matrix(c(1/6,1/6,1/6,1/6,1/6,1/6),ncol=1)
r
##           [,1]
## [1,] 0.1666667
## [2,] 0.1666667
## [3,] 0.1666667
## [4,] 0.1666667
## [5,] 0.1666667
## [6,] 0.1666667
Decay Matrix B

\[ d =0 .85 \\ B = d * A + (1-d)/ n \]

#As one column is having a dead end, we need to convert the matrix in stochastic

n=dim(A)[1]

A = apply(A,1,  function(x) if(sum(x)!=1) return(x+ (1/n)) else return(x))

#matrix(c(unlist(A1)),nrow=6)

A
##      [,1]      [,2]      [,3] [,4] [,5] [,6]
## [1,]  0.0 0.1666667 0.3333333  0.0  0.0    0
## [2,]  0.5 0.1666667 0.3333333  0.0  0.0    0
## [3,]  0.5 0.1666667 0.0000000  0.0  0.0    0
## [4,]  0.0 0.1666667 0.0000000  0.0  0.5    1
## [5,]  0.0 0.1666667 0.3333333  0.5  0.0    0
## [6,]  0.0 0.1666667 0.0000000  0.5  0.5    0
#Damping factor
d = .85
#Identity matrix - Can be used or just the nodes
I = matrix(c(rep(1,36)),ncol =6)

#Number of Nodes
n=dim(A)[1]

#Decay from A (Which can be used for disconnected matrix)
B = d*A+((1-d)*(I/n))

#Decomposed matrix
B
##       [,1]      [,2]      [,3]  [,4]  [,5]  [,6]
## [1,] 0.025 0.1666667 0.3083333 0.025 0.025 0.025
## [2,] 0.450 0.1666667 0.3083333 0.025 0.025 0.025
## [3,] 0.450 0.1666667 0.0250000 0.025 0.025 0.025
## [4,] 0.025 0.1666667 0.0250000 0.025 0.450 0.875
## [5,] 0.025 0.1666667 0.3083333 0.450 0.025 0.025
## [6,] 0.025 0.1666667 0.0250000 0.450 0.450 0.025

Start with a uniform rank vector r and perform power iterations on B till convergence. That is, compute the solution \(r = B^n * r\). Attempt this for a sufficiently large n so that r actually converges.

\(r\)’ converges at 30.

#Power iterations on decompostion matrix - 30 

br = format((B %^% 30) %*% r, scientific=F)
br = as.numeric(br)
br
## [1] 0.05170475 0.07367927 0.05741242 0.34870367 0.19990381 0.26859607
#Power iterations on decompostion matrix - 100

format((B %^% 100) %*% r, scientific=F)
##      [,1]        
## [1,] "0.05170475"
## [2,] "0.07367926"
## [3,] "0.05741241"
## [4,] "0.34870369"
## [5,] "0.19990381"
## [6,] "0.26859608"

Compute the eigen-decomposition of B and verify that you indeed get an eigenvalue of 1 as the largest eigenvalue and that its corresponding eigenvector is the same vector that you obtained in the previous power iteration method. Further, this eigenvector has all positive entries and it sums to 1.

#Eigen value for B
eigen(B)
## $values
## [1]  1.00000000+0i  0.57619235+0i -0.42500000+0i -0.42500000-0i
## [5] -0.34991524+0i -0.08461044+0i
## 
## $vectors
##              [,1]          [,2]                        [,3]
## [1,] 0.1044385+0i  0.2931457+0i  9.162390e-16+5.425344e-23i
## [2,] 0.1488249+0i  0.5093703+0i -8.139802e-16-0.000000e+00i
## [3,] 0.1159674+0i  0.3414619+0i -7.362635e-17-5.073466e-23i
## [4,] 0.7043472+0i -0.5890805+0i -7.071068e-01+0.000000e+00i
## [5,] 0.4037861+0i -0.1413606+0i  7.071068e-01+0.000000e+00i
## [6,] 0.5425377+0i -0.4135367+0i  0.000000e+00-1.460109e-08i
##                             [,4]           [,5]            [,6]
## [1,]  9.162390e-16-5.425344e-23i -0.06471710+0i -0.212296003+0i
## [2,] -8.139802e-16+0.000000e+00i  0.01388698+0i  0.854071294+0i
## [3,] -7.362635e-17+5.073466e-23i  0.07298180+0i -0.363638739+0i
## [4,] -7.071068e-01+0.000000e+00i -0.66058664+0i  0.018399984+0i
## [5,]  7.071068e-01-0.000000e+00i  0.73761812+0i -0.304719509+0i
## [6,]  0.000000e+00+1.460109e-08i -0.09918316+0i  0.008182973+0i
# Sum of the vector is greater than 1 for the eigen value 1
sum(eigen(B)$vectors[,1])
## [1] 2.019902+0i
# Change it to unit vector
e1 = as.numeric(eigen(B)$vectors[,1]/ sum(eigen(B)$vectors[,1]))
e1
## [1] 0.05170475 0.07367926 0.05741241 0.34870369 0.19990381 0.26859608
# Difference between power method and eigen vector is negligible
sum(br - e1)
## [1] -1e-08

Use the graph package in R and its page.rank method to compute the Page Rank of the graph as given in A. Note that you don’t need to apply decay. The package starts with a connected graph and applies decay internally. Verify that you do get the same PageRank vector as the two approaches above.

#Converting to graph from adjacency matrix
g1 = graph_from_adjacency_matrix(t(A),weighted = T)

#Plot the graph
plot(g1)

#Resultant vector
page_rank(g1)$vector
## [1] 0.05170475 0.07367926 0.05741241 0.34870369 0.19990381 0.26859608