Playing with PageRank

You’ll verify for yourself that PageRank works by performing calculations on a small universe of web pages. Let’s use the 6 page universe that we had in the course notes. For this directed graph, perform the following calculations in R. • Form the A matrix. Then, introduce decay and form the B matrix as we did in the course notes.

t <- 1/3
A <- matrix(c(0, .5, .5, 0, 0, 0, 0, 0, 0, 0, 0, 0, t, t, 0, 0, t, 0, 0, 0, 0, 0, .5, .5, 0, 0, 0, .5, 0, .5, 0, 0, 0, 1, 0, 0), nrow = 6, byrow = TRUE)
#just to check, let's make sure our A looks right
A
##           [,1]      [,2] [,3] [,4]      [,5] [,6]
## [1,] 0.0000000 0.5000000  0.5  0.0 0.0000000  0.0
## [2,] 0.0000000 0.0000000  0.0  0.0 0.0000000  0.0
## [3,] 0.3333333 0.3333333  0.0  0.0 0.3333333  0.0
## [4,] 0.0000000 0.0000000  0.0  0.0 0.5000000  0.5
## [5,] 0.0000000 0.0000000  0.0  0.5 0.0000000  0.5
## [6,] 0.0000000 0.0000000  0.0  1.0 0.0000000  0.0
#we will take the decay formula directly from page 3 of our notes
n <- 6 #number of webpages we are working with 

B <- .85 * A + (.15/n)
B
##           [,1]      [,2]  [,3]  [,4]      [,5]  [,6]
## [1,] 0.0250000 0.4500000 0.450 0.025 0.0250000 0.025
## [2,] 0.0250000 0.0250000 0.025 0.025 0.0250000 0.025
## [3,] 0.3083333 0.3083333 0.025 0.025 0.3083333 0.025
## [4,] 0.0250000 0.0250000 0.025 0.025 0.4500000 0.450
## [5,] 0.0250000 0.0250000 0.025 0.450 0.0250000 0.450
## [6,] 0.0250000 0.0250000 0.025 0.875 0.0250000 0.025

• Start with a uniform rank vector r and perform power iterations on B till convergence. That is, compute the solution r = Bn × r. Attempt this for a sufficiently large n so that r actually converges.

r <- matrix(c(.167, .167, .167, .167, .167, .167), nrow = 6, byrow = FALSE)
r
##       [,1]
## [1,] 0.167
## [2,] 0.167
## [3,] 0.167
## [4,] 0.167
## [5,] 0.167
## [6,] 0.167
#we will first start with n = 6, then n = 10, 50, 100, 500, and maybe more if need be

n <- 6
r <- B^n %*% r
r
##              [,1]
## [1,] 2.773458e-03
## [2,] 2.446289e-10
## [3,] 4.304877e-04
## [4,] 2.773458e-03
## [5,] 2.773458e-03
## [6,] 7.494882e-02
n <- 10
r <- B^n %*% r
r
##              [,1]
## [1,] 1.465838e-07
## [2,] 7.982224e-18
## [3,] 4.307808e-08
## [4,] 2.646492e-05
## [5,] 2.646492e-05
## [6,] 7.296290e-04
n <- 50
r <- B^n %*% r
r
##              [,1]
## [1,] 1.971887e-25
## [2,] 6.174797e-84
## [3,] 7.517845e-31
## [4,] 3.460999e-21
## [5,] 3.460999e-21
## [6,] 3.334827e-08
n <- 100
r <- B^n %*% r
r
##               [,1]
## [1,]  1.575233e-65
## [2,] 2.075268e-168
## [3,]  2.762320e-72
## [4,]  6.987546e-43
## [5,]  6.987546e-43
## [6,]  5.495494e-27
n <- 500
r <- B^n %*% r
r
##               [,1]
## [1,] 1.115657e-245
## [2,]  0.000000e+00
## [3,] 2.262370e-298
## [4,] 2.219543e-200
## [5,] 2.219543e-200
## [6,]  7.052631e-72
n <- 1000
r <- B^n %*% r
r
##               [,1]
## [1,]  0.000000e+00
## [2,]  0.000000e+00
## [3,]  0.000000e+00
## [4,]  0.000000e+00
## [5,]  0.000000e+00
## [6,] 2.261084e-258
#at n = 1000 we see that the values converge to 2.261084e-258

• Compute the eigen-decomposition of B and verify that you indeed get an eigenvalue of 1 as the largest eigenvalue and that its corresponding eigenvector is the same vector that you obtained in the previous power iteration method. Further, this eigenvector has all positive entries and it sums to 1.

eigen.decomp <- eigen(B)
eigen.decomp
## $values
## [1]  0.95165271+0i -0.42500000+0i -0.42500000-0i  0.41436582+0i
## [5] -0.34733611+0i -0.01868242+0i
## 
## $vectors
##               [,1]                        [,2]                        [,3]
## [1,] -0.2159123+0i -5.345225e-01-0.000000e+00i -5.345225e-01+0.000000e+00i
## [2,] -0.0572329+0i -5.200810e-17+3.664536e-10i -5.200810e-17-3.664536e-10i
## [3,] -0.2980792+0i  5.345225e-01+0.000000e+00i  5.345225e-01+0.000000e+00i
## [4,] -0.5358032+0i -2.672612e-01+0.000000e+00i -2.672612e-01-0.000000e+00i
## [5,] -0.5358032+0i -2.672612e-01+0.000000e+00i -2.672612e-01-0.000000e+00i
## [6,] -0.5358032+0i  5.345225e-01-0.000000e+00i  5.345225e-01+0.000000e+00i
##                [,4]            [,5]           [,6]
## [1,] -0.77869913+0i -0.775079691+0i  0.55060201+0i
## [2,] -0.07537631+0i  0.009090375+0i -0.61511270+0i
## [3,] -0.61034825+0i  0.631781588+0i  0.56386946+0i
## [4,]  0.07169632+0i  0.002637034+0i -0.01322899+0i
## [5,]  0.07169632+0i  0.002637034+0i -0.01322899+0i
## [6,]  0.07169632+0i  0.002637034+0i -0.01322899+0i
#The highest value I got for the eigen value is .95165271+0i, which is not 1 but very close to 1. It does not look as though my eigenvectors are the same as the one from above, so I think I may have made a mistake with my iterations.  

• Use the graph package in R and its page.rank method to compute the Page Rank of the graph as given in A. Note that you don’t need to apply decay. The package starts with a connected graph and applies decay internally. Verify that you do get the same PageRank vector as the two approaches above.

require(igraph)
## Loading required package: igraph
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
net <- graph.adjacency(A, mode = "directed")
pagerank.A <- page.rank(net, directed = TRUE)$vector
pagerank.A
## [1] 0.1459854 0.1459854 0.1459854 0.2700730 0.1459854 0.1459854

Please document all your experiments in an R Markdown document.