This document pertains to assignment# 10 and is meant to explore the PageRank. We will do so by using various method for calcuating the PageRank of a limited universe of 6 websites.
The relations between the websites is as follows:
Site | connects to |
---|---|
1 | 2 |
1 | 3 |
2 | none |
3 | 1 |
3 | 2 |
3 | 5 |
4 | 5 |
4 | 6 |
5 | 4 |
5 | 6 |
6 | 4 |
These relationships can be represented in the following Adjancy Matrix.
connections <- c(0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0)
ad_A <- matrix(connections, nrow=6, ncol=6, byrow = TRUE)
ad_A
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0 1 1 0 0 0
## [2,] 0 0 0 0 0 0
## [3,] 1 1 0 0 1 0
## [4,] 0 0 0 0 1 1
## [5,] 0 0 0 1 0 1
## [6,] 0 0 0 1 0 0
From examining the adjancy matrix and the sites relationships, it is clear that we have node 2 as a dandling node. There are no outlink from node 2. We will remedy this by replacing row 2 in Matrix by value 1.
# Handling dangling node; 2, by adding 1 all accross in row 2
ad_A_fix<- ad_A
ad_A_fix[2, ] <- c(1, 1, 1, 1, 1, 1)
ad_A_fix
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0 1 1 0 0 0
## [2,] 1 1 1 1 1 1
## [3,] 1 1 0 0 1 0
## [4,] 0 0 0 0 1 1
## [5,] 0 0 0 1 0 1
## [6,] 0 0 0 1 0 0
We will now convert our matrix into a row-wise stochastic matrix by dividing each row by the number of outlinks for that node. For node 2 (i.e. row 2, we will divide by 6 so that sum for the row equal to 1).
#Convert Adjancy matrix into stochastic matrix by dividing each rows by the number of outlinks for this node,
#sum of each row is 1, for row 2 we will divide by number of nodes.
sto_A <- ad_A_fix
sto_A[1,] <- ad_A_fix[1,]/2
sto_A[2,] <- ad_A_fix[2,]/6
sto_A[3,] <- ad_A_fix[3,]/3
sto_A[4,] <- ad_A_fix[4,]/2
sto_A[5,] <- ad_A_fix[5,]/2
sto_A[6,] <- ad_A_fix[6,]/1
sto_A
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0000000 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000
## [2,] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
## [3,] 0.3333333 0.3333333 0.0000000 0.0000000 0.3333333 0.0000000
## [4,] 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.5000000
## [5,] 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000 0.5000000
## [6,] 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000
We will now introduce the “decay” element and construc the matrix B based on following formula:
\(B\quad =\quad \alpha \cdot A\quad +\quad \frac { (1-\alpha ) }{ n } \cdot E,\quad where\quad E\quad is\quad matrix\quad with\quad all\quad 1's\).
We will use \(\alpha =0.85\)
ones <- rep(1, 36)
E <- matrix (ones, nrow=6, ncol=6, byrow=TRUE)
E
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 1 1 1 1 1
## [2,] 1 1 1 1 1 1
## [3,] 1 1 1 1 1 1
## [4,] 1 1 1 1 1 1
## [5,] 1 1 1 1 1 1
## [6,] 1 1 1 1 1 1
a <- 0.85
n <- 6
B <- a*sto_A + (1-a)/n*E
B
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0250000 0.4500000 0.4500000 0.0250000 0.0250000 0.0250000
## [2,] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
## [3,] 0.3083333 0.3083333 0.0250000 0.0250000 0.3083333 0.0250000
## [4,] 0.0250000 0.0250000 0.0250000 0.0250000 0.4500000 0.4500000
## [5,] 0.0250000 0.0250000 0.0250000 0.4500000 0.0250000 0.4500000
## [6,] 0.0250000 0.0250000 0.0250000 0.8750000 0.0250000 0.0250000
We will now perform the power interactions on matrix B untill we have convergence. We will do so by calling the following function. This function will calculate successive\({ r }_{ i+1 }\quad =\quad { r }_{ i }\cdot B\) until we have \({ r }_{ i+1 }={ r }_{ i }\).
B_converge <- function(B, r, l){
# Calculate r%*%B until convergence
r_i<- r
n <- 0
loop_indicator <- TRUE
while (loop_indicator){
n <- n + 1
# r_j <- B%*%r_i
r_j <- r_i%*%B
if (all(r_j == r_i)){
loop_indicator <- FALSE # end while loop when convergence occurred
result <- list(r_i, n-1, 'converged') # return results
}else if(n>= l){
loop_indicator <- FALSE # end while loop when convergence occurred
result <- list(r_i, n-1, 'limit') # return results
} #end of if else
r_i<-r_j
} #end of while loop
return(result)
}
We will start calling the function with a limit of 100 (the iterations will end eventhough we have no convergence). We will start with a uniform rank vector, c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6)
r0 <- c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6)
res1 <- B_converge (B, r0, 100)
res1[[3]]
## [1] "converged"
r_final <- res1[[1]]
r_final
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.05170475 0.07367926 0.05741241 0.3487037 0.1999038 0.2685961
We wil now calculate the (left) eigen values and vectors for the matrix B.
b_eigen_vectors <- eigen(t(B))$vectors
b_eigen_values <- eigen(t(B))$values
b_eigen_values
## [1] 1.00000000+0i 0.57619235+0i -0.42500000+0i -0.42500000-0i
## [5] -0.34991524+0i -0.08461044+0i
b_eigen_vectors
## [,1] [,2] [,3]
## [1,] 0.1044385+0i 0.2931457+0i 2.486934e-15+0.0000e+00i
## [2,] 0.1488249+0i 0.5093703+0i -8.528385e-16-6.9832e-23i
## [3,] 0.1159674+0i 0.3414619+0i -1.930646e-15-0.0000e+00i
## [4,] 0.7043472+0i -0.5890805+0i -7.071068e-01+0.0000e+00i
## [5,] 0.4037861+0i -0.1413606+0i 7.071068e-01+0.0000e+00i
## [6,] 0.5425377+0i -0.4135367+0i 0.000000e+00-1.7058e-08i
## [,4] [,5] [,6]
## [1,] 2.486934e-15-0.0000e+00i -0.06471710+0i -0.212296003+0i
## [2,] -8.528385e-16+6.9832e-23i 0.01388698+0i 0.854071294+0i
## [3,] -1.930646e-15+0.0000e+00i 0.07298180+0i -0.363638739+0i
## [4,] -7.071068e-01+0.0000e+00i -0.66058664+0i 0.018399984+0i
## [5,] 7.071068e-01-0.0000e+00i 0.73761812+0i -0.304719509+0i
## [6,] 0.000000e+00+1.7058e-08i -0.09918316+0i 0.008182973+0i
v1 <- Re(b_eigen_vectors[,1])
As expected, we have the largest eignvalue (in absolute value) = 1 and the corresponding vector v1.
We will now compare this vector with the vector we found by convergence.
v2 <- v1/r_final
v2
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 2.019902 2.019902 2.019902 2.019902 2.019902 2.019902
Since their are a scalar multiple of each other, we can conclude that the eigen vector and the vector found by convergence are the same.
We will now find the PageRank by using the igraph package.
library(igraph)
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
g <- graph_from_adjacency_matrix(ad_A, 'directed', weighted=TRUE)
plot(g)
v3 <- page_rank(g)$vector
v3
## [1] 0.05170475 0.07367926 0.05741241 0.34870369 0.19990381 0.26859608
v3/r_final
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 1 1 1 1 1