Form the A matrix. Then, introduce decay and form the B matrix as we did in the course notes.
# Form A Matrix
A <- matrix(c(0, 1/2, 1/2, 0, 0, 0,
0, 0, 0, 0, 0, 0,
1/3, 1/3, 0, 0, 1/3, 0,
0, 0, 0, 0, 1/2, 1/2,
0, 0, 0, 1/2, 0, 1/2,
0, 0, 0, 1, 0, 0
), nrow = 6)
A
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0 0 0.3333333 0.0 0.0 0
## [2,] 0.5 0 0.3333333 0.0 0.0 0
## [3,] 0.5 0 0.0000000 0.0 0.0 0
## [4,] 0.0 0 0.0000000 0.0 0.5 1
## [5,] 0.0 0 0.3333333 0.5 0.0 0
## [6,] 0.0 0 0.0000000 0.5 0.5 0
# making A row-stochastic with a uniform vector because if we leave at 0(indicating there are no outgoing links from p2), page rank will not converge because of a dangling node.
A <- matrix(c(0, 1/2, 1/2, 0, 0, 0,
1/6, 1/6, 1/6, 1/6, 1/6, 1/6,
1/3, 1/3, 0, 0, 1/3, 0,
0, 0, 0, 0, 1/2, 1/2,
0, 0, 0, 1/2, 0, 1/2,
0, 0, 0, 1, 0, 0), nrow=6)
print(A)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0 0.1666667 0.3333333 0.0 0.0 0
## [2,] 0.5 0.1666667 0.3333333 0.0 0.0 0
## [3,] 0.5 0.1666667 0.0000000 0.0 0.0 0
## [4,] 0.0 0.1666667 0.0000000 0.0 0.5 1
## [5,] 0.0 0.1666667 0.3333333 0.5 0.0 0
## [6,] 0.0 0.1666667 0.0000000 0.5 0.5 0
# Next we form the decay matrix B
n <- 6
B=(0.85*A+0.15/n)
print(B)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.025 0.1666667 0.3083333 0.025 0.025 0.025
## [2,] 0.450 0.1666667 0.3083333 0.025 0.025 0.025
## [3,] 0.450 0.1666667 0.0250000 0.025 0.025 0.025
## [4,] 0.025 0.1666667 0.0250000 0.025 0.450 0.875
## [5,] 0.025 0.1666667 0.3083333 0.450 0.025 0.025
## [6,] 0.025 0.1666667 0.0250000 0.450 0.450 0.025
Start with a uniform rank vector r and perform power iterations on B till convergence. That is, compute the solution r = Bn × r. Attempt this for a sufficiently large n so that r actually converges
# The following function is created to perform power iterations on B until convergence, utilizing a uniform rank vector
r <- matrix(c(1/6, 1/6, 1/6, 1/6, 1/6, 1/6), nrow=6, ncol=1)
r
## [,1]
## [1,] 0.1666667
## [2,] 0.1666667
## [3,] 0.1666667
## [4,] 0.1666667
## [5,] 0.1666667
## [6,] 0.1666667
n <- 30
ns <- c(0)
rs <- c(r[1])
n_r <- r
Bn <- B
for (i in 1:n) {
Bn <- Bn%*%B
n_r <- Bn%*%r
rs <- append(rs, n_r[4])
ns <- append(ns, i)
}
n_r
## [,1]
## [1,] 0.05170475
## [2,] 0.07367927
## [3,] 0.05741242
## [4,] 0.34870368
## [5,] 0.19990381
## [6,] 0.26859608
plot(ns, rs)
Compute the eigen-decomposition of B and verify that you indeed get an eigenvalue of 1 as the largest eigenvalue and that its corresponding eigenvector is the same vector that you obtained in the previous power iteration method. Further, this eigenvector has all positive entries and it sums to 1.
# 1 is an eigenvalue of B
x<-eigen(B)
x$values
## [1] 1.00000000+0i 0.57619235+0i -0.42500000+0i -0.42500000-0i -0.34991524+0i
## [6] -0.08461044+0i
# vector sums to 1
x$vectors[,1]/sum(x$vectors[,1])
## [1] 0.05170475+0i 0.07367926+0i 0.05741241+0i 0.34870369+0i 0.19990381+0i
## [6] 0.26859608+0i
Use the graph package in R and its page.rank method to compute the Page Rank of the graph as given in A. Note that you don’t need to apply decay. The package starts with a connected graph and applies decay internally. Verify that you do get the same PageRank vector as the two approaches above.
A <- matrix(c(0, 1/2, 1/2, 0, 0, 0,
0, 0, 0, 0, 0, 0,
1/3, 1/3, 0, 0, 1/3, 0,
0, 0, 0, 0, 1/2, 1/2,
0, 0, 0, 1/2, 0, 1/2,
0, 0, 0, 1, 0, 0
), nrow = 6, byrow = T)
library('igraph')
g <- graph_from_adjacency_matrix(A, weighted=TRUE)
plot(g)
testdigds <-read_csv("testdig.csv")
## Rows: 28000 Columns: 784
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl (784): pixel0, pixel1, pixel2, pixel3, pixel4, pixel5, pixel6, pixel7, p...
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
testdigds
## # A tibble: 28,000 x 784
## pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9 pixel10
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 0 0 0 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 0 0 0 0 0
## 3 0 0 0 0 0 0 0 0 0 0 0
## 4 0 0 0 0 0 0 0 0 0 0 0
## 5 0 0 0 0 0 0 0 0 0 0 0
## 6 0 0 0 0 0 0 0 0 0 0 0
## 7 0 0 0 0 0 0 0 0 0 0 0
## 8 0 0 0 0 0 0 0 0 0 0 0
## 9 0 0 0 0 0 0 0 0 0 0 0
## 10 0 0 0 0 0 0 0 0 0 0 0
## # ... with 27,990 more rows, and 773 more variables: pixel11 <dbl>,
## # pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>, pixel16 <dbl>,
## # pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>, pixel20 <dbl>, pixel21 <dbl>,
## # pixel22 <dbl>, pixel23 <dbl>, pixel24 <dbl>, pixel25 <dbl>, pixel26 <dbl>,
## # pixel27 <dbl>, pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>, pixel31 <dbl>,
## # pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>, pixel36 <dbl>,
## # pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>, pixel40 <dbl>, ...
traindigds <-read_csv("traindig.csv")
## Rows: 42000 Columns: 785
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl (785): label, pixel0, pixel1, pixel2, pixel3, pixel4, pixel5, pixel6, pi...
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
traindigds
## # A tibble: 42,000 x 785
## label pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0 0 0 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 0 0 0 0 0
## 3 1 0 0 0 0 0 0 0 0 0 0
## 4 4 0 0 0 0 0 0 0 0 0 0
## 5 0 0 0 0 0 0 0 0 0 0 0
## 6 0 0 0 0 0 0 0 0 0 0 0
## 7 7 0 0 0 0 0 0 0 0 0 0
## 8 3 0 0 0 0 0 0 0 0 0 0
## 9 5 0 0 0 0 0 0 0 0 0 0
## 10 3 0 0 0 0 0 0 0 0 0 0
## # ... with 41,990 more rows, and 774 more variables: pixel10 <dbl>,
## # pixel11 <dbl>, pixel12 <dbl>, pixel13 <dbl>, pixel14 <dbl>, pixel15 <dbl>,
## # pixel16 <dbl>, pixel17 <dbl>, pixel18 <dbl>, pixel19 <dbl>, pixel20 <dbl>,
## # pixel21 <dbl>, pixel22 <dbl>, pixel23 <dbl>, pixel24 <dbl>, pixel25 <dbl>,
## # pixel26 <dbl>, pixel27 <dbl>, pixel28 <dbl>, pixel29 <dbl>, pixel30 <dbl>,
## # pixel31 <dbl>, pixel32 <dbl>, pixel33 <dbl>, pixel34 <dbl>, pixel35 <dbl>,
## # pixel36 <dbl>, pixel37 <dbl>, pixel38 <dbl>, pixel39 <dbl>, ...
plot_img = function(row)
{
img <- matrix(row, nrow=28,ncol=28)
mode(img) = "numeric"
image(img, useRaster=TRUE, axes=FALSE)
}
for (i in 1:10) {
plot_img(traindigds[i,])
}
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
## Warning in matrix(row, nrow = 28, ncol = 28): data length [785] is not a sub-
## multiple or multiple of the number of rows [28]
trainingds_pix <- traindigds[,2:ncol(traindigds)]/255
trainingds_test <- array_reshape(as.matrix(trainingds_pix), c(nrow(trainingds_pix), 28, 28, 1))
hist(trainingds_test)
hist(trainingds_test[trainingds_test > 0])
## Question 3
All the initial values are 0 and the means increase in value towards the middle. This just means all the drawings are towards the center of the image.
colMeans(traindigds)[1:60]
## label pixel0 pixel1 pixel2 pixel3 pixel4
## 4.4566428571 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## pixel5 pixel6 pixel7 pixel8 pixel9 pixel10
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## pixel11 pixel12 pixel13 pixel14 pixel15 pixel16
## 0.0000000000 0.0030000000 0.0111904762 0.0051428571 0.0002142857 0.0000000000
## pixel17 pixel18 pixel19 pixel20 pixel21 pixel22
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## pixel23 pixel24 pixel25 pixel26 pixel27 pixel28
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## pixel29 pixel30 pixel31 pixel32 pixel33 pixel34
## 0.0000000000 0.0000000000 0.0000000000 0.0003809524 0.0013095238 0.0105476190
## pixel35 pixel36 pixel37 pixel38 pixel39 pixel40
## 0.0272619048 0.0509047619 0.0664047619 0.1295714286 0.1741190476 0.1913095238
## pixel41 pixel42 pixel43 pixel44 pixel45 pixel46
## 0.1905952381 0.1960476190 0.1713571429 0.1644761905 0.1517142857 0.1053095238
## pixel47 pixel48 pixel49 pixel50 pixel51 pixel52
## 0.0607857143 0.0450714286 0.0154047619 0.0105238095 0.0050476190 0.0000000000
## pixel53 pixel54 pixel55 pixel56 pixel57 pixel58
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0015238095
traindsx <- traindigds
#Scaling the pixels
traindsxreduced <- traindsx/255
traindsxcov <- cov(traindsxreduced)
#Reduce to principle components
pcatraindsx <- prcomp(traindsxcov)
(cumsum(pcatraindsx$sdev^2) / sum(pcatraindsx$sdev^2))[1:50]
## [1] 0.2533214 0.4211306 0.5443493 0.6382875 0.7099679 0.7691895 0.8028369
## [8] 0.8302285 0.8531289 0.8711703 0.8853394 0.8986566 0.9080934 0.9174708
## [15] 0.9256099 0.9328111 0.9384893 0.9438301 0.9484009 0.9527328 0.9564977
## [22] 0.9598726 0.9629213 0.9656491 0.9682131 0.9705047 0.9726596 0.9746378
## [29] 0.9764288 0.9779685 0.9793811 0.9807173 0.9818937 0.9830211 0.9840639
## [36] 0.9850215 0.9858701 0.9866476 0.9873883 0.9880994 0.9887699 0.9894182
## [43] 0.9899904 0.9905079 0.9909907 0.9914508 0.9918774 0.9922720 0.9926428
## [50] 0.9929782
It appears to be just noise, looking at the variation across a variety of different images it looks like a blob.
l<- 1
m<-812
for (i in 1:10){
img<- matrix(pcatraindsx$rotation[l:m],nrow=28, ncol=28)
image(img, useRaster=TRUE, axes=FALSE)
l<- l+812
m<- m+812
}
Re-run PCA that accounts for all of the variance (100%)
# filter the training data down to 8’s
library(dplyr)
traindigds_8 <- traindigds %>% filter(label == 8)
traindigds_8 <- dplyr::select(traindigds_8,c(2:785))
dim(traindigds_8)
## [1] 4063 784
knitr::opts_chunk$set(cache = TRUE)
X_8 <- traindigds_8
#Scaling the pixels
X_8reduced <- X_8/255
X_8cov <- cov(X_8reduced)
#Reduce to principle components
pcaX_8 <- prcomp(X_8cov)
m<- 1
n<-812
for (i in 1:10){
img<- matrix(pcaX_8$rotation[m:n],nrow=28, ncol=28)
image(img, useRaster=TRUE, axes=FALSE)
m<- m+812
n<- n+812
}
# All the images resemble 8
An incorrect approach to predicting the images would be to build a linear regression model with y as the digit values and X as the pixel matrix. Instead, we can build a multinomial model that classifies the digits. Build a multinomial model on the entirety of the training set. Then provide its classification accuracy (percent correctly identified) as well as a matrix of observed versus forecast values (confusion matrix). This matrix will be a 10 x 10, and correct classifications will be on the diagonal.
# Divide our training dataset to trainds_1 and trainds_2
trainds_size <- floor(.2 * nrow(traindigds))
set.seed(2200)
trainds_index <- sample(seq_len(nrow(traindigds)), size = trainds_size )
trainds_1 <- traindigds[trainds_index,]
trainds_2 <- traindigds[-trainds_index,]
trainds1_x <- trainds_1[,names(trainds_1) != "label"]
trainds1_x <- lapply(trainds1_x, as.numeric)
trainds1_y <- as.factor(trainds_1$label)
trainds2_x <- trainds_2[,names(trainds_2) != "label"]
trainds2_x <- lapply(trainds2_x, as.numeric)
trainds2_y <- as.factor(trainds_2$label)
knitr::opts_chunk$set(cache = TRUE)
library(nnet)
trainds_mn_model <- multinom(trainds1_y~., trainds1_x, MaxNWts = 100000)
## # weights: 7860 (7065 variable)
## initial value 19341.714781
## iter 10 value 4865.900733
## iter 20 value 3912.737730
## iter 30 value 3581.493792
## iter 40 value 3453.794647
## iter 50 value 3395.592725
## iter 60 value 3361.886757
## iter 70 value 3324.984639
## iter 80 value 3292.613879
## iter 90 value 3267.937840
## iter 100 value 3248.331137
## final value 3248.331137
## stopped after 100 iterations
library("caret")
## Loading required package: lattice
##
## Attaching package: 'lattice'
## The following objects are masked from 'package:openintro':
##
## ethanol, lsegments
##
## Attaching package: 'caret'
## The following object is masked from 'package:openintro':
##
## dotPlot
## The following object is masked from 'package:purrr':
##
## lift
trainds_prediction <- predict(trainds_mn_model, newdata = trainds2_x)
confusionMatrix(trainds_prediction,trainds2_y)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1 2 3 4 5 6 7 8 9
## 0 3061 2 64 41 13 55 70 42 84 48
## 1 2 3555 57 30 45 22 40 79 162 33
## 2 24 29 2785 130 47 16 106 78 65 19
## 3 21 9 60 2687 13 106 7 51 98 45
## 4 8 2 79 13 2794 31 34 58 16 159
## 5 99 73 35 359 101 2590 114 56 398 152
## 6 46 6 74 19 39 42 2886 18 25 7
## 7 8 13 64 47 41 16 9 2759 15 82
## 8 34 43 108 80 42 110 24 35 2281 19
## 9 7 5 37 69 154 39 14 355 62 2794
##
## Overall Statistics
##
## Accuracy : 0.839
## 95% CI : (0.8351, 0.843)
## No Information Rate : 0.1112
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.8212
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Statistics by Class:
##
## Class: 0 Class: 1 Class: 2 Class: 3 Class: 4 Class: 5
## Sensitivity 0.92477 0.9513 0.82813 0.77324 0.84950 0.85563
## Specificity 0.98617 0.9843 0.98300 0.98639 0.98680 0.95463
## Pos Pred Value 0.87960 0.8832 0.84420 0.86761 0.87477 0.65124
## Neg Pred Value 0.99173 0.9938 0.98092 0.97417 0.98372 0.98525
## Prevalence 0.09851 0.1112 0.10009 0.10342 0.09789 0.09009
## Detection Rate 0.09110 0.1058 0.08289 0.07997 0.08315 0.07708
## Detection Prevalence 0.10357 0.1198 0.09818 0.09217 0.09506 0.11836
## Balanced Accuracy 0.95547 0.9678 0.90557 0.87981 0.91815 0.90513
## Class: 6 Class: 7 Class: 8 Class: 9
## Sensitivity 0.87349 0.78137 0.71148 0.83204
## Specificity 0.99089 0.99019 0.98371 0.97546
## Pos Pred Value 0.91271 0.90341 0.82169 0.79016
## Neg Pred Value 0.98627 0.97473 0.96999 0.98124
## Prevalence 0.09833 0.10509 0.09542 0.09994
## Detection Rate 0.08589 0.08211 0.06789 0.08315
## Detection Prevalence 0.09411 0.09089 0.08262 0.10524
## Balanced Accuracy 0.93219 0.88578 0.84760 0.90375