EXERCISE 01
Part (a)
stonks <- matrix(c(0.00023, 0.00038, 0.00022, 0.00007, 0.00006,
0.00038, 0.00134, 0.00041, 0.00013, 0.00008,
0.00022, 0.00041, 0.00093, 0.00019, 0.00002,
0.00007, 0.00013, 0.00019, 0.00068, 0.00037,
0.00006, 0.00008, 0.00002, 0.00037, 0.00053),
nrow =5,ncol = 5)
rownames(stonks)<- c("SNDL", "WRN", "NGD", "UPH", "WISH")
colnames(stonks)<- c("SNDL", "WRN", "NGD", "UPH", "WISH")
stonks
## SNDL WRN NGD UPH WISH
## SNDL 0.00023 0.00038 0.00022 0.00007 0.00006
## WRN 0.00038 0.00134 0.00041 0.00013 0.00008
## NGD 0.00022 0.00041 0.00093 0.00019 0.00002
## UPH 0.00007 0.00013 0.00019 0.00068 0.00037
## WISH 0.00006 0.00008 0.00002 0.00037 0.00053
Part (b)
R <- solve(sqrt(diag(diag(stonks)))) %*% stonks %*%
t(solve(sqrt(diag(diag(stonks)))))
R
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.0000000 0.68449027 0.47568263 0.1770026 0.17184995
## [2,] 0.6844903 1.00000000 0.36727383 0.1361873 0.09492916
## [3,] 0.4756826 0.36727383 1.00000000 0.2389228 0.02848725
## [4,] 0.1770026 0.13618726 0.23892284 1.0000000 0.61632436
## [5,] 0.1718499 0.09492916 0.02848725 0.6163244 1.00000000
Part (c)
library(corrplot)
## corrplot 0.92 loaded
corrplot(R, method = "color")

Part (d)
From the heat map, there appears to be two clusters, similar to the clusters from in class! There's one 3x3 in the top left and one 2x2 in the bottom right.
Part (e)
eigen(R)$values
## [1] 2.2315142 1.4234986 0.7052701 0.3628567 0.2768604
eigen(R)$vectors
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.5510097 0.2846241 -0.2465648 0.3535425 0.6554319
## [2,] -0.5050987 0.3282943 -0.4429402 -0.4423280 -0.4952253
## [3,] -0.4443831 0.2314799 0.7976778 0.1901316 -0.2765878
## [4,] -0.3794004 -0.5840746 0.2169151 -0.5950724 0.3372668
## [5,] -0.3159789 -0.6453572 -0.2442697 0.5376732 -0.3673027
Part (f)
Using Kaiser's criterion, there should be two clusters (so the intrinsic dimensionality is 2). This is because two of the eigenvalues are greater than 1.
Part (f)
Wow! I guessed that there would be the same number of dimensions. Heat maps really can be good.
EXERCISE 02
Part (a)
southAfrica <- c(20.225,8.432,2.356,2.156,1.835,.895,.779,.701,.653,.601,.552,.492,.452,.401,.369,.301,.215,.198,.182,.173,.167,.161,.157,.152,.107,.101,.040,.037,.034)
southAfrica
## [1] 20.225 8.432 2.356 2.156 1.835 0.895 0.779 0.701 0.653 0.601
## [11] 0.552 0.492 0.452 0.401 0.369 0.301 0.215 0.198 0.182 0.173
## [21] 0.167 0.161 0.157 0.152 0.107 0.101 0.040 0.037 0.034
Part (b)
plot(southAfrica, pch = 16)

Part (c)
We thing that the inflection point is at point 5, which means the instrinsic dimensionality would be 4 based on this point.
Part (d)
Using Kaiser's criterion, the intrinsic dimensionality would be 5.
Part (e)
Using Jolliffe's criterion, the intrinsic dimensionality would be 8.
Part (f)
There are so many differences between inflection point, Kaiser, and Jolliffe! I'm thinking that Joliffe missed the 8am team zoom calls to discuss intrinsic dimensionality since that number of 8 was way different from 5 and 6. Since they are different, we would have to look more closely at what the clusters are to see if they make sense, and then choose from there.
EXERCISE 03
Part (a)
epilepsy <- read.csv("C:/Users/Sarah Chock/OneDrive - University of St. Thomas/Senior Year/STAT 360 Comp Stat and Data Analysis/Exploratory Factor Analysis/Epilepsy Detection.csv")
Part (b)
corrEpilepsy <- cor(epilepsy)
Part (c)
corrplot(corrEpilepsy, method = "color", tl.pos = 'n')

Part (d)
I think there are 8 clusters based on this data, after rigorous discussion with Connor. We counted up the clusters on the diagonal and split them up when there were light sections inbetween the dark sections.
Part (e)
eigenvalues <- eigen(corrEpilepsy)$values
Part (f)
sum(eigenvalues > 1) #kaiser
## [1] 54
sum(eigenvalues > .7) #joliffe
## [1] 72
Part (g)
Well, needless to say, there certainly are differences between my answer of 8 and the criteria of 54 and 72. I think this is because my eyes are not very discerning of all of these clusters since there are so amny dimensions to begin with.