EXERCISE 01
Part (a)
library(readxl)
DistressData <- read_excel("C:/Users/Sarah Chock/OneDrive - University of St. Thomas/Senior Year/STAT 360 Comp Stat and Data Analysis/Exploratory Data Analysis/DistressData.xlsx")
dd <- as.matrix(DistressData)
dd[which(dd==3)] = 6
dd[which(dd==5)] = 3
dd[which(dd==6)] = 5
head(DistressData)
## # A tibble: 6 x 10
## Hopelessness Overwhelmed Exhausted VeryLonely VerySad Depressed Anxiety
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 5 3 3 5 5 5 4
## 2 4 4 4 5 5 2 3
## 3 1 4 3 4 2 2 2
## 4 5 3 4 3 4 2 3
## 5 5 5 5 5 5 1 1
## 6 3 4 4 3 3 4 3
## # ... with 3 more variables: SelfHarm <dbl>, SuicidalThoughts <dbl>,
## # SuicidalAttempts <dbl>
Part (b)
R <- cor(dd)
eigen(R)$values
## [1] 4.8390360 1.6895478 0.8790759 0.4963882 0.4727502 0.4058604 0.3471259
## [8] 0.3220688 0.2873694 0.2607774
#The intrinsic dimensionality using Kaiser's criterion is 2
library(psych)
## Warning: package 'psych' was built under R version 4.0.5
A <- pca(R, 2, rotate = "varimax")$loadings[]
A
## RC1 RC2
## Hopelessness 0.736782679 0.35542955
## Overwhelmed 0.750335749 -0.10891644
## Exhausted 0.766728049 -0.04801078
## VeryLonely 0.763674783 0.25509207
## VerySad 0.801576138 0.26866823
## Depressed 0.691210438 0.47498740
## Anxiety 0.742612047 0.22977641
## SelfHarm 0.162446158 0.78686905
## SuicidalThoughts 0.232894533 0.82454863
## SuicidalAttempts 0.005108308 0.80251514
Part (c)
There is still some complexity in our optimally rotated matrix. Hopelessness and Depressed are both complex.
Part (d)
Looking at the factors that are complex, it seeme like hopelessness and depressed would be correlated with this theme of sadness and emotions in factor 1, as well as the the mental health problems of selfharm/suicidal ideas in factor 2. I think factor 1 is something to do with negative emotions, where factor 2 has to do more with impacts of mental health. Even in the rows that aren't complex, the correlations are still not negligible, so it's possible we would benefit from oblique rotation.
Part (e)
plot(A, xlim = c(-1,1), ylim = c(-1,1))
abline(v = 0, h = 0)

Part (f)
It looks like we can go through the centers of our clusters better if we did an oblique rotation. It kind of looks like if we rotate our x axis (RC1) by about 30 degrees and our y axis (RC2) by about -15 degrees, we would hit these clusters better.
Part (g)
library(GPArotation)
Aoblique <- pca(R, 2, rotate = "oblimin")$loadings[]
Aoblique
## TC1 TC2
## Hopelessness 0.72687711 0.20895915
## Overwhelmed 0.80001663 -0.27378107
## Exhausted 0.80946160 -0.21436714
## VeryLonely 0.76779000 0.09949767
## VerySad 0.80577935 0.10538218
## Depressed 0.66395180 0.34229585
## Anxiety 0.74893402 0.07786394
## SelfHarm 0.07033278 0.77819767
## SuicidalThoughts 0.13936505 0.80199550
## SuicidalAttempts -0.09650970 0.82817441
plot(Aoblique, xlim = c(-1,1), ylim = c(-1,1))
abline(v = 0, h = 0)

Part (h)
Z <- scale(dd)
B <- solve(R) %*% Aoblique
factorscores <- Z %*% B
head(factorscores)
## TC1 TC2
## [1,] 0.2541373 -0.1937682
## [2,] 0.1713971 -0.6322462
## [3,] -0.3687477 -0.7357213
## [4,] 0.7575780 -0.9600768
## [5,] -1.0760793 -0.1615432
## [6,] 1.0521125 -0.6240366
Part (i)
phi <- cor(factorscores)
phi
## TC1 TC2
## TC1 1.0000000 -0.3198569
## TC2 -0.3198569 1.0000000
Part (j)
Yes, this rotation was necessary, but it was flying by the skin of it's teeth! In order for oblique rotation to be worthwhile, the absolute value of the correlation between the factors needs to be bigger than .3, and it is, at .32. Thus, this rotation is significant enough, so we can stick with our oblique rotation.
library(cats)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.5
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
stonks <- matrix(c(0.00023, 0.00038, 0.00022, 0.00007, 0.00006,
0.00038, 0.00134, 0.00041, 0.00013, 0.00008,
0.00022, 0.00041, 0.00093, 0.00019, 0.00002,
0.00007, 0.00013, 0.00019, 0.00068, 0.00037,
0.00006, 0.00008, 0.00002, 0.00037, 0.00053), nrow = 5,ncol = 5)
stonks <- as.data.frame(stonks)
ggplot(stonks, aes(x = stonks[,1], y = stonks[,2])) + add_cat()+ geom_point()

here_kitty()

## meow