library(rmarkdown); library(knitr)
library(dplyr)
library(psych); library(corrplot); library(GPArotation)
library(lavaan); library(moments)
ACADEMICDATA <- read.csv("https://www.dropbox.com/s/4qtq7noyk76axk4/?dl=1")
attach(ACADEMICDATA)
set.seed(37)
Use pca() to generate the unrotated (factor) loading matrix, A, for an exploratory factor analysis of the interaction between PHYSICAL, SEXUAL, STDI, ILLNESS, and PAIN
r <- cor(cbind(PHYSICAL, SEXUAL, STDI, ILLNESS, PAIN), use = "pairwise.complete.obs")
#e <- eigen(r)$values
#v <- eigen(r)$vectors
#A <- v[,1:2] %*% sqrt(l[1:2,1:2])
(load <- pca(r, nfactors = 2, rotate = "none")$loadings[])
## PC1 PC2
## PHYSICAL 0.6849157 0.4313925
## SEXUAL 0.6342701 0.4462439
## STDI 0.5494984 0.2996891
## ILLNESS 0.6333489 -0.5878605
## PAIN 0.6382143 -0.5810972
Use plot() to generate a scatter plot of the loadings of the original variables onto each of the factors extracted in the exploratory factor analysis above
plot(load, xlim = c(-2, 2), ylim = c(-1, 1))
abline(h = 0, v = 0)
Compute the rotation matrix, PSI37, for a 37 degree clockwise rotation of the loadings onto the extracted factors
psi <- -37*pi/180
PSI37 <- matrix(c(cos(psi), -sin(psi), sin(psi), cos(psi)), byrow = T, nrow = 2)
Use A1 and PSI37 to compute the rotated (by 37 degrees) loading matrix
(A37 <- load %*% PSI37)
## [,1] [,2]
## PHYSICAL 0.2873795 0.75671792
## SEXUAL 0.2379944 0.73809951
## STDI 0.2584915 0.57003882
## ILLNESS 0.8595982 -0.08832740
## PAIN 0.8594137 -0.07999791
Explain what the values in each of the rows of A37 mean in terms of if there are any remaining complex variables and if rotation was, therefore, successful (in decreasing complexity among variables)
Rotation was successful, there are no complex variables remaining.
Use plot() to generate a scatter plot of the rotated loadings above
plot(A37, xlim = c(-2, 2), ylim = c(-1, 1))
abline(h = 0, v = 0)
Explain what the scatter plot above means in terms of whether or not the rotation helped to decrease complexity among variables
It is much more obvious which factor each variable corresponds to.
Use pca() to generate the optimally rotated (factor) loading matrix, AVARIMAX, for the original variables based on the varimax algorithm
(Avar <- pca(r, nfactors = 2, rotate = "varimax"))$loadings[]
## RC1 RC2
## PHYSICAL 0.8039160 0.09448853
## SEXUAL 0.7738338 0.05112269
## STDI 0.6158211 0.11192182
## ILLNESS 0.1236224 0.85523580
## PAIN 0.1316563 0.85302883
Use pca() to generate the optimally rotated (factor) loading matrix, AEQUAMAX, for the original variables based on the equamax algorithm
(Aequ <- pca(r, nfactors = 2, rotate = "equamax"))$loadings[]
## RC1 RC2
## PHYSICAL 0.79976600 0.12483329
## SEXUAL 0.77134584 0.08036047
## STDI 0.61114625 0.13513841
## ILLNESS 0.09118004 0.85930028
## PAIN 0.09929169 0.85739881
Explain what the two rotated loading matrices above mean in terms of whether the varimax or equamax algorithms were more effective in decreasing complexity among variables
They both decreased complexity roughly the same amount and produced very similar matrices.
Use the optimally rotated (factor) loading matrix based on the varimax algorithm, AVARIMAX, above to compute the communalities for each of the variables
Avar$communality
## PHYSICAL SEXUAL STDI ILLNESS PAIN
## 0.6552090 0.6014322 0.3917621 0.7467108 0.7449916
Explain what the communalities above mean in terms of how much of the variability in PHYSICAL was lost by performing the exploratory factor analysis
This indicates a loss of roughly 34.5% (1-0.6552) of its variability.
Use the optimally rotated (factor) loading matrix based on the varimax algorithm, AVARIMAX, above to compute the variance accounted for by each rotated factor
colSums(Avar$loadings[]^2)/5
## RC1 RC2
## 0.3313902 0.2966309
Explain what the variances above mean in terms of how much variability in the original variables is explained by the first rotated factor
33.14% of the of the variability is explained in the first rotated factor.
Explain what the variances above mean in terms of how much additional variability in the original DVs is explained by the addition of the second rotated factor
29.66% of the of the variability is explained in the second rotated factor.
Explain what the variances above mean in terms of how much variability in the original variables is explained by the exploratory factor analysis as a whole
62.8% of the of the variability is explained in the rotated factors.
Compute the transpose of the rotated (factor) loading matrix based on the varimax algorithm, AVARIMAX, above
(tAvar <- t(Avar$loadings[]))
## PHYSICAL SEXUAL STDI ILLNESS PAIN
## RC1 0.80391599 0.77383376 0.6158211 0.1236224 0.1316563
## RC2 0.09448853 0.05112269 0.1119218 0.8552358 0.8530288
Use the optimally rotated (factor) loading matrix based on the varimax algorithm, AVARIMAX, and its transpose above to compute the reproduced correlation matrix, RR, for the exploratory factor analysis
(rr <- Avar$loadings[] %*% tAvar)
## PHYSICAL SEXUAL STDI ILLNESS PAIN
## PHYSICAL 0.6552090 0.6269278 0.5056438 0.1801920 0.1864420
## SEXUAL 0.6269278 0.6014322 0.4822649 0.1393851 0.1454892
## STDI 0.5056438 0.4822649 0.3917621 0.1718488 0.1765493
## ILLNESS 0.1801920 0.1393851 0.1718488 0.7467108 0.7458165
## PAIN 0.1864420 0.1454892 0.1765493 0.7458165 0.7449916
Use the original correlation matrix, R, and the reproduced correlation matrix, RR, above to compute the residual correlation matrix, RES
r-rr
## PHYSICAL SEXUAL STDI ILLNESS PAIN
## PHYSICAL 0.344790999 -0.19066527 -0.21075232 -0.003490247 0.004385892
## SEXUAL -0.190665274 0.39856779 -0.25831362 0.020204505 0.010868493
## STDI -0.210752324 -0.25831362 0.60823787 -0.017708486 -0.023224057
## ILLNESS -0.003490247 0.02020450 -0.01770849 0.253289232 -0.252445365
## PAIN 0.004385892 0.01086849 -0.02322406 -0.252445365 0.255008438
Explain what the residual correlation matrix above means in terms of how successful the exploratory factor analysis was in reproducing the original correlation structure between PHYSICAL, SEXUAL, STDI, ILLNESS, and PAIN
The exploratory factor analysis was somewhat successful. Most of the residuals are small, although some get a bit large, such as Illness/Pain.