STAT 360: Computational Statistics and Data Analysis

Load R Libraries, Import and Attach Relevant Data, and Specify Seed

library(rmarkdown); library(knitr); library(readxl)
set.seed(37)

EXERCISE 01

Part (a)

DCF <- matrix(c(1.00, -.72, -.09, -.38, -.72, 1.00, 0.23, 0.49, -.09, 0.23, 1.0, -.46, -.38, 0.49, -.46, 1.00), nrow = 4, ncol = 4)
rownames(DCF)<- c("DBH", "TYPE", "SLOPE", "ASPECT")
colnames(DCF)<- c("DBH", "TYPE", "SLOPE", "ASPECT")
DCF

##          DBH  TYPE SLOPE ASPECT
## DBH     1.00 -0.72 -0.09  -0.38
## TYPE   -0.72  1.00  0.23   0.49
## SLOPE  -0.09  0.23  1.00  -0.46
## ASPECT -0.38  0.49 -0.46   1.00

Part (b)

eigen(DCF)$values

## [1] 2.0746888 1.3654830 0.3926504 0.1671779

library(psych)

## Warning: package 'psych' was built under R version 4.0.5

Arot <- pca(DCF, 2, rotate = "varimax")$loadings[]
Arot

##               RC1         RC2
## DBH    -0.8854395  0.02236984
## TYPE    0.9433740  0.02898144
## SLOPE   0.2220135  0.92760753
## ASPECT  0.5426109 -0.74879998

Part (c)

Yes, this rotated matrix has complexity. The Aspect dimension loads significantly onto both factors. Since there is only one that is complex, we want to add it as a measured variable.

Part (d)

library(knitr)
include_graphics("C:/Users/Sarah Chock/OneDrive - University of St. Thomas/Senior Year/STAT 360 Comp Stat and Data Analysis/Structural Equation Models/PS12 Q1.png")

EXERCISE 02

Part (a)

library(readxl)
DistressData <- read_excel("C:/Users/Sarah Chock/OneDrive - University of St. Thomas/Senior Year/STAT 360 Comp Stat and Data Analysis/Exploratory Data Analysis/DistressData.xlsx")
dd <- as.matrix(DistressData)
dd[which(dd==3)] = 6
dd[which(dd==5)] = 3
dd[which(dd==6)] = 5
head(DistressData)

## # A tibble: 6 x 10
##   Hopelessness Overwhelmed Exhausted VeryLonely VerySad Depressed Anxiety
##          <dbl>       <dbl>     <dbl>      <dbl>   <dbl>     <dbl>   <dbl>
## 1            5           3         3          5       5         5       4
## 2            4           4         4          5       5         2       3
## 3            1           4         3          4       2         2       2
## 4            5           3         4          3       4         2       3
## 5            5           5         5          5       5         1       1
## 6            3           4         4          3       3         4       3
## # ... with 3 more variables: SelfHarm <dbl>, SuicidalThoughts <dbl>,
## #   SuicidalAttempts <dbl>

Part (b)

R <- cor(dd)
eigen(R)$values

##  [1] 4.8390360 1.6895478 0.8790759 0.4963882 0.4727502 0.4058604 0.3471259
##  [8] 0.3220688 0.2873694 0.2607774

#The intrinsic dimensionality using Kaiser's criterion is 2
library(psych)
A <- pca(R, 2, rotate = "varimax")$loadings[]
A

##                          RC1         RC2
## Hopelessness     0.736782679  0.35542955
## Overwhelmed      0.750335749 -0.10891644
## Exhausted        0.766728049 -0.04801078
## VeryLonely       0.763674783  0.25509207
## VerySad          0.801576138  0.26866823
## Depressed        0.691210438  0.47498740
## Anxiety          0.742612047  0.22977641
## SelfHarm         0.162446158  0.78686905
## SuicidalThoughts 0.232894533  0.82454863
## SuicidalAttempts 0.005108308  0.80251514

Part (c)

library(GPArotation)
Aoblique <- pca(R, 2, rotate = "oblimin")$loadings[]
Aoblique

##                          TC1         TC2
## Hopelessness      0.72687711  0.20895915
## Overwhelmed       0.80001663 -0.27378107
## Exhausted         0.80946160 -0.21436714
## VeryLonely        0.76779000  0.09949767
## VerySad           0.80577935  0.10538218
## Depressed         0.66395180  0.34229585
## Anxiety           0.74893402  0.07786394
## SelfHarm          0.07033278  0.77819767
## SuicidalThoughts  0.13936505  0.80199550
## SuicidalAttempts -0.09650970  0.82817441

Part (d)

Z <- scale(dd)
B <- solve(R) %*% Aoblique
factorscores <- Z %*% B
phi <- cor(factorscores)
phi

##            TC1        TC2
## TC1  1.0000000 -0.3198569
## TC2 -0.3198569  1.0000000

Part (e)

Yes, oblique rotation was necessary. Our factors are correlated with each other, with correlations of -.32, meaning that they are not independent since they are below the -.3 threshold. If they were independent, orthogonal rotation would be better. In our case, oblique rotation was the correct choice.

Part (f)

include_graphics("C:/Users/Sarah Chock/OneDrive - University of St. Thomas/Senior Year/STAT 360 Comp Stat and Data Analysis/Structural Equation Models/PS12 Q2.png")