Problem 1:

The data file SVI GU-2010 contains variables derived from the 2010 U.S. Census. A detailed description of these variables can be found in the SVI Documentation file.

#Loading libraries.
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(psych)
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(readr)
library(lavaan)
## This is lavaan 0.6-19
## lavaan is FREE software! Please report any bugs.
## 
## Attaching package: 'lavaan'
## The following object is masked from 'package:psych':
## 
##     cor2cov
library(semPlot)
#Loading the data.
svi <- read.csv('/Users/michaelcajigal/Desktop/MA564/Homework 3/SVI GU-2010.csv')
head(svi)
##    VILLAGE_NAME          CDP_NAME CDP_POP CIV_POP POP_25OVER POP_5OVER
## 1      Mangilao        Adacao CDP   4,184   1,951       2384     3,840
## 2      Sinajana         Afame CDP     758     369        437       667
## 3 Agana Heights Agana Heights CDP   3,718   1,724       2181     3,426
## 4          Agat          Agat CDP   3,677   1,568       2171     3,354
## 5          Yigo          Anao CDP   1,952     850       1074     1,777
## 6          Yigo  Andersen AFB CDP   3,061     530       1343     2,633
##   HOUSE_UNITS HOUSEHOLDS CIV_NONINST  A1_P  A2_P  A3_P  A4_P  B1_P  B2_P  B3_P
## 1       1,203      1,079       4,123 19.46  6.05 16931 18.88  6.79 32.22  6.69
## 2         311        224         731 15.83 10.03 19894 18.76  8.44 30.87 11.48
## 3       1,236      1,077       3,668 18.24  7.08 19300 17.24  8.42 31.33  9.84
## 4       1,197        974       3,617 18.96 11.48 16551 21.42 10.23 30.43 12.59
## 5         546        476       1,919 21.98  7.41 14534 24.21  6.51 35.09  5.74
## 6       1,079        812       2,181  8.82 11.13 16149  7.00  0.72 38.61  1.76
##    B4_P  C1_P C1_NEW_P C2_P  D1_P D2_P  D3_P D4_P D5_P H1_P  H2_P H3A_P H3B_P
## 1 11.58 95.41    30.88 0.26  0.50 1.66 23.63 3.71 0.36 1.16 52.20 10.81 16.96
## 2 14.29 93.80    31.00 0.00 36.01 0.64 21.43 6.70 0.00 0.00 29.26  3.54  6.11
## 3 15.04 94.16    29.18 0.15 14.97 0.49 16.90 6.22 0.40 0.24 12.70  2.67  5.66
## 4 14.17 97.47    20.40 0.06  8.94 1.84 23.61 7.39 2.80 0.50  6.93  7.44 15.46
## 5 15.55 96.36    36.32 0.56  0.18 1.10 30.46 3.99 1.59 1.83 82.23 14.10 22.16
## 6 10.10 47.11    85.56 0.11  1.02 1.02  5.54 2.96 8.62 0.00  5.28  1.76  8.53
##   H3C_P  H4_P  H5_P H6_P  H7_P  H8_P H9_P  P1_P  P2_P  P3_P  P4_P
## 1  1.75 14.88  9.64 4.45 21.96 22.52 6.49 16.40 36.35 34.94 20.57
## 2  0.96  7.72 10.61 4.91 30.36 17.41 8.04 10.16 15.70 37.05 18.74
## 3  0.73 10.19  4.94 3.71 19.78 24.14 7.15  9.87 15.06 25.44 13.47
## 4  2.09 14.12 13.03 3.49 20.53 29.98 8.01 11.04 22.19 34.29 15.87
## 5  4.03 15.75 16.85 4.83 20.80 28.15 5.67 22.75 37.04 38.87 23.29
## 6  1.20  1.30  0.37 0.86 25.74  2.59 1.23  5.68 10.81 21.67  5.18
summary(svi)
##  VILLAGE_NAME         CDP_NAME           CDP_POP            CIV_POP         
##  Length:57          Length:57          Length:57          Length:57         
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    POP_25OVER    POP_5OVER         HOUSE_UNITS         HOUSEHOLDS       
##  Min.   :  19   Length:57          Length:57          Length:57         
##  1st Qu.: 627   Class :character   Class :character   Class :character  
##  Median :1441   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :1484                                                           
##  3rd Qu.:2042                                                           
##  Max.   :3705                                                           
##                                                                         
##  CIV_NONINST             A1_P            A2_P             A3_P      
##  Length:57          Min.   : 5.56   Min.   : 0.000   Min.   : 8970  
##  Class :character   1st Qu.:18.15   1st Qu.: 6.310   1st Qu.:13972  
##  Mode  :character   Median :21.63   Median : 8.120   Median :16338  
##                     Mean   :21.89   Mean   : 8.475   Mean   :17468  
##                     3rd Qu.:25.96   3rd Qu.: 9.620   3rd Qu.:19515  
##                     Max.   :37.94   Max.   :18.580   Max.   :34296  
##                                                                     
##       A4_P            B1_P             B2_P            B3_P       
##  Min.   : 4.07   Min.   : 0.000   Min.   :10.00   Min.   : 1.110  
##  1st Qu.:16.67   1st Qu.: 5.410   1st Qu.:30.41   1st Qu.: 6.670  
##  Median :20.40   Median : 6.580   Median :32.94   Median : 7.390  
##  Mean   :20.39   Mean   : 6.503   Mean   :32.23   Mean   : 7.638  
##  3rd Qu.:23.84   3rd Qu.: 7.960   3rd Qu.:35.58   3rd Qu.: 9.230  
##  Max.   :47.99   Max.   :11.250   Max.   :40.94   Max.   :12.590  
##                                                                   
##       B4_P            C1_P          C1_NEW_P          C2_P       
##  Min.   : 0.00   Min.   :40.11   Min.   :14.95   Min.   :0.0000  
##  1st Qu.:11.95   1st Qu.:92.14   1st Qu.:27.81   1st Qu.:0.0900  
##  Median :15.33   Median :94.93   Median :34.12   Median :0.2550  
##  Mean   :14.81   Mean   :90.80   Mean   :38.47   Mean   :0.4631  
##  3rd Qu.:18.00   3rd Qu.:97.13   3rd Qu.:43.51   3rd Qu.:0.6200  
##  Max.   :24.31   Max.   :98.65   Max.   :88.99   Max.   :2.0600  
##                                                  NA's   :3       
##       D1_P            D2_P             D3_P            D4_P       
##  Min.   : 0.00   Min.   :0.0000   Min.   : 2.82   Min.   : 0.000  
##  1st Qu.: 0.27   1st Qu.:0.2800   1st Qu.:16.55   1st Qu.: 4.100  
##  Median : 1.69   Median :0.7300   Median :22.73   Median : 6.750  
##  Mean   :13.00   Mean   :0.8086   Mean   :22.53   Mean   : 7.473  
##  3rd Qu.:21.71   3rd Qu.:1.2100   3rd Qu.:30.46   3rd Qu.: 8.150  
##  Max.   :76.29   Max.   :2.1300   Max.   :45.10   Max.   :52.630  
##                                                                   
##       D5_P            H1_P              H2_P           H3A_P       
##  Min.   : 0.00   Min.   : 0.0000   Min.   : 1.86   Min.   : 0.000  
##  1st Qu.: 0.12   1st Qu.: 0.0000   1st Qu.: 6.93   1st Qu.: 1.760  
##  Median : 0.68   Median : 0.2600   Median :27.35   Median : 5.180  
##  Mean   : 4.84   Mean   : 0.8351   Mean   :31.17   Mean   : 7.679  
##  3rd Qu.: 3.04   3rd Qu.: 0.7100   3rd Qu.:52.02   3rd Qu.: 9.290  
##  Max.   :47.78   Max.   :10.3600   Max.   :93.44   Max.   :49.290  
##                                                                    
##      H3B_P           H3C_P             H4_P            H5_P      
##  Min.   : 0.00   Min.   : 0.000   Min.   : 0.00   Min.   : 0.37  
##  1st Qu.: 4.46   1st Qu.: 0.570   1st Qu.: 5.96   1st Qu.: 5.00  
##  Median : 9.56   Median : 1.350   Median :12.13   Median : 9.47  
##  Mean   :13.15   Mean   : 2.048   Mean   :12.08   Mean   :11.10  
##  3rd Qu.:16.96   3rd Qu.: 2.500   3rd Qu.:14.88   3rd Qu.:13.23  
##  Max.   :65.36   Max.   :12.860   Max.   :38.13   Max.   :44.64  
##                                                                  
##       H6_P             H7_P            H8_P            H9_P       
##  Min.   : 0.000   Min.   : 0.00   Min.   : 1.30   Min.   : 1.230  
##  1st Qu.: 3.010   1st Qu.:19.78   1st Qu.:19.42   1st Qu.: 5.670  
##  Median : 3.710   Median :24.10   Median :24.55   Median : 6.730  
##  Mean   : 4.353   Mean   :23.71   Mean   :24.75   Mean   : 6.913  
##  3rd Qu.: 5.000   3rd Qu.:29.19   3rd Qu.:30.07   3rd Qu.: 8.380  
##  Max.   :11.760   Max.   :39.12   Max.   :50.20   Max.   :15.790  
##                                                                   
##       P1_P            P2_P            P3_P            P4_P      
##  Min.   : 2.45   Min.   : 5.23   Min.   :15.15   Min.   : 4.65  
##  1st Qu.: 7.77   1st Qu.:13.60   1st Qu.:24.11   1st Qu.:13.86  
##  Median :17.32   Median :26.72   Median :29.63   Median :19.88  
##  Mean   :16.90   Mean   :27.62   Mean   :29.89   Mean   :19.80  
##  3rd Qu.:21.70   3rd Qu.:38.60   3rd Qu.:36.21   3rd Qu.:24.58  
##  Max.   :64.46   Max.   :72.48   Max.   :44.40   Max.   :43.35  
## 
#Checking for any missing data because PCA interpretation changes.
colSums(is.na(svi))
## VILLAGE_NAME     CDP_NAME      CDP_POP      CIV_POP   POP_25OVER    POP_5OVER 
##            0            0            0            0            0            0 
##  HOUSE_UNITS   HOUSEHOLDS  CIV_NONINST         A1_P         A2_P         A3_P 
##            0            0            0            0            0            0 
##         A4_P         B1_P         B2_P         B3_P         B4_P         C1_P 
##            0            0            0            0            0            0 
##     C1_NEW_P         C2_P         D1_P         D2_P         D3_P         D4_P 
##            0            3            0            0            0            0 
##         D5_P         H1_P         H2_P        H3A_P        H3B_P        H3C_P 
##            0            0            0            0            0            0 
##         H4_P         H5_P         H6_P         H7_P         H8_P         H9_P 
##            0            0            0            0            0            0 
##         P1_P         P2_P         P3_P         P4_P 
##            0            0            0            0

Note. There are 3 missing data for C2_P. Due to missing data, PCA/FCA was performed using listwise deletion, thus reducing the sample size. Interpretations must acknowledge potential bias due to excluded observations.

a. Perform a Principal Component Analysis (PCA) with varimax rotation on the variables A1_P to P4_P.

#Need to extract numeric columns only for PCA on variables A1_P to P4_P.
svi_numeric <- svi %>% select(where(is.numeric))
head(svi_numeric)
##   POP_25OVER  A1_P  A2_P  A3_P  A4_P  B1_P  B2_P  B3_P  B4_P  C1_P C1_NEW_P
## 1       2384 19.46  6.05 16931 18.88  6.79 32.22  6.69 11.58 95.41    30.88
## 2        437 15.83 10.03 19894 18.76  8.44 30.87 11.48 14.29 93.80    31.00
## 3       2181 18.24  7.08 19300 17.24  8.42 31.33  9.84 15.04 94.16    29.18
## 4       2171 18.96 11.48 16551 21.42 10.23 30.43 12.59 14.17 97.47    20.40
## 5       1074 21.98  7.41 14534 24.21  6.51 35.09  5.74 15.55 96.36    36.32
## 6       1343  8.82 11.13 16149  7.00  0.72 38.61  1.76 10.10 47.11    85.56
##   C2_P  D1_P D2_P  D3_P D4_P D5_P H1_P  H2_P H3A_P H3B_P H3C_P  H4_P  H5_P H6_P
## 1 0.26  0.50 1.66 23.63 3.71 0.36 1.16 52.20 10.81 16.96  1.75 14.88  9.64 4.45
## 2 0.00 36.01 0.64 21.43 6.70 0.00 0.00 29.26  3.54  6.11  0.96  7.72 10.61 4.91
## 3 0.15 14.97 0.49 16.90 6.22 0.40 0.24 12.70  2.67  5.66  0.73 10.19  4.94 3.71
## 4 0.06  8.94 1.84 23.61 7.39 2.80 0.50  6.93  7.44 15.46  2.09 14.12 13.03 3.49
## 5 0.56  0.18 1.10 30.46 3.99 1.59 1.83 82.23 14.10 22.16  4.03 15.75 16.85 4.83
## 6 0.11  1.02 1.02  5.54 2.96 8.62 0.00  5.28  1.76  8.53  1.20  1.30  0.37 0.86
##    H7_P  H8_P H9_P  P1_P  P2_P  P3_P  P4_P
## 1 21.96 22.52 6.49 16.40 36.35 34.94 20.57
## 2 30.36 17.41 8.04 10.16 15.70 37.05 18.74
## 3 19.78 24.14 7.15  9.87 15.06 25.44 13.47
## 4 20.53 29.98 8.01 11.04 22.19 34.29 15.87
## 5 20.80 28.15 5.67 22.75 37.04 38.87 23.29
## 6 25.74  2.59 1.23  5.68 10.81 21.67  5.18
summary(svi_numeric)
##    POP_25OVER        A1_P            A2_P             A3_P      
##  Min.   :  19   Min.   : 5.56   Min.   : 0.000   Min.   : 8970  
##  1st Qu.: 627   1st Qu.:18.15   1st Qu.: 6.310   1st Qu.:13972  
##  Median :1441   Median :21.63   Median : 8.120   Median :16338  
##  Mean   :1484   Mean   :21.89   Mean   : 8.475   Mean   :17468  
##  3rd Qu.:2042   3rd Qu.:25.96   3rd Qu.: 9.620   3rd Qu.:19515  
##  Max.   :3705   Max.   :37.94   Max.   :18.580   Max.   :34296  
##                                                                 
##       A4_P            B1_P             B2_P            B3_P       
##  Min.   : 4.07   Min.   : 0.000   Min.   :10.00   Min.   : 1.110  
##  1st Qu.:16.67   1st Qu.: 5.410   1st Qu.:30.41   1st Qu.: 6.670  
##  Median :20.40   Median : 6.580   Median :32.94   Median : 7.390  
##  Mean   :20.39   Mean   : 6.503   Mean   :32.23   Mean   : 7.638  
##  3rd Qu.:23.84   3rd Qu.: 7.960   3rd Qu.:35.58   3rd Qu.: 9.230  
##  Max.   :47.99   Max.   :11.250   Max.   :40.94   Max.   :12.590  
##                                                                   
##       B4_P            C1_P          C1_NEW_P          C2_P       
##  Min.   : 0.00   Min.   :40.11   Min.   :14.95   Min.   :0.0000  
##  1st Qu.:11.95   1st Qu.:92.14   1st Qu.:27.81   1st Qu.:0.0900  
##  Median :15.33   Median :94.93   Median :34.12   Median :0.2550  
##  Mean   :14.81   Mean   :90.80   Mean   :38.47   Mean   :0.4631  
##  3rd Qu.:18.00   3rd Qu.:97.13   3rd Qu.:43.51   3rd Qu.:0.6200  
##  Max.   :24.31   Max.   :98.65   Max.   :88.99   Max.   :2.0600  
##                                                  NA's   :3       
##       D1_P            D2_P             D3_P            D4_P       
##  Min.   : 0.00   Min.   :0.0000   Min.   : 2.82   Min.   : 0.000  
##  1st Qu.: 0.27   1st Qu.:0.2800   1st Qu.:16.55   1st Qu.: 4.100  
##  Median : 1.69   Median :0.7300   Median :22.73   Median : 6.750  
##  Mean   :13.00   Mean   :0.8086   Mean   :22.53   Mean   : 7.473  
##  3rd Qu.:21.71   3rd Qu.:1.2100   3rd Qu.:30.46   3rd Qu.: 8.150  
##  Max.   :76.29   Max.   :2.1300   Max.   :45.10   Max.   :52.630  
##                                                                   
##       D5_P            H1_P              H2_P           H3A_P       
##  Min.   : 0.00   Min.   : 0.0000   Min.   : 1.86   Min.   : 0.000  
##  1st Qu.: 0.12   1st Qu.: 0.0000   1st Qu.: 6.93   1st Qu.: 1.760  
##  Median : 0.68   Median : 0.2600   Median :27.35   Median : 5.180  
##  Mean   : 4.84   Mean   : 0.8351   Mean   :31.17   Mean   : 7.679  
##  3rd Qu.: 3.04   3rd Qu.: 0.7100   3rd Qu.:52.02   3rd Qu.: 9.290  
##  Max.   :47.78   Max.   :10.3600   Max.   :93.44   Max.   :49.290  
##                                                                    
##      H3B_P           H3C_P             H4_P            H5_P      
##  Min.   : 0.00   Min.   : 0.000   Min.   : 0.00   Min.   : 0.37  
##  1st Qu.: 4.46   1st Qu.: 0.570   1st Qu.: 5.96   1st Qu.: 5.00  
##  Median : 9.56   Median : 1.350   Median :12.13   Median : 9.47  
##  Mean   :13.15   Mean   : 2.048   Mean   :12.08   Mean   :11.10  
##  3rd Qu.:16.96   3rd Qu.: 2.500   3rd Qu.:14.88   3rd Qu.:13.23  
##  Max.   :65.36   Max.   :12.860   Max.   :38.13   Max.   :44.64  
##                                                                  
##       H6_P             H7_P            H8_P            H9_P       
##  Min.   : 0.000   Min.   : 0.00   Min.   : 1.30   Min.   : 1.230  
##  1st Qu.: 3.010   1st Qu.:19.78   1st Qu.:19.42   1st Qu.: 5.670  
##  Median : 3.710   Median :24.10   Median :24.55   Median : 6.730  
##  Mean   : 4.353   Mean   :23.71   Mean   :24.75   Mean   : 6.913  
##  3rd Qu.: 5.000   3rd Qu.:29.19   3rd Qu.:30.07   3rd Qu.: 8.380  
##  Max.   :11.760   Max.   :39.12   Max.   :50.20   Max.   :15.790  
##                                                                   
##       P1_P            P2_P            P3_P            P4_P      
##  Min.   : 2.45   Min.   : 5.23   Min.   :15.15   Min.   : 4.65  
##  1st Qu.: 7.77   1st Qu.:13.60   1st Qu.:24.11   1st Qu.:13.86  
##  Median :17.32   Median :26.72   Median :29.63   Median :19.88  
##  Mean   :16.90   Mean   :27.62   Mean   :29.89   Mean   :19.80  
##  3rd Qu.:21.70   3rd Qu.:38.60   3rd Qu.:36.21   3rd Qu.:24.58  
##  Max.   :64.46   Max.   :72.48   Max.   :44.40   Max.   :43.35  
## 
#Removing "POP_25OVER" variable as we will not perform PCA on this variable.
svi_num <- svi_numeric %>% select(-POP_25OVER, -D5_P)
#Listwise deletion: Keep only complete rows (no NAs)
svi_clean <- svi_num[complete.cases(svi_num), ]

#Compute the covariance matrix.
cov_svi <- cov(svi_clean)
print(cov_svi)
##                   A1_P          A2_P          A3_P          A4_P          B1_P
## A1_P      5.065629e+01  1.063180e+01   -26128.0398  4.192061e+01   -3.43255849
## A2_P      1.063180e+01  8.809417e+00    -9816.8146  1.466883e+01   -2.50685472
## A3_P     -2.612804e+04 -9.816815e+03 27880188.0657 -3.092975e+04 4074.50792453
## A4_P      4.192061e+01  1.466883e+01   -30929.7488  5.539837e+01   -1.72805472
## B1_P     -3.432558e+00 -2.506855e+00     4074.5079 -1.728055e+00    4.65871321
## B2_P      1.076905e+01  8.257583e+00   -14917.6977  1.734929e+01   -2.55725094
## B3_P      4.238662e+00  2.637835e+00    -2431.5978  6.576044e+00    2.14509245
## B4_P      1.849997e+01  7.862490e+00   -15815.1071  1.967373e+01   -1.99226415
## C1_P      4.742181e+01  3.638733e+00   -24991.3939  4.859398e+01    8.57250189
## C1_NEW_P -3.137257e+01 -1.205021e+01    39028.5371 -5.791651e+01  -12.66355849
## C2_P      3.626387e-01 -5.516303e-01      599.6663 -3.016268e-01    0.01535094
## D1_P     -3.997941e+00 -2.177523e+01    50763.4982 -5.110366e+01   -0.68656038
## D2_P      9.808066e-01  4.164093e-01    -1476.6444  2.218507e+00    0.08587358
## D3_P      5.313013e+01  1.514244e+01   -42275.1022  6.282416e+01   -5.05245660
## D4_P      1.652533e+01  4.017010e+00    -6615.5639  1.250066e+01   -1.26758868
## H1_P      5.595557e+00  3.424814e+00    -4434.3938  1.019967e+01   -0.84123019
## H2_P      3.927664e+01  2.523316e+01   -52652.4138  1.012381e+02   -5.85781887
## H3A_P     3.566779e+01  1.912095e+01   -27892.8401  5.528167e+01   -3.55532075
## H3B_P     4.679148e+01  2.494628e+01   -40652.7180  7.611261e+01   -4.20950566
## H3C_P     8.345731e+00  4.659195e+00    -6961.1540  1.396031e+01   -1.11983396
## H4_P      2.328295e+01  1.352923e+01   -25081.4711  4.636701e+01    0.73100566
## H5_P      3.291549e+01  1.727469e+01   -27559.6125  5.383710e+01   -2.23443962
## H6_P      1.198075e+01  4.291952e+00    -6170.3282  1.337447e+01   -1.05274717
## H7_P      1.362357e+01 -8.727526e-02    -4062.6795  9.006687e+00   -3.50354528
## H8_P      6.074150e+01  1.641751e+01   -36893.1038  6.300685e+01   -1.00387170
## H9_P      8.429944e+00  1.692624e+00    -4539.1119  8.454221e+00    0.67770189
## P1_P      3.222465e+01 -8.388021e+00    -4598.7041  1.302488e+01   -3.28612453
## P2_P      3.396561e+01 -1.674442e+01    -7325.4001  1.576086e+01    0.88498302
## P3_P      8.234755e-01 -7.377804e+00    -5723.9795  4.614531e+00    4.20072264
## P4_P      2.678682e+01 -4.737138e+00    -8641.0139  1.659495e+01   -0.28850000
##                   B2_P          B3_P          B4_P          C1_P      C1_NEW_P
## A1_P         10.769055     4.2386623  1.849997e+01  4.742181e+01   -31.3725726
## A2_P          8.257583     2.6378352  7.862490e+00  3.638733e+00   -12.0502053
## A3_P     -14917.697743 -2431.5978407 -1.581511e+04 -2.499139e+04 39028.5370650
## A4_P         17.349293     6.5760444  1.967373e+01  4.859398e+01   -57.9165122
## B1_P         -2.557251     2.1450925 -1.992264e+00  8.572502e+00   -12.6635585
## B2_P         20.596909     1.7168432  1.210744e+01  8.534765e+00   -33.2409662
## B3_P          1.716843     4.7797950  4.357610e+00  1.387777e+01   -22.5695050
## B4_P         12.107443     4.3576105  1.670760e+01  2.065626e+01   -33.9791395
## C1_P          8.534765    13.8777679  2.065626e+01  1.135332e+02  -125.2223708
## C1_NEW_P    -33.240966   -22.5695050 -3.397914e+01 -1.252224e+02   241.7171110
## C2_P         -1.459437    -0.2400254 -6.809239e-01  3.814324e-01     3.0394360
## D1_P        -56.395780    -6.4531901 -2.884046e+01 -9.202507e+00   158.5892466
## D2_P          1.088362     0.1960143  4.389358e-01  1.775529e+00    -3.2215886
## D3_P         22.227788     6.0906073  2.392003e+01  6.741360e+01   -84.3828247
## D4_P         -1.360083     1.5559235  5.624201e+00  1.154843e+01     0.9508801
## H1_P          3.842592     0.4978256  2.134121e+00  4.117350e+00    -6.1464744
## H2_P         70.917389     9.4128776  3.988927e+01  8.153938e+01  -177.3238470
## H3A_P        22.666148     5.7007092  1.701547e+01  2.922673e+01   -48.1678162
## H3B_P        32.541585     7.3589287  2.280993e+01  3.952785e+01   -69.5704552
## H3C_P         5.707493     1.1244415  3.875334e+00  6.725265e+00   -11.4014858
## H4_P         20.700348     7.4358566  1.421481e+01  3.773432e+01   -75.8766189
## H5_P         19.483405     8.6248920  1.896237e+01  3.897824e+01   -66.0617891
## H6_P          3.857423     1.6848235  4.815907e+00  1.087627e+01    -9.4904746
## H7_P         -9.794245    -3.3719059 -2.676290e+00  4.135528e+00    36.0448790
## H8_P         11.607260    10.5511635  2.466232e+01  7.342985e+01   -76.6300544
## H9_P          1.629560     2.7889807  4.796369e+00  1.449939e+01   -15.4389353
## P1_P        -24.309437    -8.1983191 -1.077908e+01  2.622728e+01    60.6197856
## P2_P        -30.005847   -12.5305887 -1.927533e+01  4.194815e+01    63.5203821
## P3_P         -4.315880    -2.6778107 -6.338611e+00  2.063958e+01   -12.1611390
## P4_P        -12.032892    -2.3512069 -2.855668e+00  3.697291e+01    10.4042119
##                  C2_P          D1_P          D2_P          D3_P          D4_P
## A1_P       0.36263868    -3.9979406  9.808066e-01  5.313013e+01    16.5253264
## A2_P      -0.55163029   -21.7752278  4.164093e-01  1.514244e+01     4.0170105
## A3_P     599.66626136 50763.4981901 -1.476644e+03 -4.227510e+04 -6615.5638714
## A4_P      -0.30162676   -51.1036630  2.218507e+00  6.282416e+01    12.5006632
## B1_P       0.01535094    -0.6865604  8.587358e-02 -5.052457e+00    -1.2675887
## B2_P      -1.45943704   -56.3957796  1.088362e+00  2.222779e+01    -1.3600831
## B3_P      -0.24002537    -6.4531901  1.960143e-01  6.090607e+00     1.5559235
## B4_P      -0.68092393   -28.8404640  4.389358e-01  2.392003e+01     5.6242010
## C1_P       0.38143239    -9.2025072  1.775529e+00  6.741360e+01    11.5484289
## C1_NEW_P   3.03943595   158.5892466 -3.221589e+00 -8.438282e+01     0.9508801
## C2_P       0.29195405     4.5500650 -2.334846e-02 -4.191169e-01     0.4701437
## D1_P       4.55006495   396.2088991 -5.493134e+00 -5.914266e+01    16.1912234
## D2_P      -0.02334846    -5.4931338  3.524698e-01  2.715407e+00    -0.1622783
## D3_P      -0.41911691   -59.1426614  2.715407e+00  9.325790e+01    11.9678685
## D4_P       0.47014375    16.1912234 -1.622783e-01  1.196787e+01    10.4018103
## H1_P      -0.19143683   -10.4012297  5.044059e-01  1.066492e+01     1.6152054
## H2_P      -3.39296862  -273.1013940  8.744930e+00  1.271998e+02   -11.7608188
## H3A_P     -1.05479144   -75.1440208  3.024738e+00  6.105544e+01     9.0074817
## H3B_P     -1.45700297  -114.8469367  4.692193e+00  8.554705e+01    10.0009989
## H3C_P     -0.23849843   -19.2840230  7.446940e-01  1.577069e+01     1.6484239
## H4_P      -1.11993082   -97.0518440  2.965043e+00  5.523625e+01     1.4788818
## H5_P      -0.76349294   -70.6547875  2.346698e+00  6.101309e+01     9.1193923
## H6_P      -0.05930908    -2.6779653  3.391312e-01  1.513244e+01     4.3308896
## H7_P       1.25452397    68.2047607 -3.834597e-01  1.188406e+01    10.6447767
## H8_P       0.23220870   -33.8385926  1.907509e+00  7.579414e+01    22.3639790
## H9_P       0.01613630     1.3588625  1.387803e-01  9.378868e+00     2.8502244
## P1_P       3.82765573   113.9471283 -3.746188e-02  2.621714e+01    16.6159723
## P2_P       4.68359969   129.0248544  1.145626e+00  3.529756e+01    15.0340654
## P3_P       0.74408721    -3.2275078  1.408819e+00  1.532527e+01    -3.6285910
## P4_P       2.26545828    66.3357878  5.131621e-01  2.950054e+01    10.2867342
##                   H1_P          H2_P         H3A_P         H3B_P         H3C_P
## A1_P         5.5955566     39.276636  3.566779e+01     46.791480     8.3457311
## A2_P         3.4248140     25.233162  1.912095e+01     24.946284     4.6591953
## A3_P     -4434.3938015 -52652.413767 -2.789284e+04 -40652.718036 -6961.1540252
## A4_P        10.1996682    101.238066  5.528167e+01     76.112607    13.9603138
## B1_P        -0.8412302     -5.857819 -3.555321e+00     -4.209506    -1.1198340
## B2_P         3.8425919     70.917389  2.266615e+01     32.541585     5.7074931
## B3_P         0.4978256      9.412878  5.700709e+00      7.358929     1.1244415
## B4_P         2.1341210     39.889268  1.701547e+01     22.809927     3.8753336
## C1_P         4.1173503     81.539384  2.922673e+01     39.527851     6.7252651
## C1_NEW_P    -6.1464744   -177.323847 -4.816782e+01    -69.570455   -11.4014858
## C2_P        -0.1914368     -3.392969 -1.054791e+00     -1.457003    -0.2384984
## D1_P       -10.4012297   -273.101394 -7.514402e+01   -114.846937   -19.2840230
## D2_P         0.5044059      8.744930  3.024738e+00      4.692193     0.7446940
## D3_P        10.6649215    127.199757  6.105544e+01     85.547048    15.7706874
## D4_P         1.6152054    -11.760819  9.007482e+00     10.000999     1.6484239
## H1_P         3.5442166     27.256046  1.523568e+01     20.246029     4.1293384
## H2_P        27.2560458    751.536649  1.574188e+02    218.404098    39.5908805
## H3A_P       15.2356818    157.418822  8.290074e+01    112.013340    20.4972676
## H3B_P       20.2460288    218.404098  1.120133e+02    156.395646    28.0184412
## H3C_P        4.1293384     39.590881  2.049727e+01     28.018441     5.6896142
## H4_P        12.2113390    161.349440  6.471721e+01     90.926165    17.0377094
## H5_P        13.0062272    138.178029  6.959961e+01     94.819865    17.9883031
## H6_P         3.0081978     24.585532  1.620865e+01     20.822982     3.9368616
## H7_P         1.5601925    -58.495619 -7.813718e-01     -3.660758     0.3323572
## H8_P        10.3658662     82.630642  6.138033e+01     82.182341    14.9422858
## H9_P         0.8138725      8.884561  5.068677e+00      7.177738     1.2374871
## P1_P         0.5010082    -66.694507 -5.435555e+00     -7.915443    -1.1147531
## P2_P         0.1752686    -97.275027 -1.066949e+01     -9.840459    -2.9401368
## P3_P        -1.5268958     -5.887369 -6.331693e+00     -2.048023    -2.0031019
## P4_P         0.6821602    -25.947304  1.234623e+00      2.499920     0.4030774
##                   H4_P          H5_P          H6_P          H7_P          H8_P
## A1_P      2.328295e+01  3.291549e+01  1.198075e+01  1.362357e+01  6.074150e+01
## A2_P      1.352923e+01  1.727469e+01  4.291952e+00 -8.727526e-02  1.641751e+01
## A3_P     -2.508147e+04 -2.755961e+04 -6.170328e+03 -4.062680e+03 -3.689310e+04
## A4_P      4.636701e+01  5.383710e+01  1.337447e+01  9.006687e+00  6.300685e+01
## B1_P      7.310057e-01 -2.234440e+00 -1.052747e+00 -3.503545e+00 -1.003872e+00
## B2_P      2.070035e+01  1.948341e+01  3.857423e+00 -9.794245e+00  1.160726e+01
## B3_P      7.435857e+00  8.624892e+00  1.684823e+00 -3.371906e+00  1.055116e+01
## B4_P      1.421481e+01  1.896237e+01  4.815907e+00 -2.676290e+00  2.466232e+01
## C1_P      3.773432e+01  3.897824e+01  1.087627e+01  4.135528e+00  7.342985e+01
## C1_NEW_P -7.587662e+01 -6.606179e+01 -9.490475e+00  3.604488e+01 -7.663005e+01
## C2_P     -1.119931e+00 -7.634929e-01 -5.930908e-02  1.254524e+00  2.322087e-01
## D1_P     -9.705184e+01 -7.065479e+01 -2.677965e+00  6.820476e+01 -3.383859e+01
## D2_P      2.965043e+00  2.346698e+00  3.391312e-01 -3.834597e-01  1.907509e+00
## D3_P      5.523625e+01  6.101309e+01  1.513244e+01  1.188406e+01  7.579414e+01
## D4_P      1.478882e+00  9.119392e+00  4.330890e+00  1.064478e+01  2.236398e+01
## H1_P      1.221134e+01  1.300623e+01  3.008198e+00  1.560193e+00  1.036587e+01
## H2_P      1.613494e+02  1.381780e+02  2.458553e+01 -5.849562e+01  8.263064e+01
## H3A_P     6.471721e+01  6.959961e+01  1.620865e+01 -7.813718e-01  6.138033e+01
## H3B_P     9.092617e+01  9.481987e+01  2.082298e+01 -3.660758e+00  8.218234e+01
## H3C_P     1.703771e+01  1.798830e+01  3.936862e+00  3.323572e-01  1.494229e+01
## H4_P      7.012037e+01  6.216375e+01  1.156781e+01 -1.228482e+01  5.236899e+01
## H5_P      6.216375e+01  7.415478e+01  1.580125e+01 -2.358774e+00  6.426191e+01
## H6_P      1.156781e+01  1.580125e+01  5.302512e+00  3.510088e+00  1.743551e+01
## H7_P     -1.228482e+01 -2.358774e+00  3.510088e+00  4.087370e+01  1.100183e+01
## H8_P      5.236899e+01  6.426191e+01  1.743551e+01  1.100183e+01  1.015260e+02
## H9_P      6.956740e+00  8.580773e+00  2.287347e+00  9.099312e-01  1.284785e+01
## P1_P     -1.818015e+01 -1.144097e+01  3.194119e+00  4.835925e+01  2.811541e+01
## P2_P     -2.233044e+01 -2.252899e+01  1.288443e+00  6.513438e+01  2.394500e+01
## P3_P      6.934755e-01 -8.776734e+00 -1.860634e+00  8.783971e+00 -1.694850e+00
## P4_P     -3.597668e+00 -2.596841e-01  3.836770e+00  3.019090e+01  2.864361e+01
##                   H9_P          P1_P          P2_P          P3_P          P4_P
## A1_P         8.4299443  3.222465e+01    33.9656123     0.8234755    26.7868245
## A2_P         1.6926240 -8.388021e+00   -16.7444217    -7.3778044    -4.7371384
## A3_P     -4539.1119008 -4.598704e+03 -7325.4000629 -5723.9794759 -8641.0139413
## A4_P         8.4542209  1.302488e+01    15.7608610     4.6145306    16.5949545
## B1_P         0.6777019 -3.286125e+00     0.8849830     4.2007226    -0.2885000
## B2_P         1.6295601 -2.430944e+01   -30.0058465    -4.3158801   -12.0328920
## B3_P         2.7889807 -8.198319e+00   -12.5305887    -2.6778107    -2.3512069
## B4_P         4.7963690 -1.077908e+01   -19.2753343    -6.3386109    -2.8556681
## C1_P        14.4993884  2.622728e+01    41.9481500    20.6395774    36.9729075
## C1_NEW_P   -15.4389353  6.061979e+01    63.5203821   -12.1611390    10.4042119
## C2_P         0.0161363  3.827656e+00     4.6835997     0.7440872     2.2654583
## D1_P         1.3588625  1.139471e+02   129.0248544    -3.2275078    66.3357878
## D2_P         0.1387803 -3.746188e-02     1.1456261     1.4088193     0.5131621
## D3_P         9.3788683  2.621714e+01    35.2975610    15.3252671    29.5005400
## D4_P         2.8502244  1.661597e+01    15.0340654    -3.6285910    10.2867342
## H1_P         0.8138725  5.010082e-01     0.1752686    -1.5268958     0.6821602
## H2_P         8.8845607 -6.669451e+01   -97.2750270    -5.8873690   -25.9473042
## H3A_P        5.0686768 -5.435555e+00   -10.6694871    -6.3316933     1.2346231
## H3B_P        7.1777380 -7.915443e+00    -9.8404588    -2.0480229     2.4999199
## H3C_P        1.2374871 -1.114753e+00    -2.9401368    -2.0031019     0.4030774
## H4_P         6.9567403 -1.818015e+01   -22.3304396     0.6934755    -3.5976679
## H5_P         8.5807730 -1.144097e+01   -22.5289950    -8.7767338    -0.2596841
## H6_P         2.2873470  3.194119e+00     1.2884428    -1.8606344     3.8367700
## H7_P         0.9099312  4.835925e+01    65.1343799     8.7839709    30.1909023
## H8_P        12.8478510  2.811541e+01    23.9449991    -1.6948497    28.6436069
## H9_P         4.4841796  1.049362e+00    -0.9150619    -0.8956602     3.0830197
## P1_P         1.0493617  1.190871e+02   158.0575733    34.8546520    73.0023853
## P2_P        -0.9150619  1.580576e+02   244.3075236    75.5246038   102.9644038
## P3_P        -0.8956602  3.485465e+01    75.5246038    53.1956365    29.2132006
## P4_P         3.0830197  7.300239e+01   102.9644038    29.2132006    52.3501270
#Compute total variance
trace_cov <- sum(diag(cov_svi))
# Print the result
print(trace_cov)
## [1] 27882971
#Perform PCA (unstandardized).
pca_unstand <- prcomp(svi_clean)

#Compute eigenvalues from PCA
ev1 <- (pca_unstand$sdev)^2  

#Print the eigenvalues
print(ev1)
##  [1] 2.788080e+07 9.853312e+02 4.722027e+02 2.454876e+02 1.893916e+02
##  [6] 1.388021e+02 4.278436e+01 1.890971e+01 1.740406e+01 1.303513e+01
## [11] 1.142581e+01 7.953393e+00 6.038056e+00 4.126621e+00 3.457004e+00
## [16] 2.915039e+00 2.407362e+00 1.860315e+00 1.685880e+00 1.468750e+00
## [21] 9.248961e-01 7.729898e-01 6.449708e-01 4.865241e-01 3.357214e-01
## [26] 2.930370e-01 2.421454e-01 1.182695e-01 1.089978e-01 4.843496e-02

b. How many principal components should be retained? Justify your choice.

#1. Kaiser Method:
#Eigenvalues greater than 1
ev1[ev1 > 1]
##  [1] 2.788080e+07 9.853312e+02 4.722027e+02 2.454876e+02 1.893916e+02
##  [6] 1.388021e+02 4.278436e+01 1.890971e+01 1.740406e+01 1.303513e+01
## [11] 1.142581e+01 7.953393e+00 6.038056e+00 4.126621e+00 3.457004e+00
## [16] 2.915039e+00 2.407362e+00 1.860315e+00 1.685880e+00 1.468750e+00
#2. Scree Plot (Elbow method):
#Scree plot
#Create a dataframe for plotting
scree_svi <- data.frame(
  PC = paste0("PC", 1:length(ev1)),  
  Eigenvalue = ev1)

#Fix ordering by converting to a factor with correct levels
scree_svi$PC <- factor(scree_svi$PC, levels = paste0("PC", 1:length(ev1)))

#Plot the Scree Plot using ggplot2
ggplot(scree_svi, aes(x = PC, y = Eigenvalue)) +
  geom_line(aes(group = 1), color = "red", linetype = "dashed") +  
  geom_point(color = "red", size = 3) +  
  labs(title = "Scree Plot of PCA", x = "Principal Components", y = "Eigenvalues") +
  theme_minimal()

print(pca_unstand)
## Standard deviations (1, .., p=30):
##  [1] 5280.2272908   31.3899863   21.7302250   15.6680437   13.7619635
##  [6]   11.7814297    6.5409752    4.3485295    4.1718179    3.6104194
## [11]    3.3802079    2.8201761    2.4572456    2.0314086    1.8593021
## [16]    1.7073484    1.5515675    1.3639338    1.2984144    1.2119201
## [21]    0.9617152    0.8791984    0.8031008    0.6975128    0.5794147
## [26]    0.5413289    0.4920827    0.3439033    0.3301481    0.2200794
## 
## Rotation (n x k) = (30 x 30):
##                    PC1           PC2          PC3          PC4           PC5
## A1_P      9.371463e-04  0.0402265944 -0.127392020  0.088495385  1.413354e-01
## A2_P      3.521053e-04 -0.0216986621  0.018733062 -0.012058194  1.190082e-01
## A3_P     -9.999890e-01 -0.0029755650 -0.001972866  0.001535198  7.585626e-05
## A4_P      1.109380e-03 -0.0433185076 -0.125713189  0.081856406  1.174598e-01
## B1_P     -1.461392e-04 -0.0102759303 -0.005137695  0.087035831 -4.532912e-02
## B2_P      5.350642e-04 -0.0725841303  0.044907085 -0.020915906  2.770979e-02
## B3_P      8.721810e-05 -0.0192018140  0.005551392  0.097525208  4.850881e-02
## B4_P      5.672481e-04 -0.0237467607  0.024585082  0.056894170  8.469832e-02
## C1_P      8.963905e-04 -0.0172475862 -0.212633142  0.506421825 -8.043978e-02
## C1_NEW_P -1.399881e-03  0.2170523850 -0.041835755 -0.722773834  1.895611e-01
## C2_P     -2.150878e-05  0.0068816817 -0.010955642 -0.001933044 -4.182338e-03
## D1_P     -1.820786e-03  0.3876105399 -0.373700327  0.063464326  4.720959e-01
## D2_P      5.296432e-05 -0.0066781976 -0.006765822 -0.002714922 -9.527141e-03
## D3_P      1.516309e-03 -0.0315127252 -0.171522520  0.125352068  3.041054e-02
## D4_P      2.372808e-04  0.0413683989 -0.042512538  0.026091848  1.056459e-01
## H1_P      1.590539e-04 -0.0221111303 -0.033040395 -0.013274929  5.194766e-02
## H2_P      1.888587e-03 -0.7128890504 -0.479375321 -0.269624757 -1.440444e-01
## H3A_P     1.000464e-03 -0.1332055334 -0.143371862 -0.035326822  2.495197e-01
## H3B_P     1.458135e-03 -0.1819366143 -0.184722675 -0.061705890  2.765618e-01
## H3C_P     2.496837e-04 -0.0337859006 -0.035716959 -0.012616082  6.299550e-02
## H4_P      8.996294e-04 -0.1611071759 -0.092475244  0.071632799  7.922777e-02
## H5_P      9.885105e-04 -0.1232520169 -0.103574076  0.078808352  2.517136e-01
## H6_P      2.213169e-04 -0.0116518056 -0.044918401  0.018040585  8.797587e-02
## H7_P      1.457070e-04  0.1352118613 -0.121079191 -0.058468747  6.652222e-02
## H8_P      1.323270e-03 -0.0059692796 -0.163781809  0.230117471  2.338082e-01
## H9_P      1.628073e-04 -0.0003258136 -0.018028667  0.066506910  4.353532e-02
## P1_P      1.649313e-04  0.2251321586 -0.343651106 -0.080870252 -1.303621e-01
## P2_P      2.627268e-04  0.3104372748 -0.459634460 -0.070600530 -4.324737e-01
## P3_P      2.053023e-04  0.0596162047 -0.105783929  0.031259021 -3.758482e-01
## P4_P      3.099249e-04  0.1355559866 -0.237536229  0.046992881 -1.114773e-01
##                   PC6          PC7           PC8          PC9          PC10
## A1_P     -0.030272857  0.361569766 -0.3352735899  0.345102220 -0.0795362800
## A2_P     -0.034980296 -0.005780770 -0.0084164988  0.098828494  0.0518011257
## A3_P     -0.002022545  0.000754501  0.0001027412  0.001006447 -0.0002375071
## A4_P     -0.143188668  0.048871385  0.0499135303  0.230700986  0.2591936224
## B1_P     -0.039552431 -0.018748245 -0.0767564244 -0.059612371  0.2078407388
## B2_P      0.038874484 -0.111708546 -0.2198975363  0.400846419 -0.0747292591
## B3_P      0.004810981 -0.002843460 -0.0302893107 -0.055218686  0.1329216934
## B4_P      0.089396148  0.087362540 -0.1376153789  0.183705971  0.1427661033
## C1_P      0.096759375  0.138324070 -0.1605266762  0.129182845  0.1791404405
## C1_NEW_P -0.070497703  0.204527684 -0.1849909367 -0.032439322  0.1011123285
## C2_P     -0.004141620  0.024567069  0.0106157063 -0.023540039 -0.0014944693
## D1_P      0.535909169 -0.377780956 -0.1027971499 -0.136461636 -0.0390516736
## D2_P     -0.012188809 -0.019335665 -0.0161816124 -0.007329717  0.0158730520
## D3_P     -0.069604463 -0.068208465  0.1945137851  0.303831759 -0.6104217016
## D4_P     -0.021377070  0.198904557  0.0293479047  0.030491794  0.2228858297
## H1_P     -0.064282594 -0.046408613  0.0579864897  0.030137065  0.0031740072
## H2_P      0.386086163  0.072687995  0.0329843076 -0.027083191  0.0719680401
## H3A_P    -0.305162605 -0.134968971 -0.1092126188  0.134002473  0.0329328454
## H3B_P    -0.438976981 -0.279863969 -0.2430676183  0.063888876  0.0018367177
## H3C_P    -0.080133672 -0.039891105  0.0375867057  0.023813220 -0.0445997276
## H4_P     -0.235600847 -0.153917372  0.1008339092 -0.355182278 -0.1394665300
## H5_P     -0.219343607 -0.078105666  0.2691842623 -0.172049490  0.0086784444
## H6_P     -0.035293180  0.011266739 -0.0080580411  0.069093303  0.0531088670
## H7_P     -0.019858804 -0.084924988  0.6512990997  0.330879187  0.4245774879
## H8_P     -0.158608786  0.433950617 -0.0527478151 -0.390554254  0.1204652648
## H9_P      0.027883557  0.028940432 -0.0199144408 -0.068315583  0.0926676872
## P1_P     -0.088429906  0.305363946  0.1271010822 -0.090549671 -0.2502938410
## P2_P     -0.250051494 -0.061167649 -0.0200366274  0.078535856  0.0505228195
## P3_P     -0.109603865 -0.404513504 -0.3125925395 -0.129197165  0.2541553970
## P4_P     -0.016449920  0.070909202  0.0096777048 -0.064459116 -0.0825627497
##                   PC11          PC12          PC13          PC14          PC15
## A1_P      0.1441949024 -0.0884818743  0.1737593681  0.1628728696 -0.0223876311
## A2_P      0.0702706765  0.0942202269  0.0776773717 -0.2901701610 -0.1633320316
## A3_P      0.0007019939  0.0002528734  0.0002476212  0.0001569519 -0.0002747373
## A4_P      0.2123258528 -0.1654037907 -0.1132850188 -0.3180489055 -0.2209057190
## B1_P     -0.0708463205 -0.0914833367  0.0201346827 -0.0411694563 -0.0435557229
## B2_P     -0.0043826707 -0.1669054326  0.0840708991  0.3591506162 -0.1374576396
## B3_P      0.0564844243 -0.0373355346 -0.0141449981 -0.2843690996  0.0321087068
## B4_P      0.1180411704 -0.0194313527 -0.1393686272 -0.0701349665 -0.2479811595
## C1_P     -0.0742983314 -0.3473453473 -0.0398893713 -0.1159802544  0.3751516696
## C1_NEW_P  0.2275486558 -0.2722258752  0.0486623402 -0.0094536493  0.2075097276
## C2_P      0.0138814314 -0.0228363841 -0.0374793281 -0.0418078687  0.0305130508
## D1_P     -0.0499101424  0.0266749544 -0.0050888142  0.0194331677 -0.0921467013
## D2_P     -0.0252576096 -0.0031663117  0.0057684949 -0.0185358514  0.0470677610
## D3_P      0.3748405326  0.1387166300  0.0983940538 -0.0211550188 -0.0225482108
## D4_P      0.0031489869  0.2209618753 -0.0966371812 -0.0580407738 -0.3938183114
## H1_P     -0.0528003048 -0.0577102258  0.0262860344  0.0406823385 -0.1189119227
## H2_P     -0.0020050890  0.0534699929  0.0017587202  0.0244533903 -0.0174186269
## H3A_P    -0.1956922792  0.1935261313  0.0332821796 -0.0960858341  0.0750466226
## H3B_P    -0.2737135558  0.1508182045 -0.0351176242 -0.1340435399  0.1659115893
## H3C_P    -0.0211319545 -0.0531484062  0.0473632740 -0.0261962754  0.0844911792
## H4_P      0.0034624262 -0.5349407512  0.4808242795 -0.0102441865 -0.3076630921
## H5_P      0.3516282339 -0.1805244679 -0.6295045442  0.2666476540  0.1095071064
## H6_P      0.1457679617  0.0203851690  0.0110065663  0.1820133806  0.0241765276
## H7_P      0.0120505226  0.0579442220  0.3540528059  0.1032865203  0.1495094696
## H8_P      0.0734179235  0.3488561690  0.2426866714  0.3522026250 -0.0096438256
## H9_P      0.1046349154 -0.2665423328 -0.0033181412 -0.0415125545 -0.1300662741
## P1_P     -0.0192470685  0.0599971978 -0.0781935062 -0.4112433548 -0.1430836456
## P2_P     -0.2843101528 -0.0914963459 -0.1974516354  0.3138693534 -0.2343058983
## P3_P      0.5833876757  0.2387443785  0.1564926989 -0.0280578523 -0.0129605502
## P4_P      0.0806557496 -0.0204825039  0.1126064057 -0.1117541940  0.4612913247
##                   PC16          PC17          PC18          PC19          PC20
## A1_P     -0.0652526104 -0.0245430902 -0.4749995309  0.1446511471  4.752964e-02
## A2_P      0.1621181060  0.4051261358  0.1412194492  0.3153166234 -2.814212e-01
## A3_P     -0.0001022706 -0.0002118591  0.0002997828 -0.0001062818 -1.925671e-05
## A4_P      0.5379747433 -0.3597958548  0.0980779196  0.0425299549  2.571322e-01
## B1_P      0.0509373861 -0.2151794704 -0.0315809260  0.1230107742 -1.699755e-01
## B2_P      0.3345700260  0.1613736324  0.1616016835 -0.2876066752 -2.519286e-01
## B3_P     -0.1200166304 -0.0450488301  0.0433302576  0.3610775200 -4.623489e-01
## B4_P     -0.5680371853 -0.1132461850  0.4351540050 -0.1196436761  3.213863e-01
## C1_P     -0.1092265957  0.2754904054  0.2138745515 -0.1073629486 -1.008666e-01
## C1_NEW_P -0.0778632270  0.0278748192  0.2165832042  0.0427964903 -1.009914e-01
## C2_P     -0.0187106705 -0.0024403820  0.0052926851  0.0052486991  3.037472e-02
## D1_P      0.0576268681 -0.0238995591 -0.0162300730  0.0154915312  2.192761e-02
## D2_P      0.0083217630 -0.0484174739 -0.0173975608  0.0025642892 -6.760688e-02
## D3_P     -0.1631326189 -0.1979101946  0.1956105974  0.1407502436 -1.583336e-01
## D4_P     -0.1155383381  0.1333142357 -0.1951343190  0.0813728069  3.856065e-02
## H1_P      0.1582135144  0.1786211158  0.1090812043 -0.0983939699  1.684168e-01
## H2_P      0.0057089612 -0.0149439361 -0.0186549102  0.0335767548 -2.405821e-02
## H3A_P    -0.1190525556  0.3353372944 -0.0593655322  0.1080282659  2.420808e-01
## H3B_P    -0.0970767689 -0.2921354098 -0.0541463393 -0.2208694744 -1.775381e-01
## H3C_P     0.0609625687  0.0586532304  0.1338168320 -0.1664365900  1.377214e-01
## H4_P     -0.1131326654  0.0998037759  0.0101607033  0.0904971480  1.287349e-01
## H5_P      0.0016402017  0.1630550592 -0.0800450277  0.0092971998 -6.828621e-02
## H6_P     -0.0368041928  0.2213580683 -0.2341219343  0.1071679429  1.540656e-01
## H7_P     -0.1414834128 -0.0071709640 -0.0485799245 -0.1325671125 -1.002728e-01
## H8_P      0.1323170931 -0.1166162335  0.3078654788 -0.0767047817 -1.222998e-01
## H9_P     -0.1949120277 -0.2445896532 -0.3637915873 -0.2884249586 -2.278681e-01
## P1_P      0.0467914001  0.2338562982 -0.1009595725 -0.4768012752 -1.034936e-01
## P2_P     -0.0613650027 -0.0710942879  0.1100502535  0.2799735585 -6.140062e-02
## P3_P     -0.0221714085  0.1249466332 -0.0511584204 -0.1640802128  1.001056e-02
## P4_P      0.1350903761 -0.1197708012 -0.0673699496  0.1929226377  3.399778e-01
##                   PC21          PC22         PC23          PC24          PC25
## A1_P      0.1913431916  0.1512895269  0.320339382 -6.919663e-02 -0.0486615054
## A2_P     -0.0099557341 -0.3814377788  0.272429699 -2.240989e-01 -0.2060639190
## A3_P     -0.0001637726 -0.0003391257  0.000253230 -6.364153e-05  0.0001135195
## A4_P      0.1493222645  0.0089435388 -0.146257556 -1.056848e-01  0.1516552119
## B1_P     -0.1334821508  0.5411973600  0.055422277  2.706844e-01 -0.3656844754
## B2_P     -0.3358071707 -0.0012389897  0.142229047  1.482304e-01  0.1714446518
## B3_P      0.0275823313  0.1007501935  0.098128063  1.276471e-01  0.0250026829
## B4_P     -0.0852070630 -0.0233054761  0.301543941 -9.080648e-02 -0.0434342872
## C1_P     -0.0024947309  0.0066773597 -0.314965003 -2.570712e-02  0.0495434210
## C1_NEW_P -0.0076234396  0.0290041051 -0.218458431  7.301196e-02  0.0171780598
## C2_P     -0.1147163059 -0.0631520564  0.057202323 -8.287028e-02  0.1061761332
## D1_P      0.0358095014  0.0521254930  0.008891373 -1.560675e-02  0.0078317323
## D2_P     -0.0257981438 -0.0142697469  0.037982787 -2.593380e-02  0.0171401281
## D3_P     -0.0096776137  0.0233851616 -0.281807480  1.487627e-01 -0.0541987766
## D4_P     -0.5592540715 -0.0187861744 -0.293408991  2.694691e-01  0.0142419160
## H1_P      0.1132404391 -0.1522529144 -0.066735779  3.368808e-01 -0.3930893154
## H2_P      0.0169946133 -0.0030414816  0.013513565 -1.450018e-03 -0.0211442819
## H3A_P     0.3095573791  0.0437872078 -0.053089795  4.073489e-01  0.2794157371
## H3B_P    -0.2326365816 -0.0637502723 -0.010193948 -2.738466e-01 -0.0862431116
## H3C_P     0.0963211563  0.0496697907  0.107314811  9.134381e-02 -0.6168987604
## H4_P     -0.1339212384  0.0888798980  0.040570657 -9.303015e-02  0.1317365431
## H5_P     -0.0419152634  0.0765175566  0.235194213  3.330057e-02  0.0381647898
## H6_P     -0.1136408908  0.0243284816 -0.415338714 -4.625959e-01 -0.3037137587
## H7_P      0.0084010327  0.0560036789  0.120199553 -4.573186e-02  0.0501332588
## H8_P      0.0954480380 -0.0852487048 -0.001443675 -6.304282e-04  0.0310721700
## H9_P      0.1639614254 -0.5599190921 -0.091618447  2.563349e-01 -0.1005874056
## P1_P     -0.0125978982  0.2257108037  0.122540944 -8.247181e-02  0.0345950212
## P2_P      0.1092006740 -0.1354329905 -0.001005758 -5.109178e-02 -0.0354263369
## P3_P      0.0507680289  0.0557830717  0.024550346  3.079098e-02  0.0247423978
## P4_P     -0.4676419711 -0.2774071700  0.271444130  2.020981e-01 -0.0163299966
##                   PC26          PC27          PC28          PC29          PC30
## A1_P     -0.2473479954 -1.027811e-02 -2.923534e-02 -6.665917e-02 -9.745189e-02
## A2_P      0.0238734331  3.624083e-01 -5.739891e-02 -1.304725e-03  6.075194e-02
## A3_P      0.0000249722 -5.077513e-05  2.548705e-06 -3.628739e-05  7.901751e-05
## A4_P      0.0096458098 -4.453436e-02  3.409996e-02  3.179810e-02  4.693956e-02
## B1_P      0.3236350026  4.138135e-01 -1.534105e-01  5.386134e-02 -3.435017e-02
## B2_P      0.2221964996 -1.463586e-01  7.309511e-02  7.360951e-02  6.227738e-02
## B3_P      0.1178325185 -6.570891e-01  1.552880e-01 -2.872853e-02 -9.426008e-02
## B4_P      0.1979840861 -4.366997e-02 -5.877928e-02 -5.748934e-02  1.993041e-02
## C1_P     -0.1761186570  1.279857e-01 -1.820889e-02 -3.394940e-02 -6.369682e-03
## C1_NEW_P -0.0593174671  7.180593e-02 -1.708277e-02 -1.019688e-02  1.246032e-04
## C2_P     -0.0206307377  1.583927e-02 -1.333142e-01  7.228465e-01 -6.425515e-01
## D1_P     -0.0282533716  1.031502e-02 -1.419686e-03  1.742368e-02 -1.535459e-03
## D2_P     -0.1301916186 -2.303054e-01 -8.177557e-01  2.290677e-01  4.401395e-01
## D3_P     -0.0175157236  1.101499e-01 -5.032005e-02  3.453586e-02 -9.363013e-03
## D4_P     -0.3344973983 -1.042682e-02  7.739927e-02  4.668390e-02  8.377166e-02
## H1_P     -0.0751092178 -2.216165e-01 -3.440557e-01 -3.761691e-01 -4.820720e-01
## H2_P     -0.0101430534  4.692137e-03  4.525985e-03  7.028494e-04 -6.189241e-03
## H3A_P     0.2715091422  4.514356e-02  1.545582e-02  1.736232e-01  1.141629e-01
## H3B_P    -0.1338302002  1.261383e-02 -3.391543e-03 -1.546973e-01 -1.147346e-01
## H3C_P    -0.2621149239 -1.923647e-01  3.434105e-01  4.216021e-01  2.878288e-01
## H4_P     -0.0662047575 -1.178551e-02  1.287262e-02 -1.592123e-02  3.048296e-02
## H5_P     -0.0589934415  5.425133e-02 -1.622696e-02 -4.850895e-02  1.677066e-02
## H6_P      0.4936632684 -2.138346e-01 -3.198967e-02  4.240741e-02  1.905349e-02
## H7_P     -0.0312324944  1.176897e-02 -9.861873e-03 -2.897464e-02 -3.773744e-02
## H8_P      0.0814285089 -1.268111e-02  1.282658e-03  3.422353e-02  7.392552e-03
## H9_P      0.2427422777  9.208825e-02  3.159580e-02  1.111390e-01  8.250639e-02
## P1_P      0.1833871301 -6.914249e-02 -2.812592e-02 -3.867814e-02 -2.277636e-04
## P2_P     -0.0123653680 -2.012628e-02  4.515578e-02  5.051627e-02  3.138426e-02
## P3_P     -0.1047404540 -1.484707e-03 -1.024219e-03 -5.975386e-03 -3.133110e-02
## P4_P      0.1926923487 -4.413354e-02  2.527696e-02 -7.748037e-02  5.678309e-02
#3. Variance Explained:
#Print PCA summary (eigenvalues and explained variance)
summary(pca_unstand)
## Importance of components:
##                              PC1      PC2      PC3      PC4      PC5   PC6
## Standard deviation     5280.2273 31.38999 21.73023 15.66804 13.76196 11.78
## Proportion of Variance    0.9999  0.00004  0.00002  0.00001  0.00001  0.00
## Cumulative Proportion     0.9999  0.99996  0.99997  0.99998  0.99999  1.00
##                          PC7   PC8   PC9 PC10 PC11 PC12  PC13  PC14  PC15  PC16
## Standard deviation     6.541 4.349 4.172 3.61 3.38 2.82 2.457 2.031 1.859 1.707
## Proportion of Variance 0.000 0.000 0.000 0.00 0.00 0.00 0.000 0.000 0.000 0.000
## Cumulative Proportion  1.000 1.000 1.000 1.00 1.00 1.00 1.000 1.000 1.000 1.000
##                         PC17  PC18  PC19  PC20   PC21   PC22   PC23   PC24
## Standard deviation     1.552 1.364 1.298 1.212 0.9617 0.8792 0.8031 0.6975
## Proportion of Variance 0.000 0.000 0.000 0.000 0.0000 0.0000 0.0000 0.0000
## Cumulative Proportion  1.000 1.000 1.000 1.000 1.0000 1.0000 1.0000 1.0000
##                          PC25   PC26   PC27   PC28   PC29   PC30
## Standard deviation     0.5794 0.5413 0.4921 0.3439 0.3301 0.2201
## Proportion of Variance 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## Cumulative Proportion  1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

c. How would you interpret the resulting components?

#Perform PCA with varimax rotation retaining first component.
pca_rotated <- principal(svi_clean, nfactors = 1, rotate = "varimax", scores = TRUE)
print(pca_rotated)
## Principal Components Analysis
## Call: principal(r = svi_clean, nfactors = 1, rotate = "varimax", scores = TRUE)
## Standardized loadings (pattern matrix) based upon correlation matrix
##            PC1      h2   u2 com
## A1_P      0.73 5.3e-01 0.47   1
## A2_P      0.74 5.5e-01 0.45   1
## A3_P     -0.79 6.3e-01 0.37   1
## A4_P      0.94 8.9e-01 0.11   1
## B1_P     -0.15 2.2e-02 0.98   1
## B2_P      0.65 4.2e-01 0.58   1
## B3_P      0.47 2.2e-01 0.78   1
## B4_P      0.71 5.0e-01 0.50   1
## C1_P      0.57 3.3e-01 0.67   1
## C1_NEW_P -0.59 3.5e-01 0.65   1
## C2_P     -0.24 5.6e-02 0.94   1
## D1_P     -0.46 2.1e-01 0.79   1
## D2_P      0.55 3.0e-01 0.70   1
## D3_P      0.87 7.5e-01 0.25   1
## D4_P      0.41 1.7e-01 0.83   1
## H1_P      0.77 6.0e-01 0.40   1
## H2_P      0.62 3.8e-01 0.62   1
## H3A_P     0.90 8.1e-01 0.19   1
## H3B_P     0.90 8.0e-01 0.20   1
## H3C_P     0.86 7.4e-01 0.26   1
## H4_P      0.86 7.4e-01 0.26   1
## H5_P      0.91 8.4e-01 0.16   1
## H6_P      0.80 6.5e-01 0.35   1
## H7_P     -0.01 6.9e-05 1.00   1
## H8_P      0.83 6.9e-01 0.31   1
## H9_P      0.53 2.8e-01 0.72   1
## P1_P     -0.05 2.5e-03 1.00   1
## P2_P     -0.08 6.2e-03 0.99   1
## P3_P     -0.04 1.9e-03 1.00   1
## P4_P      0.11 1.2e-02 0.99   1
## 
##                  PC1
## SS loadings    12.46
## Proportion Var  0.42
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 component is sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.22 
##  with the empirical chi square  2190.47  with prob <  3.5e-242 
## 
## Fit based upon off diagonal values = 0.77

d. There are two variables related to minority status. Do they load onto the same component, or do they belong to different components?

Problem 2:

Standardize all the variables, then repeat the PCA. How do the results compare with the unstandardized version?

#Perform PCA (standardized).
pca_stand <- prcomp(svi_clean, scale. = TRUE)

#Compute eigenvalues from PCA (standarized).
ev2 <- (pca_stand$sdev)^2  
#1. Kaiser Method:
#Eigenvalues greater than 1
ev2[ev2 > 1]
## [1] 12.461755  5.805486  2.906425  2.574283  1.627996
#2. Scree Plot (Elbow method):
#Scree plot
#Create a dataframe for plotting
scree_svi <- data.frame(
  PC = paste0("PC", 1:length(ev2)),  
  Eigenvalue = ev2)

#Fix ordering by converting to a factor with correct levels
scree_svi$PC <- factor(scree_svi$PC, levels = paste0("PC", 1:length(ev2)))

#Plot the Scree Plot using ggplot2
ggplot(scree_svi, aes(x = PC, y = Eigenvalue)) +
  geom_line(aes(group = 1), color = "red", linetype = "dashed") +  
  geom_point(color = "red", size = 3) +  
  labs(title = "Scree Plot of PCA", x = "Principal Components", y = "Eigenvalues") +
  theme_minimal()

#3. Variance Explained:
#Print PCA summary (eigenvalues and explained variance)
summary(pca_stand)
## Importance of components:
##                           PC1    PC2     PC3     PC4     PC5     PC6    PC7
## Standard deviation     3.5301 2.4095 1.70482 1.60446 1.27593 0.87756 0.8179
## Proportion of Variance 0.4154 0.1935 0.09688 0.08581 0.05427 0.02567 0.0223
## Cumulative Proportion  0.4154 0.6089 0.70579 0.79160 0.84586 0.87153 0.8938
##                            PC8     PC9    PC10    PC11   PC12    PC13    PC14
## Standard deviation     0.70127 0.65466 0.60143 0.57656 0.5423 0.50475 0.45971
## Proportion of Variance 0.01639 0.01429 0.01206 0.01108 0.0098 0.00849 0.00704
## Cumulative Proportion  0.91022 0.92451 0.93657 0.94765 0.9575 0.96594 0.97299
##                           PC15    PC16    PC17    PC18    PC19    PC20    PC21
## Standard deviation     0.38401 0.36995 0.29603 0.28287 0.26799 0.24038 0.22402
## Proportion of Variance 0.00492 0.00456 0.00292 0.00267 0.00239 0.00193 0.00167
## Cumulative Proportion  0.97790 0.98247 0.98539 0.98805 0.99045 0.99237 0.99405
##                          PC22    PC23    PC24    PC25    PC26   PC27    PC28
## Standard deviation     0.2053 0.17940 0.17406 0.14116 0.13642 0.1226 0.10182
## Proportion of Variance 0.0014 0.00107 0.00101 0.00066 0.00062 0.0005 0.00035
## Cumulative Proportion  0.9954 0.99652 0.99753 0.99820 0.99882 0.9993 0.99967
##                           PC29    PC30
## Standard deviation     0.08303 0.05614
## Proportion of Variance 0.00023 0.00011
## Cumulative Proportion  0.99989 1.00000
#Perform PCA with varimax rotation retaining 5 components.
pca_rotated <- principal(svi_clean, nfactors = 5, rotate = "varimax", scores = TRUE)
print(pca_rotated)
## Principal Components Analysis
## Call: principal(r = svi_clean, nfactors = 5, rotate = "varimax", scores = TRUE)
## Standardized loadings (pattern matrix) based upon correlation matrix
##            RC1   RC2   RC3   RC5   RC4   h2    u2 com
## A1_P      0.40  0.41  0.38  0.63 -0.10 0.88 0.119 3.3
## A2_P      0.58 -0.24  0.07  0.54 -0.35 0.82 0.183 3.1
## A3_P     -0.43 -0.03 -0.18 -0.80 -0.24 0.92 0.082 1.9
## A4_P      0.73  0.17  0.38  0.47  0.07 0.94 0.063 2.4
## B1_P     -0.15 -0.08  0.62 -0.59  0.25 0.83 0.170 2.5
## B2_P      0.42 -0.55  0.01  0.58  0.18 0.85 0.150 3.0
## B3_P      0.18 -0.26  0.84  0.01 -0.19 0.85 0.151 1.4
## B4_P      0.24 -0.24  0.42  0.76 -0.13 0.88 0.119 2.1
## C1_P      0.20  0.24  0.83  0.22  0.26 0.91 0.091 1.7
## C1_NEW_P -0.23  0.36 -0.73 -0.22 -0.34 0.89 0.114 2.4
## C2_P     -0.10  0.69  0.00 -0.25 -0.03 0.55 0.447 1.3
## D1_P     -0.37  0.62 -0.02 -0.24 -0.41 0.75 0.254 2.8
## D2_P      0.58 -0.05  0.07  0.06  0.56 0.67 0.333 2.1
## D3_P      0.60  0.21  0.32  0.58  0.25 0.91 0.093 3.2
## D4_P      0.19  0.54  0.29  0.42 -0.48 0.82 0.178 3.8
## H1_P      0.93  0.04 -0.03  0.10 -0.08 0.88 0.117 1.0
## H2_P      0.65 -0.31  0.07  0.09  0.30 0.62 0.380 2.0
## H3A_P     0.95 -0.05  0.10  0.21 -0.03 0.96 0.040 1.1
## H3B_P     0.94 -0.06  0.09  0.21  0.07 0.95 0.048 1.1
## H3C_P     0.95 -0.04  0.04  0.17 -0.02 0.94 0.058 1.1
## H4_P      0.86 -0.22  0.30  0.09  0.20 0.92 0.084 1.5
## H5_P      0.86 -0.09  0.31  0.22 -0.09 0.91 0.088 1.5
## H6_P      0.71  0.21  0.28  0.32 -0.28 0.82 0.185 2.3
## H7_P      0.01  0.78 -0.15  0.18 -0.18 0.70 0.304 1.3
## H8_P      0.57  0.28  0.54  0.42 -0.11 0.88 0.118 3.4
## H9_P      0.18  0.11  0.73  0.27 -0.17 0.68 0.320 1.6
## P1_P     -0.01  0.96 -0.07  0.06  0.12 0.95 0.047 1.1
## P2_P     -0.03  0.90 -0.06  0.00  0.37 0.96 0.041 1.3
## P3_P     -0.08  0.40  0.06 -0.03  0.79 0.79 0.213 1.5
## P4_P      0.03  0.92  0.16  0.13  0.26 0.96 0.036 1.3
## 
##                        RC1  RC2  RC3  RC5  RC4
## SS loadings           8.82 5.75 4.20 4.16 2.45
## Proportion Var        0.29 0.19 0.14 0.14 0.08
## Cumulative Var        0.29 0.49 0.63 0.76 0.85
## Proportion Explained  0.35 0.23 0.17 0.16 0.10
## Cumulative Proportion 0.35 0.57 0.74 0.90 1.00
## 
## Mean item complexity =  2
## Test of the hypothesis that 5 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.03 
##  with the empirical chi square  49.51  with prob <  1 
## 
## Fit based upon off diagonal values = 0.99

Problem 3:

The CDC’s Social Vulnerability Index (SVI) is structured into four themes:

• Socioeconomic Status = A1_P + A2_P + A3_P + A4_P

• Household Composition & Disability = B1_P + B2_P + B3_P + B4_P

• Minority Status & Language = C1_P + C2_P

• Housing Type & Transportation = D1_P + D2_P +D3_P + D4_P

Using this structure as a theoretical model, perform a Confirmatory Factor Analysis (CFA) on the variables (A1_P to D5_P without C1_NEW_P) to assess how well the data fits the CDC SVI framework.

#Removing variables that are not included in the structure as a theoretical model as we will not perform CFA on these variables.
svi_cdc_C1 <- svi_clean %>% select(A1_P, A2_P, A3_P, A4_P, B1_P, B2_P, B3_P, B4_P, C1_P, C2_P, D1_P, D2_P, D3_P, D4_P)

#Perform PCA using the correlation matrix
pca_svi_cdc <- prcomp(svi_cdc_C1, scale = TRUE)

#Compute eigenvalues from PCA
ev3 <- (pca_svi_cdc$sdev)^2  
#Print the eigenvalues
print(ev3)
##  [1] 6.18895658 2.36954486 1.86295968 1.23155445 0.62422264 0.43848260
##  [7] 0.36362128 0.29060188 0.24146283 0.11695753 0.09704565 0.07780270
## [13] 0.05286245 0.04392488
#1. Kaiser Method:
#Eigenvalues greater than 1
ev3[ev3 > 1]
## [1] 6.188957 2.369545 1.862960 1.231554
#2. Scree Plot (Elbow method):
#Scree plot
#Create a dataframe for plotting
scree_svi <- data.frame(
  PC = paste0("PC", 1:length(ev3)),  
  Eigenvalue = ev3)

# Fix ordering by converting to a factor with correct levels
scree_svi$PC <- factor(scree_svi$PC, levels = paste0("PC", 1:length(ev3)))

#Plot the Scree Plot using ggplot2
ggplot(scree_svi, aes(x = PC, y = Eigenvalue)) +
  geom_line(aes(group = 1), color = "red", linetype = "dashed") +  
  geom_point(color = "red", size = 3) +  
  labs(title = "Scree Plot of PCA", x = "Principal Components", y = "Eigenvalues") +
  theme_minimal()

#3. Variance Explained:
#Print PCA summary (eigenvalues and explained variance)
summary(pca_svi_cdc)
## Importance of components:
##                           PC1    PC2    PC3     PC4     PC5     PC6     PC7
## Standard deviation     2.4878 1.5393 1.3649 1.10975 0.79008 0.66218 0.60301
## Proportion of Variance 0.4421 0.1693 0.1331 0.08797 0.04459 0.03132 0.02597
## Cumulative Proportion  0.4421 0.6113 0.7444 0.83236 0.87695 0.90827 0.93424
##                            PC8     PC9    PC10    PC11    PC12    PC13    PC14
## Standard deviation     0.53908 0.49139 0.34199 0.31152 0.27893 0.22992 0.20958
## Proportion of Variance 0.02076 0.01725 0.00835 0.00693 0.00556 0.00378 0.00314
## Cumulative Proportion  0.95500 0.97224 0.98060 0.98753 0.99309 0.99686 1.00000
#Correlated two factor solution, marker method

svi_cdc_model <- 'f1 =~ A1_P + A2_P + A3_P + A4_P
        f2 =~ B1_P + B2_P + B3_P + B4_P
        f3 =~ C1_P + C2_P
        f4 =~ D1_P + D2_P + D3_P + D4_P' 

fourfac14items_a <- cfa(svi_cdc_model, data=svi_cdc_C1, std.lv=TRUE) 
## Warning: lavaan->lav_data_full():  
##    some observed variances are (at least) a factor 1000 times larger than 
##    others; use varTable(fit) to investigate
## Warning: lavaan->lav_data_full():  
##    some observed variances are larger than 1000000 use varTable(fit) to 
##    investigate
## Warning: lavaan->lav_model_vcov():  
##    Could not compute standard errors! The information matrix could not be 
##    inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_model_vcov():  
##    Could not compute standard errors! The information matrix could not be 
##    inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_object_post_check():  
##    some estimated ov variances are negative
summary(fourfac14items_a,fit.measures=TRUE, standardized=TRUE)
## lavaan 0.6-19 ended normally after 246 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        34
## 
##   Number of observations                            54
## 
## Model Test User Model:
##                                                       
##   Test statistic                               444.778
##   Degrees of freedom                                71
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                               785.467
##   Degrees of freedom                                91
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.462
##   Tucker-Lewis Index (TLI)                       0.310
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -2286.359
##   Loglikelihood unrestricted model (H1)      -2063.970
##                                                       
##   Akaike (AIC)                                4640.718
##   Bayesian (BIC)                              4708.343
##   Sample-size adjusted Bayesian (SABIC)       4601.526
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.312
##   90 Percent confidence interval - lower         0.285
##   90 Percent confidence interval - upper         0.340
##   P-value H_0: RMSEA <= 0.050                    0.000
##   P-value H_0: RMSEA >= 0.080                    1.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.188
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate   Std.Err  z-value  P(>|z|)   Std.lv   Std.all
##   f1 =~                                                                   
##     A1_P              -4.902       NA                      -4.902   -0.695
##     A2_P              -1.842       NA                      -1.842   -0.626
##     A3_P            5229.526       NA                    5229.526    1.000
##     A4_P              -5.803       NA                      -5.803   -0.787
##   f2 =~                                                                   
##     B1_P              -0.497       NA                      -0.497   -0.233
##     B2_P               2.857       NA                       2.857    0.635
##     B3_P               1.060       NA                       1.060    0.489
##     B4_P               4.133       NA                       4.133    1.021
##   f3 =~                                                                   
##     C1_P              34.240       NA                      34.240    3.212
##     C2_P               0.024       NA                       0.024    0.044
##   f4 =~                                                                   
##     D1_P               8.118       NA                       8.118    0.411
##     D2_P              -0.288       NA                      -0.288   -0.489
##     D3_P              -8.380       NA                      -8.380   -0.876
##     D4_P              -1.234       NA                      -1.234   -0.386
## 
## Covariances:
##                    Estimate   Std.Err  z-value  P(>|z|)   Std.lv   Std.all
##   f1 ~~                                                                   
##     f2                -0.712       NA                      -0.712   -0.712
##     f3                -0.146       NA                      -0.146   -0.146
##     f4                 0.962       NA                       0.962    0.962
##   f2 ~~                                                                   
##     f3                 0.155       NA                       0.155    0.155
##     f4                -0.670       NA                      -0.670   -0.670
##   f3 ~~                                                                   
##     f4                -0.223       NA                      -0.223   -0.223
## 
## Variances:
##                    Estimate   Std.Err  z-value  P(>|z|)   Std.lv   Std.all
##    .A1_P              25.675       NA                      25.675    0.517
##    .A2_P               5.252       NA                       5.252    0.608
##    .A3_P            3051.233       NA                    3051.233    0.000
##    .A4_P              20.679       NA                      20.679    0.380
##    .B1_P               4.325       NA                       4.325    0.946
##    .B2_P              12.050       NA                      12.050    0.596
##    .B3_P               3.568       NA                       3.568    0.761
##    .B4_P              -0.689       NA                      -0.689   -0.042
##    .C1_P           -1058.763       NA                   -1058.763   -9.317
##    .C2_P               0.286       NA                       0.286    0.998
##    .D1_P             323.575       NA                     323.575    0.831
##    .D2_P               0.263       NA                       0.263    0.761
##    .D3_P              21.278       NA                      21.278    0.233
##    .D4_P               8.686       NA                       8.686    0.851
##     f1                 1.000                                1.000    1.000
##     f2                 1.000                                1.000    1.000
##     f3                 1.000                                1.000    1.000
##     f4                 1.000                                1.000    1.000
semPaths(fourfac14items_a, what="std", edge.label.cex = 1.2)

Problem 4:

Repeat problem 3 but replace C1_P with C1_NEW_P.

#Removing variables that are not included in the structure as a theoretical model as we will not perform CFA on these variables.
svi_cdc_C1_NEW <- svi_clean %>% select(A1_P, A2_P, A3_P, A4_P, B1_P, B2_P, B3_P, B4_P, C1_NEW_P, C2_P, D1_P, D2_P, D3_P, D4_P)

#correlated two factor solution, marker method

svi_cdc_model <- 'f1 =~ A1_P + A2_P + A3_P + A4_P
        f2 =~ B1_P + B2_P + B3_P + B4_P
        f3 =~ C1_NEW_P + C2_P
        f4 =~ D1_P + D2_P + D3_P + D4_P' 

fourfac14items_b <- cfa(svi_cdc_model, data=svi_cdc_C1_NEW, std.lv=TRUE)
## Warning: lavaan->lav_data_full():  
##    some observed variances are (at least) a factor 1000 times larger than 
##    others; use varTable(fit) to investigate
## Warning: lavaan->lav_data_full():  
##    some observed variances are larger than 1000000 use varTable(fit) to 
##    investigate
## Warning: lavaan->lav_model_vcov():  
##    Could not compute standard errors! The information matrix could not be 
##    inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_model_vcov():  
##    Could not compute standard errors! The information matrix could not be 
##    inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_object_post_check():  
##    covariance matrix of latent variables is not positive definite ; use 
##    lavInspect(fit, "cov.lv") to investigate.
summary(fourfac14items_b,fit.measures=TRUE, standardized=TRUE)
## lavaan 0.6-19 ended normally after 79 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        34
## 
##   Number of observations                            54
## 
## Model Test User Model:
##                                                       
##   Test statistic                               361.158
##   Degrees of freedom                                71
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                               780.138
##   Degrees of freedom                                91
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.579
##   Tucker-Lewis Index (TLI)                       0.460
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -2267.617
##   Loglikelihood unrestricted model (H1)      -2087.038
##                                                       
##   Akaike (AIC)                                4603.234
##   Bayesian (BIC)                              4670.860
##   Sample-size adjusted Bayesian (SABIC)       4564.042
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.275
##   90 Percent confidence interval - lower         0.247
##   90 Percent confidence interval - upper         0.304
##   P-value H_0: RMSEA <= 0.050                    0.000
##   P-value H_0: RMSEA >= 0.080                    1.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.179
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate     Std.Err  z-value  P(>|z|)   Std.lv     Std.all
##   f1 =~                                                                       
##     A1_P                 5.876       NA                         5.876    0.833
##     A2_P                 2.021       NA                         2.021    0.687
##     A3_P             -4601.424       NA                     -4601.424   -0.880
##     A4_P                 6.691       NA                         6.691    0.907
##   f2 =~                                                                       
##     B1_P                 0.119       NA                         0.119    0.056
##     B2_P                 3.081       NA                         3.081    0.685
##     B3_P                 1.314       NA                         1.314    0.607
##     B4_P                 3.165       NA                         3.165    0.782
##   f3 =~                                                                       
##     C1_NEW_P            12.839       NA                        12.839    0.834
##     C2_P                 0.232       NA                         0.232    0.434
##   f4 =~                                                                       
##     D1_P                 7.223       NA                         7.223    0.366
##     D2_P                -0.169       NA                        -0.169   -0.288
##     D3_P                -7.189       NA                        -7.189   -0.751
##     D4_P                -1.692       NA                        -1.692   -0.529
## 
## Covariances:
##                    Estimate     Std.Err  z-value  P(>|z|)   Std.lv     Std.all
##   f1 ~~                                                                       
##     f2                   0.881       NA                         0.881    0.881
##     f3                  -0.562       NA                        -0.562   -0.562
##     f4                  -1.221       NA                        -1.221   -1.221
##   f2 ~~                                                                       
##     f3                  -0.949       NA                        -0.949   -0.949
##     f4                  -0.920       NA                        -0.920   -0.920
##   f3 ~~                                                                       
##     f4                   0.738       NA                         0.738    0.738
## 
## Variances:
##                    Estimate     Std.Err  z-value  P(>|z|)   Std.lv     Std.all
##    .A1_P                15.189       NA                        15.189    0.306
##    .A2_P                 4.561       NA                         4.561    0.528
##    .A3_P           6190738.785       NA                   6190738.785    0.226
##    .A4_P                 9.600       NA                         9.600    0.177
##    .B1_P                 4.558       NA                         4.558    0.997
##    .B2_P                10.724       NA                        10.724    0.530
##    .B3_P                 2.966       NA                         2.966    0.632
##    .B4_P                 6.378       NA                         6.378    0.389
##    .C1_NEW_P            72.415       NA                        72.415    0.305
##    .C2_P                 0.233       NA                         0.233    0.812
##    .D1_P               336.696       NA                       336.696    0.866
##    .D2_P                 0.317       NA                         0.317    0.917
##    .D3_P                39.856       NA                        39.856    0.435
##    .D4_P                 7.348       NA                         7.348    0.720
##     f1                   1.000                                  1.000    1.000
##     f2                   1.000                                  1.000    1.000
##     f3                   1.000                                  1.000    1.000
##     f4                   1.000                                  1.000    1.000
semPaths(fourfac14items_b, what="std", edge.label.cex = 1.2)

#Chi-square difference test (nested models only)
anova(fourfac14items_a, fourfac14items_b)
## Warning: lavaan->lavTestLRT():  
##    some models are based on a different set of observed variables
## Warning: lavaan->lavTestLRT():  
##    Some restricted models fit better than less restricted models; either 
##    these models are not nested, or the less restricted model failed to reach 
##    a global optimum.Smallest difference = -83.61966233577.
## Warning: lavaan->lavTestLRT():  
##    some models have the same degrees of freedom
## 
## Chi-Squared Difference Test
## 
##                  Df    AIC    BIC  Chisq Chisq diff RMSEA Df diff Pr(>Chisq)
## fourfac14items_a 71 4640.7 4708.3 444.78                                    
## fourfac14items_b 71 4603.2 4670.9 361.16     -83.62     0       0
#For all fit indices
fitMeasures(fourfac14items_a, c("cfi", "tli", "rmsea", "srmr", "aic", "bic"))
##      cfi      tli    rmsea     srmr      aic      bic 
##    0.462    0.310    0.312    0.188 4640.718 4708.343
fitMeasures(fourfac14items_b, c("cfi", "tli", "rmsea", "srmr", "aic", "bic"))
##      cfi      tli    rmsea     srmr      aic      bic 
##    0.579    0.460    0.275    0.179 4603.234 4670.860

• Are there any indicators (variables) that load poorly onto their expected factors?