The data file SVI GU-2010 contains variables derived from the 2010 U.S. Census. A detailed description of these variables can be found in the SVI Documentation file.
#Loading libraries.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(psych)
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(readr)
library(lavaan)
## This is lavaan 0.6-19
## lavaan is FREE software! Please report any bugs.
##
## Attaching package: 'lavaan'
## The following object is masked from 'package:psych':
##
## cor2cov
library(semPlot)
#Loading the data.
svi <- read.csv('/Users/michaelcajigal/Desktop/MA564/Homework 3/SVI GU-2010.csv')
head(svi)
## VILLAGE_NAME CDP_NAME CDP_POP CIV_POP POP_25OVER POP_5OVER
## 1 Mangilao Adacao CDP 4,184 1,951 2384 3,840
## 2 Sinajana Afame CDP 758 369 437 667
## 3 Agana Heights Agana Heights CDP 3,718 1,724 2181 3,426
## 4 Agat Agat CDP 3,677 1,568 2171 3,354
## 5 Yigo Anao CDP 1,952 850 1074 1,777
## 6 Yigo Andersen AFB CDP 3,061 530 1343 2,633
## HOUSE_UNITS HOUSEHOLDS CIV_NONINST A1_P A2_P A3_P A4_P B1_P B2_P B3_P
## 1 1,203 1,079 4,123 19.46 6.05 16931 18.88 6.79 32.22 6.69
## 2 311 224 731 15.83 10.03 19894 18.76 8.44 30.87 11.48
## 3 1,236 1,077 3,668 18.24 7.08 19300 17.24 8.42 31.33 9.84
## 4 1,197 974 3,617 18.96 11.48 16551 21.42 10.23 30.43 12.59
## 5 546 476 1,919 21.98 7.41 14534 24.21 6.51 35.09 5.74
## 6 1,079 812 2,181 8.82 11.13 16149 7.00 0.72 38.61 1.76
## B4_P C1_P C1_NEW_P C2_P D1_P D2_P D3_P D4_P D5_P H1_P H2_P H3A_P H3B_P
## 1 11.58 95.41 30.88 0.26 0.50 1.66 23.63 3.71 0.36 1.16 52.20 10.81 16.96
## 2 14.29 93.80 31.00 0.00 36.01 0.64 21.43 6.70 0.00 0.00 29.26 3.54 6.11
## 3 15.04 94.16 29.18 0.15 14.97 0.49 16.90 6.22 0.40 0.24 12.70 2.67 5.66
## 4 14.17 97.47 20.40 0.06 8.94 1.84 23.61 7.39 2.80 0.50 6.93 7.44 15.46
## 5 15.55 96.36 36.32 0.56 0.18 1.10 30.46 3.99 1.59 1.83 82.23 14.10 22.16
## 6 10.10 47.11 85.56 0.11 1.02 1.02 5.54 2.96 8.62 0.00 5.28 1.76 8.53
## H3C_P H4_P H5_P H6_P H7_P H8_P H9_P P1_P P2_P P3_P P4_P
## 1 1.75 14.88 9.64 4.45 21.96 22.52 6.49 16.40 36.35 34.94 20.57
## 2 0.96 7.72 10.61 4.91 30.36 17.41 8.04 10.16 15.70 37.05 18.74
## 3 0.73 10.19 4.94 3.71 19.78 24.14 7.15 9.87 15.06 25.44 13.47
## 4 2.09 14.12 13.03 3.49 20.53 29.98 8.01 11.04 22.19 34.29 15.87
## 5 4.03 15.75 16.85 4.83 20.80 28.15 5.67 22.75 37.04 38.87 23.29
## 6 1.20 1.30 0.37 0.86 25.74 2.59 1.23 5.68 10.81 21.67 5.18
summary(svi)
## VILLAGE_NAME CDP_NAME CDP_POP CIV_POP
## Length:57 Length:57 Length:57 Length:57
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## POP_25OVER POP_5OVER HOUSE_UNITS HOUSEHOLDS
## Min. : 19 Length:57 Length:57 Length:57
## 1st Qu.: 627 Class :character Class :character Class :character
## Median :1441 Mode :character Mode :character Mode :character
## Mean :1484
## 3rd Qu.:2042
## Max. :3705
##
## CIV_NONINST A1_P A2_P A3_P
## Length:57 Min. : 5.56 Min. : 0.000 Min. : 8970
## Class :character 1st Qu.:18.15 1st Qu.: 6.310 1st Qu.:13972
## Mode :character Median :21.63 Median : 8.120 Median :16338
## Mean :21.89 Mean : 8.475 Mean :17468
## 3rd Qu.:25.96 3rd Qu.: 9.620 3rd Qu.:19515
## Max. :37.94 Max. :18.580 Max. :34296
##
## A4_P B1_P B2_P B3_P
## Min. : 4.07 Min. : 0.000 Min. :10.00 Min. : 1.110
## 1st Qu.:16.67 1st Qu.: 5.410 1st Qu.:30.41 1st Qu.: 6.670
## Median :20.40 Median : 6.580 Median :32.94 Median : 7.390
## Mean :20.39 Mean : 6.503 Mean :32.23 Mean : 7.638
## 3rd Qu.:23.84 3rd Qu.: 7.960 3rd Qu.:35.58 3rd Qu.: 9.230
## Max. :47.99 Max. :11.250 Max. :40.94 Max. :12.590
##
## B4_P C1_P C1_NEW_P C2_P
## Min. : 0.00 Min. :40.11 Min. :14.95 Min. :0.0000
## 1st Qu.:11.95 1st Qu.:92.14 1st Qu.:27.81 1st Qu.:0.0900
## Median :15.33 Median :94.93 Median :34.12 Median :0.2550
## Mean :14.81 Mean :90.80 Mean :38.47 Mean :0.4631
## 3rd Qu.:18.00 3rd Qu.:97.13 3rd Qu.:43.51 3rd Qu.:0.6200
## Max. :24.31 Max. :98.65 Max. :88.99 Max. :2.0600
## NA's :3
## D1_P D2_P D3_P D4_P
## Min. : 0.00 Min. :0.0000 Min. : 2.82 Min. : 0.000
## 1st Qu.: 0.27 1st Qu.:0.2800 1st Qu.:16.55 1st Qu.: 4.100
## Median : 1.69 Median :0.7300 Median :22.73 Median : 6.750
## Mean :13.00 Mean :0.8086 Mean :22.53 Mean : 7.473
## 3rd Qu.:21.71 3rd Qu.:1.2100 3rd Qu.:30.46 3rd Qu.: 8.150
## Max. :76.29 Max. :2.1300 Max. :45.10 Max. :52.630
##
## D5_P H1_P H2_P H3A_P
## Min. : 0.00 Min. : 0.0000 Min. : 1.86 Min. : 0.000
## 1st Qu.: 0.12 1st Qu.: 0.0000 1st Qu.: 6.93 1st Qu.: 1.760
## Median : 0.68 Median : 0.2600 Median :27.35 Median : 5.180
## Mean : 4.84 Mean : 0.8351 Mean :31.17 Mean : 7.679
## 3rd Qu.: 3.04 3rd Qu.: 0.7100 3rd Qu.:52.02 3rd Qu.: 9.290
## Max. :47.78 Max. :10.3600 Max. :93.44 Max. :49.290
##
## H3B_P H3C_P H4_P H5_P
## Min. : 0.00 Min. : 0.000 Min. : 0.00 Min. : 0.37
## 1st Qu.: 4.46 1st Qu.: 0.570 1st Qu.: 5.96 1st Qu.: 5.00
## Median : 9.56 Median : 1.350 Median :12.13 Median : 9.47
## Mean :13.15 Mean : 2.048 Mean :12.08 Mean :11.10
## 3rd Qu.:16.96 3rd Qu.: 2.500 3rd Qu.:14.88 3rd Qu.:13.23
## Max. :65.36 Max. :12.860 Max. :38.13 Max. :44.64
##
## H6_P H7_P H8_P H9_P
## Min. : 0.000 Min. : 0.00 Min. : 1.30 Min. : 1.230
## 1st Qu.: 3.010 1st Qu.:19.78 1st Qu.:19.42 1st Qu.: 5.670
## Median : 3.710 Median :24.10 Median :24.55 Median : 6.730
## Mean : 4.353 Mean :23.71 Mean :24.75 Mean : 6.913
## 3rd Qu.: 5.000 3rd Qu.:29.19 3rd Qu.:30.07 3rd Qu.: 8.380
## Max. :11.760 Max. :39.12 Max. :50.20 Max. :15.790
##
## P1_P P2_P P3_P P4_P
## Min. : 2.45 Min. : 5.23 Min. :15.15 Min. : 4.65
## 1st Qu.: 7.77 1st Qu.:13.60 1st Qu.:24.11 1st Qu.:13.86
## Median :17.32 Median :26.72 Median :29.63 Median :19.88
## Mean :16.90 Mean :27.62 Mean :29.89 Mean :19.80
## 3rd Qu.:21.70 3rd Qu.:38.60 3rd Qu.:36.21 3rd Qu.:24.58
## Max. :64.46 Max. :72.48 Max. :44.40 Max. :43.35
##
#Checking for any missing data because PCA interpretation changes.
colSums(is.na(svi))
## VILLAGE_NAME CDP_NAME CDP_POP CIV_POP POP_25OVER POP_5OVER
## 0 0 0 0 0 0
## HOUSE_UNITS HOUSEHOLDS CIV_NONINST A1_P A2_P A3_P
## 0 0 0 0 0 0
## A4_P B1_P B2_P B3_P B4_P C1_P
## 0 0 0 0 0 0
## C1_NEW_P C2_P D1_P D2_P D3_P D4_P
## 0 3 0 0 0 0
## D5_P H1_P H2_P H3A_P H3B_P H3C_P
## 0 0 0 0 0 0
## H4_P H5_P H6_P H7_P H8_P H9_P
## 0 0 0 0 0 0
## P1_P P2_P P3_P P4_P
## 0 0 0 0
Note. There are 3 missing data for C2_P. Due to missing data, PCA/FCA was performed using listwise deletion, thus reducing the sample size. Interpretations must acknowledge potential bias due to excluded observations.
a. Perform a Principal Component Analysis (PCA) with varimax rotation on the variables A1_P to P4_P.
#Need to extract numeric columns only for PCA on variables A1_P to P4_P.
svi_numeric <- svi %>% select(where(is.numeric))
head(svi_numeric)
## POP_25OVER A1_P A2_P A3_P A4_P B1_P B2_P B3_P B4_P C1_P C1_NEW_P
## 1 2384 19.46 6.05 16931 18.88 6.79 32.22 6.69 11.58 95.41 30.88
## 2 437 15.83 10.03 19894 18.76 8.44 30.87 11.48 14.29 93.80 31.00
## 3 2181 18.24 7.08 19300 17.24 8.42 31.33 9.84 15.04 94.16 29.18
## 4 2171 18.96 11.48 16551 21.42 10.23 30.43 12.59 14.17 97.47 20.40
## 5 1074 21.98 7.41 14534 24.21 6.51 35.09 5.74 15.55 96.36 36.32
## 6 1343 8.82 11.13 16149 7.00 0.72 38.61 1.76 10.10 47.11 85.56
## C2_P D1_P D2_P D3_P D4_P D5_P H1_P H2_P H3A_P H3B_P H3C_P H4_P H5_P H6_P
## 1 0.26 0.50 1.66 23.63 3.71 0.36 1.16 52.20 10.81 16.96 1.75 14.88 9.64 4.45
## 2 0.00 36.01 0.64 21.43 6.70 0.00 0.00 29.26 3.54 6.11 0.96 7.72 10.61 4.91
## 3 0.15 14.97 0.49 16.90 6.22 0.40 0.24 12.70 2.67 5.66 0.73 10.19 4.94 3.71
## 4 0.06 8.94 1.84 23.61 7.39 2.80 0.50 6.93 7.44 15.46 2.09 14.12 13.03 3.49
## 5 0.56 0.18 1.10 30.46 3.99 1.59 1.83 82.23 14.10 22.16 4.03 15.75 16.85 4.83
## 6 0.11 1.02 1.02 5.54 2.96 8.62 0.00 5.28 1.76 8.53 1.20 1.30 0.37 0.86
## H7_P H8_P H9_P P1_P P2_P P3_P P4_P
## 1 21.96 22.52 6.49 16.40 36.35 34.94 20.57
## 2 30.36 17.41 8.04 10.16 15.70 37.05 18.74
## 3 19.78 24.14 7.15 9.87 15.06 25.44 13.47
## 4 20.53 29.98 8.01 11.04 22.19 34.29 15.87
## 5 20.80 28.15 5.67 22.75 37.04 38.87 23.29
## 6 25.74 2.59 1.23 5.68 10.81 21.67 5.18
summary(svi_numeric)
## POP_25OVER A1_P A2_P A3_P
## Min. : 19 Min. : 5.56 Min. : 0.000 Min. : 8970
## 1st Qu.: 627 1st Qu.:18.15 1st Qu.: 6.310 1st Qu.:13972
## Median :1441 Median :21.63 Median : 8.120 Median :16338
## Mean :1484 Mean :21.89 Mean : 8.475 Mean :17468
## 3rd Qu.:2042 3rd Qu.:25.96 3rd Qu.: 9.620 3rd Qu.:19515
## Max. :3705 Max. :37.94 Max. :18.580 Max. :34296
##
## A4_P B1_P B2_P B3_P
## Min. : 4.07 Min. : 0.000 Min. :10.00 Min. : 1.110
## 1st Qu.:16.67 1st Qu.: 5.410 1st Qu.:30.41 1st Qu.: 6.670
## Median :20.40 Median : 6.580 Median :32.94 Median : 7.390
## Mean :20.39 Mean : 6.503 Mean :32.23 Mean : 7.638
## 3rd Qu.:23.84 3rd Qu.: 7.960 3rd Qu.:35.58 3rd Qu.: 9.230
## Max. :47.99 Max. :11.250 Max. :40.94 Max. :12.590
##
## B4_P C1_P C1_NEW_P C2_P
## Min. : 0.00 Min. :40.11 Min. :14.95 Min. :0.0000
## 1st Qu.:11.95 1st Qu.:92.14 1st Qu.:27.81 1st Qu.:0.0900
## Median :15.33 Median :94.93 Median :34.12 Median :0.2550
## Mean :14.81 Mean :90.80 Mean :38.47 Mean :0.4631
## 3rd Qu.:18.00 3rd Qu.:97.13 3rd Qu.:43.51 3rd Qu.:0.6200
## Max. :24.31 Max. :98.65 Max. :88.99 Max. :2.0600
## NA's :3
## D1_P D2_P D3_P D4_P
## Min. : 0.00 Min. :0.0000 Min. : 2.82 Min. : 0.000
## 1st Qu.: 0.27 1st Qu.:0.2800 1st Qu.:16.55 1st Qu.: 4.100
## Median : 1.69 Median :0.7300 Median :22.73 Median : 6.750
## Mean :13.00 Mean :0.8086 Mean :22.53 Mean : 7.473
## 3rd Qu.:21.71 3rd Qu.:1.2100 3rd Qu.:30.46 3rd Qu.: 8.150
## Max. :76.29 Max. :2.1300 Max. :45.10 Max. :52.630
##
## D5_P H1_P H2_P H3A_P
## Min. : 0.00 Min. : 0.0000 Min. : 1.86 Min. : 0.000
## 1st Qu.: 0.12 1st Qu.: 0.0000 1st Qu.: 6.93 1st Qu.: 1.760
## Median : 0.68 Median : 0.2600 Median :27.35 Median : 5.180
## Mean : 4.84 Mean : 0.8351 Mean :31.17 Mean : 7.679
## 3rd Qu.: 3.04 3rd Qu.: 0.7100 3rd Qu.:52.02 3rd Qu.: 9.290
## Max. :47.78 Max. :10.3600 Max. :93.44 Max. :49.290
##
## H3B_P H3C_P H4_P H5_P
## Min. : 0.00 Min. : 0.000 Min. : 0.00 Min. : 0.37
## 1st Qu.: 4.46 1st Qu.: 0.570 1st Qu.: 5.96 1st Qu.: 5.00
## Median : 9.56 Median : 1.350 Median :12.13 Median : 9.47
## Mean :13.15 Mean : 2.048 Mean :12.08 Mean :11.10
## 3rd Qu.:16.96 3rd Qu.: 2.500 3rd Qu.:14.88 3rd Qu.:13.23
## Max. :65.36 Max. :12.860 Max. :38.13 Max. :44.64
##
## H6_P H7_P H8_P H9_P
## Min. : 0.000 Min. : 0.00 Min. : 1.30 Min. : 1.230
## 1st Qu.: 3.010 1st Qu.:19.78 1st Qu.:19.42 1st Qu.: 5.670
## Median : 3.710 Median :24.10 Median :24.55 Median : 6.730
## Mean : 4.353 Mean :23.71 Mean :24.75 Mean : 6.913
## 3rd Qu.: 5.000 3rd Qu.:29.19 3rd Qu.:30.07 3rd Qu.: 8.380
## Max. :11.760 Max. :39.12 Max. :50.20 Max. :15.790
##
## P1_P P2_P P3_P P4_P
## Min. : 2.45 Min. : 5.23 Min. :15.15 Min. : 4.65
## 1st Qu.: 7.77 1st Qu.:13.60 1st Qu.:24.11 1st Qu.:13.86
## Median :17.32 Median :26.72 Median :29.63 Median :19.88
## Mean :16.90 Mean :27.62 Mean :29.89 Mean :19.80
## 3rd Qu.:21.70 3rd Qu.:38.60 3rd Qu.:36.21 3rd Qu.:24.58
## Max. :64.46 Max. :72.48 Max. :44.40 Max. :43.35
##
#Removing "POP_25OVER" variable as we will not perform PCA on this variable.
svi_num <- svi_numeric %>% select(-POP_25OVER, -D5_P)
#Listwise deletion: Keep only complete rows (no NAs)
svi_clean <- svi_num[complete.cases(svi_num), ]
#Compute the covariance matrix.
cov_svi <- cov(svi_clean)
print(cov_svi)
## A1_P A2_P A3_P A4_P B1_P
## A1_P 5.065629e+01 1.063180e+01 -26128.0398 4.192061e+01 -3.43255849
## A2_P 1.063180e+01 8.809417e+00 -9816.8146 1.466883e+01 -2.50685472
## A3_P -2.612804e+04 -9.816815e+03 27880188.0657 -3.092975e+04 4074.50792453
## A4_P 4.192061e+01 1.466883e+01 -30929.7488 5.539837e+01 -1.72805472
## B1_P -3.432558e+00 -2.506855e+00 4074.5079 -1.728055e+00 4.65871321
## B2_P 1.076905e+01 8.257583e+00 -14917.6977 1.734929e+01 -2.55725094
## B3_P 4.238662e+00 2.637835e+00 -2431.5978 6.576044e+00 2.14509245
## B4_P 1.849997e+01 7.862490e+00 -15815.1071 1.967373e+01 -1.99226415
## C1_P 4.742181e+01 3.638733e+00 -24991.3939 4.859398e+01 8.57250189
## C1_NEW_P -3.137257e+01 -1.205021e+01 39028.5371 -5.791651e+01 -12.66355849
## C2_P 3.626387e-01 -5.516303e-01 599.6663 -3.016268e-01 0.01535094
## D1_P -3.997941e+00 -2.177523e+01 50763.4982 -5.110366e+01 -0.68656038
## D2_P 9.808066e-01 4.164093e-01 -1476.6444 2.218507e+00 0.08587358
## D3_P 5.313013e+01 1.514244e+01 -42275.1022 6.282416e+01 -5.05245660
## D4_P 1.652533e+01 4.017010e+00 -6615.5639 1.250066e+01 -1.26758868
## H1_P 5.595557e+00 3.424814e+00 -4434.3938 1.019967e+01 -0.84123019
## H2_P 3.927664e+01 2.523316e+01 -52652.4138 1.012381e+02 -5.85781887
## H3A_P 3.566779e+01 1.912095e+01 -27892.8401 5.528167e+01 -3.55532075
## H3B_P 4.679148e+01 2.494628e+01 -40652.7180 7.611261e+01 -4.20950566
## H3C_P 8.345731e+00 4.659195e+00 -6961.1540 1.396031e+01 -1.11983396
## H4_P 2.328295e+01 1.352923e+01 -25081.4711 4.636701e+01 0.73100566
## H5_P 3.291549e+01 1.727469e+01 -27559.6125 5.383710e+01 -2.23443962
## H6_P 1.198075e+01 4.291952e+00 -6170.3282 1.337447e+01 -1.05274717
## H7_P 1.362357e+01 -8.727526e-02 -4062.6795 9.006687e+00 -3.50354528
## H8_P 6.074150e+01 1.641751e+01 -36893.1038 6.300685e+01 -1.00387170
## H9_P 8.429944e+00 1.692624e+00 -4539.1119 8.454221e+00 0.67770189
## P1_P 3.222465e+01 -8.388021e+00 -4598.7041 1.302488e+01 -3.28612453
## P2_P 3.396561e+01 -1.674442e+01 -7325.4001 1.576086e+01 0.88498302
## P3_P 8.234755e-01 -7.377804e+00 -5723.9795 4.614531e+00 4.20072264
## P4_P 2.678682e+01 -4.737138e+00 -8641.0139 1.659495e+01 -0.28850000
## B2_P B3_P B4_P C1_P C1_NEW_P
## A1_P 10.769055 4.2386623 1.849997e+01 4.742181e+01 -31.3725726
## A2_P 8.257583 2.6378352 7.862490e+00 3.638733e+00 -12.0502053
## A3_P -14917.697743 -2431.5978407 -1.581511e+04 -2.499139e+04 39028.5370650
## A4_P 17.349293 6.5760444 1.967373e+01 4.859398e+01 -57.9165122
## B1_P -2.557251 2.1450925 -1.992264e+00 8.572502e+00 -12.6635585
## B2_P 20.596909 1.7168432 1.210744e+01 8.534765e+00 -33.2409662
## B3_P 1.716843 4.7797950 4.357610e+00 1.387777e+01 -22.5695050
## B4_P 12.107443 4.3576105 1.670760e+01 2.065626e+01 -33.9791395
## C1_P 8.534765 13.8777679 2.065626e+01 1.135332e+02 -125.2223708
## C1_NEW_P -33.240966 -22.5695050 -3.397914e+01 -1.252224e+02 241.7171110
## C2_P -1.459437 -0.2400254 -6.809239e-01 3.814324e-01 3.0394360
## D1_P -56.395780 -6.4531901 -2.884046e+01 -9.202507e+00 158.5892466
## D2_P 1.088362 0.1960143 4.389358e-01 1.775529e+00 -3.2215886
## D3_P 22.227788 6.0906073 2.392003e+01 6.741360e+01 -84.3828247
## D4_P -1.360083 1.5559235 5.624201e+00 1.154843e+01 0.9508801
## H1_P 3.842592 0.4978256 2.134121e+00 4.117350e+00 -6.1464744
## H2_P 70.917389 9.4128776 3.988927e+01 8.153938e+01 -177.3238470
## H3A_P 22.666148 5.7007092 1.701547e+01 2.922673e+01 -48.1678162
## H3B_P 32.541585 7.3589287 2.280993e+01 3.952785e+01 -69.5704552
## H3C_P 5.707493 1.1244415 3.875334e+00 6.725265e+00 -11.4014858
## H4_P 20.700348 7.4358566 1.421481e+01 3.773432e+01 -75.8766189
## H5_P 19.483405 8.6248920 1.896237e+01 3.897824e+01 -66.0617891
## H6_P 3.857423 1.6848235 4.815907e+00 1.087627e+01 -9.4904746
## H7_P -9.794245 -3.3719059 -2.676290e+00 4.135528e+00 36.0448790
## H8_P 11.607260 10.5511635 2.466232e+01 7.342985e+01 -76.6300544
## H9_P 1.629560 2.7889807 4.796369e+00 1.449939e+01 -15.4389353
## P1_P -24.309437 -8.1983191 -1.077908e+01 2.622728e+01 60.6197856
## P2_P -30.005847 -12.5305887 -1.927533e+01 4.194815e+01 63.5203821
## P3_P -4.315880 -2.6778107 -6.338611e+00 2.063958e+01 -12.1611390
## P4_P -12.032892 -2.3512069 -2.855668e+00 3.697291e+01 10.4042119
## C2_P D1_P D2_P D3_P D4_P
## A1_P 0.36263868 -3.9979406 9.808066e-01 5.313013e+01 16.5253264
## A2_P -0.55163029 -21.7752278 4.164093e-01 1.514244e+01 4.0170105
## A3_P 599.66626136 50763.4981901 -1.476644e+03 -4.227510e+04 -6615.5638714
## A4_P -0.30162676 -51.1036630 2.218507e+00 6.282416e+01 12.5006632
## B1_P 0.01535094 -0.6865604 8.587358e-02 -5.052457e+00 -1.2675887
## B2_P -1.45943704 -56.3957796 1.088362e+00 2.222779e+01 -1.3600831
## B3_P -0.24002537 -6.4531901 1.960143e-01 6.090607e+00 1.5559235
## B4_P -0.68092393 -28.8404640 4.389358e-01 2.392003e+01 5.6242010
## C1_P 0.38143239 -9.2025072 1.775529e+00 6.741360e+01 11.5484289
## C1_NEW_P 3.03943595 158.5892466 -3.221589e+00 -8.438282e+01 0.9508801
## C2_P 0.29195405 4.5500650 -2.334846e-02 -4.191169e-01 0.4701437
## D1_P 4.55006495 396.2088991 -5.493134e+00 -5.914266e+01 16.1912234
## D2_P -0.02334846 -5.4931338 3.524698e-01 2.715407e+00 -0.1622783
## D3_P -0.41911691 -59.1426614 2.715407e+00 9.325790e+01 11.9678685
## D4_P 0.47014375 16.1912234 -1.622783e-01 1.196787e+01 10.4018103
## H1_P -0.19143683 -10.4012297 5.044059e-01 1.066492e+01 1.6152054
## H2_P -3.39296862 -273.1013940 8.744930e+00 1.271998e+02 -11.7608188
## H3A_P -1.05479144 -75.1440208 3.024738e+00 6.105544e+01 9.0074817
## H3B_P -1.45700297 -114.8469367 4.692193e+00 8.554705e+01 10.0009989
## H3C_P -0.23849843 -19.2840230 7.446940e-01 1.577069e+01 1.6484239
## H4_P -1.11993082 -97.0518440 2.965043e+00 5.523625e+01 1.4788818
## H5_P -0.76349294 -70.6547875 2.346698e+00 6.101309e+01 9.1193923
## H6_P -0.05930908 -2.6779653 3.391312e-01 1.513244e+01 4.3308896
## H7_P 1.25452397 68.2047607 -3.834597e-01 1.188406e+01 10.6447767
## H8_P 0.23220870 -33.8385926 1.907509e+00 7.579414e+01 22.3639790
## H9_P 0.01613630 1.3588625 1.387803e-01 9.378868e+00 2.8502244
## P1_P 3.82765573 113.9471283 -3.746188e-02 2.621714e+01 16.6159723
## P2_P 4.68359969 129.0248544 1.145626e+00 3.529756e+01 15.0340654
## P3_P 0.74408721 -3.2275078 1.408819e+00 1.532527e+01 -3.6285910
## P4_P 2.26545828 66.3357878 5.131621e-01 2.950054e+01 10.2867342
## H1_P H2_P H3A_P H3B_P H3C_P
## A1_P 5.5955566 39.276636 3.566779e+01 46.791480 8.3457311
## A2_P 3.4248140 25.233162 1.912095e+01 24.946284 4.6591953
## A3_P -4434.3938015 -52652.413767 -2.789284e+04 -40652.718036 -6961.1540252
## A4_P 10.1996682 101.238066 5.528167e+01 76.112607 13.9603138
## B1_P -0.8412302 -5.857819 -3.555321e+00 -4.209506 -1.1198340
## B2_P 3.8425919 70.917389 2.266615e+01 32.541585 5.7074931
## B3_P 0.4978256 9.412878 5.700709e+00 7.358929 1.1244415
## B4_P 2.1341210 39.889268 1.701547e+01 22.809927 3.8753336
## C1_P 4.1173503 81.539384 2.922673e+01 39.527851 6.7252651
## C1_NEW_P -6.1464744 -177.323847 -4.816782e+01 -69.570455 -11.4014858
## C2_P -0.1914368 -3.392969 -1.054791e+00 -1.457003 -0.2384984
## D1_P -10.4012297 -273.101394 -7.514402e+01 -114.846937 -19.2840230
## D2_P 0.5044059 8.744930 3.024738e+00 4.692193 0.7446940
## D3_P 10.6649215 127.199757 6.105544e+01 85.547048 15.7706874
## D4_P 1.6152054 -11.760819 9.007482e+00 10.000999 1.6484239
## H1_P 3.5442166 27.256046 1.523568e+01 20.246029 4.1293384
## H2_P 27.2560458 751.536649 1.574188e+02 218.404098 39.5908805
## H3A_P 15.2356818 157.418822 8.290074e+01 112.013340 20.4972676
## H3B_P 20.2460288 218.404098 1.120133e+02 156.395646 28.0184412
## H3C_P 4.1293384 39.590881 2.049727e+01 28.018441 5.6896142
## H4_P 12.2113390 161.349440 6.471721e+01 90.926165 17.0377094
## H5_P 13.0062272 138.178029 6.959961e+01 94.819865 17.9883031
## H6_P 3.0081978 24.585532 1.620865e+01 20.822982 3.9368616
## H7_P 1.5601925 -58.495619 -7.813718e-01 -3.660758 0.3323572
## H8_P 10.3658662 82.630642 6.138033e+01 82.182341 14.9422858
## H9_P 0.8138725 8.884561 5.068677e+00 7.177738 1.2374871
## P1_P 0.5010082 -66.694507 -5.435555e+00 -7.915443 -1.1147531
## P2_P 0.1752686 -97.275027 -1.066949e+01 -9.840459 -2.9401368
## P3_P -1.5268958 -5.887369 -6.331693e+00 -2.048023 -2.0031019
## P4_P 0.6821602 -25.947304 1.234623e+00 2.499920 0.4030774
## H4_P H5_P H6_P H7_P H8_P
## A1_P 2.328295e+01 3.291549e+01 1.198075e+01 1.362357e+01 6.074150e+01
## A2_P 1.352923e+01 1.727469e+01 4.291952e+00 -8.727526e-02 1.641751e+01
## A3_P -2.508147e+04 -2.755961e+04 -6.170328e+03 -4.062680e+03 -3.689310e+04
## A4_P 4.636701e+01 5.383710e+01 1.337447e+01 9.006687e+00 6.300685e+01
## B1_P 7.310057e-01 -2.234440e+00 -1.052747e+00 -3.503545e+00 -1.003872e+00
## B2_P 2.070035e+01 1.948341e+01 3.857423e+00 -9.794245e+00 1.160726e+01
## B3_P 7.435857e+00 8.624892e+00 1.684823e+00 -3.371906e+00 1.055116e+01
## B4_P 1.421481e+01 1.896237e+01 4.815907e+00 -2.676290e+00 2.466232e+01
## C1_P 3.773432e+01 3.897824e+01 1.087627e+01 4.135528e+00 7.342985e+01
## C1_NEW_P -7.587662e+01 -6.606179e+01 -9.490475e+00 3.604488e+01 -7.663005e+01
## C2_P -1.119931e+00 -7.634929e-01 -5.930908e-02 1.254524e+00 2.322087e-01
## D1_P -9.705184e+01 -7.065479e+01 -2.677965e+00 6.820476e+01 -3.383859e+01
## D2_P 2.965043e+00 2.346698e+00 3.391312e-01 -3.834597e-01 1.907509e+00
## D3_P 5.523625e+01 6.101309e+01 1.513244e+01 1.188406e+01 7.579414e+01
## D4_P 1.478882e+00 9.119392e+00 4.330890e+00 1.064478e+01 2.236398e+01
## H1_P 1.221134e+01 1.300623e+01 3.008198e+00 1.560193e+00 1.036587e+01
## H2_P 1.613494e+02 1.381780e+02 2.458553e+01 -5.849562e+01 8.263064e+01
## H3A_P 6.471721e+01 6.959961e+01 1.620865e+01 -7.813718e-01 6.138033e+01
## H3B_P 9.092617e+01 9.481987e+01 2.082298e+01 -3.660758e+00 8.218234e+01
## H3C_P 1.703771e+01 1.798830e+01 3.936862e+00 3.323572e-01 1.494229e+01
## H4_P 7.012037e+01 6.216375e+01 1.156781e+01 -1.228482e+01 5.236899e+01
## H5_P 6.216375e+01 7.415478e+01 1.580125e+01 -2.358774e+00 6.426191e+01
## H6_P 1.156781e+01 1.580125e+01 5.302512e+00 3.510088e+00 1.743551e+01
## H7_P -1.228482e+01 -2.358774e+00 3.510088e+00 4.087370e+01 1.100183e+01
## H8_P 5.236899e+01 6.426191e+01 1.743551e+01 1.100183e+01 1.015260e+02
## H9_P 6.956740e+00 8.580773e+00 2.287347e+00 9.099312e-01 1.284785e+01
## P1_P -1.818015e+01 -1.144097e+01 3.194119e+00 4.835925e+01 2.811541e+01
## P2_P -2.233044e+01 -2.252899e+01 1.288443e+00 6.513438e+01 2.394500e+01
## P3_P 6.934755e-01 -8.776734e+00 -1.860634e+00 8.783971e+00 -1.694850e+00
## P4_P -3.597668e+00 -2.596841e-01 3.836770e+00 3.019090e+01 2.864361e+01
## H9_P P1_P P2_P P3_P P4_P
## A1_P 8.4299443 3.222465e+01 33.9656123 0.8234755 26.7868245
## A2_P 1.6926240 -8.388021e+00 -16.7444217 -7.3778044 -4.7371384
## A3_P -4539.1119008 -4.598704e+03 -7325.4000629 -5723.9794759 -8641.0139413
## A4_P 8.4542209 1.302488e+01 15.7608610 4.6145306 16.5949545
## B1_P 0.6777019 -3.286125e+00 0.8849830 4.2007226 -0.2885000
## B2_P 1.6295601 -2.430944e+01 -30.0058465 -4.3158801 -12.0328920
## B3_P 2.7889807 -8.198319e+00 -12.5305887 -2.6778107 -2.3512069
## B4_P 4.7963690 -1.077908e+01 -19.2753343 -6.3386109 -2.8556681
## C1_P 14.4993884 2.622728e+01 41.9481500 20.6395774 36.9729075
## C1_NEW_P -15.4389353 6.061979e+01 63.5203821 -12.1611390 10.4042119
## C2_P 0.0161363 3.827656e+00 4.6835997 0.7440872 2.2654583
## D1_P 1.3588625 1.139471e+02 129.0248544 -3.2275078 66.3357878
## D2_P 0.1387803 -3.746188e-02 1.1456261 1.4088193 0.5131621
## D3_P 9.3788683 2.621714e+01 35.2975610 15.3252671 29.5005400
## D4_P 2.8502244 1.661597e+01 15.0340654 -3.6285910 10.2867342
## H1_P 0.8138725 5.010082e-01 0.1752686 -1.5268958 0.6821602
## H2_P 8.8845607 -6.669451e+01 -97.2750270 -5.8873690 -25.9473042
## H3A_P 5.0686768 -5.435555e+00 -10.6694871 -6.3316933 1.2346231
## H3B_P 7.1777380 -7.915443e+00 -9.8404588 -2.0480229 2.4999199
## H3C_P 1.2374871 -1.114753e+00 -2.9401368 -2.0031019 0.4030774
## H4_P 6.9567403 -1.818015e+01 -22.3304396 0.6934755 -3.5976679
## H5_P 8.5807730 -1.144097e+01 -22.5289950 -8.7767338 -0.2596841
## H6_P 2.2873470 3.194119e+00 1.2884428 -1.8606344 3.8367700
## H7_P 0.9099312 4.835925e+01 65.1343799 8.7839709 30.1909023
## H8_P 12.8478510 2.811541e+01 23.9449991 -1.6948497 28.6436069
## H9_P 4.4841796 1.049362e+00 -0.9150619 -0.8956602 3.0830197
## P1_P 1.0493617 1.190871e+02 158.0575733 34.8546520 73.0023853
## P2_P -0.9150619 1.580576e+02 244.3075236 75.5246038 102.9644038
## P3_P -0.8956602 3.485465e+01 75.5246038 53.1956365 29.2132006
## P4_P 3.0830197 7.300239e+01 102.9644038 29.2132006 52.3501270
#Compute total variance
trace_cov <- sum(diag(cov_svi))
# Print the result
print(trace_cov)
## [1] 27882971
#Perform PCA (unstandardized).
pca_unstand <- prcomp(svi_clean)
#Compute eigenvalues from PCA
ev1 <- (pca_unstand$sdev)^2
#Print the eigenvalues
print(ev1)
## [1] 2.788080e+07 9.853312e+02 4.722027e+02 2.454876e+02 1.893916e+02
## [6] 1.388021e+02 4.278436e+01 1.890971e+01 1.740406e+01 1.303513e+01
## [11] 1.142581e+01 7.953393e+00 6.038056e+00 4.126621e+00 3.457004e+00
## [16] 2.915039e+00 2.407362e+00 1.860315e+00 1.685880e+00 1.468750e+00
## [21] 9.248961e-01 7.729898e-01 6.449708e-01 4.865241e-01 3.357214e-01
## [26] 2.930370e-01 2.421454e-01 1.182695e-01 1.089978e-01 4.843496e-02
b. How many principal components should be retained? Justify your choice.
#1. Kaiser Method:
#Eigenvalues greater than 1
ev1[ev1 > 1]
## [1] 2.788080e+07 9.853312e+02 4.722027e+02 2.454876e+02 1.893916e+02
## [6] 1.388021e+02 4.278436e+01 1.890971e+01 1.740406e+01 1.303513e+01
## [11] 1.142581e+01 7.953393e+00 6.038056e+00 4.126621e+00 3.457004e+00
## [16] 2.915039e+00 2.407362e+00 1.860315e+00 1.685880e+00 1.468750e+00
#2. Scree Plot (Elbow method):
#Scree plot
#Create a dataframe for plotting
scree_svi <- data.frame(
PC = paste0("PC", 1:length(ev1)),
Eigenvalue = ev1)
#Fix ordering by converting to a factor with correct levels
scree_svi$PC <- factor(scree_svi$PC, levels = paste0("PC", 1:length(ev1)))
#Plot the Scree Plot using ggplot2
ggplot(scree_svi, aes(x = PC, y = Eigenvalue)) +
geom_line(aes(group = 1), color = "red", linetype = "dashed") +
geom_point(color = "red", size = 3) +
labs(title = "Scree Plot of PCA", x = "Principal Components", y = "Eigenvalues") +
theme_minimal()
print(pca_unstand)
## Standard deviations (1, .., p=30):
## [1] 5280.2272908 31.3899863 21.7302250 15.6680437 13.7619635
## [6] 11.7814297 6.5409752 4.3485295 4.1718179 3.6104194
## [11] 3.3802079 2.8201761 2.4572456 2.0314086 1.8593021
## [16] 1.7073484 1.5515675 1.3639338 1.2984144 1.2119201
## [21] 0.9617152 0.8791984 0.8031008 0.6975128 0.5794147
## [26] 0.5413289 0.4920827 0.3439033 0.3301481 0.2200794
##
## Rotation (n x k) = (30 x 30):
## PC1 PC2 PC3 PC4 PC5
## A1_P 9.371463e-04 0.0402265944 -0.127392020 0.088495385 1.413354e-01
## A2_P 3.521053e-04 -0.0216986621 0.018733062 -0.012058194 1.190082e-01
## A3_P -9.999890e-01 -0.0029755650 -0.001972866 0.001535198 7.585626e-05
## A4_P 1.109380e-03 -0.0433185076 -0.125713189 0.081856406 1.174598e-01
## B1_P -1.461392e-04 -0.0102759303 -0.005137695 0.087035831 -4.532912e-02
## B2_P 5.350642e-04 -0.0725841303 0.044907085 -0.020915906 2.770979e-02
## B3_P 8.721810e-05 -0.0192018140 0.005551392 0.097525208 4.850881e-02
## B4_P 5.672481e-04 -0.0237467607 0.024585082 0.056894170 8.469832e-02
## C1_P 8.963905e-04 -0.0172475862 -0.212633142 0.506421825 -8.043978e-02
## C1_NEW_P -1.399881e-03 0.2170523850 -0.041835755 -0.722773834 1.895611e-01
## C2_P -2.150878e-05 0.0068816817 -0.010955642 -0.001933044 -4.182338e-03
## D1_P -1.820786e-03 0.3876105399 -0.373700327 0.063464326 4.720959e-01
## D2_P 5.296432e-05 -0.0066781976 -0.006765822 -0.002714922 -9.527141e-03
## D3_P 1.516309e-03 -0.0315127252 -0.171522520 0.125352068 3.041054e-02
## D4_P 2.372808e-04 0.0413683989 -0.042512538 0.026091848 1.056459e-01
## H1_P 1.590539e-04 -0.0221111303 -0.033040395 -0.013274929 5.194766e-02
## H2_P 1.888587e-03 -0.7128890504 -0.479375321 -0.269624757 -1.440444e-01
## H3A_P 1.000464e-03 -0.1332055334 -0.143371862 -0.035326822 2.495197e-01
## H3B_P 1.458135e-03 -0.1819366143 -0.184722675 -0.061705890 2.765618e-01
## H3C_P 2.496837e-04 -0.0337859006 -0.035716959 -0.012616082 6.299550e-02
## H4_P 8.996294e-04 -0.1611071759 -0.092475244 0.071632799 7.922777e-02
## H5_P 9.885105e-04 -0.1232520169 -0.103574076 0.078808352 2.517136e-01
## H6_P 2.213169e-04 -0.0116518056 -0.044918401 0.018040585 8.797587e-02
## H7_P 1.457070e-04 0.1352118613 -0.121079191 -0.058468747 6.652222e-02
## H8_P 1.323270e-03 -0.0059692796 -0.163781809 0.230117471 2.338082e-01
## H9_P 1.628073e-04 -0.0003258136 -0.018028667 0.066506910 4.353532e-02
## P1_P 1.649313e-04 0.2251321586 -0.343651106 -0.080870252 -1.303621e-01
## P2_P 2.627268e-04 0.3104372748 -0.459634460 -0.070600530 -4.324737e-01
## P3_P 2.053023e-04 0.0596162047 -0.105783929 0.031259021 -3.758482e-01
## P4_P 3.099249e-04 0.1355559866 -0.237536229 0.046992881 -1.114773e-01
## PC6 PC7 PC8 PC9 PC10
## A1_P -0.030272857 0.361569766 -0.3352735899 0.345102220 -0.0795362800
## A2_P -0.034980296 -0.005780770 -0.0084164988 0.098828494 0.0518011257
## A3_P -0.002022545 0.000754501 0.0001027412 0.001006447 -0.0002375071
## A4_P -0.143188668 0.048871385 0.0499135303 0.230700986 0.2591936224
## B1_P -0.039552431 -0.018748245 -0.0767564244 -0.059612371 0.2078407388
## B2_P 0.038874484 -0.111708546 -0.2198975363 0.400846419 -0.0747292591
## B3_P 0.004810981 -0.002843460 -0.0302893107 -0.055218686 0.1329216934
## B4_P 0.089396148 0.087362540 -0.1376153789 0.183705971 0.1427661033
## C1_P 0.096759375 0.138324070 -0.1605266762 0.129182845 0.1791404405
## C1_NEW_P -0.070497703 0.204527684 -0.1849909367 -0.032439322 0.1011123285
## C2_P -0.004141620 0.024567069 0.0106157063 -0.023540039 -0.0014944693
## D1_P 0.535909169 -0.377780956 -0.1027971499 -0.136461636 -0.0390516736
## D2_P -0.012188809 -0.019335665 -0.0161816124 -0.007329717 0.0158730520
## D3_P -0.069604463 -0.068208465 0.1945137851 0.303831759 -0.6104217016
## D4_P -0.021377070 0.198904557 0.0293479047 0.030491794 0.2228858297
## H1_P -0.064282594 -0.046408613 0.0579864897 0.030137065 0.0031740072
## H2_P 0.386086163 0.072687995 0.0329843076 -0.027083191 0.0719680401
## H3A_P -0.305162605 -0.134968971 -0.1092126188 0.134002473 0.0329328454
## H3B_P -0.438976981 -0.279863969 -0.2430676183 0.063888876 0.0018367177
## H3C_P -0.080133672 -0.039891105 0.0375867057 0.023813220 -0.0445997276
## H4_P -0.235600847 -0.153917372 0.1008339092 -0.355182278 -0.1394665300
## H5_P -0.219343607 -0.078105666 0.2691842623 -0.172049490 0.0086784444
## H6_P -0.035293180 0.011266739 -0.0080580411 0.069093303 0.0531088670
## H7_P -0.019858804 -0.084924988 0.6512990997 0.330879187 0.4245774879
## H8_P -0.158608786 0.433950617 -0.0527478151 -0.390554254 0.1204652648
## H9_P 0.027883557 0.028940432 -0.0199144408 -0.068315583 0.0926676872
## P1_P -0.088429906 0.305363946 0.1271010822 -0.090549671 -0.2502938410
## P2_P -0.250051494 -0.061167649 -0.0200366274 0.078535856 0.0505228195
## P3_P -0.109603865 -0.404513504 -0.3125925395 -0.129197165 0.2541553970
## P4_P -0.016449920 0.070909202 0.0096777048 -0.064459116 -0.0825627497
## PC11 PC12 PC13 PC14 PC15
## A1_P 0.1441949024 -0.0884818743 0.1737593681 0.1628728696 -0.0223876311
## A2_P 0.0702706765 0.0942202269 0.0776773717 -0.2901701610 -0.1633320316
## A3_P 0.0007019939 0.0002528734 0.0002476212 0.0001569519 -0.0002747373
## A4_P 0.2123258528 -0.1654037907 -0.1132850188 -0.3180489055 -0.2209057190
## B1_P -0.0708463205 -0.0914833367 0.0201346827 -0.0411694563 -0.0435557229
## B2_P -0.0043826707 -0.1669054326 0.0840708991 0.3591506162 -0.1374576396
## B3_P 0.0564844243 -0.0373355346 -0.0141449981 -0.2843690996 0.0321087068
## B4_P 0.1180411704 -0.0194313527 -0.1393686272 -0.0701349665 -0.2479811595
## C1_P -0.0742983314 -0.3473453473 -0.0398893713 -0.1159802544 0.3751516696
## C1_NEW_P 0.2275486558 -0.2722258752 0.0486623402 -0.0094536493 0.2075097276
## C2_P 0.0138814314 -0.0228363841 -0.0374793281 -0.0418078687 0.0305130508
## D1_P -0.0499101424 0.0266749544 -0.0050888142 0.0194331677 -0.0921467013
## D2_P -0.0252576096 -0.0031663117 0.0057684949 -0.0185358514 0.0470677610
## D3_P 0.3748405326 0.1387166300 0.0983940538 -0.0211550188 -0.0225482108
## D4_P 0.0031489869 0.2209618753 -0.0966371812 -0.0580407738 -0.3938183114
## H1_P -0.0528003048 -0.0577102258 0.0262860344 0.0406823385 -0.1189119227
## H2_P -0.0020050890 0.0534699929 0.0017587202 0.0244533903 -0.0174186269
## H3A_P -0.1956922792 0.1935261313 0.0332821796 -0.0960858341 0.0750466226
## H3B_P -0.2737135558 0.1508182045 -0.0351176242 -0.1340435399 0.1659115893
## H3C_P -0.0211319545 -0.0531484062 0.0473632740 -0.0261962754 0.0844911792
## H4_P 0.0034624262 -0.5349407512 0.4808242795 -0.0102441865 -0.3076630921
## H5_P 0.3516282339 -0.1805244679 -0.6295045442 0.2666476540 0.1095071064
## H6_P 0.1457679617 0.0203851690 0.0110065663 0.1820133806 0.0241765276
## H7_P 0.0120505226 0.0579442220 0.3540528059 0.1032865203 0.1495094696
## H8_P 0.0734179235 0.3488561690 0.2426866714 0.3522026250 -0.0096438256
## H9_P 0.1046349154 -0.2665423328 -0.0033181412 -0.0415125545 -0.1300662741
## P1_P -0.0192470685 0.0599971978 -0.0781935062 -0.4112433548 -0.1430836456
## P2_P -0.2843101528 -0.0914963459 -0.1974516354 0.3138693534 -0.2343058983
## P3_P 0.5833876757 0.2387443785 0.1564926989 -0.0280578523 -0.0129605502
## P4_P 0.0806557496 -0.0204825039 0.1126064057 -0.1117541940 0.4612913247
## PC16 PC17 PC18 PC19 PC20
## A1_P -0.0652526104 -0.0245430902 -0.4749995309 0.1446511471 4.752964e-02
## A2_P 0.1621181060 0.4051261358 0.1412194492 0.3153166234 -2.814212e-01
## A3_P -0.0001022706 -0.0002118591 0.0002997828 -0.0001062818 -1.925671e-05
## A4_P 0.5379747433 -0.3597958548 0.0980779196 0.0425299549 2.571322e-01
## B1_P 0.0509373861 -0.2151794704 -0.0315809260 0.1230107742 -1.699755e-01
## B2_P 0.3345700260 0.1613736324 0.1616016835 -0.2876066752 -2.519286e-01
## B3_P -0.1200166304 -0.0450488301 0.0433302576 0.3610775200 -4.623489e-01
## B4_P -0.5680371853 -0.1132461850 0.4351540050 -0.1196436761 3.213863e-01
## C1_P -0.1092265957 0.2754904054 0.2138745515 -0.1073629486 -1.008666e-01
## C1_NEW_P -0.0778632270 0.0278748192 0.2165832042 0.0427964903 -1.009914e-01
## C2_P -0.0187106705 -0.0024403820 0.0052926851 0.0052486991 3.037472e-02
## D1_P 0.0576268681 -0.0238995591 -0.0162300730 0.0154915312 2.192761e-02
## D2_P 0.0083217630 -0.0484174739 -0.0173975608 0.0025642892 -6.760688e-02
## D3_P -0.1631326189 -0.1979101946 0.1956105974 0.1407502436 -1.583336e-01
## D4_P -0.1155383381 0.1333142357 -0.1951343190 0.0813728069 3.856065e-02
## H1_P 0.1582135144 0.1786211158 0.1090812043 -0.0983939699 1.684168e-01
## H2_P 0.0057089612 -0.0149439361 -0.0186549102 0.0335767548 -2.405821e-02
## H3A_P -0.1190525556 0.3353372944 -0.0593655322 0.1080282659 2.420808e-01
## H3B_P -0.0970767689 -0.2921354098 -0.0541463393 -0.2208694744 -1.775381e-01
## H3C_P 0.0609625687 0.0586532304 0.1338168320 -0.1664365900 1.377214e-01
## H4_P -0.1131326654 0.0998037759 0.0101607033 0.0904971480 1.287349e-01
## H5_P 0.0016402017 0.1630550592 -0.0800450277 0.0092971998 -6.828621e-02
## H6_P -0.0368041928 0.2213580683 -0.2341219343 0.1071679429 1.540656e-01
## H7_P -0.1414834128 -0.0071709640 -0.0485799245 -0.1325671125 -1.002728e-01
## H8_P 0.1323170931 -0.1166162335 0.3078654788 -0.0767047817 -1.222998e-01
## H9_P -0.1949120277 -0.2445896532 -0.3637915873 -0.2884249586 -2.278681e-01
## P1_P 0.0467914001 0.2338562982 -0.1009595725 -0.4768012752 -1.034936e-01
## P2_P -0.0613650027 -0.0710942879 0.1100502535 0.2799735585 -6.140062e-02
## P3_P -0.0221714085 0.1249466332 -0.0511584204 -0.1640802128 1.001056e-02
## P4_P 0.1350903761 -0.1197708012 -0.0673699496 0.1929226377 3.399778e-01
## PC21 PC22 PC23 PC24 PC25
## A1_P 0.1913431916 0.1512895269 0.320339382 -6.919663e-02 -0.0486615054
## A2_P -0.0099557341 -0.3814377788 0.272429699 -2.240989e-01 -0.2060639190
## A3_P -0.0001637726 -0.0003391257 0.000253230 -6.364153e-05 0.0001135195
## A4_P 0.1493222645 0.0089435388 -0.146257556 -1.056848e-01 0.1516552119
## B1_P -0.1334821508 0.5411973600 0.055422277 2.706844e-01 -0.3656844754
## B2_P -0.3358071707 -0.0012389897 0.142229047 1.482304e-01 0.1714446518
## B3_P 0.0275823313 0.1007501935 0.098128063 1.276471e-01 0.0250026829
## B4_P -0.0852070630 -0.0233054761 0.301543941 -9.080648e-02 -0.0434342872
## C1_P -0.0024947309 0.0066773597 -0.314965003 -2.570712e-02 0.0495434210
## C1_NEW_P -0.0076234396 0.0290041051 -0.218458431 7.301196e-02 0.0171780598
## C2_P -0.1147163059 -0.0631520564 0.057202323 -8.287028e-02 0.1061761332
## D1_P 0.0358095014 0.0521254930 0.008891373 -1.560675e-02 0.0078317323
## D2_P -0.0257981438 -0.0142697469 0.037982787 -2.593380e-02 0.0171401281
## D3_P -0.0096776137 0.0233851616 -0.281807480 1.487627e-01 -0.0541987766
## D4_P -0.5592540715 -0.0187861744 -0.293408991 2.694691e-01 0.0142419160
## H1_P 0.1132404391 -0.1522529144 -0.066735779 3.368808e-01 -0.3930893154
## H2_P 0.0169946133 -0.0030414816 0.013513565 -1.450018e-03 -0.0211442819
## H3A_P 0.3095573791 0.0437872078 -0.053089795 4.073489e-01 0.2794157371
## H3B_P -0.2326365816 -0.0637502723 -0.010193948 -2.738466e-01 -0.0862431116
## H3C_P 0.0963211563 0.0496697907 0.107314811 9.134381e-02 -0.6168987604
## H4_P -0.1339212384 0.0888798980 0.040570657 -9.303015e-02 0.1317365431
## H5_P -0.0419152634 0.0765175566 0.235194213 3.330057e-02 0.0381647898
## H6_P -0.1136408908 0.0243284816 -0.415338714 -4.625959e-01 -0.3037137587
## H7_P 0.0084010327 0.0560036789 0.120199553 -4.573186e-02 0.0501332588
## H8_P 0.0954480380 -0.0852487048 -0.001443675 -6.304282e-04 0.0310721700
## H9_P 0.1639614254 -0.5599190921 -0.091618447 2.563349e-01 -0.1005874056
## P1_P -0.0125978982 0.2257108037 0.122540944 -8.247181e-02 0.0345950212
## P2_P 0.1092006740 -0.1354329905 -0.001005758 -5.109178e-02 -0.0354263369
## P3_P 0.0507680289 0.0557830717 0.024550346 3.079098e-02 0.0247423978
## P4_P -0.4676419711 -0.2774071700 0.271444130 2.020981e-01 -0.0163299966
## PC26 PC27 PC28 PC29 PC30
## A1_P -0.2473479954 -1.027811e-02 -2.923534e-02 -6.665917e-02 -9.745189e-02
## A2_P 0.0238734331 3.624083e-01 -5.739891e-02 -1.304725e-03 6.075194e-02
## A3_P 0.0000249722 -5.077513e-05 2.548705e-06 -3.628739e-05 7.901751e-05
## A4_P 0.0096458098 -4.453436e-02 3.409996e-02 3.179810e-02 4.693956e-02
## B1_P 0.3236350026 4.138135e-01 -1.534105e-01 5.386134e-02 -3.435017e-02
## B2_P 0.2221964996 -1.463586e-01 7.309511e-02 7.360951e-02 6.227738e-02
## B3_P 0.1178325185 -6.570891e-01 1.552880e-01 -2.872853e-02 -9.426008e-02
## B4_P 0.1979840861 -4.366997e-02 -5.877928e-02 -5.748934e-02 1.993041e-02
## C1_P -0.1761186570 1.279857e-01 -1.820889e-02 -3.394940e-02 -6.369682e-03
## C1_NEW_P -0.0593174671 7.180593e-02 -1.708277e-02 -1.019688e-02 1.246032e-04
## C2_P -0.0206307377 1.583927e-02 -1.333142e-01 7.228465e-01 -6.425515e-01
## D1_P -0.0282533716 1.031502e-02 -1.419686e-03 1.742368e-02 -1.535459e-03
## D2_P -0.1301916186 -2.303054e-01 -8.177557e-01 2.290677e-01 4.401395e-01
## D3_P -0.0175157236 1.101499e-01 -5.032005e-02 3.453586e-02 -9.363013e-03
## D4_P -0.3344973983 -1.042682e-02 7.739927e-02 4.668390e-02 8.377166e-02
## H1_P -0.0751092178 -2.216165e-01 -3.440557e-01 -3.761691e-01 -4.820720e-01
## H2_P -0.0101430534 4.692137e-03 4.525985e-03 7.028494e-04 -6.189241e-03
## H3A_P 0.2715091422 4.514356e-02 1.545582e-02 1.736232e-01 1.141629e-01
## H3B_P -0.1338302002 1.261383e-02 -3.391543e-03 -1.546973e-01 -1.147346e-01
## H3C_P -0.2621149239 -1.923647e-01 3.434105e-01 4.216021e-01 2.878288e-01
## H4_P -0.0662047575 -1.178551e-02 1.287262e-02 -1.592123e-02 3.048296e-02
## H5_P -0.0589934415 5.425133e-02 -1.622696e-02 -4.850895e-02 1.677066e-02
## H6_P 0.4936632684 -2.138346e-01 -3.198967e-02 4.240741e-02 1.905349e-02
## H7_P -0.0312324944 1.176897e-02 -9.861873e-03 -2.897464e-02 -3.773744e-02
## H8_P 0.0814285089 -1.268111e-02 1.282658e-03 3.422353e-02 7.392552e-03
## H9_P 0.2427422777 9.208825e-02 3.159580e-02 1.111390e-01 8.250639e-02
## P1_P 0.1833871301 -6.914249e-02 -2.812592e-02 -3.867814e-02 -2.277636e-04
## P2_P -0.0123653680 -2.012628e-02 4.515578e-02 5.051627e-02 3.138426e-02
## P3_P -0.1047404540 -1.484707e-03 -1.024219e-03 -5.975386e-03 -3.133110e-02
## P4_P 0.1926923487 -4.413354e-02 2.527696e-02 -7.748037e-02 5.678309e-02
#3. Variance Explained:
#Print PCA summary (eigenvalues and explained variance)
summary(pca_unstand)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6
## Standard deviation 5280.2273 31.38999 21.73023 15.66804 13.76196 11.78
## Proportion of Variance 0.9999 0.00004 0.00002 0.00001 0.00001 0.00
## Cumulative Proportion 0.9999 0.99996 0.99997 0.99998 0.99999 1.00
## PC7 PC8 PC9 PC10 PC11 PC12 PC13 PC14 PC15 PC16
## Standard deviation 6.541 4.349 4.172 3.61 3.38 2.82 2.457 2.031 1.859 1.707
## Proportion of Variance 0.000 0.000 0.000 0.00 0.00 0.00 0.000 0.000 0.000 0.000
## Cumulative Proportion 1.000 1.000 1.000 1.00 1.00 1.00 1.000 1.000 1.000 1.000
## PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24
## Standard deviation 1.552 1.364 1.298 1.212 0.9617 0.8792 0.8031 0.6975
## Proportion of Variance 0.000 0.000 0.000 0.000 0.0000 0.0000 0.0000 0.0000
## Cumulative Proportion 1.000 1.000 1.000 1.000 1.0000 1.0000 1.0000 1.0000
## PC25 PC26 PC27 PC28 PC29 PC30
## Standard deviation 0.5794 0.5413 0.4921 0.3439 0.3301 0.2201
## Proportion of Variance 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## Cumulative Proportion 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
c. How would you interpret the resulting components?
#Perform PCA with varimax rotation retaining first component.
pca_rotated <- principal(svi_clean, nfactors = 1, rotate = "varimax", scores = TRUE)
print(pca_rotated)
## Principal Components Analysis
## Call: principal(r = svi_clean, nfactors = 1, rotate = "varimax", scores = TRUE)
## Standardized loadings (pattern matrix) based upon correlation matrix
## PC1 h2 u2 com
## A1_P 0.73 5.3e-01 0.47 1
## A2_P 0.74 5.5e-01 0.45 1
## A3_P -0.79 6.3e-01 0.37 1
## A4_P 0.94 8.9e-01 0.11 1
## B1_P -0.15 2.2e-02 0.98 1
## B2_P 0.65 4.2e-01 0.58 1
## B3_P 0.47 2.2e-01 0.78 1
## B4_P 0.71 5.0e-01 0.50 1
## C1_P 0.57 3.3e-01 0.67 1
## C1_NEW_P -0.59 3.5e-01 0.65 1
## C2_P -0.24 5.6e-02 0.94 1
## D1_P -0.46 2.1e-01 0.79 1
## D2_P 0.55 3.0e-01 0.70 1
## D3_P 0.87 7.5e-01 0.25 1
## D4_P 0.41 1.7e-01 0.83 1
## H1_P 0.77 6.0e-01 0.40 1
## H2_P 0.62 3.8e-01 0.62 1
## H3A_P 0.90 8.1e-01 0.19 1
## H3B_P 0.90 8.0e-01 0.20 1
## H3C_P 0.86 7.4e-01 0.26 1
## H4_P 0.86 7.4e-01 0.26 1
## H5_P 0.91 8.4e-01 0.16 1
## H6_P 0.80 6.5e-01 0.35 1
## H7_P -0.01 6.9e-05 1.00 1
## H8_P 0.83 6.9e-01 0.31 1
## H9_P 0.53 2.8e-01 0.72 1
## P1_P -0.05 2.5e-03 1.00 1
## P2_P -0.08 6.2e-03 0.99 1
## P3_P -0.04 1.9e-03 1.00 1
## P4_P 0.11 1.2e-02 0.99 1
##
## PC1
## SS loadings 12.46
## Proportion Var 0.42
##
## Mean item complexity = 1
## Test of the hypothesis that 1 component is sufficient.
##
## The root mean square of the residuals (RMSR) is 0.22
## with the empirical chi square 2190.47 with prob < 3.5e-242
##
## Fit based upon off diagonal values = 0.77
d. There are two variables related to minority status. Do they load onto the same component, or do they belong to different components?
Standardize all the variables, then repeat the PCA. How do the results compare with the unstandardized version?
#Perform PCA (standardized).
pca_stand <- prcomp(svi_clean, scale. = TRUE)
#Compute eigenvalues from PCA (standarized).
ev2 <- (pca_stand$sdev)^2
#1. Kaiser Method:
#Eigenvalues greater than 1
ev2[ev2 > 1]
## [1] 12.461755 5.805486 2.906425 2.574283 1.627996
#2. Scree Plot (Elbow method):
#Scree plot
#Create a dataframe for plotting
scree_svi <- data.frame(
PC = paste0("PC", 1:length(ev2)),
Eigenvalue = ev2)
#Fix ordering by converting to a factor with correct levels
scree_svi$PC <- factor(scree_svi$PC, levels = paste0("PC", 1:length(ev2)))
#Plot the Scree Plot using ggplot2
ggplot(scree_svi, aes(x = PC, y = Eigenvalue)) +
geom_line(aes(group = 1), color = "red", linetype = "dashed") +
geom_point(color = "red", size = 3) +
labs(title = "Scree Plot of PCA", x = "Principal Components", y = "Eigenvalues") +
theme_minimal()
#3. Variance Explained:
#Print PCA summary (eigenvalues and explained variance)
summary(pca_stand)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.5301 2.4095 1.70482 1.60446 1.27593 0.87756 0.8179
## Proportion of Variance 0.4154 0.1935 0.09688 0.08581 0.05427 0.02567 0.0223
## Cumulative Proportion 0.4154 0.6089 0.70579 0.79160 0.84586 0.87153 0.8938
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 0.70127 0.65466 0.60143 0.57656 0.5423 0.50475 0.45971
## Proportion of Variance 0.01639 0.01429 0.01206 0.01108 0.0098 0.00849 0.00704
## Cumulative Proportion 0.91022 0.92451 0.93657 0.94765 0.9575 0.96594 0.97299
## PC15 PC16 PC17 PC18 PC19 PC20 PC21
## Standard deviation 0.38401 0.36995 0.29603 0.28287 0.26799 0.24038 0.22402
## Proportion of Variance 0.00492 0.00456 0.00292 0.00267 0.00239 0.00193 0.00167
## Cumulative Proportion 0.97790 0.98247 0.98539 0.98805 0.99045 0.99237 0.99405
## PC22 PC23 PC24 PC25 PC26 PC27 PC28
## Standard deviation 0.2053 0.17940 0.17406 0.14116 0.13642 0.1226 0.10182
## Proportion of Variance 0.0014 0.00107 0.00101 0.00066 0.00062 0.0005 0.00035
## Cumulative Proportion 0.9954 0.99652 0.99753 0.99820 0.99882 0.9993 0.99967
## PC29 PC30
## Standard deviation 0.08303 0.05614
## Proportion of Variance 0.00023 0.00011
## Cumulative Proportion 0.99989 1.00000
#Perform PCA with varimax rotation retaining 5 components.
pca_rotated <- principal(svi_clean, nfactors = 5, rotate = "varimax", scores = TRUE)
print(pca_rotated)
## Principal Components Analysis
## Call: principal(r = svi_clean, nfactors = 5, rotate = "varimax", scores = TRUE)
## Standardized loadings (pattern matrix) based upon correlation matrix
## RC1 RC2 RC3 RC5 RC4 h2 u2 com
## A1_P 0.40 0.41 0.38 0.63 -0.10 0.88 0.119 3.3
## A2_P 0.58 -0.24 0.07 0.54 -0.35 0.82 0.183 3.1
## A3_P -0.43 -0.03 -0.18 -0.80 -0.24 0.92 0.082 1.9
## A4_P 0.73 0.17 0.38 0.47 0.07 0.94 0.063 2.4
## B1_P -0.15 -0.08 0.62 -0.59 0.25 0.83 0.170 2.5
## B2_P 0.42 -0.55 0.01 0.58 0.18 0.85 0.150 3.0
## B3_P 0.18 -0.26 0.84 0.01 -0.19 0.85 0.151 1.4
## B4_P 0.24 -0.24 0.42 0.76 -0.13 0.88 0.119 2.1
## C1_P 0.20 0.24 0.83 0.22 0.26 0.91 0.091 1.7
## C1_NEW_P -0.23 0.36 -0.73 -0.22 -0.34 0.89 0.114 2.4
## C2_P -0.10 0.69 0.00 -0.25 -0.03 0.55 0.447 1.3
## D1_P -0.37 0.62 -0.02 -0.24 -0.41 0.75 0.254 2.8
## D2_P 0.58 -0.05 0.07 0.06 0.56 0.67 0.333 2.1
## D3_P 0.60 0.21 0.32 0.58 0.25 0.91 0.093 3.2
## D4_P 0.19 0.54 0.29 0.42 -0.48 0.82 0.178 3.8
## H1_P 0.93 0.04 -0.03 0.10 -0.08 0.88 0.117 1.0
## H2_P 0.65 -0.31 0.07 0.09 0.30 0.62 0.380 2.0
## H3A_P 0.95 -0.05 0.10 0.21 -0.03 0.96 0.040 1.1
## H3B_P 0.94 -0.06 0.09 0.21 0.07 0.95 0.048 1.1
## H3C_P 0.95 -0.04 0.04 0.17 -0.02 0.94 0.058 1.1
## H4_P 0.86 -0.22 0.30 0.09 0.20 0.92 0.084 1.5
## H5_P 0.86 -0.09 0.31 0.22 -0.09 0.91 0.088 1.5
## H6_P 0.71 0.21 0.28 0.32 -0.28 0.82 0.185 2.3
## H7_P 0.01 0.78 -0.15 0.18 -0.18 0.70 0.304 1.3
## H8_P 0.57 0.28 0.54 0.42 -0.11 0.88 0.118 3.4
## H9_P 0.18 0.11 0.73 0.27 -0.17 0.68 0.320 1.6
## P1_P -0.01 0.96 -0.07 0.06 0.12 0.95 0.047 1.1
## P2_P -0.03 0.90 -0.06 0.00 0.37 0.96 0.041 1.3
## P3_P -0.08 0.40 0.06 -0.03 0.79 0.79 0.213 1.5
## P4_P 0.03 0.92 0.16 0.13 0.26 0.96 0.036 1.3
##
## RC1 RC2 RC3 RC5 RC4
## SS loadings 8.82 5.75 4.20 4.16 2.45
## Proportion Var 0.29 0.19 0.14 0.14 0.08
## Cumulative Var 0.29 0.49 0.63 0.76 0.85
## Proportion Explained 0.35 0.23 0.17 0.16 0.10
## Cumulative Proportion 0.35 0.57 0.74 0.90 1.00
##
## Mean item complexity = 2
## Test of the hypothesis that 5 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0.03
## with the empirical chi square 49.51 with prob < 1
##
## Fit based upon off diagonal values = 0.99
Based on multiple criteria, including the scree plot, eigenvalues, and cumulative variance explained, five principal components were decided to be retained. The scree plot shows a clear “elbow” after the fourth component, where the curve begins to level off, indicating that with each new additional principal component, variance in the data explains less. Also, according to the eigenvalue criterion (Kaiser rule), the first five components all have eigenvalues greater than 1. Together, these five components account for approximately 84.6% of the total variance, suggesting they capture the most meaningful structure in the data while minimizing noise. This supports retaining the first five components for further analysis.
| PC1 | PC2 | PC3 | PC4 | PC5 | |
|---|---|---|---|---|---|
| A1_P | 0.63 | ||||
| A2_P | 0.58 |
||||
| A3_P | -0.8 | ||||
| A4_P | 0.73 | ||||
| B1_P | 0.62 | ||||
| B2_P | 0.58 | ||||
| B3_P | 0.84 | ||||
| B4_P | 0.76 | ||||
| C1_P | 0.83 | ||||
| C1_NEW_P | -0.73 | ||||
| C2_P | 0.69 | ||||
| D1_P | 0.62 | ||||
| D2_P | 0.58 | ||||
| D3_P | 0.6 | ||||
| D4_P | 0.54 | ||||
| H1_P | 0.93 | ||||
| H2_P | 0.65 | ||||
| H3A_P | 0.95 | ||||
| H3B_P | 0.94 | ||||
| H3C_P | 0.95 | ||||
| H4_P | 0.86 | ||||
| H5_P | 0.86 | ||||
| H6_P | 0.71 | ||||
| H7_P | 0.78 | ||||
| H8_P | 0.57 | ||||
| H9_P | 0.73 | ||||
| P1_P | 0.96 | ||||
| P2_P | 0.9 | ||||
| P3_P | 0.79 | ||||
| P4_P | 0.92 |
The resulting components align closely with the CDC’s Social Vulnerability Index themes. Component 1 reflects housing and infrastructure vulnerability, with strong loadings on items related to water access, plumbing, and internet availability, corresponding to SVI Themes 5 and 6. Component 2 captures immigration-related and economic stress factors, including non-citizenship, remittances, and lack of health insurance, aligning with Theme 7. Component 3 represents minority status and language barriers, based on high loadings for race/ethnicity and limited English proficiency variables from Theme 3. Component 4 indicates socioeconomic disadvantage, including poverty, unemployment, and lower educational attainment, consistent with Theme 1. Lastly, Component 5 appears to reflect aspects of housing density and transportation access, linking to Theme 4. Together, these components reflect distinct yet interconnected domains of social vulnerability as defined by the SVI framework.
The two variables related to minority status, C1_P and C2_P, load onto different components, suggesting they capture distinct aspects of the minority status construct. We see the same thing with C1_NEW_P and C2_P; they load onto different components, suggesting they capture distinct aspects of the minority status construct, even with the adapted variable. C1_P/C1_NEW_P loads most strongly on Component 3, while C2_P loads primarily on Component 2. This indicates that, although both relate to minority status, they likely represent different dimensions of the concept—such as structural vs. identity-based barriers, or perhaps differences in how minority status affects access versus perception.
The CDC’s Social Vulnerability Index (SVI) is structured into four themes:
• Socioeconomic Status = A1_P + A2_P + A3_P + A4_P
• Household Composition & Disability = B1_P + B2_P + B3_P + B4_P
• Minority Status & Language = C1_P + C2_P
• Housing Type & Transportation = D1_P + D2_P +D3_P + D4_P
Using this structure as a theoretical model, perform a Confirmatory Factor Analysis (CFA) on the variables (A1_P to D5_P without C1_NEW_P) to assess how well the data fits the CDC SVI framework.
#Removing variables that are not included in the structure as a theoretical model as we will not perform CFA on these variables.
svi_cdc_C1 <- svi_clean %>% select(A1_P, A2_P, A3_P, A4_P, B1_P, B2_P, B3_P, B4_P, C1_P, C2_P, D1_P, D2_P, D3_P, D4_P)
#Perform PCA using the correlation matrix
pca_svi_cdc <- prcomp(svi_cdc_C1, scale = TRUE)
#Compute eigenvalues from PCA
ev3 <- (pca_svi_cdc$sdev)^2
#Print the eigenvalues
print(ev3)
## [1] 6.18895658 2.36954486 1.86295968 1.23155445 0.62422264 0.43848260
## [7] 0.36362128 0.29060188 0.24146283 0.11695753 0.09704565 0.07780270
## [13] 0.05286245 0.04392488
#1. Kaiser Method:
#Eigenvalues greater than 1
ev3[ev3 > 1]
## [1] 6.188957 2.369545 1.862960 1.231554
#2. Scree Plot (Elbow method):
#Scree plot
#Create a dataframe for plotting
scree_svi <- data.frame(
PC = paste0("PC", 1:length(ev3)),
Eigenvalue = ev3)
# Fix ordering by converting to a factor with correct levels
scree_svi$PC <- factor(scree_svi$PC, levels = paste0("PC", 1:length(ev3)))
#Plot the Scree Plot using ggplot2
ggplot(scree_svi, aes(x = PC, y = Eigenvalue)) +
geom_line(aes(group = 1), color = "red", linetype = "dashed") +
geom_point(color = "red", size = 3) +
labs(title = "Scree Plot of PCA", x = "Principal Components", y = "Eigenvalues") +
theme_minimal()
#3. Variance Explained:
#Print PCA summary (eigenvalues and explained variance)
summary(pca_svi_cdc)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 2.4878 1.5393 1.3649 1.10975 0.79008 0.66218 0.60301
## Proportion of Variance 0.4421 0.1693 0.1331 0.08797 0.04459 0.03132 0.02597
## Cumulative Proportion 0.4421 0.6113 0.7444 0.83236 0.87695 0.90827 0.93424
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 0.53908 0.49139 0.34199 0.31152 0.27893 0.22992 0.20958
## Proportion of Variance 0.02076 0.01725 0.00835 0.00693 0.00556 0.00378 0.00314
## Cumulative Proportion 0.95500 0.97224 0.98060 0.98753 0.99309 0.99686 1.00000
#Correlated two factor solution, marker method
svi_cdc_model <- 'f1 =~ A1_P + A2_P + A3_P + A4_P
f2 =~ B1_P + B2_P + B3_P + B4_P
f3 =~ C1_P + C2_P
f4 =~ D1_P + D2_P + D3_P + D4_P'
fourfac14items_a <- cfa(svi_cdc_model, data=svi_cdc_C1, std.lv=TRUE)
## Warning: lavaan->lav_data_full():
## some observed variances are (at least) a factor 1000 times larger than
## others; use varTable(fit) to investigate
## Warning: lavaan->lav_data_full():
## some observed variances are larger than 1000000 use varTable(fit) to
## investigate
## Warning: lavaan->lav_model_vcov():
## Could not compute standard errors! The information matrix could not be
## inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_model_vcov():
## Could not compute standard errors! The information matrix could not be
## inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_object_post_check():
## some estimated ov variances are negative
summary(fourfac14items_a,fit.measures=TRUE, standardized=TRUE)
## lavaan 0.6-19 ended normally after 246 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 34
##
## Number of observations 54
##
## Model Test User Model:
##
## Test statistic 444.778
## Degrees of freedom 71
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 785.467
## Degrees of freedom 91
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.462
## Tucker-Lewis Index (TLI) 0.310
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2286.359
## Loglikelihood unrestricted model (H1) -2063.970
##
## Akaike (AIC) 4640.718
## Bayesian (BIC) 4708.343
## Sample-size adjusted Bayesian (SABIC) 4601.526
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.312
## 90 Percent confidence interval - lower 0.285
## 90 Percent confidence interval - upper 0.340
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 1.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.188
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## f1 =~
## A1_P -4.902 NA -4.902 -0.695
## A2_P -1.842 NA -1.842 -0.626
## A3_P 5229.526 NA 5229.526 1.000
## A4_P -5.803 NA -5.803 -0.787
## f2 =~
## B1_P -0.497 NA -0.497 -0.233
## B2_P 2.857 NA 2.857 0.635
## B3_P 1.060 NA 1.060 0.489
## B4_P 4.133 NA 4.133 1.021
## f3 =~
## C1_P 34.240 NA 34.240 3.212
## C2_P 0.024 NA 0.024 0.044
## f4 =~
## D1_P 8.118 NA 8.118 0.411
## D2_P -0.288 NA -0.288 -0.489
## D3_P -8.380 NA -8.380 -0.876
## D4_P -1.234 NA -1.234 -0.386
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## f1 ~~
## f2 -0.712 NA -0.712 -0.712
## f3 -0.146 NA -0.146 -0.146
## f4 0.962 NA 0.962 0.962
## f2 ~~
## f3 0.155 NA 0.155 0.155
## f4 -0.670 NA -0.670 -0.670
## f3 ~~
## f4 -0.223 NA -0.223 -0.223
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .A1_P 25.675 NA 25.675 0.517
## .A2_P 5.252 NA 5.252 0.608
## .A3_P 3051.233 NA 3051.233 0.000
## .A4_P 20.679 NA 20.679 0.380
## .B1_P 4.325 NA 4.325 0.946
## .B2_P 12.050 NA 12.050 0.596
## .B3_P 3.568 NA 3.568 0.761
## .B4_P -0.689 NA -0.689 -0.042
## .C1_P -1058.763 NA -1058.763 -9.317
## .C2_P 0.286 NA 0.286 0.998
## .D1_P 323.575 NA 323.575 0.831
## .D2_P 0.263 NA 0.263 0.761
## .D3_P 21.278 NA 21.278 0.233
## .D4_P 8.686 NA 8.686 0.851
## f1 1.000 1.000 1.000
## f2 1.000 1.000 1.000
## f3 1.000 1.000 1.000
## f4 1.000 1.000 1.000
semPaths(fourfac14items_a, what="std", edge.label.cex = 1.2)
The CFA model tested the CDC’s four-theme structure but demonstrated poor fit across multiple indices:
χ²(71) = 444.78;
p < .001;
CFI = 0.462;
TLI = 0.310;
RMSEA = 0.312;
SRMR = 0.188.
Additionally, negative variance estimates were flagged, indicating possible model misspecification or estimation issues, likely due to the small sample size (n = 54) and factors with too few indicators (f3 only has two indicators).
Based on the results, does your data support the CDC’s four-theme structure? If not, where does it diverge?
Based on these results, the data do not support the CDC’s four-theme structureand estimation issues such as negative variances and extremely high or low factor loadings suggest the model is misspecified.
The model diverges in several areas:
Factor 3 (Minority Status & Language) shows instability: C1_P has an excessively high loading (3.212) and a negative variance, while C2_P loads very poorly (0.044), indicating these items may not represent a cohesive latent construct.
Factor 2 (Household Composition & Disability) includes B1_P, which loads weakly (0.233).
Factor 4 (Housing & Transportation)contains D2_P (−0.489) and D4_P (−0.386), both with weak and negative loadings.
The presence of Heywood cases (e.g., negative variances for C1_P and B4_P) further suggests that the factor structure imposed by the CDC model may not reflect the actual relationships in your sample.
Are there any indicators (variables) that load poorly onto their expected factors?
Repeat problem 3 but replace C1_P with C1_NEW_P.
#Removing variables that are not included in the structure as a theoretical model as we will not perform CFA on these variables.
svi_cdc_C1_NEW <- svi_clean %>% select(A1_P, A2_P, A3_P, A4_P, B1_P, B2_P, B3_P, B4_P, C1_NEW_P, C2_P, D1_P, D2_P, D3_P, D4_P)
#correlated two factor solution, marker method
svi_cdc_model <- 'f1 =~ A1_P + A2_P + A3_P + A4_P
f2 =~ B1_P + B2_P + B3_P + B4_P
f3 =~ C1_NEW_P + C2_P
f4 =~ D1_P + D2_P + D3_P + D4_P'
fourfac14items_b <- cfa(svi_cdc_model, data=svi_cdc_C1_NEW, std.lv=TRUE)
## Warning: lavaan->lav_data_full():
## some observed variances are (at least) a factor 1000 times larger than
## others; use varTable(fit) to investigate
## Warning: lavaan->lav_data_full():
## some observed variances are larger than 1000000 use varTable(fit) to
## investigate
## Warning: lavaan->lav_model_vcov():
## Could not compute standard errors! The information matrix could not be
## inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_model_vcov():
## Could not compute standard errors! The information matrix could not be
## inverted. This may be a symptom that the model is not identified.
## Warning: lavaan->lav_object_post_check():
## covariance matrix of latent variables is not positive definite ; use
## lavInspect(fit, "cov.lv") to investigate.
summary(fourfac14items_b,fit.measures=TRUE, standardized=TRUE)
## lavaan 0.6-19 ended normally after 79 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 34
##
## Number of observations 54
##
## Model Test User Model:
##
## Test statistic 361.158
## Degrees of freedom 71
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 780.138
## Degrees of freedom 91
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.579
## Tucker-Lewis Index (TLI) 0.460
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2267.617
## Loglikelihood unrestricted model (H1) -2087.038
##
## Akaike (AIC) 4603.234
## Bayesian (BIC) 4670.860
## Sample-size adjusted Bayesian (SABIC) 4564.042
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.275
## 90 Percent confidence interval - lower 0.247
## 90 Percent confidence interval - upper 0.304
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 1.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.179
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## f1 =~
## A1_P 5.876 NA 5.876 0.833
## A2_P 2.021 NA 2.021 0.687
## A3_P -4601.424 NA -4601.424 -0.880
## A4_P 6.691 NA 6.691 0.907
## f2 =~
## B1_P 0.119 NA 0.119 0.056
## B2_P 3.081 NA 3.081 0.685
## B3_P 1.314 NA 1.314 0.607
## B4_P 3.165 NA 3.165 0.782
## f3 =~
## C1_NEW_P 12.839 NA 12.839 0.834
## C2_P 0.232 NA 0.232 0.434
## f4 =~
## D1_P 7.223 NA 7.223 0.366
## D2_P -0.169 NA -0.169 -0.288
## D3_P -7.189 NA -7.189 -0.751
## D4_P -1.692 NA -1.692 -0.529
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## f1 ~~
## f2 0.881 NA 0.881 0.881
## f3 -0.562 NA -0.562 -0.562
## f4 -1.221 NA -1.221 -1.221
## f2 ~~
## f3 -0.949 NA -0.949 -0.949
## f4 -0.920 NA -0.920 -0.920
## f3 ~~
## f4 0.738 NA 0.738 0.738
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .A1_P 15.189 NA 15.189 0.306
## .A2_P 4.561 NA 4.561 0.528
## .A3_P 6190738.785 NA 6190738.785 0.226
## .A4_P 9.600 NA 9.600 0.177
## .B1_P 4.558 NA 4.558 0.997
## .B2_P 10.724 NA 10.724 0.530
## .B3_P 2.966 NA 2.966 0.632
## .B4_P 6.378 NA 6.378 0.389
## .C1_NEW_P 72.415 NA 72.415 0.305
## .C2_P 0.233 NA 0.233 0.812
## .D1_P 336.696 NA 336.696 0.866
## .D2_P 0.317 NA 0.317 0.917
## .D3_P 39.856 NA 39.856 0.435
## .D4_P 7.348 NA 7.348 0.720
## f1 1.000 1.000 1.000
## f2 1.000 1.000 1.000
## f3 1.000 1.000 1.000
## f4 1.000 1.000 1.000
semPaths(fourfac14items_b, what="std", edge.label.cex = 1.2)
#Chi-square difference test (nested models only)
anova(fourfac14items_a, fourfac14items_b)
## Warning: lavaan->lavTestLRT():
## some models are based on a different set of observed variables
## Warning: lavaan->lavTestLRT():
## Some restricted models fit better than less restricted models; either
## these models are not nested, or the less restricted model failed to reach
## a global optimum.Smallest difference = -83.61966233577.
## Warning: lavaan->lavTestLRT():
## some models have the same degrees of freedom
##
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff RMSEA Df diff Pr(>Chisq)
## fourfac14items_a 71 4640.7 4708.3 444.78
## fourfac14items_b 71 4603.2 4670.9 361.16 -83.62 0 0
#For all fit indices
fitMeasures(fourfac14items_a, c("cfi", "tli", "rmsea", "srmr", "aic", "bic"))
## cfi tli rmsea srmr aic bic
## 0.462 0.310 0.312 0.188 4640.718 4708.343
fitMeasures(fourfac14items_b, c("cfi", "tli", "rmsea", "srmr", "aic", "bic"))
## cfi tli rmsea srmr aic bic
## 0.579 0.460 0.275 0.179 4603.234 4670.860
χ²(71) = 361.16;
p < .001;
CFI = 0.579;
TLI = 0.460;
RMSEA = 0.275 (90% CI: [0.247, 0.304]);
SRMR = 0.179.
Additionally, the extremely high RMSEA and low CFI/TLI values indicate substantial model misfit. These issues may be due to model misspecification, the limited sample size (n = 54), or factors defined by too few indicators. Further refinement or respecification of the model is recommended.
Based on the results, does your data support the CDC’s four-theme structure? If not, where does it diverge?
Based on the results, the data still does not support the CDC’s four-theme structure. The model demonstrated poor fit across all major indices (CFI = 0.579, TLI = 0.460, RMSEA = 0.275, SRMR = 0.179), and the chi-square test was significant (χ²(71) = 361.16, p < .001), indicating a mismatch between the hypothesized model and the observed data.
The divergence appears to occur in the way certain indicators load onto the proposed factors. For example, variables related to minority status (C1_NEW_P and C2_P), which are theoretically grouped under one theme, load onto different components in the exploratory analysis, suggesting they represent distinct constructs. Additionally, the low TLI and high RMSEA may reflect issues with factor(s) that have too few indicators or weak loadings, undermining the stability of the CDC’s predefined structure.
However, compared to the first CDC’s four-theme model, the alternative model showed better fit across multiple indices (CFI (CDC) = 0.462 vs. CFI (GU) = 0.579, TLI (CDC) = 0.310 vs. TLI (GU) = 0.460, RMSEA (CDC) = 0.312 vs. RMSEA (GU) = 0.275, AIC (CDC) = 4640.718 vs. AIC (GU) = 4603.234), suggesting the revised factor structure better captures the observed relationships.
• Are there any indicators (variables) that load poorly onto their expected factors?