PCA & Factor Analysis - Data Sosial Ekonomi BPS

1. Data Loading and Cleaning

This analysis uses provincial socioeconomic data from BPS (Badan Pusat Statistik). The dataset contains multiple numerical variables representing various social and economic indicators across Indonesian provinces. Non-analytical columns such as province names, year, poverty line values, and poverty headcount figures were excluded from the analysis, as they either serve as identifiers or are directly derived from the target variables. After removing non-analytical variables and performing manual imputation prior to loading, the dataset was ready for multivariate analysis.

df <- read.csv("sps data bps - full imputasi manual - data bps (2).csv",
               stringsAsFactors = FALSE, check.names = FALSE)

head(df)

##           PROVINSI TAHUN APS 13-15 APS 16-18 APS 19-24 TPT - Februari
## 1             ACEH  2021     98,42     83,28     32,61            6,3
## 2   SUMATERA UTARA  2021     96,99     78,66     27,05           6,01
## 3   SUMATERA BARAT  2021     96,63     84,07     36,41           6,67
## 4             RIAU  2021     95,66     77,81     28,79           4,96
## 5            JAMBI  2021     96,39      72,5     24,14           4,76
## 6 SUMATERA SELATAN  2021     94,85     71,53     18,81           5,17
##   TPT - Agustus TPAK - Februari TPAK - Agustus Garis Kemiskinan - Maret (Rp)
## 1           6,3           65,14          63,78                        541109
## 2          6,33           69,39           69,1                        525756
## 3          6,52           68,41          67,72                        568703
## 4          4,42           65,81          65,03                        565937
## 5          5,09            67,3          67,17                        506355
## 6          4,98           69,95          68,77                        457455
##   Garis Kemiskinan - September (Rp) Jumlah Penduduk Miskin - Maret (ribu)
## 1                            665855                                834,24
## 2                            648336                               1343,86
## 3                            714991                                370,67
## 4                            702620                                500,81
## 5                            658100                                293,86
## 6                            564462                               1113,76
##   Jumlah Penduduk Miskin - September (ribu) Persentase Penduduk Miskin - Maret
## 1                                    718,96                              15,33
## 2                                   1110,92                               9,01
## 3                                    315,43                               6,63
## 4                                    473,04                               7,12
## 5                                     272,7                               8,09
## 6                                    948,84                              12,84
##   Persentase Penduduk Miskin - September Indeks Pembangunan Manusia
## 1                                  12,64                      73,48
## 2                                   7,19                      73,84
## 3                                   5,42                      74,56
## 4                                   6,36                      73,89
## 5                                   7,26                      72,62
## 6                                  10,51                      71,83
##   Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
## 1                                                                   9,77
## 2                                                                   9,88
## 3                                                                   9,46
## 4                                                                   9,52
## 5                                                                   9,03
## 6                                                                   8,78
##    Harapan Lama Sekolah (Tahun)\n
## 1                           14,36
## 2                           13,27
## 3                           14,09
## 4                           13,28
## 5                           13,04
## 6                           12,54

summary(df)

##    PROVINSI             TAHUN       APS 13-15          APS 16-18        
##  Length:152         Min.   :2021   Length:152         Length:152        
##  Class :character   1st Qu.:2022   Class :character   Class :character  
##  Mode  :character   Median :2022   Mode  :character   Mode  :character  
##                     Mean   :2022                                        
##                     3rd Qu.:2023                                        
##                     Max.   :2024                                        
##   APS 19-24         TPT - Februari     TPT - Agustus      TPAK - Februari   
##  Length:152         Length:152         Length:152         Length:152        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  TPAK - Agustus     Garis Kemiskinan - Maret (Rp)
##  Length:152         Min.   : 364251              
##  Class :character   1st Qu.: 490383              
##  Mode  :character   Median : 569536              
##                     Mean   : 583579              
##                     3rd Qu.: 662321              
##                     Max.   :1007060              
##  Garis Kemiskinan - September (Rp) Jumlah Penduduk Miskin - Maret (ribu)
##  Min.   :460283                    Length:152                           
##  1st Qu.:547751                    Class :character                     
##  Median :646222                    Mode  :character                     
##  Mean   :652547                                                         
##  3rd Qu.:708811                                                         
##  Max.   :917673                                                         
##  Jumlah Penduduk Miskin - September (ribu) Persentase Penduduk Miskin - Maret
##  Length:152                                Length:152                        
##  Class :character                          Class :character                  
##  Mode  :character                          Mode  :character                  
##                                                                              
##                                                                              
##                                                                              
##  Persentase Penduduk Miskin - September Indeks Pembangunan Manusia
##  Length:152                             Length:152                
##  Class :character                       Class :character          
##  Mode  :character                       Mode  :character          
##                                                                   
##                                                                   
##                                                                   
##  Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
##  Length:152                                                            
##  Class :character                                                      
##  Mode  :character                                                      
##                                                                        
##                                                                        
##                                                                        
##   Harapan Lama Sekolah (Tahun)\n
##  Length:152                     
##  Class :character               
##  Mode  :character               
##                                 
##                                 
##

str(df)

## 'data.frame':    152 obs. of  18 variables:
##  $ PROVINSI                                                              : chr  "ACEH" "SUMATERA UTARA" "SUMATERA BARAT" "RIAU" ...
##  $ TAHUN                                                                 : int  2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
##  $ APS 13-15                                                             : chr  "98,42" "96,99" "96,63" "95,66" ...
##  $ APS 16-18                                                             : chr  "83,28" "78,66" "84,07" "77,81" ...
##  $ APS 19-24                                                             : chr  "32,61" "27,05" "36,41" "28,79" ...
##  $ TPT - Februari                                                        : chr  "6,3" "6,01" "6,67" "4,96" ...
##  $ TPT - Agustus                                                         : chr  "6,3" "6,33" "6,52" "4,42" ...
##  $ TPAK - Februari                                                       : chr  "65,14" "69,39" "68,41" "65,81" ...
##  $ TPAK - Agustus                                                        : chr  "63,78" "69,1" "67,72" "65,03" ...
##  $ Garis Kemiskinan - Maret (Rp)                                         : int  541109 525756 568703 565937 506355 457455 548934 471439 752203 642425 ...
##  $ Garis Kemiskinan - September (Rp)                                     : int  665855 648336 714991 702620 658100 564462 672816 599018 917673 807602 ...
##  $ Jumlah Penduduk Miskin - Maret (ribu)                                 : chr  "834,24" "1343,86" "370,67" "500,81" ...
##  $ Jumlah Penduduk Miskin - September (ribu)                             : chr  "718,96" "1110,92" "315,43" "473,04" ...
##  $ Persentase Penduduk Miskin - Maret                                    : chr  "15,33" "9,01" "6,63" "7,12" ...
##  $ Persentase Penduduk Miskin - September                                : chr  "12,64" "7,19" "5,42" "6,36" ...
##  $ Indeks Pembangunan Manusia                                            : chr  "73,48" "73,84" "74,56" "73,89" ...
##  $ Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi: chr  "9,77" "9,88" "9,46" "9,52" ...
##  $  Harapan Lama Sekolah (Tahun)
##                                        : chr  "14,36" "13,27" "14,09" "13,28" ...

colnames(df)

##  [1] "PROVINSI"                                                              
##  [2] "TAHUN"                                                                 
##  [3] "APS 13-15"                                                             
##  [4] "APS 16-18"                                                             
##  [5] "APS 19-24"                                                             
##  [6] "TPT - Februari"                                                        
##  [7] "TPT - Agustus"                                                         
##  [8] "TPAK - Februari"                                                       
##  [9] "TPAK - Agustus"                                                        
## [10] "Garis Kemiskinan - Maret (Rp)"                                         
## [11] "Garis Kemiskinan - September (Rp)"                                     
## [12] "Jumlah Penduduk Miskin - Maret (ribu)"                                 
## [13] "Jumlah Penduduk Miskin - September (ribu)"                             
## [14] "Persentase Penduduk Miskin - Maret"                                    
## [15] "Persentase Penduduk Miskin - September"                                
## [16] "Indeks Pembangunan Manusia"                                            
## [17] "Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi"
## [18] " Harapan Lama Sekolah (Tahun)\n"

for (col in colnames(df)) {
  if (is.character(df[[col]]) && col != "PROVINSI") {
    df[[col]] <- as.numeric(gsub(",", ".", df[[col]]))
  }
}

drop_vars <- c(
  "PROVINSI",
  "TAHUN",
  "Garis Kemiskinan - Maret (Rp)",
  "Garis Kemiskinan - September (Rp)",
  "Jumlah Penduduk Miskin - Maret (ribu)",
  "Jumlah Penduduk Miskin - September (ribu)"
)

data_clean <- df[, !(colnames(df) %in% drop_vars)]
data_clean <- data_clean[, sapply(data_clean, is.numeric)]
data_clean <- na.omit(data_clean)

str(data_clean)

## 'data.frame':    152 obs. of  12 variables:
##  $ APS 13-15                                                             : num  98.4 97 96.6 95.7 96.4 ...
##  $ APS 16-18                                                             : num  83.3 78.7 84.1 77.8 72.5 ...
##  $ APS 19-24                                                             : num  32.6 27.1 36.4 28.8 24.1 ...
##  $ TPT - Februari                                                        : num  6.3 6.01 6.67 4.96 4.76 ...
##  $ TPT - Agustus                                                         : num  6.3 6.33 6.52 4.42 5.09 4.98 3.65 4.69 5.03 9.91 ...
##  $ TPAK - Februari                                                       : num  65.1 69.4 68.4 65.8 67.3 ...
##  $ TPAK - Agustus                                                        : num  63.8 69.1 67.7 65 67.2 ...
##  $ Persentase Penduduk Miskin - Maret                                    : num  15.33 9.01 6.63 7.12 8.09 ...
##  $ Persentase Penduduk Miskin - September                                : num  12.64 7.19 5.42 6.36 7.26 ...
##  $ Indeks Pembangunan Manusia                                            : num  73.5 73.8 74.6 73.9 72.6 ...
##  $ Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi: num  9.77 9.88 9.46 9.52 9.03 ...
##  $  Harapan Lama Sekolah (Tahun)
##                                        : num  14.4 13.3 14.1 13.3 13 ...

dim(data_clean)

## [1] 152  12

colSums(is.na(df[, !(colnames(df) %in% drop_vars)]))

##                                                              APS 13-15 
##                                                                      0 
##                                                              APS 16-18 
##                                                                      0 
##                                                              APS 19-24 
##                                                                      0 
##                                                         TPT - Februari 
##                                                                      0 
##                                                          TPT - Agustus 
##                                                                      0 
##                                                        TPAK - Februari 
##                                                                      0 
##                                                         TPAK - Agustus 
##                                                                      0 
##                                     Persentase Penduduk Miskin - Maret 
##                                                                      0 
##                                 Persentase Penduduk Miskin - September 
##                                                                      0 
##                                             Indeks Pembangunan Manusia 
##                                                                      0 
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi 
##                                                                      0 
##                                         Harapan Lama Sekolah (Tahun)\n 
##                                                                      0

2. Sample Size Adequacy

Before proceeding with Factor Analysis, it is important to assess whether the available sample size is sufficient relative to the number of variables. This dataset consists of provincial observations from BPS, which typically yields around 34 to 38 observations (one per province per year). Factor Analysis requires adequate sample size to produce stable and replicable factor solutions. Two commonly applied guidelines are: a minimum ratio of 5 to 10 observations per variable (Hair et al., 2019), and an absolute minimum of n > 100 observations as a traditional rule of thumb. When sample size is small, results should be interpreted with caution, and communalities and factor loadings should be cross-validated if possible.

n_obs  <- nrow(data_clean)
n_vars <- ncol(data_clean)
ratio  <- n_obs / n_vars

cat("Number of observations (n)   :", n_obs,  "\n")

## Number of observations (n)   : 152

cat("Number of variables (p)      :", n_vars, "\n")

## Number of variables (p)      : 12

cat("Observation-to-variable ratio:", round(ratio, 2), "\n\n")

## Observation-to-variable ratio: 12.67

if (ratio >= 10) {
  cat("Ratio >= 10: Sample size is adequate (Hair et al., 2019).\n")
} else if (ratio >= 5) {
  cat("Ratio 5-10: Marginally adequate. Interpret results with caution.\n")
} else {
  cat("Ratio < 5: Sample size may be insufficient. Results should be treated as exploratory.\n")
}

## Ratio >= 10: Sample size is adequate (Hair et al., 2019).

if (n_obs >= 100) {
  cat("n >= 100: Meets traditional absolute minimum.\n")
} else {
  cat("n < 100: Below traditional absolute minimum. FA results may be unstable.\n")
  cat("Interpretation should be treated as preliminary and exploratory.\n")
}

## n >= 100: Meets traditional absolute minimum.

3. Data Distribution and Normality

The histograms below provide a visual overview of each variable’s distribution. Variables tied to economic conditions such as expenditure or poverty-related indicators typically exhibit right-skewed distributions, while variables related to access and participation rates tend to be more symmetrically distributed.

The Shapiro-Wilk test is included for descriptive purposes only. Because this analysis uses Principal Axis Factoring (PAF) as the extraction method, strict multivariate normality is not required. Normality assessment would only be critical if Maximum Likelihood estimation were used. Results from the Shapiro-Wilk test are therefore interpreted as supplementary distributional information rather than a prerequisite check.

normality_results <- data.frame(
  Variable  = colnames(data_clean),
  Shapiro_W = NA_real_,
  p_value   = NA_real_,
  Skewness  = NA_real_,
  Kurtosis  = NA_real_,
  stringsAsFactors = FALSE
)

for (i in seq_along(colnames(data_clean))) {
  col <- colnames(data_clean)[i]
  x   <- data_clean[[col]]
  sw  <- shapiro.test(x)
  normality_results$Shapiro_W[i] <- round(sw$statistic, 4)
  normality_results$p_value[i]   <- round(sw$p.value, 4)
  normality_results$Skewness[i]  <- round(psych::skew(x), 4)
  normality_results$Kurtosis[i]  <- round(psych::kurtosi(x), 4)
}

normality_results$Normal <- ifelse(normality_results$p_value > 0.05, "Yes", "No")
print(normality_results)

##                                                                  Variable
## 1                                                               APS 13-15
## 2                                                               APS 16-18
## 3                                                               APS 19-24
## 4                                                          TPT - Februari
## 5                                                           TPT - Agustus
## 6                                                         TPAK - Februari
## 7                                                          TPAK - Agustus
## 8                                      Persentase Penduduk Miskin - Maret
## 9                                  Persentase Penduduk Miskin - September
## 10                                             Indeks Pembangunan Manusia
## 11 Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
## 12                                         Harapan Lama Sekolah (Tahun)\n
##    Shapiro_W p_value Skewness Kurtosis Normal
## 1     0.7139  0.0000  -1.8599   2.4117     No
## 2     0.9729  0.0042   0.2860  -0.5467     No
## 3     0.8821  0.0000   1.5081   3.7814     No
## 4     0.9086  0.0000   0.9788   0.3434     No
## 5     0.9555  0.0001   0.6822   0.2138     No
## 6     0.9662  0.0009   0.5784  -0.0068     No
## 7     0.9428  0.0000   0.5368  -0.4535     No
## 8     0.8742  0.0000   1.0844   0.3242     No
## 9     0.8530  0.0000   1.3093   1.2354     No
## 10    0.9403  0.0000  -0.4440   0.3897     No
## 11    0.9641  0.0005  -0.6898   1.1973     No
## 12    0.9099  0.0000  -0.4487   2.1534     No

n_cols_plot <- 3
n_rows_plot <- ceiling(n_vars / n_cols_plot)

colors <- colorRampPalette(c("lightblue", "steelblue", "lightgreen",
                              "lightpink", "khaki", "plum"))(n_vars)

par(mfrow = c(n_rows_plot, n_cols_plot), mar = c(4, 4, 3, 1))

for (i in seq_along(colnames(data_clean))) {
  col <- colnames(data_clean)[i]
  hist(data_clean[[col]],
       main   = paste("Distribution of", col),
       xlab   = col,
       col    = colors[i],
       border = "white",
       breaks = 15)
}

par(mfrow = c(1, 1))

4. Outlier Detection

Outliers were assessed using the Z-score method. Observations with a Z-score greater than plus or minus 3 are considered extreme outliers. Extreme outliers in socioeconomic data often reflect genuine variation between provinces. DKI Jakarta, for example, typically has very different income or expenditure levels compared to other provinces. Rather than removing such observations, they should be noted and interpreted with care in subsequent analyses.

z_scores      <- scale(data_clean)
outlier_count <- colSums(abs(z_scores) > 3)
print(data.frame(Variable = names(outlier_count), Outliers_Z3 = outlier_count))

##                                                                                                                                      Variable
## APS 13-15                                                                                                                           APS 13-15
## APS 16-18                                                                                                                           APS 16-18
## APS 19-24                                                                                                                           APS 19-24
## TPT - Februari                                                                                                                 TPT - Februari
## TPT - Agustus                                                                                                                   TPT - Agustus
## TPAK - Februari                                                                                                               TPAK - Februari
## TPAK - Agustus                                                                                                                 TPAK - Agustus
## Persentase Penduduk Miskin - Maret                                                                         Persentase Penduduk Miskin - Maret
## Persentase Penduduk Miskin - September                                                                 Persentase Penduduk Miskin - September
## Indeks Pembangunan Manusia                                                                                         Indeks Pembangunan Manusia
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
##  Harapan Lama Sekolah (Tahun)\n                                                                                Harapan Lama Sekolah (Tahun)\n
##                                                                        Outliers_Z3
## APS 13-15                                                                        0
## APS 16-18                                                                        0
## APS 19-24                                                                        4
## TPT - Februari                                                                   1
## TPT - Agustus                                                                    2
## TPAK - Februari                                                                  0
## TPAK - Agustus                                                                   0
## Persentase Penduduk Miskin - Maret                                               1
## Persentase Penduduk Miskin - September                                           4
## Indeks Pembangunan Manusia                                                       0
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi           1
##  Harapan Lama Sekolah (Tahun)\n                                                  1

par(mfrow = c(n_rows_plot, n_cols_plot), mar = c(4, 4, 3, 1))

for (i in seq_along(colnames(data_clean))) {
  col <- colnames(data_clean)[i]
  boxplot(data_clean[[col]],
          main       = col,
          col        = colors[i],
          border     = "gray30",
          horizontal = TRUE)
}

par(mfrow = c(1, 1))

5. Correlation Matrix

The correlation matrix was computed to examine linear relationships among all variables. For Factor Analysis to be appropriate, there should be sufficient intercorrelation among variables generally, pairs with an absolute correlation of at least 0.30 are desirable. Variables with very low correlations across the board contribute little shared variance and may not be suitable for factor extraction.

The determinant of the correlation matrix is reported as a check for multicollinearity. A determinant close to zero indicates extreme multicollinearity that can cause numerical instability. Variable pairs with absolute correlation exceeding 0.90 are additionally flagged, as extreme correlations may indicate near-redundancy and could destabilise the factor solution.

cor_matrix <- cor(data_clean)
round(cor_matrix, 3)

##                                                                        APS 13-15
## APS 13-15                                                                  1.000
## APS 16-18                                                                  0.715
## APS 19-24                                                                  0.327
## TPT - Februari                                                             0.443
## TPT - Agustus                                                              0.516
## TPAK - Februari                                                           -0.566
## TPAK - Agustus                                                            -0.624
## Persentase Penduduk Miskin - Maret                                        -0.657
## Persentase Penduduk Miskin - September                                    -0.652
## Indeks Pembangunan Manusia                                                 0.785
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi     0.732
##  Harapan Lama Sekolah (Tahun)\n                                            0.735
##                                                                        APS 16-18
## APS 13-15                                                                  0.715
## APS 16-18                                                                  1.000
## APS 19-24                                                                  0.654
## TPT - Februari                                                             0.209
## TPT - Agustus                                                              0.236
## TPAK - Februari                                                           -0.218
## TPAK - Agustus                                                            -0.229
## Persentase Penduduk Miskin - Maret                                        -0.260
## Persentase Penduduk Miskin - September                                    -0.273
## Indeks Pembangunan Manusia                                                 0.539
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi     0.658
##  Harapan Lama Sekolah (Tahun)\n                                            0.790
##                                                                        APS 19-24
## APS 13-15                                                                  0.327
## APS 16-18                                                                  0.654
## APS 19-24                                                                  1.000
## TPT - Februari                                                            -0.070
## TPT - Agustus                                                             -0.044
## TPAK - Februari                                                           -0.066
## TPAK - Agustus                                                            -0.055
## Persentase Penduduk Miskin - Maret                                         0.038
## Persentase Penduduk Miskin - September                                     0.038
## Indeks Pembangunan Manusia                                                 0.265
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi     0.369
##  Harapan Lama Sekolah (Tahun)\n                                            0.728
##                                                                        TPT - Februari
## APS 13-15                                                                       0.443
## APS 16-18                                                                       0.209
## APS 19-24                                                                      -0.070
## TPT - Februari                                                                  1.000
## TPT - Agustus                                                                   0.953
## TPAK - Februari                                                                -0.611
## TPAK - Agustus                                                                 -0.622
## Persentase Penduduk Miskin - Maret                                             -0.378
## Persentase Penduduk Miskin - September                                         -0.389
## Indeks Pembangunan Manusia                                                      0.467
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi          0.538
##  Harapan Lama Sekolah (Tahun)\n                                                 0.152
##                                                                        TPT - Agustus
## APS 13-15                                                                      0.516
## APS 16-18                                                                      0.236
## APS 19-24                                                                     -0.044
## TPT - Februari                                                                 0.953
## TPT - Agustus                                                                  1.000
## TPAK - Februari                                                               -0.613
## TPAK - Agustus                                                                -0.649
## Persentase Penduduk Miskin - Maret                                            -0.433
## Persentase Penduduk Miskin - September                                        -0.441
## Indeks Pembangunan Manusia                                                     0.502
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi         0.539
##  Harapan Lama Sekolah (Tahun)\n                                                0.193
##                                                                        TPAK - Februari
## APS 13-15                                                                       -0.566
## APS 16-18                                                                       -0.218
## APS 19-24                                                                       -0.066
## TPT - Februari                                                                  -0.611
## TPT - Agustus                                                                   -0.613
## TPAK - Februari                                                                  1.000
## TPAK - Agustus                                                                   0.916
## Persentase Penduduk Miskin - Maret                                               0.563
## Persentase Penduduk Miskin - September                                           0.555
## Indeks Pembangunan Manusia                                                      -0.522
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi          -0.610
##  Harapan Lama Sekolah (Tahun)\n                                                 -0.364
##                                                                        TPAK - Agustus
## APS 13-15                                                                      -0.624
## APS 16-18                                                                      -0.229
## APS 19-24                                                                      -0.055
## TPT - Februari                                                                 -0.622
## TPT - Agustus                                                                  -0.649
## TPAK - Februari                                                                 0.916
## TPAK - Agustus                                                                  1.000
## Persentase Penduduk Miskin - Maret                                              0.596
## Persentase Penduduk Miskin - September                                          0.584
## Indeks Pembangunan Manusia                                                     -0.552
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi         -0.616
##  Harapan Lama Sekolah (Tahun)\n                                                -0.378
##                                                                        Persentase Penduduk Miskin - Maret
## APS 13-15                                                                                          -0.657
## APS 16-18                                                                                          -0.260
## APS 19-24                                                                                           0.038
## TPT - Februari                                                                                     -0.378
## TPT - Agustus                                                                                      -0.433
## TPAK - Februari                                                                                     0.563
## TPAK - Agustus                                                                                      0.596
## Persentase Penduduk Miskin - Maret                                                                  1.000
## Persentase Penduduk Miskin - September                                                              0.967
## Indeks Pembangunan Manusia                                                                         -0.812
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi                             -0.553
##  Harapan Lama Sekolah (Tahun)\n                                                                    -0.410
##                                                                        Persentase Penduduk Miskin - September
## APS 13-15                                                                                              -0.652
## APS 16-18                                                                                              -0.273
## APS 19-24                                                                                               0.038
## TPT - Februari                                                                                         -0.389
## TPT - Agustus                                                                                          -0.441
## TPAK - Februari                                                                                         0.555
## TPAK - Agustus                                                                                          0.584
## Persentase Penduduk Miskin - Maret                                                                      0.967
## Persentase Penduduk Miskin - September                                                                  1.000
## Indeks Pembangunan Manusia                                                                             -0.797
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi                                 -0.530
##  Harapan Lama Sekolah (Tahun)\n                                                                        -0.393
##                                                                        Indeks Pembangunan Manusia
## APS 13-15                                                                                   0.785
## APS 16-18                                                                                   0.539
## APS 19-24                                                                                   0.265
## TPT - Februari                                                                              0.467
## TPT - Agustus                                                                               0.502
## TPAK - Februari                                                                            -0.522
## TPAK - Agustus                                                                             -0.552
## Persentase Penduduk Miskin - Maret                                                         -0.812
## Persentase Penduduk Miskin - September                                                     -0.797
## Indeks Pembangunan Manusia                                                                  1.000
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi                      0.700
##  Harapan Lama Sekolah (Tahun)\n                                                             0.625
##                                                                        Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
## APS 13-15                                                                                                                               0.732
## APS 16-18                                                                                                                               0.658
## APS 19-24                                                                                                                               0.369
## TPT - Februari                                                                                                                          0.538
## TPT - Agustus                                                                                                                           0.539
## TPAK - Februari                                                                                                                        -0.610
## TPAK - Agustus                                                                                                                         -0.616
## Persentase Penduduk Miskin - Maret                                                                                                     -0.553
## Persentase Penduduk Miskin - September                                                                                                 -0.530
## Indeks Pembangunan Manusia                                                                                                              0.700
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi                                                                  1.000
##  Harapan Lama Sekolah (Tahun)\n                                                                                                         0.677
##                                                                         Harapan Lama Sekolah (Tahun)\n
## APS 13-15                                                                                        0.735
## APS 16-18                                                                                        0.790
## APS 19-24                                                                                        0.728
## TPT - Februari                                                                                   0.152
## TPT - Agustus                                                                                    0.193
## TPAK - Februari                                                                                 -0.364
## TPAK - Agustus                                                                                  -0.378
## Persentase Penduduk Miskin - Maret                                                              -0.410
## Persentase Penduduk Miskin - September                                                          -0.393
## Indeks Pembangunan Manusia                                                                       0.625
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi                           0.677
##  Harapan Lama Sekolah (Tahun)\n                                                                  1.000

det_val <- det(cor_matrix)
cat("Determinant of Correlation Matrix:", round(det_val, 8), "\n")

## Determinant of Correlation Matrix: 1.9e-07

if (det_val > 0.00001) {
  cat("No extreme multicollinearity detected. Matrix is suitable for FA.\n")
} else {
  cat("Determinant is very small. Check for multicollinearity.\n")
}

## Determinant is very small. Check for multicollinearity.

cor_tmp <- cor_matrix
diag(cor_tmp) <- NA
high_corr_pairs <- which(abs(cor_tmp) > 0.90, arr.ind = TRUE)
high_corr_pairs <- high_corr_pairs[high_corr_pairs[, 1] < high_corr_pairs[, 2], , drop = FALSE]

if (nrow(high_corr_pairs) > 0) {
  cat("\nVariable pairs with |r| > 0.90 (potential near-redundancy):\n")
  for (k in seq_len(nrow(high_corr_pairs))) {
    r1 <- rownames(cor_matrix)[high_corr_pairs[k, 1]]
    r2 <- colnames(cor_matrix)[high_corr_pairs[k, 2]]
    cat(sprintf("  %s  &  %s  : r = %.4f\n",
                r1, r2, cor_matrix[high_corr_pairs[k, 1], high_corr_pairs[k, 2]]))
  }
} else {
  cat("\nNo variable pairs exceed |r| = 0.90. No near-redundancy detected.\n")
}

## 
## Variable pairs with |r| > 0.90 (potential near-redundancy):
##   TPT - Februari  &  TPT - Agustus  : r = 0.9530
##   TPAK - Februari  &  TPAK - Agustus  : r = 0.9165
##   Persentase Penduduk Miskin - Maret  &  Persentase Penduduk Miskin - September  : r = 0.9667

6. Correlation Heatmap

The correlation heatmap provides a visual representation of the correlation structure. Blue shades indicate positive correlations, while red shades indicate negative correlations. Clusters of correlated variables provide an initial indication of potential latent factor groupings.

corrplot(cor_matrix,
         method        = "color",
         type          = "full",
         order         = "hclust",
         hclust.method = "ward.D2",
         col           = colorRampPalette(c("red", "white", "blue"))(200),
         addCoef.col   = "black",
         number.cex    = 0.5,
         tl.cex        = 0.6,
         tl.col        = "black",
         diag          = FALSE,
         mar           = c(0, 0, 1.5, 0))

corr_tmp2 <- cor_matrix
corr_tmp2[lower.tri(corr_tmp2, diag = TRUE)] <- NA
corr_table <- na.omit(as.data.frame(as.table(corr_tmp2)))
colnames(corr_table) <- c("Variable_1", "Variable_2", "Correlation")
corr_table <- corr_table[order(-abs(corr_table$Correlation)), ]
head(corr_table, 5)

##                                 Variable_1
## 104     Persentase Penduduk Miskin - Maret
## 52                          TPT - Februari
## 78                         TPAK - Februari
## 116     Persentase Penduduk Miskin - Maret
## 117 Persentase Penduduk Miskin - September
##                                 Variable_2 Correlation
## 104 Persentase Penduduk Miskin - September   0.9667073
## 52                           TPT - Agustus   0.9530010
## 78                          TPAK - Agustus   0.9164744
## 116             Indeks Pembangunan Manusia  -0.8121996
## 117             Indeks Pembangunan Manusia  -0.7974913

7. Kaiser-Meyer-Olkin (KMO) Test

The KMO measure assesses the proportion of variance among variables that might be common variance, that is, variance attributable to underlying factors. Values above 0.70 are considered acceptable, with higher values indicating greater suitability for Factor Analysis.

KMO Value	Interpretation
>= 0.90	Marvelous
>= 0.80	Meritorious
>= 0.70	Middling
>= 0.60	Mediocre
< 0.50	Unacceptable

kmo_result <- KMO(data_clean)

interpret_kmo <- function(val) {
  if      (val >= 0.90) "Marvelous"
  else if (val >= 0.80) "Meritorious"
  else if (val >= 0.70) "Middling"
  else if (val >= 0.60) "Mediocre"
  else                  "Unacceptable"
}

cat("Overall KMO    :", round(kmo_result$MSA, 4), "\n")

## Overall KMO    : 0.8192

cat("Interpretation :", interpret_kmo(kmo_result$MSA), "\n\n")

## Interpretation : Meritorious

cat("KMO per Variable:\n")

## KMO per Variable:

print(data.frame(Variable = names(kmo_result$MSAi),
                 KMO      = round(kmo_result$MSAi, 4)))

##                                                                                                                                      Variable
## APS 13-15                                                                                                                           APS 13-15
## APS 16-18                                                                                                                           APS 16-18
## APS 19-24                                                                                                                           APS 19-24
## TPT - Februari                                                                                                                 TPT - Februari
## TPT - Agustus                                                                                                                   TPT - Agustus
## TPAK - Februari                                                                                                               TPAK - Februari
## TPAK - Agustus                                                                                                                 TPAK - Agustus
## Persentase Penduduk Miskin - Maret                                                                         Persentase Penduduk Miskin - Maret
## Persentase Penduduk Miskin - September                                                                 Persentase Penduduk Miskin - September
## Indeks Pembangunan Manusia                                                                                         Indeks Pembangunan Manusia
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
##  Harapan Lama Sekolah (Tahun)\n                                                                                Harapan Lama Sekolah (Tahun)\n
##                                                                           KMO
## APS 13-15                                                              0.8396
## APS 16-18                                                              0.8067
## APS 19-24                                                              0.6362
## TPT - Februari                                                         0.7176
## TPT - Agustus                                                          0.7462
## TPAK - Februari                                                        0.8311
## TPAK - Agustus                                                         0.8252
## Persentase Penduduk Miskin - Maret                                     0.7843
## Persentase Penduduk Miskin - September                                 0.8115
## Indeks Pembangunan Manusia                                             0.9351
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi 0.9307
##  Harapan Lama Sekolah (Tahun)\n                                        0.8480

8. Bartlett’s Test of Sphericity

Bartlett’s test examines whether the correlation matrix is an identity matrix. If the correlation matrix were an identity matrix, all variables would be completely uncorrelated and Factor Analysis would be meaningless. A significant result (p < 0.05) indicates that the variables are sufficiently correlated to justify Factor Analysis.

bartlett_result <- cortest.bartlett(cor_matrix, n = nrow(data_clean))

cat("Bartlett's Test of Sphericity\n")

## Bartlett's Test of Sphericity

cat("Chi-square :", round(bartlett_result$chisq, 4), "\n")

## Chi-square : 2263.651

cat("df         :", bartlett_result$df, "\n")

## df         : 66

cat("p-value    :", format(bartlett_result$p.value, scientific = TRUE), "\n")

## p-value    : 0e+00

if (bartlett_result$p.value < 0.05) {
  cat("p < 0.05: H0 rejected. Data is suitable for Factor Analysis.\n")
} else {
  cat("p >= 0.05: H0 not rejected. Data may not be suitable for Factor Analysis.\n")
}

## p < 0.05: H0 rejected. Data is suitable for Factor Analysis.

9. Standardization

Standardization using Z-score transformation was applied after the KMO and Bartlett tests, because both tests are based on the correlation matrix which is scale-invariant. Standardization is essential for PCA, as variables with larger variances would otherwise dominate the principal components. The same standardized data object X_scaled is used as input for both PCA and Factor Analysis to ensure a fair and consistent comparison.

X_scaled <- scale(data_clean)

cat("Mean after scaling (should be ~0):", round(mean(colMeans(X_scaled)), 6), "\n")

## Mean after scaling (should be ~0): 0

cat("SD after scaling   (should be ~1):", round(mean(apply(X_scaled, 2, sd)), 6), "\n")

## SD after scaling   (should be ~1): 1

10. Principal Component Analysis (PCA)

PCA was conducted to reduce dimensionality and identify the main components that explain the maximum variance in the socioeconomic dataset. Unlike Factor Analysis, PCA is based on total variance and does not distinguish between common and unique variance. The number of components to retain was determined using three criteria: the Kaiser criterion (eigenvalue > 1), the cumulative variance threshold (>= 60%), and Parallel Analysis. Parallel Analysis is used as the primary criterion, as it is generally considered the most statistically rigorous method (Hayton et al., 2004).

pca_result        <- prcomp(X_scaled, center = FALSE, scale. = FALSE)
pca_eigenvalues   <- pca_result$sdev^2
pca_var_explained <- pca_eigenvalues / sum(pca_eigenvalues) * 100
pca_cum_var       <- cumsum(pca_var_explained)

pca_table <- data.frame(
  Component      = 1:length(pca_eigenvalues),
  Eigenvalue     = round(pca_eigenvalues, 4),
  Pct_Variance   = round(pca_var_explained, 2),
  Cumulative_Pct = round(pca_cum_var, 2),
  Kaiser         = ifelse(pca_eigenvalues > 1, "Yes", "No")
)
print(pca_table)

##    Component Eigenvalue Pct_Variance Cumulative_Pct Kaiser
## 1          1     6.6180        55.15          55.15    Yes
## 2          2     2.3112        19.26          74.41    Yes
## 3          3     1.2471        10.39          84.80    Yes
## 4          4     0.6838         5.70          90.50     No
## 5          5     0.3457         2.88          93.38     No
## 6          6     0.2683         2.24          95.62     No
## 7          7     0.1524         1.27          96.89     No
## 8          8     0.1384         1.15          98.04     No
## 9          9     0.1000         0.83          98.87     No
## 10        10     0.0696         0.58          99.45     No
## 11        11     0.0368         0.31          99.76     No
## 12        12     0.0288         0.24         100.00     No

n_kaiser_pca <- sum(pca_eigenvalues > 1)
n_cumvar_pca <- which(pca_cum_var >= 60)[1]

cat("\nKaiser Criterion           :", n_kaiser_pca, "components\n")

## 
## Kaiser Criterion           : 3 components

cat("Cumulative Variance >= 60% :", n_cumvar_pca,
    "components (", round(pca_cum_var[n_cumvar_pca], 2), "%)\n")

## Cumulative Variance >= 60% : 2 components ( 74.41 %)

pa_pca   <- fa.parallel(X_scaled,
                        fa          = "pc",
                        n.iter      = 100,
                        main        = "Parallel Analysis - PCA",
                        show.legend = TRUE)

## Parallel analysis suggests that the number of factors =  NA  and the number of components =  2

n_pa_pca <- pa_pca$ncomp

cat("Parallel Analysis          :", n_pa_pca, "components\n\n")

## Parallel Analysis          : 2 components

n_components <- n_pa_pca
cat("Final decision (Parallel Analysis as primary criterion):", n_components, "components\n")

## Final decision (Parallel Analysis as primary criterion): 2 components

10.1 Scree Plot

plot(pca_eigenvalues,
     type  = "b",
     pch   = 19,
     col   = "darkblue",
     xlab  = "Principal Component",
     ylab  = "Eigenvalue",
     main  = "Scree Plot - PCA",
     ylim  = c(0, max(pca_eigenvalues) * 1.1))

abline(h = 1,            col = "red",       lty = 2, lwd = 1.5)
abline(v = n_components, col = "darkgreen", lty = 3, lwd = 1.5)

legend("topright",
       legend = c("Kaiser Criterion (= 1)",
                  paste0("Selected n = ", n_components, " (Parallel Analysis)")),
       col    = c("red", "darkgreen"),
       lty    = c(2, 3),
       bty    = "n")

10.2 PCA Loadings

The loadings represent the correlation between original variables and each principal component.

print(round(pca_result$rotation, 4))

##                                                                            PC1
## APS 13-15                                                              -0.3444
## APS 16-18                                                              -0.2414
## APS 19-24                                                              -0.1183
## TPT - Februari                                                         -0.2564
## TPT - Agustus                                                          -0.2716
## TPAK - Februari                                                         0.2992
## TPAK - Agustus                                                          0.3097
## Persentase Penduduk Miskin - Maret                                      0.3054
## Persentase Penduduk Miskin - September                                  0.3028
## Indeks Pembangunan Manusia                                             -0.3393
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.3317
##  Harapan Lama Sekolah (Tahun)\n                                        -0.2721
##                                                                            PC2
## APS 13-15                                                              -0.1317
## APS 16-18                                                              -0.4317
## APS 19-24                                                              -0.5363
## TPT - Februari                                                          0.3078
## TPT - Agustus                                                           0.2936
## TPAK - Februari                                                        -0.2160
## TPAK - Agustus                                                         -0.2198
## Persentase Penduduk Miskin - Maret                                     -0.1438
## Persentase Penduduk Miskin - September                                 -0.1465
## Indeks Pembangunan Manusia                                             -0.0457
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.1115
##  Harapan Lama Sekolah (Tahun)\n                                        -0.4196
##                                                                            PC3
## APS 13-15                                                              -0.0619
## APS 16-18                                                               0.1325
## APS 19-24                                                               0.2155
## TPT - Februari                                                          0.4354
## TPT - Agustus                                                           0.3914
## TPAK - Februari                                                        -0.1470
## TPAK - Agustus                                                         -0.1271
## Persentase Penduduk Miskin - Maret                                      0.4883
## Persentase Penduduk Miskin - September                                  0.4817
## Indeks Pembangunan Manusia                                             -0.2533
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.1307
##  Harapan Lama Sekolah (Tahun)\n                                         0.0005
##                                                                            PC4
## APS 13-15                                                              -0.0860
## APS 16-18                                                              -0.2402
## APS 19-24                                                               0.1771
## TPT - Februari                                                         -0.3353
## TPT - Agustus                                                          -0.3399
## TPAK - Februari                                                        -0.5763
## TPAK - Agustus                                                         -0.5192
## Persentase Penduduk Miskin - Maret                                      0.0489
## Persentase Penduduk Miskin - September                                  0.0772
## Indeks Pembangunan Manusia                                             -0.2269
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.0150
##  Harapan Lama Sekolah (Tahun)\n                                         0.1175
##                                                                            PC5
## APS 13-15                                                              -0.4181
## APS 16-18                                                              -0.3790
## APS 19-24                                                               0.6405
## TPT - Februari                                                          0.1801
## TPT - Agustus                                                           0.1867
## TPAK - Februari                                                         0.0319
## TPAK - Agustus                                                          0.0750
## Persentase Penduduk Miskin - Maret                                     -0.1525
## Persentase Penduduk Miskin - September                                 -0.1647
## Indeks Pembangunan Manusia                                              0.2299
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.2979
##  Harapan Lama Sekolah (Tahun)\n                                         0.0733
##                                                                            PC6
## APS 13-15                                                               0.4926
## APS 16-18                                                               0.1232
## APS 19-24                                                               0.0135
## TPT - Februari                                                         -0.0217
## TPT - Agustus                                                           0.1648
## TPAK - Februari                                                        -0.0019
## TPAK - Agustus                                                         -0.1514
## Persentase Penduduk Miskin - Maret                                      0.0674
## Persentase Penduduk Miskin - September                                 -0.0080
## Indeks Pembangunan Manusia                                             -0.0594
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.8221
##  Harapan Lama Sekolah (Tahun)\n                                         0.0858
##                                                                            PC7
## APS 13-15                                                              -0.2687
## APS 16-18                                                               0.5385
## APS 19-24                                                               0.1605
## TPT - Februari                                                          0.0374
## TPT - Agustus                                                           0.0400
## TPAK - Februari                                                        -0.1273
## TPAK - Agustus                                                          0.0392
## Persentase Penduduk Miskin - Maret                                     -0.2074
## Persentase Penduduk Miskin - September                                 -0.4046
## Indeks Pembangunan Manusia                                             -0.5470
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.0805
##  Harapan Lama Sekolah (Tahun)\n                                        -0.2810
##                                                                            PC8
## APS 13-15                                                               0.0661
## APS 16-18                                                              -0.3249
## APS 19-24                                                              -0.1154
## TPT - Februari                                                          0.0502
## TPT - Agustus                                                           0.1811
## TPAK - Februari                                                         0.2268
## TPAK - Agustus                                                          0.0575
## Persentase Penduduk Miskin - Maret                                     -0.1700
## Persentase Penduduk Miskin - September                                 -0.1274
## Indeks Pembangunan Manusia                                             -0.5918
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.1073
##  Harapan Lama Sekolah (Tahun)\n                                         0.6163
##                                                                            PC9
## APS 13-15                                                               0.3851
## APS 16-18                                                              -0.2245
## APS 19-24                                                               0.3634
## TPT - Februari                                                         -0.2088
## TPT - Agustus                                                           0.1431
## TPAK - Februari                                                         0.4237
## TPAK - Agustus                                                         -0.3147
## Persentase Penduduk Miskin - Maret                                     -0.0589
## Persentase Penduduk Miskin - September                                 -0.0324
## Indeks Pembangunan Manusia                                             -0.1742
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.2525
##  Harapan Lama Sekolah (Tahun)\n                                        -0.4774
##                                                                           PC10
## APS 13-15                                                               0.4131
## APS 16-18                                                              -0.2461
## APS 19-24                                                               0.1701
## TPT - Februari                                                         -0.0266
## TPT - Agustus                                                           0.0363
## TPAK - Februari                                                        -0.4974
## TPAK - Agustus                                                          0.6578
## Persentase Penduduk Miskin - Maret                                      0.0052
## Persentase Penduduk Miskin - September                                 -0.0374
## Indeks Pembangunan Manusia                                             -0.1293
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.0931
##  Harapan Lama Sekolah (Tahun)\n                                        -0.1758
##                                                                           PC11
## APS 13-15                                                              -0.1911
## APS 16-18                                                               0.1076
## APS 19-24                                                              -0.0811
## TPT - Februari                                                         -0.6285
## TPT - Agustus                                                           0.6380
## TPAK - Februari                                                        -0.1204
## TPAK - Agustus                                                          0.0690
## Persentase Penduduk Miskin - Maret                                     -0.2130
## Persentase Penduduk Miskin - September                                  0.2672
## Indeks Pembangunan Manusia                                              0.0774
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.0102
##  Harapan Lama Sekolah (Tahun)\n                                         0.0327
##                                                                           PC12
## APS 13-15                                                              -0.0581
## APS 16-18                                                              -0.0568
## APS 19-24                                                              -0.0584
## TPT - Februari                                                         -0.2476
## TPT - Agustus                                                           0.2146
## TPAK - Februari                                                        -0.0049
## TPAK - Agustus                                                         -0.0092
## Persentase Penduduk Miskin - Maret                                      0.7048
## Persentase Penduduk Miskin - September                                 -0.6065
## Indeks Pembangunan Manusia                                              0.0905
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.0713
##  Harapan Lama Sekolah (Tahun)\n                                         0.0687

10.3 Component Scores

Component scores represent the transformed coordinates of provinces in the reduced-dimensional space.

pca_scores <- as.data.frame(pca_result$x)
head(pca_scores)

##          PC1         PC2        PC3        PC4        PC5         PC6
## 1 -2.3385944 -0.81753946  1.8216797  0.6196472 -0.2868767  0.52823632
## 2 -1.4780383  0.21602878  0.2746672 -0.8369233 -0.1982245 -0.15525600
## 3 -2.5170324 -0.92239718  0.6337863 -0.5938328  0.8828663  0.38296703
## 4 -1.4767519  0.09712381 -0.4223455  0.9035499 -0.1313916 -0.05285344
## 5 -0.6330726  0.86905284 -0.5752711  0.4319450 -0.2429342  0.29389941
## 6  0.5386700  1.37184033 -0.2174194 -0.2567865 -0.7299706  0.24436089
##            PC7         PC8          PC9        PC10        PC11        PC12
## 1 -0.015815183  0.04570412 -0.353356031 -0.46766128  0.05419049  0.18219926
## 2  0.274518462  0.15055580  0.139154946  0.01102609  0.13234260  0.12159473
## 3  0.908669444  0.19548492 -0.136271556 -0.28384336 -0.04055757 -0.11948260
## 4  0.450826559 -0.33586797  0.004319478 -0.27874119 -0.11557566 -0.08304870
## 5  0.003528384  0.11061277  0.091832455  0.06327489  0.14111966  0.05254338
## 6 -0.282485122 -0.04758487 -0.037531699 -0.17180244 -0.05797341  0.15314752

10.4 Biplot

The biplot simultaneously visualizes provinces and variables in the space of the first two principal components.

biplot(pca_result, scale = 0, cex = 0.6)

11. Determining the Number of Factors

Three complementary methods were used to determine the appropriate number of factors: the Kaiser criterion (eigenvalue > 1), the Scree Plot, and Parallel Analysis. Parallel Analysis is the primary criterion, as it compares observed eigenvalues against those generated from random data of the same dimensions, providing a statistically principled threshold that avoids over- or under-extraction.

Because X_scaled is the input for Factor Analysis, its covariance matrix is equivalent to the correlation matrix of the original variables. The eigenvalues below are therefore identical to those derived from the PCA step above. They are presented here alongside the FA-specific Parallel Analysis for transparency.

11.1 Eigenvalue Table

eigen_values <- pca_eigenvalues
cum_var_fa   <- cumsum(eigen_values / sum(eigen_values)) * 100

eigen_table <- data.frame(
  Component      = 1:length(eigen_values),
  Eigenvalue     = round(eigen_values, 4),
  Pct_Variance   = round(eigen_values / sum(eigen_values) * 100, 2),
  Cumulative_Pct = round(cum_var_fa, 2),
  Kaiser         = ifelse(eigen_values > 1, "Yes", "No")
)
print(eigen_table)

##    Component Eigenvalue Pct_Variance Cumulative_Pct Kaiser
## 1          1     6.6180        55.15          55.15    Yes
## 2          2     2.3112        19.26          74.41    Yes
## 3          3     1.2471        10.39          84.80    Yes
## 4          4     0.6838         5.70          90.50     No
## 5          5     0.3457         2.88          93.38     No
## 6          6     0.2683         2.24          95.62     No
## 7          7     0.1524         1.27          96.89     No
## 8          8     0.1384         1.15          98.04     No
## 9          9     0.1000         0.83          98.87     No
## 10        10     0.0696         0.58          99.45     No
## 11        11     0.0368         0.31          99.76     No
## 12        12     0.0288         0.24         100.00     No

n_kaiser <- sum(eigen_values > 1)
cat("\nFactors with eigenvalue > 1 (Kaiser):", n_kaiser, "\n")

## 
## Factors with eigenvalue > 1 (Kaiser): 3

11.2 Scree Plot

plot(eigen_values,
     type  = "b",
     pch   = 19,
     col   = "steelblue",
     xlab  = "Component",
     ylab  = "Eigenvalue",
     main  = "Scree Plot - Factor Analysis",
     ylim  = c(0, max(eigen_values) * 1.1))

abline(h = 1, col = "red", lty = 2, lwd = 1.5)
legend("topright", legend = "Kaiser Criterion (= 1)",
       col = "red", lty = 2, bty = "n")

11.3 Parallel Analysis for Factor Analysis

The result of Parallel Analysis using fa = “fa” is saved and used as the primary criterion for determining the number of factors to extract. This is consistent with the approach used for PCA, ensuring methodological coherence across both analyses.

pa_fa   <- fa.parallel(X_scaled,
                       fa          = "fa",
                       n.iter      = 100,
                       main        = "Parallel Analysis - Factor Analysis",
                       show.legend = TRUE)

## Parallel analysis suggests that the number of factors =  3  and the number of components =  NA

n_pa_fa <- pa_fa$nfact

cat("Kaiser Criterion  :", n_kaiser, "factors\n")

## Kaiser Criterion  : 3 factors

cat("Parallel Analysis :", n_pa_fa,  "factors\n\n")

## Parallel Analysis : 3 factors

n_factors <- n_pa_fa
cat("Final decision (Parallel Analysis as primary criterion):", n_factors, "factors\n")

## Final decision (Parallel Analysis as primary criterion): 3 factors

12. Factor Extraction

Based on Parallel Analysis as the primary criterion, n_factors factors were extracted using Principal Axis Factoring (PAF). PAF is preferred over Maximum Likelihood when the normality assumption is not fully met, which is common in real-world socioeconomic data (Fabrigar et al., 1999).

fa_result <- fa(X_scaled,
                nfactors = n_factors,
                rotate   = "none",
                fm       = "pa")

cat("Factor Variance (Before Rotation):\n")

## Factor Variance (Before Rotation):

print(fa_result$Vaccounted)

##                             PA1       PA2        PA3
## SS loadings           6.4306748 2.0970008 1.13581570
## Proportion Var        0.5358896 0.1747501 0.09465131
## Cumulative Var        0.5358896 0.7106396 0.80529094
## Proportion Explained  0.6654608 0.2170024 0.11753679
## Cumulative Proportion 0.6654608 0.8824632 1.00000000

13. Factor Rotation

Two rotation methods are applied for comparison. Varimax (orthogonal) is used when factors are assumed to be uncorrelated, while Oblimin (oblique) is used when factors are allowed to correlate. In social science research, oblique rotation is often more theoretically realistic because underlying constructs rarely operate entirely independently. The factor correlation matrix Phi from Oblimin is inspected to determine whether orthogonality is a reasonable assumption: if inter-factor correlations are below 0.30, Varimax is appropriate; if they are 0.30 or above, Oblimin is preferred.

13.1 Varimax Rotation (Orthogonal)

fa_varimax <- fa(X_scaled,
                 nfactors = n_factors,
                 rotate   = "varimax",
                 fm       = "pa")

cat("Factor Loadings (Varimax):\n")

## Factor Loadings (Varimax):

print(fa_varimax$loadings, cutoff = 0.30, sort = TRUE)

## 
## Loadings:
##                                                                        PA1   
## TPT - Februari                                                          0.925
## TPT - Agustus                                                           0.903
## TPAK - Februari                                                        -0.670
## TPAK - Agustus                                                         -0.689
## Persentase Penduduk Miskin - Maret                                           
## Persentase Penduduk Miskin - September                                       
## Indeks Pembangunan Manusia                                              0.345
## APS 13-15                                                               0.404
## APS 16-18                                                                    
## APS 19-24                                                                    
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.506
##  Harapan Lama Sekolah (Tahun)\n                                              
##                                                                        PA3   
## TPT - Februari                                                               
## TPT - Agustus                                                                
## TPAK - Februari                                                         0.414
## TPAK - Agustus                                                          0.445
## Persentase Penduduk Miskin - Maret                                      0.953
## Persentase Penduduk Miskin - September                                  0.925
## Indeks Pembangunan Manusia                                             -0.698
## APS 13-15                                                              -0.539
## APS 16-18                                                                    
## APS 19-24                                                                    
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.385
##  Harapan Lama Sekolah (Tahun)\n                                        -0.311
##                                                                        PA2   
## TPT - Februari                                                               
## TPT - Agustus                                                                
## TPAK - Februari                                                              
## TPAK - Agustus                                                               
## Persentase Penduduk Miskin - Maret                                           
## Persentase Penduduk Miskin - September                                       
## Indeks Pembangunan Manusia                                              0.434
## APS 13-15                                                               0.588
## APS 16-18                                                               0.859
## APS 19-24                                                               0.804
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.575
##  Harapan Lama Sekolah (Tahun)\n                                         0.891
## 
##                  PA1   PA3   PA2
## SS loadings    3.309 3.244 3.110
## Proportion Var 0.276 0.270 0.259
## Cumulative Var 0.276 0.546 0.805

13.2 Oblimin Rotation (Oblique)

fa_oblimin <- fa(X_scaled,
                 nfactors = n_factors,
                 rotate   = "oblimin",
                 fm       = "pa")

cat("Pattern Matrix (Oblimin):\n")

## Pattern Matrix (Oblimin):

print(fa_oblimin$loadings, cutoff = 0.30, sort = TRUE)

## 
## Loadings:
##                                                                        PA2   
## APS 13-15                                                               0.532
## APS 16-18                                                               0.883
## APS 19-24                                                               0.870
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.528
##  Harapan Lama Sekolah (Tahun)\n                                         0.901
## TPT - Februari                                                               
## TPT - Agustus                                                                
## TPAK - Februari                                                              
## TPAK - Agustus                                                               
## Persentase Penduduk Miskin - Maret                                           
## Persentase Penduduk Miskin - September                                       
## Indeks Pembangunan Manusia                                              0.352
##                                                                        PA3   
## APS 13-15                                                                    
## APS 16-18                                                                    
## APS 19-24                                                                    
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.402
##  Harapan Lama Sekolah (Tahun)\n                                              
## TPT - Februari                                                          0.994
## TPT - Agustus                                                           0.949
## TPAK - Februari                                                        -0.610
## TPAK - Agustus                                                         -0.621
## Persentase Penduduk Miskin - Maret                                           
## Persentase Penduduk Miskin - September                                       
## Indeks Pembangunan Manusia                                                   
##                                                                        PA1   
## APS 13-15                                                              -0.394
## APS 16-18                                                                    
## APS 19-24                                                                    
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi       
##  Harapan Lama Sekolah (Tahun)\n                                              
## TPT - Februari                                                               
## TPT - Agustus                                                                
## TPAK - Februari                                                              
## TPAK - Agustus                                                               
## Persentase Penduduk Miskin - Maret                                      1.013
## Persentase Penduduk Miskin - September                                  0.977
## Indeks Pembangunan Manusia                                             -0.625
## 
##                  PA2   PA3   PA1
## SS loadings    3.048 2.901 2.822
## Proportion Var 0.254 0.242 0.235
## Cumulative Var 0.254 0.496 0.731

cat("\nFactor Correlation Matrix (Phi):\n")

## 
## Factor Correlation Matrix (Phi):

print(round(fa_oblimin$Phi, 4))

##         PA2     PA3     PA1
## PA2  1.0000  0.2544 -0.3448
## PA3  0.2544  1.0000 -0.5100
## PA1 -0.3448 -0.5100  1.0000

14. Model Evaluation

14.1 Communalities

Communalities represent the proportion of each variable’s variance explained by the extracted factors. A communality of 0.50 or above is generally considered acceptable, indicating that the factor model accounts for at least half of the variance in each variable.

comm_df <- data.frame(
  Variable    = names(fa_varimax$communality),
  Communality = round(fa_varimax$communality, 4),
  Status      = ifelse(fa_varimax$communality >= 0.5, "Adequate", "Low (<0.5)")
)
print(comm_df)

##                                                                                                                                      Variable
## APS 13-15                                                                                                                           APS 13-15
## APS 16-18                                                                                                                           APS 16-18
## APS 19-24                                                                                                                           APS 19-24
## TPT - Februari                                                                                                                 TPT - Februari
## TPT - Agustus                                                                                                                   TPT - Agustus
## TPAK - Februari                                                                                                               TPAK - Februari
## TPAK - Agustus                                                                                                                 TPAK - Agustus
## Persentase Penduduk Miskin - Maret                                                                         Persentase Penduduk Miskin - Maret
## Persentase Penduduk Miskin - September                                                                 Persentase Penduduk Miskin - September
## Indeks Pembangunan Manusia                                                                                         Indeks Pembangunan Manusia
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
##  Harapan Lama Sekolah (Tahun)\n                                                                                Harapan Lama Sekolah (Tahun)\n
##                                                                        Communality
## APS 13-15                                                                   0.7998
## APS 16-18                                                                   0.7828
## APS 19-24                                                                   0.6618
## TPT - Februari                                                              0.8739
## TPT - Agustus                                                               0.8551
## TPAK - Februari                                                             0.6434
## TPAK - Agustus                                                              0.6964
## Persentase Penduduk Miskin - Maret                                          0.9836
## Persentase Penduduk Miskin - September                                      0.9352
## Indeks Pembangunan Manusia                                                  0.7945
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi      0.7346
##  Harapan Lama Sekolah (Tahun)\n                                             0.9024
##                                                                          Status
## APS 13-15                                                              Adequate
## APS 16-18                                                              Adequate
## APS 19-24                                                              Adequate
## TPT - Februari                                                         Adequate
## TPT - Agustus                                                          Adequate
## TPAK - Februari                                                        Adequate
## TPAK - Agustus                                                         Adequate
## Persentase Penduduk Miskin - Maret                                     Adequate
## Persentase Penduduk Miskin - September                                 Adequate
## Indeks Pembangunan Manusia                                             Adequate
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi Adequate
##  Harapan Lama Sekolah (Tahun)\n                                        Adequate

low_comm <- comm_df$Variable[fa_varimax$communality < 0.5]
if (length(low_comm) > 0) {
  cat("\nVariables with low communality:", paste(low_comm, collapse = ", "), "\n")
} else {
  cat("\nAll variables have communality >= 0.50.\n")
}

## 
## All variables have communality >= 0.50.

14.2 Factor Loadings Heatmap

The heatmap displays the Varimax factor loading matrix. Darker blue indicates a strong positive loading and darker red indicates a strong negative loading. Variables with a loading magnitude above 0.50 are considered to have a substantial relationship with the corresponding factor.

loadings_matrix <- matrix(fa_varimax$loadings,
                          nrow = nrow(fa_varimax$loadings),
                          ncol = ncol(fa_varimax$loadings))
rownames(loadings_matrix) <- rownames(fa_varimax$loadings)
colnames(loadings_matrix) <- paste0("Factor ", 1:n_factors)

corrplot(loadings_matrix,
         is.corr     = FALSE,
         col         = colorRampPalette(c("red", "white", "blue"))(200),
         tl.cex      = 0.7,
         tl.col      = "black",
         cl.lim      = c(-1, 1),
         addCoef.col = "black",
         number.cex  = 0.65,
         title       = "Factor Loadings Heatmap (Varimax)",
         mar         = c(0, 0, 2, 0))

14.3 Total Variance Explained

var_table <- fa_varimax$Vaccounted
print(round(var_table, 4))

##                          PA1    PA3    PA2
## SS loadings           3.3093 3.2443 3.1099
## Proportion Var        0.2758 0.2704 0.2592
## Cumulative Var        0.2758 0.5461 0.8053
## Proportion Explained  0.3425 0.3357 0.3218
## Cumulative Proportion 0.3425 0.6782 1.0000

total_var <- sum(var_table["Proportion Var", ]) * 100
cat("\nTotal Variance Explained:", round(total_var, 2), "%\n")

## 
## Total Variance Explained: 80.53 %

if (total_var >= 60) {
  cat("Total variance >= 60%: Model is acceptable (Hair et al., 2019).\n")
} else {
  cat("Total variance < 60%: Consider extracting additional factors.\n")
}

## Total variance >= 60%: Model is acceptable (Hair et al., 2019).

prop_var      <- var_table["Proportion Var", ] * 100
cum_var_f     <- cumsum(prop_var)
factor_labels <- paste0("Factor ", 1:n_factors)

barplot(prop_var,
        names.arg = factor_labels,
        col       = "steelblue",
        border    = "white",
        ylim      = c(0, max(cum_var_f) * 1.15),
        ylab      = "% Variance Explained",
        main      = "Variance Explained per Factor")

lines(x   = seq(0.7, by = 1.2, length.out = n_factors),
      y   = cum_var_f,
      col = "red", type = "b", pch = 19, lwd = 2)

abline(h = 60, col = "darkgreen", lty = 2, lwd = 1.5)

legend("topright",
       legend = c("% per Factor", "Cumulative %", "60% Target"),
       col    = c("steelblue", "red", "darkgreen"),
       lty    = c(NA, 1, 2), pch = c(15, 19, NA),
       bty    = "n")

15. Factor Interpretation and Naming

The factor structure was interpreted based on the pattern of high loadings (absolute loading >= 0.40) within each factor. Variables that load strongly on the same factor are assumed to share a common underlying construct. Factor names should be assigned based on the theoretical meaning of the variables grouped together, following Hair et al. (2019) who recommend that factor naming be grounded in substantive theory rather than statistical criteria alone.

loadings_df <- as.data.frame(matrix(fa_varimax$loadings,
                                    nrow = nrow(fa_varimax$loadings),
                                    ncol = ncol(fa_varimax$loadings)))
rownames(loadings_df) <- rownames(fa_varimax$loadings)
colnames(loadings_df) <- paste0("Factor", 1:n_factors)

for (i in 1:n_factors) {
  col_name  <- paste0("Factor", i)
  mask      <- abs(loadings_df[[col_name]]) >= 0.40
  vars_in_f <- loadings_df[mask, col_name, drop = FALSE]
  vars_in_f <- vars_in_f[order(-abs(vars_in_f[[1]])), , drop = FALSE]

  cat(paste0("Factor ", i, " (n variables = ", sum(mask), "):\n"))
  print(round(vars_in_f, 4))
  cat("Factor Name: [Fill in based on substantive theory]\n\n")
}

## Factor 1 (n variables = 6):
##                                                                        Factor1
## TPT - Februari                                                          0.9250
## TPT - Agustus                                                           0.9031
## TPAK - Agustus                                                         -0.6891
## TPAK - Februari                                                        -0.6702
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.5057
## APS 13-15                                                               0.4037
## Factor Name: [Fill in based on substantive theory]
## 
## Factor 2 (n variables = 6):
##                                        Factor2
## Persentase Penduduk Miskin - Maret      0.9534
## Persentase Penduduk Miskin - September  0.9248
## Indeks Pembangunan Manusia             -0.6981
## APS 13-15                              -0.5390
## TPAK - Agustus                          0.4451
## TPAK - Februari                         0.4144
## Factor Name: [Fill in based on substantive theory]
## 
## Factor 3 (n variables = 6):
##                                                                        Factor3
##  Harapan Lama Sekolah (Tahun)\n                                         0.8912
## APS 16-18                                                               0.8589
## APS 19-24                                                               0.8044
## APS 13-15                                                               0.5884
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.5748
## Indeks Pembangunan Manusia                                              0.4336
## Factor Name: [Fill in based on substantive theory]

16. Comparison: PCA vs Factor Analysis

This section provides a side-by-side comparison of PCA and Factor Analysis to highlight key conceptual and empirical differences. Both analyses used the same standardized input X_scaled.

16.1 Variance Explained

pca_var_selected <- pca_var_explained[1:n_components]
pca_total_var    <- sum(pca_var_selected)
fa_prop_var      <- var_table["Proportion Var", ] * 100
fa_total_var     <- sum(fa_prop_var)

cat("PCA - Variance per Component:\n")

## PCA - Variance per Component:

for (i in 1:n_components) {
  cat(sprintf("  PC%d: %.2f%%\n", i, pca_var_selected[i]))
}

##   PC1: 55.15%
##   PC2: 19.26%

cat(sprintf("  Total (%d components): %.2f%%\n\n", n_components, pca_total_var))

##   Total (2 components): 74.41%

cat("FA - Variance per Factor (Varimax):\n")

## FA - Variance per Factor (Varimax):

for (i in 1:n_factors) {
  cat(sprintf("  Factor %d: %.2f%%\n", i, fa_prop_var[i]))
}

##   Factor 1: 27.58%
##   Factor 2: 27.04%
##   Factor 3: 25.92%

cat(sprintf("  Total (%d factors): %.2f%%\n", n_factors, fa_total_var))

##   Total (3 factors): 80.53%

16.2 PCA R² vs FA Communality

In PCA, the sum of squared loadings per variable across all retained components approximates the proportion of variance reproduced (analogous to R²). In FA, this is the communality, the proportion of variance shared via the common factors. PCA R² computed from a subset of retained components is not guaranteed to exceed FA communality; the comparison below should therefore be read as a variable-level diagnostic rather than as a directional rule.

pca_load_retained <- pca_result$rotation[, 1:n_components, drop = FALSE]
pca_r2            <- rowSums(pca_load_retained^2)
fa_comm           <- fa_varimax$communality

comparison_df <- data.frame(
  Variable       = names(fa_comm),
  PCA_R2         = round(pca_r2, 4),
  FA_Communality = round(fa_comm, 4),
  Difference     = round(pca_r2 - fa_comm, 4)
)
print(comparison_df)

##                                                                                                                                      Variable
## APS 13-15                                                                                                                           APS 13-15
## APS 16-18                                                                                                                           APS 16-18
## APS 19-24                                                                                                                           APS 19-24
## TPT - Februari                                                                                                                 TPT - Februari
## TPT - Agustus                                                                                                                   TPT - Agustus
## TPAK - Februari                                                                                                               TPAK - Februari
## TPAK - Agustus                                                                                                                 TPAK - Agustus
## Persentase Penduduk Miskin - Maret                                                                         Persentase Penduduk Miskin - Maret
## Persentase Penduduk Miskin - September                                                                 Persentase Penduduk Miskin - September
## Indeks Pembangunan Manusia                                                                                         Indeks Pembangunan Manusia
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi
##  Harapan Lama Sekolah (Tahun)\n                                                                                Harapan Lama Sekolah (Tahun)\n
##                                                                        PCA_R2
## APS 13-15                                                              0.1359
## APS 16-18                                                              0.2446
## APS 19-24                                                              0.3016
## TPT - Februari                                                         0.1605
## TPT - Agustus                                                          0.1600
## TPAK - Februari                                                        0.1362
## TPAK - Agustus                                                         0.1442
## Persentase Penduduk Miskin - Maret                                     0.1139
## Persentase Penduduk Miskin - September                                 0.1132
## Indeks Pembangunan Manusia                                             0.1172
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi 0.1224
##  Harapan Lama Sekolah (Tahun)\n                                        0.2501
##                                                                        FA_Communality
## APS 13-15                                                                      0.7998
## APS 16-18                                                                      0.7828
## APS 19-24                                                                      0.6618
## TPT - Februari                                                                 0.8739
## TPT - Agustus                                                                  0.8551
## TPAK - Februari                                                                0.6434
## TPAK - Agustus                                                                 0.6964
## Persentase Penduduk Miskin - Maret                                             0.9836
## Persentase Penduduk Miskin - September                                         0.9352
## Indeks Pembangunan Manusia                                                     0.7945
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi         0.7346
##  Harapan Lama Sekolah (Tahun)\n                                                0.9024
##                                                                        Difference
## APS 13-15                                                                 -0.6638
## APS 16-18                                                                 -0.5382
## APS 19-24                                                                 -0.3602
## TPT - Februari                                                            -0.7135
## TPT - Agustus                                                             -0.6951
## TPAK - Februari                                                           -0.5072
## TPAK - Agustus                                                            -0.5522
## Persentase Penduduk Miskin - Maret                                        -0.8697
## Persentase Penduduk Miskin - September                                    -0.8220
## Indeks Pembangunan Manusia                                                -0.6772
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi    -0.6121
##  Harapan Lama Sekolah (Tahun)\n                                           -0.6523

cat("\nMean PCA R2          :", round(mean(pca_r2), 4), "\n")

## 
## Mean PCA R2          : 0.1667

cat("Mean FA Communality  :", round(mean(fa_comm), 4), "\n")

## Mean FA Communality  : 0.8053

cat("\nPCA R2 reflects only", n_components, "retained components.\n")

## 
## PCA R2 reflects only 2 retained components.

cat("When PCA R2 < FA communality, FA better captures shared variance\n")

## When PCA R2 < FA communality, FA better captures shared variance

cat("relative to the retained PCA solution.\n")

## relative to the retained PCA solution.

16.3 Summary Table

summary_table <- data.frame(
  Aspect          = c("Method objective",
                      "Variance modelled",
                      "Input data",
                      "Number retained",
                      "Rotation",
                      "Total variance explained",
                      "Interpretability"),
  PCA             = c("Dimensionality reduction",
                      "Total variance",
                      "X_scaled",
                      paste0(n_components, " components (Parallel Analysis)"),
                      "None (orthogonal by design)",
                      paste0(round(pca_total_var, 2), "%"),
                      "Linear combinations of variables"),
  Factor_Analysis = c("Latent construct discovery",
                      "Common variance only",
                      "X_scaled",
                      paste0(n_factors, " factors (Parallel Analysis, fa=fa)"),
                      "Varimax (orthogonal) & Oblimin (oblique)",
                      paste0(round(fa_total_var, 2), "%"),
                      "Underlying latent constructs")
)
print(summary_table, right = FALSE)

##   Aspect                   PCA                             
## 1 Method objective         Dimensionality reduction        
## 2 Variance modelled        Total variance                  
## 3 Input data               X_scaled                        
## 4 Number retained          2 components (Parallel Analysis)
## 5 Rotation                 None (orthogonal by design)     
## 6 Total variance explained 74.41%                          
## 7 Interpretability         Linear combinations of variables
##   Factor_Analysis                         
## 1 Latent construct discovery              
## 2 Common variance only                    
## 3 X_scaled                                
## 4 3 factors (Parallel Analysis, fa=fa)    
## 5 Varimax (orthogonal) & Oblimin (oblique)
## 6 80.53%                                  
## 7 Underlying latent constructs

16.4 Loading Pattern Comparison

n_show <- min(2, n_components, n_factors)

pca_load_df <- as.data.frame(round(pca_result$rotation[, 1:n_show, drop = FALSE], 4))
colnames(pca_load_df) <- paste0("PC", 1:n_show)

fa_load_mat <- matrix(fa_varimax$loadings,
                      nrow = nrow(fa_varimax$loadings),
                      ncol = ncol(fa_varimax$loadings))
rownames(fa_load_mat) <- rownames(fa_varimax$loadings)
fa_load_df  <- as.data.frame(round(fa_load_mat[, 1:n_show, drop = FALSE], 4))
colnames(fa_load_df) <- paste0("FA_F", 1:n_show)

print(cbind(pca_load_df, fa_load_df))

##                                                                            PC1
## APS 13-15                                                              -0.3444
## APS 16-18                                                              -0.2414
## APS 19-24                                                              -0.1183
## TPT - Februari                                                         -0.2564
## TPT - Agustus                                                          -0.2716
## TPAK - Februari                                                         0.2992
## TPAK - Agustus                                                          0.3097
## Persentase Penduduk Miskin - Maret                                      0.3054
## Persentase Penduduk Miskin - September                                  0.3028
## Indeks Pembangunan Manusia                                             -0.3393
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.3317
##  Harapan Lama Sekolah (Tahun)\n                                        -0.2721
##                                                                            PC2
## APS 13-15                                                              -0.1317
## APS 16-18                                                              -0.4317
## APS 19-24                                                              -0.5363
## TPT - Februari                                                          0.3078
## TPT - Agustus                                                           0.2936
## TPAK - Februari                                                        -0.2160
## TPAK - Agustus                                                         -0.2198
## Persentase Penduduk Miskin - Maret                                     -0.1438
## Persentase Penduduk Miskin - September                                 -0.1465
## Indeks Pembangunan Manusia                                             -0.0457
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.1115
##  Harapan Lama Sekolah (Tahun)\n                                        -0.4196
##                                                                          FA_F1
## APS 13-15                                                               0.4037
## APS 16-18                                                               0.1441
## APS 19-24                                                              -0.0768
## TPT - Februari                                                          0.9250
## TPT - Agustus                                                           0.9031
## TPAK - Februari                                                        -0.6702
## TPAK - Agustus                                                         -0.6891
## Persentase Penduduk Miskin - Maret                                     -0.2581
## Persentase Penduduk Miskin - September                                 -0.2681
## Indeks Pembangunan Manusia                                              0.3452
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi  0.5057
##  Harapan Lama Sekolah (Tahun)\n                                         0.1054
##                                                                          FA_F2
## APS 13-15                                                              -0.5390
## APS 16-18                                                              -0.1561
## APS 19-24                                                               0.0943
## TPT - Februari                                                         -0.1323
## TPT - Agustus                                                          -0.1903
## TPAK - Februari                                                         0.4144
## TPAK - Agustus                                                          0.4451
## Persentase Penduduk Miskin - Maret                                      0.9534
## Persentase Penduduk Miskin - September                                  0.9248
## Indeks Pembangunan Manusia                                             -0.6981
## Rata-Rata Lama Sekolah Penduduk Umur 15 Tahun ke Atas Menurut Provinsi -0.3853
##  Harapan Lama Sekolah (Tahun)\n                                        -0.3114

17. Conclusion

The analysis was conducted through a systematic procedure encompassing data cleaning, sample size assessment, distributional screening, outlier detection, correlation analysis, suitability testing (Bartlett and KMO), standardization, PCA, factor extraction, rotation, and model evaluation.

Parallel Analysis was consistently applied as the primary criterion for determining the number of retained components in PCA and factors in FA, ensuring full methodological coherence across both analyses. The Kaiser criterion and Scree Plot served as supplementary cross-checks. Factor extraction used Principal Axis Factoring, which does not require multivariate normality, making it appropriate for the distributional characteristics of BPS provincial data.

The Bartlett test confirmed sufficient intercorrelation among variables, and the KMO value indicated adequate sampling adequacy for Factor Analysis. Communalities indicate that the factor structure adequately captures shared variance across variables. Oblimin rotation results, particularly the Phi matrix, were used to evaluate whether orthogonal rotation via Varimax is a defensible choice.

Sample size limitations inherent to province-level BPS data were acknowledged. Given that the number of observations is typically around 34 to 38, results should be treated as exploratory and interpreted with appropriate caution, particularly if the observation-to-variable ratio falls below the 5:1 guideline. The resulting factors provide a theoretically grounded and statistically principled basis for further analysis, policy interpretation, or predictive modelling.