Penelitian ini berfokus pada analisis faktor-faktor yang mempengaruhi tingkat kebahagiaan suatu negara. Kebahagiaan atau kesejahteraan kualitas hidup tidak cukup jika hanya diukur melalui pendapatan ekonomi semata, melainkan juga harus mempertimbangkan aspek sosial dan tingkat kesehatan. Mengingat indikator-indikator tersebut sering kali memiliki korelasi dan tumpang tindih data yang tinggi, penelitian ini mengimplementasikan metode Principal Component Analysis (PCA) dan Factor Analysis (FA). Dataset yang dieksplorasi bersumber dari World Happiness Report dengan ukuran sampel sebanyak 137 negara, berfokus pada 10 variabel kuantitatif kontinu.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.0 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(psych)
##
## Attaching package: 'psych'
##
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(corrplot)
## corrplot 0.95 loaded
library(FactoMineR)
library(factoextra)
## Welcome to factoextra!
## Want to learn more? See two factoextra-related books at https://www.datanovia.com/en/product/practical-guide-to-principal-component-methods-in-r/
data <- read.csv("WHR2023.csv", check.names = FALSE)
cat("Dimensi dataset (baris, kolom)")
## Dimensi dataset (baris, kolom)
print(dim(data))
## [1] 137 19
cat("10 kolom pertama")
## 10 kolom pertama
print(head(data, 10))
## Country name Ladder score Standard error of ladder score upperwhisker
## 1 Finland 7.804 0.036 7.875
## 2 Denmark 7.586 0.041 7.667
## 3 Iceland 7.530 0.049 7.625
## 4 Israel 7.473 0.032 7.535
## 5 Netherlands 7.403 0.029 7.460
## 6 Sweden 7.395 0.037 7.468
## 7 Norway 7.315 0.044 7.402
## 8 Switzerland 7.240 0.043 7.324
## 9 Luxembourg 7.228 0.069 7.363
## 10 New Zealand 7.123 0.038 7.198
## lowerwhisker Logged GDP per capita Social support Healthy life expectancy
## 1 7.733 10.792 0.969 71.150
## 2 7.506 10.962 0.954 71.250
## 3 7.434 10.896 0.983 72.050
## 4 7.411 10.639 0.943 72.697
## 5 7.346 10.942 0.930 71.550
## 6 7.322 10.883 0.939 72.150
## 7 7.229 11.088 0.943 71.500
## 8 7.156 11.164 0.920 72.900
## 9 7.093 11.660 0.879 71.675
## 10 7.048 10.662 0.952 70.350
## Freedom to make life choices Generosity Perceptions of corruption
## 1 0.961 -0.019 0.182
## 2 0.934 0.134 0.196
## 3 0.936 0.211 0.668
## 4 0.809 -0.023 0.708
## 5 0.887 0.213 0.379
## 6 0.948 0.165 0.202
## 7 0.947 0.141 0.283
## 8 0.891 0.027 0.266
## 9 0.915 0.024 0.345
## 10 0.887 0.175 0.271
## Ladder score in Dystopia Explained by: Log GDP per capita
## 1 1.778 1.888
## 2 1.778 1.949
## 3 1.778 1.926
## 4 1.778 1.833
## 5 1.778 1.942
## 6 1.778 1.921
## 7 1.778 1.994
## 8 1.778 2.022
## 9 1.778 2.200
## 10 1.778 1.842
## Explained by: Social support Explained by: Healthy life expectancy
## 1 1.585 0.535
## 2 1.548 0.537
## 3 1.620 0.559
## 4 1.521 0.577
## 5 1.488 0.545
## 6 1.510 0.562
## 7 1.521 0.544
## 8 1.463 0.582
## 9 1.357 0.549
## 10 1.544 0.513
## Explained by: Freedom to make life choices Explained by: Generosity
## 1 0.772 0.126
## 2 0.734 0.208
## 3 0.738 0.250
## 4 0.569 0.124
## 5 0.672 0.251
## 6 0.754 0.225
## 7 0.752 0.212
## 8 0.678 0.151
## 9 0.710 0.149
## 10 0.672 0.230
## Explained by: Perceptions of corruption Dystopia + residual
## 1 0.535 2.363
## 2 0.525 2.084
## 3 0.187 2.250
## 4 0.158 2.691
## 5 0.394 2.110
## 6 0.520 1.903
## 7 0.463 1.829
## 8 0.475 1.870
## 9 0.418 1.845
## 10 0.471 1.852
data_selected <- data %>%
select(`Ladder score`,
`Standard error of ladder score`,
upperwhisker,
lowerwhisker,
`Logged GDP per capita`,
`Social support`,
`Healthy life expectancy`,
`Freedom to make life choices`,
Generosity,
`Perceptions of corruption`)
cat("Dimensi data_selected")
## Dimensi data_selected
print(dim(data_selected))
## [1] 137 10
cat("Missing values per kolom")
## Missing values per kolom
print(colSums(is.na(data_selected)))
## Ladder score Standard error of ladder score
## 0 0
## upperwhisker lowerwhisker
## 0 0
## Logged GDP per capita Social support
## 0 0
## Healthy life expectancy Freedom to make life choices
## 1 0
## Generosity Perceptions of corruption
## 0 0
cat("Head data_selected")
## Head data_selected
print(head(data_selected, 10))
## Ladder score Standard error of ladder score upperwhisker lowerwhisker
## 1 7.804 0.036 7.875 7.733
## 2 7.586 0.041 7.667 7.506
## 3 7.530 0.049 7.625 7.434
## 4 7.473 0.032 7.535 7.411
## 5 7.403 0.029 7.460 7.346
## 6 7.395 0.037 7.468 7.322
## 7 7.315 0.044 7.402 7.229
## 8 7.240 0.043 7.324 7.156
## 9 7.228 0.069 7.363 7.093
## 10 7.123 0.038 7.198 7.048
## Logged GDP per capita Social support Healthy life expectancy
## 1 10.792 0.969 71.150
## 2 10.962 0.954 71.250
## 3 10.896 0.983 72.050
## 4 10.639 0.943 72.697
## 5 10.942 0.930 71.550
## 6 10.883 0.939 72.150
## 7 11.088 0.943 71.500
## 8 11.164 0.920 72.900
## 9 11.660 0.879 71.675
## 10 10.662 0.952 70.350
## Freedom to make life choices Generosity Perceptions of corruption
## 1 0.961 -0.019 0.182
## 2 0.934 0.134 0.196
## 3 0.936 0.211 0.668
## 4 0.809 -0.023 0.708
## 5 0.887 0.213 0.379
## 6 0.948 0.165 0.202
## 7 0.947 0.141 0.283
## 8 0.891 0.027 0.266
## 9 0.915 0.024 0.345
## 10 0.887 0.175 0.271
data_clean <- na.omit(data_selected)
cat("Dimensi setelah na.omit")
## Dimensi setelah na.omit
print(dim(data_clean))
## [1] 136 10
desc <- psych::describe(data_clean)
desc_tbl <- desc[, c("n", "mean", "sd", "median", "min", "max")]
cat("Statistik Deskriptif")
## Statistik Deskriptif
print(round(desc_tbl, 3))
## n mean sd median min max
## Ladder score 136 5.54 1.14 5.69 1.86 7.80
## Standard error of ladder score 136 0.06 0.02 0.06 0.03 0.15
## upperwhisker 136 5.67 1.12 5.82 1.92 7.88
## lowerwhisker 136 5.42 1.17 5.55 1.79 7.73
## Logged GDP per capita 136 9.46 1.21 9.57 5.53 11.66
## Social support 136 0.80 0.13 0.83 0.34 0.98
## Healthy life expectancy 136 64.97 5.75 65.84 51.53 77.28
## Freedom to make life choices 136 0.79 0.11 0.80 0.38 0.96
## Generosity 136 0.02 0.14 0.00 -0.25 0.53
## Perceptions of corruption 136 0.72 0.18 0.77 0.15 0.93
data_scaled <- scale(data_clean)
cat("Cek data_scaled (mean ~ 0, sd ~ 1)")
## Cek data_scaled (mean ~ 0, sd ~ 1)
print(round(colMeans(data_scaled), 3))
## Ladder score Standard error of ladder score
## 0 0
## upperwhisker lowerwhisker
## 0 0
## Logged GDP per capita Social support
## 0 0
## Healthy life expectancy Freedom to make life choices
## 0 0
## Generosity Perceptions of corruption
## 0 0
print(round(apply(data_scaled, 2, sd), 3))
## Ladder score Standard error of ladder score
## 1 1
## upperwhisker lowerwhisker
## 1 1
## Logged GDP per capita Social support
## 1 1
## Healthy life expectancy Freedom to make life choices
## 1 1
## Generosity Perceptions of corruption
## 1 1
cor_matrix <- cor(data_scaled)
cat("Matriks korelasi (dibulatkan)")
## Matriks korelasi (dibulatkan)
print(round(cor_matrix, 3))
## Ladder score Standard error of ladder score
## Ladder score 1.000 -0.511
## Standard error of ladder score -0.511 1.000
## upperwhisker 0.999 -0.481
## lowerwhisker 0.999 -0.539
## Logged GDP per capita 0.784 -0.583
## Social support 0.838 -0.479
## Healthy life expectancy 0.747 -0.616
## Freedom to make life choices 0.662 -0.292
## Generosity 0.040 0.104
## Perceptions of corruption -0.471 0.302
## upperwhisker lowerwhisker Logged GDP per capita
## Ladder score 0.999 0.999 0.784
## Standard error of ladder score -0.481 -0.539 -0.583
## upperwhisker 1.000 0.998 0.776
## lowerwhisker 0.998 1.000 0.790
## Logged GDP per capita 0.776 0.790 1.000
## Social support 0.836 0.840 0.742
## Healthy life expectancy 0.737 0.756 0.838
## Freedom to make life choices 0.663 0.660 0.449
## Generosity 0.045 0.035 -0.162
## Perceptions of corruption -0.468 -0.473 -0.435
## Social support Healthy life expectancy
## Ladder score 0.838 0.747
## Standard error of ladder score -0.479 -0.616
## upperwhisker 0.836 0.737
## lowerwhisker 0.840 0.756
## Logged GDP per capita 0.742 0.838
## Social support 1.000 0.726
## Healthy life expectancy 0.726 1.000
## Freedom to make life choices 0.546 0.415
## Generosity 0.041 -0.135
## Perceptions of corruption -0.275 -0.404
## Freedom to make life choices Generosity
## Ladder score 0.662 0.040
## Standard error of ladder score -0.292 0.104
## upperwhisker 0.663 0.045
## lowerwhisker 0.660 0.035
## Logged GDP per capita 0.449 -0.162
## Social support 0.546 0.041
## Healthy life expectancy 0.415 -0.135
## Freedom to make life choices 1.000 0.165
## Generosity 0.165 1.000
## Perceptions of corruption -0.381 -0.118
## Perceptions of corruption
## Ladder score -0.471
## Standard error of ladder score 0.302
## upperwhisker -0.468
## lowerwhisker -0.473
## Logged GDP per capita -0.435
## Social support -0.275
## Healthy life expectancy -0.404
## Freedom to make life choices -0.381
## Generosity -0.118
## Perceptions of corruption 1.000
corrplot(cor_matrix, method = "color", type = "upper", tl.cex = 0.8)
kmo_result <- KMO(cor_matrix)
bart_result <- cortest.bartlett(cor_matrix, n = nrow(data_scaled))
cat("=== KMO ===\n")
## === KMO ===
print(kmo_result)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = cor_matrix)
## Overall MSA = 0.84
## MSA for each item =
## Ladder score Standard error of ladder score
## 0.76 0.78
## upperwhisker lowerwhisker
## 0.79 0.80
## Logged GDP per capita Social support
## 0.92 0.95
## Healthy life expectancy Freedom to make life choices
## 0.93 0.98
## Generosity Perceptions of corruption
## 0.41 0.86
cat("\nNilai KMO keseluruhan (MSA):", round(kmo_result$MSA, 3), "\n\n")
##
## Nilai KMO keseluruhan (MSA): 0.843
cat("=== Bartlett Test ===\n")
## === Bartlett Test ===
print(bart_result)
## $chisq
## [1] 4596.539
##
## $p.value
## [1] 0
##
## $df
## [1] 45
cat("\np-value Bartlett:", bart_result$p.value, "\n")
##
## p-value Bartlett: 0
Nilai KMO menunjukkan data layak untuk dilakukan analisis faktor, dan uji Bartlett menunjukkan bahwa korelasi antar variabel signifikan sehingga PCA dan FA dapat dilanjutkan.
pca_result <- prcomp(data_scaled, center = TRUE, scale. = TRUE)
print(summary(pca_result))
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 2.4871 1.1168 0.88798 0.81033 0.69896 0.54086 0.42978
## Proportion of Variance 0.6186 0.1247 0.07885 0.06566 0.04885 0.02925 0.01847
## Cumulative Proportion 0.6186 0.7433 0.82214 0.88781 0.93666 0.96591 0.98438
## PC8 PC9 PC10
## Standard deviation 0.39518 0.000727 0.0002519
## Proportion of Variance 0.01562 0.000000 0.0000000
## Cumulative Proportion 1.00000 1.000000 1.0000000
cat("=== Loading PCA ===\n")
## === Loading PCA ===
print(round(pca_result$rotation, 3))
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Ladder score -0.388 -0.081 0.136 0.097 -0.079 0.321 -0.195
## Standard error of ladder score 0.255 -0.292 0.305 0.622 -0.543 -0.259 -0.056
## upperwhisker -0.385 -0.094 0.151 0.124 -0.102 0.316 -0.201
## lowerwhisker -0.390 -0.068 0.121 0.071 -0.056 0.324 -0.189
## Logged GDP per capita -0.351 0.228 -0.064 -0.040 -0.221 -0.399 -0.170
## Social support -0.350 -0.002 0.327 -0.120 -0.176 -0.055 0.849
## Healthy life expectancy -0.342 0.243 -0.079 -0.174 -0.206 -0.533 -0.234
## Freedom to make life choices -0.273 -0.349 0.135 0.340 0.714 -0.400 0.005
## Generosity 0.000 -0.783 -0.022 -0.568 -0.177 -0.123 -0.116
## Perceptions of corruption 0.215 0.221 0.846 -0.314 0.148 -0.059 -0.252
## PC8 PC9 PC10
## Ladder score 0.037 0.022 0.816
## Standard error of ladder score 0.043 0.056 0.000
## upperwhisker 0.039 -0.703 -0.398
## lowerwhisker 0.034 0.709 -0.419
## Logged GDP per capita -0.763 0.000 0.000
## Social support 0.030 0.000 0.000
## Healthy life expectancy 0.637 0.000 0.000
## Freedom to make life choices -0.006 0.000 0.000
## Generosity -0.062 0.000 0.000
## Perceptions of corruption -0.044 0.000 0.000
eigenvalues <- pca_result$sdev^2
pca_table <- data.frame(
Komponen = paste0("PC", seq_along(eigenvalues)),
Eigenvalue = round(eigenvalues, 4),
Proporsi_Varians = round(eigenvalues/sum(eigenvalues), 4),
Kumulatif_Varians = round(cumsum(eigenvalues/sum(eigenvalues)), 4)
)
print(pca_table)
## Komponen Eigenvalue Proporsi_Varians Kumulatif_Varians
## 1 PC1 6.1856 0.6186 0.6186
## 2 PC2 1.2473 0.1247 0.7433
## 3 PC3 0.7885 0.0789 0.8221
## 4 PC4 0.6566 0.0657 0.8878
## 5 PC5 0.4885 0.0489 0.9367
## 6 PC6 0.2925 0.0293 0.9659
## 7 PC7 0.1847 0.0185 0.9844
## 8 PC8 0.1562 0.0156 1.0000
## 9 PC9 0.0000 0.0000 1.0000
## 10 PC10 0.0000 0.0000 1.0000
# Scree plot
fviz_eig(pca_result, addlabels = TRUE)
# Plot kumulatif varians
cum_var <- cumsum(pca_result$sdev^2 / sum(pca_result$sdev^2))
plot(cum_var, type = "b", xlab = "Komponen Utama", ylab = "Varians Kumulatif",
main = "Varians Kumulatif PCA")
abline(h = 0.70, lty = 2)
abline(h = 0.80, lty = 2)
fviz_pca_biplot(pca_result, repel = TRUE)
fviz_contrib(pca_result, choice = "var", axes = 1, top = 10)
fviz_contrib(pca_result, choice = "var", axes = 2, top = 10)
Berdasarkan kriteria Kaiser (Eigenvalue > 1), komponen utama yang dipertahankan adalah PC1 dan PC2. Kedua komponen menjelaskan 74.33% variasi total data (kumulatif > 70%), sehingga reduksi dimensi menjadi 2 komponen dianggap representatif.
fa.parallel(data_scaled, fa = "fa", n.iter = 20)
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : An
## ultra-Heywood case was detected. Examine the results carefully
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : An
## ultra-Heywood case was detected. Examine the results carefully
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : An
## ultra-Heywood case was detected. Examine the results carefully
## Parallel analysis suggests that the number of factors = 2 and the number of components = NA
fa_result <- fa(data_scaled, nfactors = 2, rotate = "varimax", fm = "ml")
print(fa_result)
## Factor Analysis using method = ml
## Call: fa(r = data_scaled, nfactors = 2, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 h2 u2 com
## Ladder score 1.00 0.02 0.998 0.0017 1.0
## Standard error of ladder score -0.52 0.85 0.995 0.0050 1.7
## upperwhisker 1.00 0.05 0.998 0.0017 1.0
## lowerwhisker 1.00 -0.02 0.998 0.0016 1.0
## Logged GDP per capita 0.79 -0.20 0.662 0.3379 1.1
## Social support 0.84 -0.05 0.708 0.2925 1.0
## Healthy life expectancy 0.75 -0.26 0.635 0.3649 1.2
## Freedom to make life choices 0.66 0.07 0.442 0.5584 1.0
## Generosity 0.04 0.15 0.023 0.9770 1.1
## Perceptions of corruption -0.47 0.06 0.227 0.7733 1.0
##
## ML1 ML2
## SS loadings 5.82 0.86
## Proportion Var 0.58 0.09
## Cumulative Var 0.58 0.67
## Proportion Explained 0.87 0.13
## Cumulative Proportion 0.87 1.00
##
## Mean item complexity = 1.1
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 45 with the objective function = 35.13 with Chi Square = 4596.54
## df of the model are 26 and the objective function was 16.99
##
## The root mean square of the residuals (RMSR) is 0.06
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic n.obs is 136 with the empirical chi square 20.61 with prob < 0.76
## The total n.obs was 136 with Likelihood Chi Square = 2200.13 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.165
## RMSEA index = 0.784 and the 90 % confidence intervals are 0.759 0.815
## BIC = 2072.4
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## ML1 ML2
## Correlation of (regression) scores with factors 1 1.00
## Multiple R square of scores with factors 1 0.99
## Minimum correlation of possible factor scores 1 0.99
print(fa_result$loadings, cutoff = 0.40)
##
## Loadings:
## ML1 ML2
## Ladder score 0.999
## Standard error of ladder score -0.524 0.849
## upperwhisker 0.998
## lowerwhisker 0.999
## Logged GDP per capita 0.788
## Social support 0.840
## Healthy life expectancy 0.753
## Freedom to make life choices 0.661
## Generosity
## Perceptions of corruption -0.472
##
## ML1 ML2
## SS loadings 5.822 0.864
## Proportion Var 0.582 0.086
## Cumulative Var 0.582 0.669
L <- as.matrix(unclass(fa_result$loadings))
L <- round(L, 3)
corrplot(L, is.corr = FALSE, method = "color",
tl.cex = 0.8, number.cex = 0.8,
main = "Heatmap Factor Loadings")
fa.diagram(fa_result)
Berdasarkan hasil Factor Analysis dengan metode Maximum Likelihood dan rotasi Varimax, diperoleh dua faktor utama. Faktor pertama (ML1) memiliki loading tinggi pada variabel Ladder score, GDP per capita, Social support, Healthy life expectancy, serta Freedom to make life choices. Oleh karena itu, faktor ini diinterpretasikan sebagai Faktor Kesejahteraan dan Kualitas Hidup. Faktor kedua (ML2) didominasi oleh variabel Generosity dan Standard error of ladder score, yang berkaitan dengan aspek sosial dan persepsi publik. Oleh karena itu, faktor kedua diinterpretasikan sebagai Faktor Sosial dan Persepsi Publik. Hasil ini konsisten dengan hasil PCA sebelumnya yang juga menunjukkan dua komponen utama.
cat("=== Communality ===\n")
## === Communality ===
print(round(fa_result$communality, 3))
## Ladder score Standard error of ladder score
## 0.998 0.995
## upperwhisker lowerwhisker
## 0.998 0.998
## Logged GDP per capita Social support
## 0.662 0.708
## Healthy life expectancy Freedom to make life choices
## 0.635 0.442
## Generosity Perceptions of corruption
## 0.023 0.227
cat("\n=== Uniqueness ===\n")
##
## === Uniqueness ===
print(round(fa_result$uniquenesses, 3))
## Ladder score Standard error of ladder score
## 0.002 0.005
## upperwhisker lowerwhisker
## 0.002 0.002
## Logged GDP per capita Social support
## 0.338 0.292
## Healthy life expectancy Freedom to make life choices
## 0.365 0.558
## Generosity Perceptions of corruption
## 0.977 0.773
comm <- fa_result$communality
barplot(sort(comm, decreasing = TRUE),
las = 2, ylab = "Communality",
main = "Communality Tiap Variabel")
abline(h = 0.5, lty = 2)
Hasil Factor Analysis menunjukkan terbentuk dua faktor utama:
Faktor 1 – Kesejahteraan dan Kualitas Hidup
Faktor 2 – Sosial dan Persepsi Publik
Generosity, Standard error of ladder score, Perceptions of corruption,
Hasil ini konsisten dengan hasil PCA sebelumnya.
Berdasarkan analisis PCA dan FA, 10 variabel kuantitatif dalam dataset World Happiness Report dapat direduksi menjadi dua dimensi utama:
Kesejahteraan dan Kualitas Hidup
Sosial dan Persepsi Publik
Kedua dimensi tersebut mampu menjelaskan sebagian besar variasi data dan menunjukkan struktur yang konsisten antara PCA dan Factor Analysis.