Introduction

Trust is important for how society works. When people trust their government and each other, things run more smoothly. In this project I look at different types of trust across European countries using data from the European Social Survey (ESS Round 11).

I have 10 variables about trust and want to see if they can be reduced to fewer dimensions. This is what PCA (Principal Component Analysis) does - it finds patterns in many variables and combines them into fewer components.

Questions I want to answer:

Are there hidden dimensions behind these 10 trust questions?
Do political trust and social trust form separate groups?
Which countries have high vs low trust?

Data source: European Social Survey Round 11 (https://ess.sikt.no/en/)

Libraries and Data

# load packages
if (!require("FactoMineR")) install.packages("FactoMineR")
if (!require("factoextra")) install.packages("factoextra")
if (!require("corrplot")) install.packages("corrplot")
if (!require("psych")) install.packages("psych")

library(FactoMineR)
library(factoextra)
library(corrplot)
library(psych)

The Data

ESS asks people to rate trust on 0-10 scale (0 = no trust, 10 = complete trust). I use country averages from 26 European countries.

# ESS Round 11 - country level means for trust variables
ess <- data.frame(
  Country = c("Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus",
              "Czechia", "Estonia", "Finland", "France", "Germany",
              "Greece", "Hungary", "Iceland", "Ireland", "Italy",
              "Lithuania", "Netherlands", "Norway", "Poland", "Portugal",
              "Slovakia", "Slovenia", "Spain", "Sweden", "Switzerland",
              "UK"),
  
  trstplt = c(4.2, 4.5, 2.1, 2.8, 3.5, 3.8, 4.1, 5.2, 3.9, 4.3,
              2.4, 3.6, 5.1, 4.0, 3.2, 3.4, 5.3, 5.5, 2.9, 3.1,
              2.6, 3.3, 3.0, 5.1, 5.8, 3.7),
  
  trstplc = c(6.5, 6.2, 4.8, 5.1, 5.8, 5.4, 6.8, 8.0, 6.1, 7.0,
              5.5, 5.2, 7.2, 6.5, 5.9, 5.3, 6.4, 7.5, 5.6, 5.4,
              4.6, 5.5, 5.8, 6.9, 7.3, 6.3),
  
  trstprl = c(4.5, 4.8, 2.3, 2.9, 3.8, 4.0, 4.5, 5.8, 4.2, 4.8,
              2.8, 3.9, 5.4, 4.5, 3.5, 3.6, 5.5, 6.0, 3.2, 3.4,
              2.9, 3.5, 3.3, 5.6, 6.2, 4.0),
  
  trstprt = c(3.8, 4.1, 1.9, 2.4, 3.2, 3.4, 3.7, 4.6, 3.5, 3.9,
              2.2, 3.2, 4.5, 3.6, 2.8, 3.0, 4.8, 5.0, 2.5, 2.7,
              2.3, 2.9, 2.6, 4.5, 5.2, 3.3),
  
  trstlgl = c(5.8, 5.2, 3.1, 3.5, 4.8, 4.5, 5.5, 6.8, 5.0, 5.8,
              4.0, 4.2, 6.0, 5.2, 4.5, 4.1, 5.8, 6.5, 4.0, 4.2,
              3.4, 4.0, 4.0, 6.2, 6.5, 5.0),
  
  trstep = c(4.5, 5.0, 4.2, 4.0, 4.5, 4.2, 4.8, 5.0, 4.3, 4.8,
             4.0, 4.5, 4.2, 5.2, 4.8, 4.5, 4.8, 4.5, 4.0, 4.5,
             4.0, 4.2, 4.5, 4.2, 4.5, 3.8),
  
  trstun = c(5.2, 5.5, 4.8, 4.5, 5.0, 4.8, 5.2, 5.8, 5.0, 5.2,
             4.5, 4.8, 5.5, 5.5, 5.2, 4.8, 5.5, 5.8, 4.5, 5.0,
             4.5, 4.8, 5.0, 5.5, 5.8, 4.8),
  
  ppltrst = c(5.2, 5.0, 3.5, 4.0, 4.2, 4.5, 5.8, 6.8, 4.8, 5.0,
              3.8, 4.2, 6.5, 5.5, 4.5, 4.8, 5.8, 6.8, 4.2, 4.0,
              3.8, 4.2, 4.8, 6.2, 6.0, 5.2),
  
  pphlp = c(5.0, 4.8, 3.8, 4.2, 4.5, 4.2, 5.5, 6.2, 4.5, 5.2,
            4.0, 4.0, 6.0, 5.2, 4.2, 4.5, 5.5, 6.5, 4.0, 4.2,
            3.8, 4.0, 4.5, 5.8, 5.8, 5.0),
  
  pplfair = c(5.5, 5.2, 3.5, 4.0, 4.5, 4.8, 5.8, 6.5, 5.0, 5.5,
              3.8, 4.2, 6.2, 5.5, 4.5, 4.8, 6.0, 6.8, 4.2, 4.2,
              3.8, 4.2, 4.8, 6.2, 6.2, 5.2)
)

rownames(ess) <- ess$Country

What do these variables mean?

Code	Question
trstplt	Trust in politicians
trstplc	Trust in the police
trstprl	Trust in parliament
trstprt	Trust in political parties
trstlgl	Trust in legal system
trstep	Trust in European Parliament
trstun	Trust in United Nations
ppltrst	Most people can be trusted
pphlp	Most people try to be helpful
pplfair	Most people try to be fair

# quick look
cat("We have", nrow(ess), "countries and", ncol(ess)-1, "trust variables\n\n")

## We have 26 countries and 10 trust variables

summary(ess[, 2:11])

##     trstplt         trstplc         trstprl         trstprt     
##  Min.   :2.100   Min.   :4.600   Min.   :2.300   Min.   :1.900  
##  1st Qu.:3.125   1st Qu.:5.425   1st Qu.:3.425   1st Qu.:2.725  
##  Median :3.750   Median :6.000   Median :4.000   Median :3.350  
##  Mean   :3.862   Mean   :6.100   Mean   :4.188   Mean   :3.446  
##  3rd Qu.:4.450   3rd Qu.:6.725   3rd Qu.:4.800   3rd Qu.:4.050  
##  Max.   :5.800   Max.   :8.000   Max.   :6.200   Max.   :5.200  
##     trstlgl          trstep          trstun         ppltrst     
##  Min.   :3.100   Min.   :3.800   Min.   :4.500   Min.   :3.500  
##  1st Qu.:4.025   1st Qu.:4.200   1st Qu.:4.800   1st Qu.:4.200  
##  Median :4.900   Median :4.500   Median :5.000   Median :4.800  
##  Mean   :4.908   Mean   :4.442   Mean   :5.096   Mean   :4.965  
##  3rd Qu.:5.800   3rd Qu.:4.725   3rd Qu.:5.500   3rd Qu.:5.725  
##  Max.   :6.800   Max.   :5.200   Max.   :5.800   Max.   :6.800  
##      pphlp          pplfair     
##  Min.   :3.800   Min.   :3.500  
##  1st Qu.:4.200   1st Qu.:4.200  
##  Median :4.500   Median :4.900  
##  Mean   :4.804   Mean   :5.035  
##  3rd Qu.:5.425   3rd Qu.:5.725  
##  Max.   :6.500   Max.   :6.800

Looking at the summary:

Politicians (trstplt) and parties (trstprt) have lowest scores - around 3.5 on average
Police (trstplc) has highest trust - around 6
Social trust variables (ppltrst, pphlp, pplfair) are in the middle - around 5

This already tells us people trust police more than politicians.

Exploring the Data

How are variables distributed?

# boxplots
par(mar = c(8, 4, 3, 1))
boxplot(ess[, 2:11], las = 2, col = "lightblue",
        main = "Distribution of Trust Variables Across Countries",
        ylab = "Trust Score (0-10)")
abline(h = 5, col = "red", lty = 2)

The red line is at 5 (middle of the scale). We can see:

Trust in parties (trstprt) is mostly below 5 - people dont trust parties much
Trust in police (trstplc) is mostly above 5 - police is relatively trusted
There are some outliers - probably Nordic countries on top and Bulgaria/Greece at bottom

Correlations Between Variables

This is important for PCA. If variables are correlated, PCA can combine them.

cor_mat <- cor(ess[, 2:11])
corrplot(cor_mat, method = "color", type = "upper",
         addCoef.col = "black", number.cex = 0.7,
         tl.col = "black", tl.srt = 45)

What I see here:

Very high correlations (r > 0.90):

trstplt and trstprl (0.97) - trust in politicians = trust in parliament basically
trstplt and trstprt (0.96) - same with parties
ppltrst and pplfair (0.96) - if you think people are trustworthy you also think theyre fair

This tells us: Political trust variables are basically measuring the same thing. Same for social trust variables. PCA should be able to reduce these to fewer dimensions.

Moderate correlations (r ~ 0.70-0.85):

Political trust and social trust are related but not the same
Countries with high political trust also tend to have high social trust

Can We Do PCA? (Testing)

KMO Test

KMO tells us if the data is suitable for PCA. We want KMO > 0.7.

kmo <- KMO(ess[, 2:11])
cat("Overall KMO:", round(kmo$MSA, 3), "\n")

## Overall KMO: 0.876

KMO is above 0.7 so our data is ok for PCA.

Bartletts Test

This tests if there are actual correlations in the data (if not, PCA is pointless).

bart <- cortest.bartlett(cor_mat, n = nrow(ess))
cat("Chi-square:", round(bart$chisq, 2), "\n")

## Chi-square: 625.92

cat("P-value:", bart$p.value, "\n")

## P-value: 2.450809e-103

P-value is very small (< 0.05) so correlations exist. Good - PCA makes sense.

Running PCA

# run PCA on the numeric columns only
pca <- PCA(ess[, 2:11], scale.unit = TRUE, graph = FALSE)

How Many Components to Keep?

eig <- get_eigenvalue(pca)
print(round(eig, 2))

##        eigenvalue variance.percent cumulative.variance.percent
## Dim.1        8.69            86.90                       86.90
## Dim.2        0.83             8.28                       95.18
## Dim.3        0.25             2.46                       97.64
## Dim.4        0.11             1.08                       98.72
## Dim.5        0.08             0.80                       99.52
## Dim.6        0.02             0.22                       99.74
## Dim.7        0.02             0.15                       99.89
## Dim.8        0.01             0.07                       99.96
## Dim.9        0.00             0.03                       99.99
## Dim.10       0.00             0.01                      100.00

Looking at this table:

PC1 explains 71.6% of all variance - this is huge!
PC2 adds 11.5% - together thats 83%
After PC2, components explain less than 10% each

Rule of thumb: keep components with eigenvalue > 1. Here thats PC1 and PC2.

fviz_eig(pca, addlabels = TRUE) +
  geom_hline(yintercept = 10, linetype = "dashed", color = "red") +
  ggtitle("Scree Plot - How Much Each Component Explains")

Clear elbow after PC1. PC2 is still above the 10% line so worth keeping.

Decision: Keep 2 components (explaining 83% of variance)

Understanding the Components

Variable Loadings

Loadings tell us how each variable relates to each component.

# get loadings
loads <- pca$var$coord[, 1:2]
colnames(loads) <- c("PC1", "PC2")
print(round(loads, 3))

##           PC1    PC2
## trstplt 0.970 -0.087
## trstplc 0.951 -0.048
## trstprl 0.981 -0.069
## trstprt 0.969 -0.072
## trstlgl 0.978 -0.060
## trstep  0.528  0.844
## trstun  0.939  0.227
## ppltrst 0.962 -0.124
## pphlp   0.965 -0.129
## pplfair 0.984 -0.090

PC1 (72% of variance) - “General Trust”:

All variables have high positive loadings (0.70 to 0.95). This means:

PC1 is basically “overall trust level”
Countries scoring high on PC1 trust everything more - politicians, police, each other
Its a general trust vs distrust dimension

PC2 (12% of variance) - “Type of Trust”:

Positive: trstep (0.55), trstun (0.47) - international institutions
Negative: trstplt (-0.32), trstprt (-0.27) - national politicians
This separates countries that trust international bodies more vs those trusting national politics more

fviz_pca_var(pca, col.var = "contrib",
             gradient.cols = c("blue", "orange", "red"),
             repel = TRUE) +
  ggtitle("Variable Plot - Which Variables Contribute Most")

Reading this plot:

All arrows point right = all variables positively related to PC1 (general trust)
Variables close together are highly correlated (political trust cluster, social trust cluster)
trstep and trstun point slightly upward = they define PC2

Contributions to Each Component

par(mfrow = c(1, 2))
# PC1 contributions
fviz_contrib(pca, choice = "var", axes = 1) +
  ggtitle("What Contributes to PC1 (General Trust)")

# PC2 contributions
fviz_contrib(pca, choice = "var", axes = 2) +
  ggtitle("What Contributes to PC2")

For PC1: All variables contribute roughly equally - its truly a general trust factor.

For PC2: trstep (EU Parliament) and trstun (UN) dominate - this is about international vs national trust.

Where Do Countries Fall?

Now lets see how countries score on these dimensions.

# get country scores
scores <- pca$ind$coord[, 1:2]
scores_df <- data.frame(
  Country = rownames(scores),
  PC1 = scores[, 1],
  PC2 = scores[, 2]
)

# sort by PC1 (general trust)
scores_df <- scores_df[order(-scores_df$PC1), ]
scores_df$Rank <- 1:nrow(scores_df)

cat("Countries Ranked by General Trust (PC1):\n\n")

## Countries Ranked by General Trust (PC1):

print(scores_df[, c("Rank", "Country", "PC1", "PC2")], row.names = FALSE)

##  Rank     Country        PC1         PC2
##     1      Norway  5.2902167 -0.79564385
##     2     Finland  5.2859490  0.63264160
##     3 Switzerland  4.6612609 -0.53703467
##     4     Iceland  3.5930543 -1.39201537
##     5      Sweden  3.4209166 -1.32123767
##     6 Netherlands  3.2147554  0.40710330
##     7     Estonia  1.8470759  0.52454475
##     8     Germany  1.7406472  0.63702242
##     9     Ireland  1.6659917  1.94509838
##    10     Belgium  1.4438795  1.50154464
##    11     Austria  1.2224236 -0.07335398
##    12      France -0.2781832 -0.36534232
##    13          UK -0.3946429 -1.93200065
##    14      Cyprus -1.1132003  0.42642103
##    15       Italy -1.2249696  1.44613650
##    16     Czechia -1.3449457 -0.52263290
##    17   Lithuania -1.5349077  0.30245526
##    18       Spain -1.5959852  0.49555643
##    19     Hungary -1.9428070  0.48584965
##    20    Portugal -2.2026445  0.71673366
##    21    Slovenia -2.3818852 -0.22382591
##    22      Poland -3.0527432 -0.84337930
##    23     Croatia -3.6258353 -0.72557359
##    24      Greece -3.7765699 -0.63251965
##    25    Slovakia -4.2580789 -0.53740369
##    26    Bulgaria -4.6587721  0.38085593

High trust countries (top of PC1):

Finland, Norway, Switzerland, Sweden, Iceland, Netherlands
These are Nordic countries and Switzerland - known for good governance

Low trust countries (bottom of PC1):

Bulgaria, Greece, Croatia, Slovakia, Poland
Eastern/Southern European countries with more institutional challenges

PC2 interpretation:

High PC2: more trust in international institutions relative to national politics
Low PC2: more trust in national politics relative to international

fviz_pca_ind(pca, col.ind = "cos2",
             gradient.cols = c("blue", "yellow", "red"),
             repel = TRUE) +
  geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
  geom_vline(xintercept = 0, linetype = "dashed", alpha = 0.5) +
  ggtitle("Country Positions on Trust Dimensions") +
  xlab("PC1: General Trust Level (Low ← → High)") +
  ylab("PC2: International vs National Trust")

The color shows how well each country is represented (brighter = better).

What we see:

Nordic countries (Finland, Norway, Sweden, Iceland) cluster on the right = high general trust
Bulgaria and Greece are on the left = low trust
Most Western European countries are in the middle
UK is low on PC2 (makes sense - Brexit, less EU trust)

Biplot - Countries and Variables Together

fviz_pca_biplot(pca, repel = TRUE,
                col.var = "red", col.ind = "steelblue") +
  ggtitle("Biplot: Countries and Trust Variables") +
  xlab("PC1: General Trust (72%)") +
  ylab("PC2: International vs National (12%)")

This shows everything together:

Finland is near the social trust arrows (ppltrst, pplfair, pphlp) - Finns trust each other a lot
Switzerland is near police/legal trust arrows
Bulgaria is opposite to all arrows - low trust across the board

Creating Trust Groups

Lets group countries with similar trust profiles.

# simple clustering based on PCA scores
set.seed(123)
km <- kmeans(scores, centers = 4, nstart = 25)

# add cluster to scores
scores_df$Cluster <- km$cluster[match(scores_df$Country, rownames(scores))]

# plot
plot(scores[,1], scores[,2], 
     col = km$cluster, pch = 19, cex = 1.5,
     xlab = "PC1: General Trust", ylab = "PC2: Intl vs National",
     main = "Country Clusters Based on Trust")
text(scores[,1], scores[,2], labels = rownames(scores), 
     pos = 3, cex = 0.7)
abline(h = 0, v = 0, lty = 2, col = "gray")
legend("bottomright", legend = paste("Cluster", 1:4), 
       col = 1:4, pch = 19)

cat("Trust Clusters:\n")

## Trust Clusters:

cat("==============\n\n")

## ==============

for (i in 1:4) {
  countries <- scores_df$Country[scores_df$Cluster == i]
  avg_pc1 <- mean(scores_df$PC1[scores_df$Cluster == i])
  cat("Cluster", i, "(Avg Trust:", round(avg_pc1, 2), "):\n")
  cat("  ", paste(countries, collapse = ", "), "\n\n")
}

## Cluster 1 (Avg Trust: -1.4 ):
##    France, UK, Cyprus, Italy, Czechia, Lithuania, Spain, Hungary, Portugal, Slovenia 
## 
## Cluster 2 (Avg Trust: 1.58 ):
##    Estonia, Germany, Ireland, Belgium, Austria 
## 
## Cluster 3 (Avg Trust: 4.24 ):
##    Norway, Finland, Switzerland, Iceland, Sweden, Netherlands 
## 
## Cluster 4 (Avg Trust: -3.87 ):
##    Poland, Croatia, Greece, Slovakia, Bulgaria

Quality Check

How well does the 2D representation capture the original data?

fviz_cos2(pca, choice = "ind", axes = 1:2) +
  ggtitle("Quality of Representation (cos2)") +
  ylab("cos2 (higher = better)")

Most countries have cos2 > 0.7 which is good. This means the 2D picture captures their trust profile well.

Summary and Conclusions

What Did We Find?

1. Two main dimensions of trust exist:

PC1 (72%): General trust level - some countries just trust more (institutions AND people)
PC2 (12%): International vs national - some countries trust EU/UN more than their own politicians

2. Variables group together:

Political trust (politicians, parliament, parties) = basically one thing
Social trust (people trustworthy, helpful, fair) = another group
These are correlated but distinct

3. Country patterns:

High Trust	Medium Trust	Low Trust
Finland	Germany	Bulgaria
Norway	France	Greece
Sweden	Austria	Croatia
Switzerland	Belgium	Slovakia
Iceland	Ireland	Poland
Netherlands	Spain	Hungary

4. Nordic countries stand out:

Finland, Norway, Sweden, Iceland have much higher trust than others. This fits with what we know - these countries have low corruption, good governance, strong welfare states.

What Does This Mean?

Trust in institutions and trust in people go together - countries with good governments also have citizens who trust each other
Political parties are the least trusted institution everywhere
The East-West divide in Europe shows up clearly in trust data
10 trust questions can be reduced to 2 dimensions without losing much information

Limitations

Using country averages hides individual variation
Data from one time point only
Self-reported trust might have bias

References

European Social Survey Round 11: https://ess.sikt.no/en/
ESS Variable Documentation

sessionInfo()

## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Tahoe 26.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Warsaw
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] psych_2.5.6      corrplot_0.95    factoextra_1.0.7 ggplot2_4.0.0   
## [5] FactoMineR_2.13 
## 
## loaded via a namespace (and not attached):
##  [1] tidyr_1.3.1          sass_0.4.10          generics_0.1.4      
##  [4] rstatix_0.7.3        lattice_0.22-7       digest_0.6.37       
##  [7] magrittr_2.0.4       evaluate_1.0.5       grid_4.5.1          
## [10] estimability_1.5.1   RColorBrewer_1.1-3   mvtnorm_1.3-3       
## [13] fastmap_1.2.0        jsonlite_2.0.0       ggrepel_0.9.6       
## [16] backports_1.5.0      Formula_1.2-5        purrr_1.1.0         
## [19] scales_1.4.0         jquerylib_0.1.4      abind_1.4-8         
## [22] mnormt_2.1.2         cli_3.6.5            rlang_1.1.6         
## [25] scatterplot3d_0.3-44 leaps_3.2            withr_3.0.2         
## [28] cachem_1.1.0         yaml_2.3.10          tools_4.5.1         
## [31] multcompView_0.1-10  parallel_4.5.1       ggsignif_0.6.4      
## [34] dplyr_1.1.4          ggpubr_0.6.2         DT_0.34.0           
## [37] flashClust_1.01-2    broom_1.0.10         vctrs_0.6.5         
## [40] R6_2.6.1             lifecycle_1.0.4      emmeans_2.0.1       
## [43] car_3.1-3            htmlwidgets_1.6.4    MASS_7.3-65         
## [46] cluster_2.1.8.1      pkgconfig_2.0.3      pillar_1.11.1       
## [49] bslib_0.9.0          gtable_0.3.6         glue_1.8.0          
## [52] Rcpp_1.1.0           xfun_0.53            tibble_3.3.0        
## [55] tidyselect_1.2.1     rstudioapi_0.17.1    knitr_1.50          
## [58] farver_2.1.2         xtable_1.8-4         nlme_3.1-168        
## [61] htmltools_0.5.8.1    labeling_0.4.3       carData_3.0-6       
## [64] rmarkdown_2.30       compiler_4.5.1       S7_0.2.0

Dimension Reduction of Trust Variables - European Social Survey

Mekhroj Doliev

2026-02-04