Exploratory Factor Analysis

library(haven)
library(tidyverse)
library(summarytools)
library(psych)
library(lavaan)
library(EGAnet)
# library(qgraph)

Mydata <- read_sav("/Users/niurongting/Desktop/JPPR/Crisis Emotion - Main Study-Final.sav")
head(Mydata)

## # A tibble: 6 × 201
##   StartDate           EndDate             Status         IPAddress      Progress
##   <dttm>              <dttm>              <dbl+lbl>      <chr>             <dbl>
## 1 2025-03-03 09:50:07 2025-03-03 09:54:14 0 [IP Address] 100.12.196.35       100
## 2 2025-03-03 09:50:10 2025-03-03 09:55:02 0 [IP Address] 72.0.134.210        100
## 3 2025-03-03 09:50:05 2025-03-03 09:55:41 0 [IP Address] 73.188.88.156       100
## 4 2025-03-03 09:50:03 2025-03-03 09:55:46 0 [IP Address] 47.39.239.232       100
## 5 2025-03-03 09:50:10 2025-03-03 09:55:53 0 [IP Address] 173.91.121.220      100
## 6 2025-03-03 09:50:07 2025-03-03 09:55:57 0 [IP Address] 73.85.98.93         100
## # ℹ 196 more variables: Duration__in_seconds_ <dbl>, Finished <dbl+lbl>,
## #   RecordedDate <dttm>, ResponseId <chr>, RecipientLastName <chr>,
## #   RecipientFirstName <chr>, RecipientEmail <chr>, ExternalReference <chr>,
## #   LocationLatitude <chr>, LocationLongitude <chr>, DistributionChannel <chr>,
## #   UserLanguage <chr>, S0Q1 <dbl+lbl>, Q79_First_Click <dbl>,
## #   Q79_Last_Click <dbl>, Q79_Page_Submit <dbl>, Q79_Click_Count <dbl>,
## #   Q80_First_Click <dbl>, Q80_Last_Click <dbl>, Q80_Page_Submit <dbl>, …

Assumption Checking

Let’s select 6 items on moral outrage and 6 times on empathetic anger(note: items are already reverse-scored)

Check for 1) item variance/range-restriction, 2) coding errors (e.g., out of range values, miscoded values), 3) non-normal item distributions (e.g., bimodal distribution)

empatheticanger_sc <- Mydata %>%
  select(EmpatheticAnger__1:EmpatheticAnger__6) %>%
  mutate(across(everything(), haven::zap_labels))

print(dfSummary(empatheticanger_sc,
                varnumbers   = FALSE,
                valid.col    = FALSE,
                plain.ascii  = FALSE,
                graph.magnif = 0.76),
      method = 'render')

Data Frame Summary

empatheticanger_sc

Dimensions: 482 x 6
Duplicates: 235

Variable

Label

Stats / Values

Freqs (% of Valid)

Graph

Missing

EmpatheticAnger__1 [numeric]

Considering the victims in this crisis, please indicate your level of agreement with the following statements: - If I see that victims in this crisis are feeling mad because they were mistreated, then I feel mad too.

Mean (sd) : 5.1 (1.5)

min ≤ med ≤ max:

1 ≤ 5 ≤ 7

IQR (CV) : 2 (0.3)

1	:	14	(	2.9%	)
2	:	26	(	5.4%	)
3	:	30	(	6.2%	)
4	:	65	(	13.5%	)
5	:	137	(	28.4%	)
6	:	131	(	27.2%	)
7	:	79	(	16.4%	)

0 (0.0%)

EmpatheticAnger__2 [numeric]

Considering the victims in this crisis, please indicate your level of agreement with the following statements: - When I see victims in this crisis feeling sad because they were hurt by the organization, I feel angry.

Mean (sd) : 5.1 (1.5)

min ≤ med ≤ max:

1 ≤ 5 ≤ 7

IQR (CV) : 2 (0.3)

1	:	16	(	3.3%	)
2	:	23	(	4.8%	)
3	:	24	(	5.0%	)
4	:	63	(	13.1%	)
5	:	145	(	30.1%	)
6	:	134	(	27.8%	)
7	:	77	(	16.0%	)

0 (0.0%)

EmpatheticAnger__3 [numeric]

Considering the victims in this crisis, please indicate your level of agreement with the following statements: - I feel angry for victims in this crisis when they have been victimized by the organization.

Mean (sd) : 5.3 (1.5)

min ≤ med ≤ max:

1 ≤ 5 ≤ 7

IQR (CV) : 1 (0.3)

1	:	16	(	3.3%	)
2	:	17	(	3.5%	)
3	:	18	(	3.7%	)
4	:	63	(	13.1%	)
5	:	131	(	27.2%	)
6	:	138	(	28.6%	)
7	:	99	(	20.5%	)

0 (0.0%)

EmpatheticAnger__4 [numeric]

Considering the victims in this crisis, please indicate your level of agreement with the following statements: - I feel angry for victims in this crisis when their feelings have been hurt by the organization.

Mean (sd) : 4.9 (1.6)

min ≤ med ≤ max:

1 ≤ 5 ≤ 7

IQR (CV) : 2 (0.3)

1	:	24	(	5.0%	)
2	:	24	(	5.0%	)
3	:	31	(	6.4%	)
4	:	69	(	14.3%	)
5	:	140	(	29.0%	)
6	:	117	(	24.3%	)
7	:	77	(	16.0%	)

0 (0.0%)

EmpatheticAnger__5 [numeric]

Considering the victims in this crisis, please indicate your level of agreement with the following statements: - I get angry when victims in this crisis are hurt by the organization.

Mean (sd) : 5.2 (1.5)

min ≤ med ≤ max:

1 ≤ 5 ≤ 7

IQR (CV) : 1 (0.3)

1	:	17	(	3.5%	)
2	:	22	(	4.6%	)
3	:	17	(	3.5%	)
4	:	52	(	10.8%	)
5	:	145	(	30.1%	)
6	:	130	(	27.0%	)
7	:	99	(	20.5%	)

0 (0.0%)

EmpatheticAnger__6 [numeric]

Considering the victims in this crisis, please indicate your level of agreement with the following statements: - When victims in this crisis get angry at the organization, I feel angry at that organization.

Mean (sd) : 5.1 (1.6)

min ≤ med ≤ max:

1 ≤ 5 ≤ 7

IQR (CV) : 2 (0.3)

1	:	21	(	4.4%	)
2	:	25	(	5.2%	)
3	:	18	(	3.7%	)
4	:	73	(	15.1%	)
5	:	124	(	25.7%	)
6	:	141	(	29.3%	)
7	:	80	(	16.6%	)

0 (0.0%)

Generated by summarytools 1.1.4 (R version 4.4.1)
2025-10-13

Let’s look at item intercorrelations to check for 3) linear relations between indicators, and 4) item redundancy.

pairs.panels(empatheticanger_sc, 
             method = "pearson",
             density = TRUE,
             gap=0) # show density plots

corPlot(empatheticanger_sc,scale=F)

# EGAnet package to examine item redundancy. See Christensen, Garrido, & Golino, 2023.

sc_uva <- UVA(
  data = empatheticanger_sc)
sc_uva

## Variable pairs with wTO > 0.30 (large-to-very large redundancy)
## 
## ----
## 
## Variable pairs with wTO > 0.25 (moderate-to-large redundancy)
## 
## ----
## 
## Variable pairs with wTO > 0.20 (small-to-moderate redundancy)
## 
##              node_i             node_j   wto
##  EmpatheticAnger__3 EmpatheticAnger__5 0.245
##  EmpatheticAnger__1 EmpatheticAnger__2 0.228
##  EmpatheticAnger__5 EmpatheticAnger__6 0.228
##  EmpatheticAnger__2 EmpatheticAnger__6 0.226
##  EmpatheticAnger__3 EmpatheticAnger__4 0.214
##  EmpatheticAnger__3 EmpatheticAnger__6 0.214
##  EmpatheticAnger__1 EmpatheticAnger__6 0.205

Selecting the Number of Factors

set.seed(123) #enables replication 
scree(empatheticanger_sc) #plots scree for observed (PC) and reduced (FA) correlation matrices

pa <- fa.parallel(empatheticanger_sc,fm="ml",fa="fa") #for continuous data

## Parallel analysis suggests that the number of factors =  1  and the number of components =  NA

# for ordinal data, cor="poly"
# pa_ord <- fa.parallel(empatheticanger_sc,fm="uls",fa="fa",cor="poly") 

# I also like to use EGA to evaluate the number of factors 
# Using bootstrapped ega is best practice but this is faster
sc_ega <- EGA(data = empatheticanger_sc)

Now let's do a factor analysis with rotation.


``` r
#for continuous indicators and multivariate normality. 
# Use estimator="mlr" if you want a robust estimator. 
# Add missing="ml" to use ML estimation for missing data.

efa_mlr <- efa(data=empatheticanger_sc,nfactors=1,rotation="none",estimator="mlr")
summary(efa_mlr)

## This is lavaan 0.6-19 -- running exploratory factor analysis
## 
##   Estimator                                         ML
##   Rotation method                                 NONE
## 
##   Number of observations                           482
## 
## Fit measures:
##                     aic      bic    sabic chisq df pvalue   cfi rmsea
##   nfactors = 1 7664.558 7714.694 7676.607 9.823  9  0.365 0.999 0.022
## 
## Eigenvalues correlation matrix:
## 
##     ev1     ev2     ev3     ev4     ev5     ev6 
##   4.921   0.306   0.241   0.212   0.166   0.153 
## 
## Standardized loadings: (* = significant at 1% level)
## 
##                        f1       unique.var   communalities
## EmpatheticAnger__1  0.870*           0.243           0.757
## EmpatheticAnger__2  0.869*           0.245           0.755
## EmpatheticAnger__3  0.909*           0.174           0.826
## EmpatheticAnger__4  0.838*           0.297           0.703
## EmpatheticAnger__5  0.909*           0.174           0.826
## EmpatheticAnger__6  0.917*           0.159           0.841
## 
##                            f1
## Sum of squared loadings 4.708
## Proportion of total     1.000
## Proportion var          0.785
## Cumulative var          0.785

## use the polychoric correlation matrix as input if data are ordinal

# efa_poly <- efa(sample.cov=sc_cor_poly,nfactors=1:6,sample.nobs = 1282, rotation="cf-facparsim",estimator="uls",ov.names=names(haggar_sc)) 
# summary(efa_poly)

# use ULS continuous indicators and multivariate normality is a concern

# efa <- efa(data=haggar_sc,nfactors=1:6,rotation="cf-facparsim",estimator="uls")