This vignette demonstrates Exploratory Factor Analysis (EFA)
functions from the rwf package. EFA is used to uncover the
latent structure underlying a set of observed variables — it identifies
groups of variables that share common variance, which are interpreted as
factors. The sections below cover loading plots, scree plots, loading
tables, residual diagnostics, and a full EFA report.
Installation instructions for rwf can be found here
plot_loadings produces a heatmap of factor loadings,
making it easy to see which variables belong to which factor and how
strongly they load. Two matrix types are available:
Four datasets are shown to illustrate how the plot looks under different factor structures:
mtcars — a real dataset with a moderate two-factor
structure.model<-psych::fa(mtcars,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
plot_loadings(model=model,matrix_type="structure")
plot_loadings(model=model,matrix_type="pattern")
cm<-matrix(c(1,.8,.8,.1,.1,.1,
.8,1,.8,.1,.1,.1,
.8,.8,1,.1,.1,.1,
.1,.1,.1,1,.8,.8,
.1,.1,.1,.8,1,.8,
.1,.1,.1,.8,.8,1),
ncol=6,nrow=6)
df1<-generate_correlation_matrix(cm,nrows=10000)
model1<-psych::fa(df1,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
plot_loadings(model=model1,matrix_type="pattern",base_size=10)
cm<-matrix(c(1,.1,.1,.1,.1,.1,
.1,1,.1,.1,.1,.1,
.1,.1,1,.1,.1,.1,
.1,.1,.1,1,.8,.8,
.1,.1,.1,.8,1,.8,
.1,.1,.1,.8,.8,1),
ncol=6,nrow=6)
df1<-generate_correlation_matrix(cm,nrows=10000)
model2<-psych::fa(df1,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
plot_loadings(model=model2,matrix_type="pattern",base_size=10)
cm<-matrix(c(1,.01,.01,.01,.01,.01,
.01,1,.01,.01,.01,.01,
.01,.01,1,.01,.01,.01,
.01,.01,.01,1,.01,.01,
.01,.01,.01,.01,1,.01,
.01,.01,.01,.01,.01,1),
ncol=6,nrow=6)
df1<-generate_correlation_matrix(cm,nrows=10000)
model3<-psych::fa(df1,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
plot_loadings(model=model3,matrix_type="pattern",base_size=10)
A scree plot shows eigenvalues (the amount of variance each factor explains) in descending order. It is used to decide how many factors to retain. The conventional rule is to keep factors whose eigenvalue exceeds 1 (Kaiser criterion) or to look for the “elbow” — the point where the curve flattens out.
plot_scree plots eigenvalues alongside parallel analysis
results, which compares observed eigenvalues against those from random
data of the same size. Factors with eigenvalues above the parallel
analysis line are unlikely to be noise and should be retained.
plot_scree(df=mtcars,title="",base_size=15)
model_loadings returns the factor loading matrix as a
formatted table. The cut argument suppresses loadings below
a threshold so only meaningful loadings are shown — this makes factor
interpretation cleaner. The sort argument reorders
variables so that items loading on the same factor appear together.
Three calls demonstrate the options:
cut = NULL: all loadings shown, pattern matrix.cut = 0.4: only loadings ≥ 0.4 shown, structure matrix
— a common threshold for “meaningful” loadings in applied research.cut = 0.4, matrix_type = "all": both pattern and
structure matrices side by side, unsorted.model<-psych::fa(mtcars,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
model_loadings(model=model,cut=NULL,matrix_type="pattern")
## Matrix variable PA1 PA2 ## qsec Pattern qsec -0.92 ## hp Pattern hp 0.89 ## carb Pattern carb 0.85 ## vs Pattern vs -0.8 ## cyl Pattern cyl ## mpg Pattern mpg ## am Pattern am 0.93 ## gear Pattern gear 0.93 ## drat Pattern drat 0.78 ## wt Pattern wt ## disp Pattern disp
model_loadings(model=model,cut=0.4,matrix_type="structure")
## Matrix variable PA1 PA2 ## hp Structure hp 0.93 ## cyl Structure cyl 0.85 -0.68 ## vs Structure vs -0.83 ## qsec Structure qsec -0.82 ## carb Structure carb 0.78 ## mpg Structure mpg -0.76 0.72 ## am Structure am 0.89 ## gear Structure gear 0.87 ## drat Structure drat 0.83 ## wt Structure wt 0.61 -0.81 ## disp Structure disp 0.75 -0.77
model_loadings(model=model,cut=0.4,matrix_type="all",sort=FALSE)
## Matrix variable PA1 PA2 ## 1 Pattern mpg -0.61 0.55 ## 2 Pattern cyl 0.71 -0.48 ## 3 Pattern disp 0.58 -0.6 ## 4 Pattern hp 0.89 ## 5 Pattern drat 0.78 ## 6 Pattern wt 0.42 -0.7 ## 7 Pattern qsec -0.92 ## 8 Pattern vs -0.8 ## 9 Pattern am 0.93 ## 10 Pattern gear 0.93 ## 11 Pattern carb 0.85 ## 12 Structure mpg -0.76 0.72 ## 13 Structure cyl 0.85 -0.68 ## 14 Structure disp 0.75 -0.77 ## 15 Structure hp 0.93 ## 16 Structure drat 0.83 ## 17 Structure wt 0.61 -0.81 ## 18 Structure qsec -0.82 ## 19 Structure vs -0.83 ## 20 Structure am 0.89 ## 21 Structure gear 0.87 ## 22 Structure carb 0.78
Residuals in EFA are the differences between the observed correlation matrix and the correlation matrix reproduced by the factor model. A good-fitting model produces small residuals.
compute_residual_stats returns summary statistics of
these residuals (mean, RMSR — root mean square of residuals, and the
proportion of residuals above a given threshold). A low RMSR (typically
< 0.05) and few large residuals indicate that the factor model
adequately reproduces the observed correlations.
model<-psych::fa(mtcars,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
compute_residual_stats(model)
## residual_statistics value critical formula ## 1 Root Mean Squared Residual 0.04419293 NA sqrt(mean(residuals^2)) ## 2 Number of absolute residuals > 0.05 13.00000000 NA abs(residuals)>0.05 ## 3 Proportion of absolute residuals > 0.05 0.23636364 0.5 numberLargeResiduals/nrow(residuals)
report_efa bundles all the above steps into a single
call — it fits the EFA model, extracts loadings, computes residual
statistics, and produces plots. It returns a structured list suitable
for inclusion in a report.
The example uses a simulated dataset with a known two-factor
structure (the same 6×6 correlation matrix used in the Plot Loadings
section), so the output should cleanly recover the two factors. The
computation chunk uses results='hide' to suppress
intermediate output; the formatted result is printed in the next
chunk.
cm<-matrix(c(1,.8,.8,.1,.1,.1,
.8,1,.8,.1,.1,.1,
.8,.8,1,.1,.1,.1,
.1,.1,.1,1,.8,.8,
.1,.1,.1,.8,1,.8,
.1,.1,.1,.8,.8,1),
ncol=6,nrow=6)
df1<-generate_correlation_matrix(cm,nrows=10000)
model<-psych::fa(df1,nfactors=2,rotate="oblimin",fm="pa",oblique.scores=TRUE)
result<-report_efa(model=model,df=df1)
result
## $correlations ## type X1 X2 X3 X4 X5 X6 ## X1 reproduced correlations 7.985130e-01 8.032711e-01 8.004885e-01 9.997577e-02 1.037210e-01 9.918369e-02 ## X2 reproduced correlations 8.032711e-01 8.080860e-01 8.052810e-01 9.581687e-02 9.960742e-02 9.505945e-02 ## X3 reproduced correlations 8.004885e-01 8.052810e-01 8.024868e-01 9.644434e-02 1.002171e-01 9.568160e-02 ## X4 reproduced correlations 9.997577e-02 9.581687e-02 9.644434e-02 8.047802e-01 8.014203e-01 7.981180e-01 ## X5 reproduced correlations 1.037210e-01 9.960742e-02 1.002171e-01 8.014203e-01 7.980965e-01 7.947862e-01 ## X6 reproduced correlations 9.918369e-02 9.505945e-02 9.568160e-02 7.981180e-01 7.947862e-01 7.915111e-01 ## X11 observed correlations 1.000000e+00 8.033186e-01 8.004536e-01 9.937601e-02 1.044944e-01 9.901257e-02 ## X21 observed correlations 8.033186e-01 1.000000e+00 8.053887e-01 9.466142e-02 9.906744e-02 9.676602e-02 ## X31 observed correlations 8.004536e-01 8.053887e-01 1.000000e+00 9.820125e-02 9.998851e-02 9.413973e-02 ## X41 observed correlations 9.937601e-02 9.466142e-02 9.820125e-02 1.000000e+00 8.015439e-01 7.981726e-01 ## X51 observed correlations 1.044944e-01 9.906744e-02 9.998851e-02 8.015439e-01 1.000000e+00 7.947313e-01 ## X61 observed correlations 9.901257e-02 9.676602e-02 9.413973e-02 7.981726e-01 7.947313e-01 1.000000e+00 ## X12 residual correlations 2.014870e-01 4.755595e-05 -3.486313e-05 -5.997661e-04 7.733228e-04 -1.711178e-04 ## X22 residual correlations 4.755595e-05 1.919140e-01 1.077420e-04 -1.155448e-03 -5.399855e-04 1.706564e-03 ## X32 residual correlations -3.486313e-05 1.077420e-04 1.975132e-01 1.756915e-03 -2.286182e-04 -1.541872e-03 ## X42 residual correlations -5.997661e-04 -1.155448e-03 1.756915e-03 1.952198e-01 1.235887e-04 5.453037e-05 ## X52 residual correlations 7.733228e-04 -5.399855e-04 -2.286182e-04 1.235887e-04 2.019035e-01 -5.485587e-05 ## X62 residual correlations -1.711178e-04 1.706564e-03 -1.541872e-03 5.453037e-05 -5.485587e-05 2.084889e-01 ## ## $npobs ## X1 X2 X3 X4 X5 X6 ## X1 10000 10000 10000 10000 10000 10000 ## X2 10000 10000 10000 10000 10000 10000 ## X3 10000 10000 10000 10000 10000 10000 ## X4 10000 10000 10000 10000 10000 10000 ## X5 10000 10000 10000 10000 10000 10000 ## X6 10000 10000 10000 10000 10000 10000 ## ## $residual_stats ## residual_statistics value critical formula ## 1 Root Mean Squared Residual 0.0008594091 NA sqrt(mean(residuals^2)) ## 2 Number of absolute residuals > 0.05 0.0000000000 NA abs(residuals)>0.05 ## 3 Proportion of absolute residuals > 0.05 0.0000000000 0.5 numberLargeResiduals/nrow(residuals) ## ## $determinant_test ## determinant above_critical ## 1 0.01054455 TRUE ## ## $bartlett_test ## x_squared[bartlett] df[bartlett] p[bartlett] ## 1 45504.01 15 0 ## ## $kmo_test ## Overall_MSA MSA Kaiser_1974 ## X1 0.7686563 0.7727995 NA ## X2 0.7686563 0.7652963 NA ## X3 0.7686563 0.7693330 NA ## X4 0.7686563 0.7633766 NA ## X5 0.7686563 0.7685973 NA ## X6 0.7686563 0.7726685 NA ## ## $loadings ## Matrix variable PA2 PA1 type row.names.model.Vaccounted. ## 1 Pattern X2 0.9000000 0.0000000## 2 Pattern X3 0.9000000 0.0000000 ## 3 Pattern X1 0.8900000 0.0000000 ## 4 Pattern X4 0.0000000 0.9000000 ## 5 Pattern X5 0.0000000 0.8900000 ## 6 Pattern X6 0.0000000 0.8900000 ## 7 Structure X2 0.9000000 0.1100000 ## 8 Structure X3 0.9000000 0.1100000 ## 9 Structure X1 0.8900000 0.1100000 ## 10 Structure X4 0.1100000 0.9000000 ## 11 Structure X5 0.1100000 0.8900000 ## 12 Structure X6 0.1100000 0.8900000 ## 13 2.4090813 2.3943923 variance accounted SS loadings ## 14 0.4015135 0.3990654 variance accounted Proportion Var ## 15 0.4015135 0.8005789 variance accounted Cumulative Var ## 16 0.5015290 0.4984710 variance accounted Proportion Explained ## 17 0.5015290 1.0000000 variance accounted Cumulative Proportion ## ## $instruction_loading_critical_values ## sample critical_loading ## 1 50 0.75 ## 2 60 0.70 ## 3 70 0.65 ## 4 85 0.60 ## 5 100 0.55 ## 6 120 0.50 ## 7 150 0.45 ## 8 200 0.40 ## 9 250 0.35 ## 10 350 0.30 ## ## $weights ## PA2 PA1 ## X1 0.33922216 -0.03717142 ## X2 0.35853173 -0.04132905 ## X3 0.34722926 -0.03994639 ## X4 -0.04165732 0.36164385 ## X5 -0.03829471 0.34794398 ## X6 -0.03882476 0.33612103