Step 1: Generate a Reliability Coefficient for a Set of Items

Using the SPSS Anxiety Questionnaire, from Field, Ch. 17

library(psych)
library(tidyverse)

Registered S3 methods overwritten by 'dbplyr':
  method         from
  print.tbl_lazy     
  print.tbl_sql      
── Attaching packages ─────────────────────────────── tidyverse 1.3.0 ──
✓ ggplot2 3.3.3     ✓ purrr   0.3.4
✓ tibble  3.0.6     ✓ dplyr   1.0.4
✓ tidyr   1.1.2     ✓ stringr 1.4.0
✓ readr   1.4.0     ✓ forcats 0.5.1
── Conflicts ────────────────────────────────── tidyverse_conflicts() ──
x ggplot2::%+%()   masks psych::%+%()
x ggplot2::alpha() masks psych::alpha()
x dplyr::filter()  masks stats::filter()
x dplyr::lag()     masks stats::lag()

library(corrr)

library(haven)
saq <- read_dta("saq.dta")

Explore your data with glimpse

glimpse(saq)

Rows: 2,571
Columns: 23
$ Q01 <dbl+lbl> 2, 1, 2, 3, 2, 2, 2, 2, 3, 2, 2, 2, 3, 2, 2, 3, 1, 2, …
$ Q02 <dbl+lbl> 1, 1, 3, 1, 1, 1, 3, 2, 3, 4, 1, 1, 1, 2, 2, 1, 2, 2, …
$ Q03 <dbl+lbl> 4, 4, 2, 1, 3, 3, 3, 3, 1, 4, 5, 3, 3, 1, 3, 2, 5, 3, …
$ Q04 <dbl+lbl> 2, 3, 2, 4, 2, 2, 2, 2, 4, 3, 2, 3, 4, 2, 4, 2, 2, 3, …
$ Q05 <dbl+lbl> 2, 2, 4, 3, 2, 4, 2, 2, 5, 2, 2, 4, 3, 2, 2, 2, 1, 3, …
$ Q06 <dbl+lbl> 2, 2, 1, 3, 3, 4, 2, 2, 3, 1, 1, 3, 2, 2, 2, 2, 1, 4, …
$ Q07 <dbl+lbl> 3, 2, 2, 4, 3, 4, 2, 2, 5, 2, 2, 3, 3, 3, 3, 2, 1, 3, …
$ Q08 <dbl+lbl> 1, 2, 2, 2, 2, 2, 2, 2, 5, 2, 2, 1, 3, 2, 2, 2, 1, 2, …
$ Q09 <dbl+lbl> 1, 5, 2, 2, 4, 4, 3, 4, 3, 3, 5, 3, 2, 2, 2, 2, 4, 5, …
$ Q10 <dbl+lbl> 2, 2, 2, 4, 2, 3, 2, 2, 3, 2, 2, 2, 3, 3, 3, 3, 1, 2, …
$ Q11 <dbl+lbl> 1, 2, 3, 2, 2, 2, 2, 2, 5, 2, 1, 2, 3, 2, 2, 2, 1, 3, …
$ Q12 <dbl+lbl> 2, 3, 3, 2, 3, 4, 2, 3, 5, 3, 3, 3, 4, 4, 3, 3, 2, 3, …
$ Q13 <dbl+lbl> 2, 1, 2, 2, 3, 3, 2, 2, 5, 2, 1, 2, 4, 2, 2, 2, 1, 3, …
$ Q14 <dbl+lbl> 2, 3, 4, 3, 2, 3, 2, 2, 5, 1, 2, 2, 4, 4, 3, 3, 1, 3, …
$ Q15 <dbl+lbl> 2, 4, 2, 3, 2, 5, 2, 3, 5, 2, 1, 3, 4, 4, 3, 2, 1, 4, …
$ Q16 <dbl+lbl> 3, 3, 3, 3, 2, 2, 2, 2, 5, 3, 2, 3, 4, 4, 4, 3, 2, 3, …
$ Q17 <dbl+lbl> 1, 2, 2, 2, 2, 3, 2, 2, 5, 2, 2, 2, 3, 2, 2, 2, 2, 2, …
$ Q18 <dbl+lbl> 2, 2, 3, 4, 3, 5, 2, 2, 5, 2, 2, 2, 3, 4, 3, 3, 1, 2, …
$ Q19 <dbl+lbl> 3, 3, 1, 2, 3, 1, 3, 4, 2, 3, 5, 3, 2, 1, 3, 2, 4, 2, …
$ Q20 <dbl+lbl> 2, 4, 4, 4, 4, 5, 2, 3, 5, 3, 3, 4, 4, 5, 4, 3, 2, 3, …
$ Q21 <dbl+lbl> 2, 4, 3, 4, 2, 3, 2, 2, 5, 2, 2, 3, 4, 5, 4, 2, 1, 3, …
$ Q22 <dbl+lbl> 2, 4, 2, 4, 4, 1, 4, 4, 3, 4, 5, 4, 3, 3, 4, 3, 4, 3, …
$ Q23 <dbl+lbl> 5, 2, 2, 3, 4, 4, 4, 4, 3, 4, 5, 4, 4, 1, 4, 4, 4, 4, …

`summary` provides a nice, compact coverage of mean, median, etc.

summary(saq)

      Q01             Q02             Q03             Q04       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:1.000   1st Qu.:2.000   1st Qu.:2.000  
 Median :2.000   Median :1.000   Median :3.000   Median :3.000  
 Mean   :2.374   Mean   :1.623   Mean   :2.585   Mean   :2.786  
 3rd Qu.:3.000   3rd Qu.:2.000   3rd Qu.:3.000   3rd Qu.:3.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
      Q05             Q06             Q07             Q08       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:1.000   1st Qu.:2.000   1st Qu.:2.000  
 Median :3.000   Median :2.000   Median :3.000   Median :2.000  
 Mean   :2.722   Mean   :2.227   Mean   :2.924   Mean   :2.237  
 3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:3.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
      Q09             Q10             Q11             Q12       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:3.000  
 Median :3.000   Median :2.000   Median :2.000   Median :3.000  
 Mean   :2.846   Mean   :2.281   Mean   :2.255   Mean   :3.159  
 3rd Qu.:4.000   3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:4.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
      Q13             Q14             Q15             Q16       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000  
 Median :2.000   Median :3.000   Median :3.000   Median :3.000  
 Mean   :2.449   Mean   :2.876   Mean   :2.766   Mean   :2.879  
 3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:3.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
      Q17             Q18             Q19             Q20       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:1.000   1st Qu.:3.000  
 Median :2.000   Median :2.000   Median :2.000   Median :4.000  
 Mean   :2.467   Mean   :2.569   Mean   :2.292   Mean   :3.624  
 3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:4.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
      Q21             Q22             Q23       
 Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:3.000  
 Median :3.000   Median :3.000   Median :4.000  
 Mean   :3.171   Mean   :2.888   Mean   :3.434  
 3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000

Now, create a correlation matrix for all of your indicators:

corr_matrix <- correlate(saq)


Correlation method: 'pearson'
Missing treated using: 'pairwise.complete.obs'

Q3 is not scaled in the same direction- we need to reverse code it:

glimpse(saq$Q03)

 dbl+lbl [1:2571] 4, 4, 2, 1, 3, 3, 3, 3, 1, 4, 5, 3, 3, 1, 3, 2, 5...
 @ label       : chr "Standard deviations excite me"
 @ format.stata: chr "%12.0g"
 @ labels      : Named num [1:5] 1 2 3 4 5
  ..- attr(*, "names")= chr [1:5] "Strongly agree" "Agree" "Neither" "Disagree" ...

First, `mutate` and create the reverse item, then use `select` to drop the original

saq.clean <-saq %>%
  mutate(.,
         Q03R = 6- Q03) %>%
  select(.,
         -Q03)

Double check that the reverse coding worked:

glimpse(saq.clean$Q03R)

 num [1:2571] 2 2 4 5 3 3 3 3 5 2 ...

Reliability 101: Calculating Coefficient Alpha

Let’s start with a subscale - select the questions about computers

computers <-saq.clean %>%
  select(.,
         Q06,
         Q07,
         Q10,
         Q13,
         Q14,
         Q15,
         Q18)

Check the correlations:

correlate(computers)


Correlation method: 'pearson'
Missing treated using: 'pairwise.complete.obs'

ABCDEFGHIJ0123456789

term <chr>	Q06 <dbl>	Q07 <dbl>	Q10 <dbl>	Q13 <dbl>	Q14 <dbl>	Q15 <dbl>	Q18 <dbl>
Q06	NA	0.5135805	0.3222302	0.4664049	0.4022441	0.3598931	0.5133216
Q07	0.5135805	NA	0.2837230	0.4421193	0.4407028	0.3913668	0.5008668
Q10	0.3222302	0.2837230	NA	0.3019671	0.2546873	0.2952344	0.2925030
Q13	0.4664049	0.4421193	0.3019671	NA	0.4497863	0.3421970	0.5329371
Q14	0.4022441	0.4407028	0.2546873	0.4497863	NA	0.3801148	0.4983061
Q15	0.3598931	0.3913668	0.2952344	0.3421970	0.3801148	NA	0.3428704
Q18	0.5133216	0.5008668	0.2925030	0.5329371	0.4983061	0.3428704	NA

How reliable would these items be as a scale? Get a Cronbach’s alpha for this measure:

psych::alpha(computers)


Reliability analysis   
Call: psych::alpha(x = computers)

ABCDEFGHIJ0123456789

	raw_alpha <dbl>	std.alpha <dbl>	G6(smc) <dbl>	average_r <dbl>	S/N <dbl>	ase <dbl>	mean <dbl>	sd <dbl>	median_r <dbl>
	0.8233595	0.8214131	0.8051715	0.3965265	4.599515	0.005216248	2.584597	0.7099879	0.3913668


 lower alpha upper     95% confidence boundaries
0.81 0.82 0.83 

 Reliability if an item is dropped:

ABCDEFGHIJ0123456789

	raw_alpha <dbl>	std.alpha <dbl>	G6(smc) <dbl>	average_r <dbl>	S/N <dbl>	alpha se <dbl>	var.r <dbl>	med.r <dbl>
Q06	0.7905934	0.7885424	0.7651928	0.3832922	3.729080	0.006290270	0.008077541	0.3801148
Q07	0.7904747	0.7887922	0.7657251	0.3836465	3.734673	0.006293846	0.007926764	0.3598931
Q10	0.8239024	0.8240881	0.8011692	0.4384474	4.684663	0.005342093	0.004291948	0.4421193
Q13	0.7936766	0.7905203	0.7677567	0.3861097	3.773733	0.006167027	0.008064726	0.3801148
Q14	0.7979749	0.7955608	0.7730056	0.3934144	3.891431	0.006035109	0.008508588	0.3598931
Q15	0.8118706	0.8093487	0.7875040	0.4143587	4.245179	0.005603716	0.009452473	0.4421193
Q18	0.7855331	0.7836346	0.7578478	0.3764168	3.621811	0.006427460	0.005826593	0.3801148


 Item statistics

ABCDEFGHIJ0123456789

	n <dbl>	raw.r <dbl>	std.r <dbl>	r.cor <dbl>	r.drop <dbl>	mean <dbl>	sd <dbl>
Q06	2571	0.7482507	0.7356097	0.6818773	0.6187265	2.227149	1.1220023
Q07	2571	0.7463776	0.7345169	0.6799128	0.6190323	2.923765	1.1023600
Q10	2571	0.5429157	0.5655016	0.4392362	0.3999218	2.280825	0.8771293
Q13	2571	0.7203114	0.7269201	0.6690493	0.6067436	2.449242	0.9485040
Q14	2571	0.7030734	0.7043912	0.6361547	0.5768249	2.876313	0.9985724
Q15	2571	0.6376030	0.6397954	0.5410779	0.4912675	2.766239	1.0093935
Q18	2571	0.7619867	0.7568146	0.7168204	0.6473895	2.568650	1.0531837


Non missing response frequency for each item
       1    2    3    4    5 miss
Q06 0.27 0.44 0.13 0.10 0.06    0
Q07 0.07 0.34 0.26 0.24 0.09    0
Q10 0.14 0.57 0.18 0.10 0.02    0
Q13 0.12 0.48 0.25 0.12 0.03    0
Q14 0.06 0.31 0.38 0.18 0.07    0
Q15 0.07 0.39 0.30 0.18 0.06    0
Q18 0.14 0.37 0.31 0.12 0.06    0

What happens when we add an item that doesn’t fit?

computers2 <-saq.clean %>%
  select(.,
         Q06,
         Q07,
         Q10,
         Q13,
         Q14,
         Q15,
         Q18,
         Q23)

correlate(computers2)


Correlation method: 'pearson'
Missing treated using: 'pairwise.complete.obs'

ABCDEFGHIJ0123456789

term <chr>	Q06 <dbl>	Q07 <dbl>	Q10 <dbl>	Q13 <dbl>	Q14 <dbl>	Q15 <dbl>	Q18 <dbl>
Q06	NA	0.51358048	0.32223023	0.46640487	0.40224407	0.35989309	0.51332164
Q07	0.51358048	NA	0.28372299	0.44211926	0.44070276	0.39136675	0.50086685
Q10	0.32223023	0.28372299	NA	0.30196707	0.25468730	0.29523438	0.29250304
Q13	0.46640487	0.44211926	0.30196707	NA	0.44978632	0.34219704	0.53293713
Q14	0.40224407	0.44070276	0.25468730	0.44978632	NA	0.38011484	0.49830615
Q15	0.35989309	0.39136675	0.29523438	0.34219704	0.38011484	NA	0.34287045
Q18	0.51332164	0.50086685	0.29250304	0.53293713	0.49830615	0.34287045	NA
Q23	-0.06868743	-0.07029016	-0.06191796	-0.05298304	-0.04847418	-0.06200665	-0.08041698

psych::alpha(computers2)

Some items were negatively correlated with the total scale and probably 
should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option

Some items ( Q23 ) were negatively correlated with the total scale and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option
Reliability analysis   
Call: psych::alpha(x = computers2)

ABCDEFGHIJ0123456789

	raw_alpha <dbl>	std.alpha <dbl>	G6(smc) <dbl>	average_r <dbl>	S/N <dbl>	ase <dbl>	mean <dbl>	sd <dbl>	median_r <dbl>
	0.7583306	0.7581306	0.7644954	0.28151	3.134463	0.006841916	2.69083	0.6229974	0.3425337


 lower alpha upper     95% confidence boundaries
0.74 0.76 0.77 

 Reliability if an item is dropped:

ABCDEFGHIJ0123456789

	raw_alpha <dbl>	std.alpha <dbl>	G6(smc) <dbl>	average_r <dbl>	S/N <dbl>	alpha se <dbl>	var.r <dbl>	med.r <dbl>
Q06	0.7043739	0.7064843	0.7127982	0.2558711	2.406973	0.008540944	0.048307569	0.3019671
Q07	0.7047206	0.7068428	0.7133920	0.2562005	2.411139	0.008526461	0.048217587	0.3222302
Q10	0.7448782	0.7454375	0.7525410	0.2949454	2.928309	0.007329939	0.057095429	0.3913668
Q13	0.7100036	0.7078575	0.7150631	0.2571358	2.422987	0.008306891	0.049338756	0.3222302
Q14	0.7137442	0.7132105	0.7207455	0.2621387	2.486878	0.008200353	0.051215179	0.3222302
Q15	0.7296619	0.7291328	0.7372036	0.2777434	2.691846	0.007721724	0.055643945	0.3222302
Q18	0.7007186	0.7016947	0.7052388	0.2515187	2.352271	0.008627160	0.045045792	0.3222302
Q23	0.8233595	0.8214131	0.8051715	0.3965265	4.599515	0.005216248	0.007741658	0.3913668


 Item statistics

ABCDEFGHIJ0123456789

	n <dbl>	raw.r <dbl>	std.r <dbl>	r.cor <dbl>	r.drop <dbl>	mean <dbl>	sd <dbl>
Q06	2571	0.7317550	0.7198083	0.6820689	0.59657109	2.227149	1.1220023
Q07	2571	0.7295515	0.7183892	0.6797031	0.59655956	2.923765	1.1023600
Q10	2571	0.5284170	0.5514846	0.4353581	0.38339470	2.280825	0.8771293
Q13	2571	0.7071832	0.7143603	0.6725937	0.59016289	2.449242	0.9485040
Q14	2571	0.6909381	0.6928088	0.6399751	0.56152727	2.876313	0.9985724
Q15	2571	0.6228186	0.6255873	0.5395326	0.47324134	2.766239	1.0093935
Q18	2571	0.7429958	0.7385577	0.7152359	0.62201277	2.568650	1.0531837
Q23	2571	0.1181643	0.1138946	-0.1020857	-0.09151165	3.434461	1.0437337


Non missing response frequency for each item
       1    2    3    4    5 miss
Q06 0.27 0.44 0.13 0.10 0.06    0
Q07 0.07 0.34 0.26 0.24 0.09    0
Q10 0.14 0.57 0.18 0.10 0.02    0
Q13 0.12 0.48 0.25 0.12 0.03    0
Q14 0.06 0.31 0.38 0.18 0.07    0
Q15 0.07 0.39 0.30 0.18 0.06    0
Q18 0.14 0.37 0.31 0.12 0.06    0
Q23 0.06 0.12 0.27 0.42 0.12    0

Reliability 102: Creating a scale score using alpha

Now, we can create a new variable, computers, that is constructed by taking the mean of our 7 computers items.

my.keys.list <- list(computers_var = c("Q06","Q07","Q10", "Q13","Q14", "Q15", "Q18"))
                     
my.scales <- scoreItems(my.keys.list, saq.clean, impute = "none")

print(my.scales, short = FALSE)

Call: scoreItems(keys = my.keys.list, items = saq.clean, impute = "none")

(Standardized) Alpha:
      computers_var
alpha          0.82

Standard errors of unstandardized Alpha:
      computers_var
ASE          0.0094

Standardized Alpha of observed scales:
     computers_var
[1,]          0.82

Average item correlation:
          computers_var
average.r           0.4

Median item correlation:
computers_var 
         0.39 

 Guttman 6* reliability: 
         computers_var
Lambda.6          0.81

Signal/Noise based upon av.r : 
             computers_var
Signal/Noise           4.7

Scale intercorrelations corrected for attenuation 
 raw correlations below the diagonal, alpha on the diagonal 
 corrected correlations above the diagonal:

Note that these are the correlations of the complete scales based on the correlation matrix,
 not the observed scales based on the raw items.
              computers_var
computers_var          0.82

Item by scale correlations:
 corrected for item overlap and scale reliability
    computers_var
Q06          0.68
Q07          0.68
Q10          0.44
Q13          0.67
Q14          0.64
Q15          0.54
Q18          0.72

Non missing response frequency for each item
       1    2    3    4    5 miss
Q06 0.27 0.44 0.13 0.10 0.06    0
Q07 0.07 0.34 0.26 0.24 0.09    0
Q10 0.14 0.57 0.18 0.10 0.02    0
Q13 0.12 0.48 0.25 0.12 0.03    0
Q14 0.06 0.31 0.38 0.18 0.07    0
Q15 0.07 0.39 0.30 0.18 0.06    0
Q18 0.14 0.37 0.31 0.12 0.06    0

Save acored scales as new variables

my.scores <- as_tibble(my.scales$scores)

Attach New Variable to Old Data Set (SAQ)

saq.clean1 <-bind_cols(saq.clean, my.scores)
glimpse(saq.clean1)

Performing an EFA

Now, let’s look at the larger SAQ survey - how many distinct factors does it measure? We can use a scree plot to aid with the decision of how many components to include: ## Scree plot

fa.parallel(saq.clean, fm = "pa", fa = "pc")

Parallel analysis suggests that the number of factors =  NA  and the number of components =  4

From this plot, we can see that the first component has the largest eigen value. We also see that the Parallel Analysis Scree Plot indicates 4 retained factors.

We now proceed to construct a PCA model, with a single principal component.

Basic code for Factor Analysis: Note the code principal for a PCA. We also need to create a correlation matrix, which is the input that psych uses to do this.

pcaModel1

Principal Components Analysis
Call: principal(r = saqcor, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix

ABCDEFGHIJ0123456789

	PC1 <S3: AsIs>	h2 <dbl>	u2 <dbl>	com <dbl>
Q01	0.59	0.34348421	0.6565158	1
Q02	-0.30	0.09155895	0.9084411	1
Q04	0.63	0.40254127	0.5974587	1
Q05	0.56	0.30863215	0.6913678	1
Q06	0.56	0.31568021	0.6843198	1
Q07	0.69	0.46947768	0.5305223	1
Q08	0.55	0.30129735	0.6987027	1
Q09	-0.28	0.08056908	0.9194309	1
Q10	0.44	0.19103889	0.8089611	1
Q11	0.65	0.42571726	0.5742827	1


                PC1
SS loadings    7.29
Proportion Var 0.32

Mean item complexity =  1
Test of the hypothesis that 1 component is sufficient.

The root mean square of the residuals (RMSR) is  0.07 
 with the empirical chi square  6633.48  with prob <  0 

Fit based upon off diagonal values = 0.94

The SS loadings represent the eigenvalue for the single principal component, which has a value of 7.29, with the proportion of variance explained by this principal component equal to 32%, which leaves 68% of the variance unexplained. We thus fit a second model, which allows for 4 factors as recommended by the scree plot.

pcaModel2 <- principal(saqcor, nfactors = 4, rotate = "none")
pcaModel2

Principal Components Analysis
Call: principal(r = saqcor, nfactors = 4, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix

ABCDEFGHIJ0123456789

	PC1 <S3: AsIs>	PC2 <S3: AsIs>	PC3 <S3: AsIs>	PC4 <S3: AsIs>	h2 <dbl>	u2 <dbl>	com <dbl>
Q01	0.59	0.18	-0.22	0.12	0.4346477	0.5653523	1.557831
Q02	-0.30	0.55	0.15	0.01	0.4137525	0.5862475	1.725000
Q04	0.63	0.14	-0.15	0.15	0.4685890	0.5314110	1.342992
Q05	0.56	0.10	-0.07	0.14	0.3430498	0.6569502	1.229209
Q06	0.56	0.10	0.57	-0.05	0.6539317	0.3460683	2.072765
Q07	0.69	0.04	0.25	0.10	0.5452943	0.4547057	1.324076
Q08	0.55	0.40	-0.32	-0.42	0.7394635	0.2605365	3.471975
Q09	-0.28	0.63	-0.01	0.10	0.4844805	0.5155195	1.456238
Q10	0.44	0.03	0.36	-0.10	0.3347726	0.6652274	2.075668
Q11	0.65	0.25	-0.21	-0.40	0.6896049	0.3103951	2.239247


                       PC1  PC2  PC3  PC4
SS loadings           7.29 1.74 1.32 1.23
Proportion Var        0.32 0.08 0.06 0.05
Cumulative Var        0.32 0.39 0.45 0.50
Proportion Explained  0.63 0.15 0.11 0.11
Cumulative Proportion 0.63 0.78 0.89 1.00

Mean item complexity =  1.8
Test of the hypothesis that 4 components are sufficient.

The root mean square of the residuals (RMSR) is  0.06 
 with the empirical chi square  4006.15  with prob <  0 

Fit based upon off diagonal values = 0.96

print(pcaModel2$loadings)


Loadings:
     PC1    PC2    PC3    PC4   
Q01   0.586  0.175 -0.215  0.119
Q02  -0.303  0.548  0.146       
Q04   0.634  0.144 -0.149  0.153
Q05   0.556  0.101         0.137
Q06   0.562         0.571       
Q07   0.685         0.252  0.103
Q08   0.549  0.401 -0.323 -0.417
Q09  -0.284  0.627         0.103
Q10   0.437         0.363 -0.103
Q11   0.652  0.245 -0.209 -0.400
Q12   0.669                0.248
Q13   0.673         0.278       
Q14   0.656         0.198  0.135
Q15   0.593         0.117 -0.113
Q16   0.679        -0.138       
Q17   0.643  0.330 -0.210 -0.342
Q18   0.701         0.298  0.125
Q19  -0.427  0.390              
Q20   0.436 -0.205 -0.404  0.297
Q21   0.658        -0.187  0.282
Q22  -0.302  0.465 -0.116  0.378
Q23  -0.144  0.366         0.507
Q03R  0.629 -0.290 -0.213       

                 PC1   PC2   PC3   PC4
SS loadings    7.290 1.739 1.317 1.227
Proportion Var 0.317 0.076 0.057 0.053
Cumulative Var 0.317 0.393 0.450 0.503

The eigenvalues for the components are λ1 = 7.29 (from previous computation), λ2 = 1.74, λ3 = 1.32, and λ4 = 1.23. Together these components account for 50% of the variation in the dataset. Blanks in this chart represent items with loadings < .10 (but may not be 0, just small).

Orthogonal Rotation (also called varimax)

varimaxModel <- fa(r=saqcor, nfactors = 4, rotate = "varimax", fm = "pa")
print(varimaxModel)

Factor Analysis using method =  pa
Call: fa(r = saqcor, nfactors = 4, rotate = "varimax", fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix

ABCDEFGHIJ0123456789

	PA1 <S3: AsIs>	PA3 <S3: AsIs>	PA4 <S3: AsIs>	PA2 <S3: AsIs>	h2 <dbl>	u2 <dbl>	com <dbl>
Q01	0.50	0.22	0.27	0.00	0.3729330	0.6270670	1.930587
Q02	-0.21	-0.03	0.01	-0.46	0.2604337	0.7395663	1.400413
Q04	0.53	0.28	0.25	0.03	0.4193328	0.5806672	2.013873
Q05	0.44	0.27	0.19	0.05	0.2985570	0.7014430	2.102467
Q06	0.05	0.75	0.12	0.10	0.5931994	0.4068006	1.097851
Q07	0.36	0.56	0.16	0.13	0.4885395	0.5114605	2.051839
Q08	0.22	0.15	0.76	0.00	0.6455682	0.3544318	1.246963
Q09	-0.13	-0.07	0.06	-0.56	0.3384921	0.6615079	1.170776
Q10	0.14	0.38	0.14	0.12	0.1971635	0.8028365	1.788870
Q11	0.24	0.27	0.69	0.17	0.6294611	0.3705389	1.701855


                       PA1  PA3  PA4  PA2
SS loadings           3.03 2.85 1.99 1.44
Proportion Var        0.13 0.12 0.09 0.06
Cumulative Var        0.13 0.26 0.34 0.40
Proportion Explained  0.33 0.31 0.21 0.15
Cumulative Proportion 0.33 0.63 0.85 1.00

Mean item complexity =  1.8
Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the null model are  253  and the objective function was  7.55 with Chi Square of  19334.49
The degrees of freedom for the model are 167  and the objective function was  0.46 

The root mean square of the residuals (RMSR) is  0.03 
The df corrected root mean square of the residuals is  0.03 

The harmonic number of observations is  2571 with the empirical chi square  880.48  with prob <  2.3e-97 
The total number of observations was  2571  with Likelihood Chi Square =  1166.49  with prob <  2.1e-149 

Tucker Lewis Index of factoring reliability =  0.921
RMSEA index =  0.048  and the 90 % confidence intervals are  0.046 0.051
BIC =  -144.8
Fit based upon off diagonal values = 0.99
Measures of factor score adequacy             
                                                   PA1  PA3  PA4  PA2
Correlation of (regression) scores with factors   0.83 0.86 0.86 0.77
Multiple R square of scores with factors          0.69 0.73 0.74 0.59
Minimum correlation of possible factor scores     0.37 0.46 0.49 0.19

Examine pattern of factor loadings

print(varimaxModel$loadings)


Loadings:
     PA1    PA3    PA4    PA2   
Q01   0.504  0.218  0.266       
Q02  -0.209               -0.464
Q04   0.527  0.281  0.248       
Q05   0.436  0.268  0.187       
Q06          0.752  0.124       
Q07   0.364  0.559  0.161  0.132
Q08   0.218  0.149  0.759       
Q09  -0.133               -0.559
Q10   0.142  0.380  0.135  0.121
Q11   0.237  0.267  0.688  0.170
Q12   0.510  0.398  0.108  0.153
Q13   0.287  0.564  0.228  0.144
Q14   0.388  0.485  0.148  0.130
Q15   0.276  0.377  0.251  0.200
Q16   0.543  0.279  0.245  0.156
Q17   0.295  0.274  0.641       
Q18   0.366  0.612  0.136  0.130
Q19  -0.282 -0.146        -0.375
Q20   0.465                0.204
Q21   0.594  0.264  0.149  0.150
Q22         -0.158        -0.465
Q23                       -0.329
Q03R  0.505  0.181  0.159  0.399

                 PA1   PA3   PA4   PA2
SS loadings    3.034 2.854 1.986 1.435
Proportion Var 0.132 0.124 0.086 0.062
Cumulative Var 0.132 0.256 0.342 0.405

Oblique Rotation (also called promax)

promaxModel <- fa(r=saqcor, nfactors = 4, rotate = "promax", fm = "pa")
print(promaxModel)

Factor Analysis using method =  pa
Call: fa(r = saqcor, nfactors = 4, rotate = "promax", fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix

ABCDEFGHIJ0123456789

	PA1 <S3: AsIs>	PA3 <S3: AsIs>	PA4 <S3: AsIs>	PA2 <S3: AsIs>	h2 <dbl>	u2 <dbl>	com <dbl>
Q01	0.56	0.01	0.15	0.14	0.3729330	0.6270670	1.266840
Q02	-0.20	0.09	0.07	0.45	0.2604337	0.7395663	1.535971
Q04	0.57	0.09	0.10	0.13	0.4193328	0.5806672	1.222447
Q05	0.45	0.12	0.05	0.09	0.2985570	0.7014430	1.253429
Q06	-0.28	0.96	-0.06	0.00	0.5931994	0.4068006	1.171760
Q07	0.23	0.55	-0.04	0.01	0.4885395	0.5114605	1.358244
Q08	0.07	-0.12	0.84	0.05	0.6455682	0.3544318	1.062517
Q09	-0.07	0.00	0.12	0.56	0.3384921	0.6615079	1.119974
Q10	0.00	0.41	0.03	-0.05	0.1971635	0.8028365	1.043854
Q11	0.04	0.03	0.72	-0.11	0.6294611	0.3705389	1.061639


                       PA1  PA3  PA4  PA2
SS loadings           3.44 2.80 1.85 1.22
Proportion Var        0.15 0.12 0.08 0.05
Cumulative Var        0.15 0.27 0.35 0.40
Proportion Explained  0.37 0.30 0.20 0.13
Cumulative Proportion 0.37 0.67 0.87 1.00

 With factor correlations of 
      PA1   PA3   PA4   PA2
PA1  1.00  0.68  0.55 -0.44
PA3  0.68  1.00  0.58 -0.37
PA4  0.55  0.58  1.00 -0.20
PA2 -0.44 -0.37 -0.20  1.00

Mean item complexity =  1.3
Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the null model are  253  and the objective function was  7.55 with Chi Square of  19334.49
The degrees of freedom for the model are 167  and the objective function was  0.46 

The root mean square of the residuals (RMSR) is  0.03 
The df corrected root mean square of the residuals is  0.03 

The harmonic number of observations is  2571 with the empirical chi square  880.48  with prob <  2.3e-97 
The total number of observations was  2571  with Likelihood Chi Square =  1166.49  with prob <  2.1e-149 

Tucker Lewis Index of factoring reliability =  0.921
RMSEA index =  0.048  and the 90 % confidence intervals are  0.046 0.051
BIC =  -144.8
Fit based upon off diagonal values = 0.99
Measures of factor score adequacy             
                                                   PA1  PA3  PA4  PA2
Correlation of (regression) scores with factors   0.93 0.93 0.91 0.81
Multiple R square of scores with factors          0.86 0.86 0.83 0.66
Minimum correlation of possible factor scores     0.72 0.73 0.67 0.31

Examine pattern of factor loadings

print(promaxModel$loadings)


Loadings:
     PA1    PA3    PA4    PA2   
Q01   0.559         0.146  0.139
Q02  -0.202                0.451
Q04   0.566         0.102  0.129
Q05   0.454  0.123              
Q06  -0.276  0.960              
Q07   0.232  0.550              
Q08         -0.122  0.839       
Q09                 0.116  0.565
Q10          0.405              
Q11                 0.717 -0.115
Q12   0.508  0.283              
Q13   0.104  0.566              
Q14   0.297  0.444              
Q15   0.146  0.302  0.143 -0.109
Q16   0.568                     
Q17   0.147         0.643       
Q18   0.219  0.630              
Q19  -0.258                0.323
Q20   0.582 -0.193        -0.107
Q21   0.669                     
Q22   0.112 -0.126         0.480
Q23   0.139                0.363
Q03R  0.530               -0.293

                 PA1   PA3   PA4   PA2
SS loadings    2.957 2.588 1.753 1.162
Proportion Var 0.129 0.113 0.076 0.051
Cumulative Var 0.129 0.241 0.317 0.368

Create a path diagram for your factor model:

fa.diagram(promaxModel)

Compare the rotated and unrotated solutions:

pcaModel2$loadings


Loadings:
     PC1    PC2    PC3    PC4   
Q01   0.586  0.175 -0.215  0.119
Q02  -0.303  0.548  0.146       
Q04   0.634  0.144 -0.149  0.153
Q05   0.556  0.101         0.137
Q06   0.562         0.571       
Q07   0.685         0.252  0.103
Q08   0.549  0.401 -0.323 -0.417
Q09  -0.284  0.627         0.103
Q10   0.437         0.363 -0.103
Q11   0.652  0.245 -0.209 -0.400
Q12   0.669                0.248
Q13   0.673         0.278       
Q14   0.656         0.198  0.135
Q15   0.593         0.117 -0.113
Q16   0.679        -0.138       
Q17   0.643  0.330 -0.210 -0.342
Q18   0.701         0.298  0.125
Q19  -0.427  0.390              
Q20   0.436 -0.205 -0.404  0.297
Q21   0.658        -0.187  0.282
Q22  -0.302  0.465 -0.116  0.378
Q23  -0.144  0.366         0.507
Q03R  0.629 -0.290 -0.213       

                 PC1   PC2   PC3   PC4
SS loadings    7.290 1.739 1.317 1.227
Proportion Var 0.317 0.076 0.057 0.053
Cumulative Var 0.317 0.393 0.450 0.503

promaxModel$loadings


Loadings:
     PA1    PA3    PA4    PA2   
Q01   0.559         0.146  0.139
Q02  -0.202                0.451
Q04   0.566         0.102  0.129
Q05   0.454  0.123              
Q06  -0.276  0.960              
Q07   0.232  0.550              
Q08         -0.122  0.839       
Q09                 0.116  0.565
Q10          0.405              
Q11                 0.717 -0.115
Q12   0.508  0.283              
Q13   0.104  0.566              
Q14   0.297  0.444              
Q15   0.146  0.302  0.143 -0.109
Q16   0.568                     
Q17   0.147         0.643       
Q18   0.219  0.630              
Q19  -0.258                0.323
Q20   0.582 -0.193        -0.107
Q21   0.669                     
Q22   0.112 -0.126         0.480
Q23   0.139                0.363
Q03R  0.530               -0.293

                 PA1   PA3   PA4   PA2
SS loadings    2.957 2.588 1.753 1.162
Proportion Var 0.129 0.113 0.076 0.051
Cumulative Var 0.129 0.241 0.317 0.368

Obtain Kaiser-Meyer-Olkin (KMO) test for sampling adequacy:

KMO(saq.clean)

Kaiser-Meyer-Olkin factor adequacy
Call: KMO(r = saq.clean)
Overall MSA =  0.93
MSA for each item = 
 Q01  Q02  Q04  Q05  Q06  Q07  Q08  Q09  Q10  Q11  Q12  Q13  Q14  Q15  Q16  Q17  Q18 
0.93 0.87 0.96 0.96 0.89 0.94 0.87 0.83 0.95 0.91 0.95 0.95 0.97 0.94 0.93 0.93 0.95 
 Q19  Q20  Q21  Q22  Q23 Q03R 
0.94 0.89 0.93 0.88 0.77 0.95

Predict factor scores for each observation:

factor_scores <-as_tibble(predict.psych(promaxModel, data=saq.clean))
glimpse(factor_scores)

Rows: 2,571
Columns: 4
$ PA1 <dbl> -1.13517005, -0.42378942, 0.09458159, 0.70889630, -0.75319639, 0.29313526…
$ PA3 <dbl> -0.656584615, -0.569931941, -0.337382551, 0.585575018, -0.017007515, 1.43…
$ PA4 <dbl> -1.53009408, -0.38298198, -0.03500232, -0.23971412, -0.42446084, 0.041457…
$ PA2 <dbl> -0.10056536, 0.59965388, -0.63252646, -0.45316706, 0.56126494, -0.5954397…

Correlate your factor scores (should be same as previous results):

correlate(factor_scores)


Correlation method: 'pearson'
Missing treated using: 'pairwise.complete.obs'

ABCDEFGHIJ0123456789

term <chr>	PA1 <dbl>	PA3 <dbl>	PA4 <dbl>	PA2 <dbl>
PA1	NA	0.7835073	0.6531389	-0.5679174
PA3	0.7835073	NA	0.6601635	-0.4759607
PA4	0.6531389	0.6601635	NA	-0.2751292
PA2	-0.5679174	-0.4759607	-0.2751292	NA

LS0tCnRpdGxlOiAiTXVsdGl2YXJpYXRlIFN0YXRpc3RpY3MgTW9kdWxlIDk6IEludHJvIHRvIEV4cGxvcmF0b3J5IEZhY3RvciBBbmFseXNpcyIKYXV0aG9yOiAiRHIuIEJyb2RhIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgojIFN0ZXAgMTogR2VuZXJhdGUgYSBSZWxpYWJpbGl0eSBDb2VmZmljaWVudCBmb3IgYSBTZXQgb2YgSXRlbXMKIyMgVXNpbmcgdGhlIFNQU1MgQW54aWV0eSBRdWVzdGlvbm5haXJlLCBmcm9tIEZpZWxkLCBDaC4gMTcKCmBgYHtyfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKbGlicmFyeShjb3JycikKbGlicmFyeShwc3ljaCkKYGBgCgpgYGB7cn0KbGlicmFyeShoYXZlbikKc2FxIDwtIHJlYWRfZHRhKCJzYXEuZHRhIikKYGBgCgojIyBFeHBsb3JlIHlvdXIgZGF0YSB3aXRoIGdsaW1wc2UKYGBge3J9CmdsaW1wc2Uoc2FxKQpgYGAKIyMgYHN1bW1hcnlgIHByb3ZpZGVzIGEgbmljZSwgY29tcGFjdCBjb3ZlcmFnZSBvZiBtZWFuLCBtZWRpYW4sIGV0Yy4KYGBge3J9CnN1bW1hcnkoc2FxKQpgYGAKIyMgTm93LCBjcmVhdGUgYSBjb3JyZWxhdGlvbiBtYXRyaXggZm9yIGFsbCBvZiB5b3VyIGluZGljYXRvcnM6CmBgYHtyfQpjb3JyX21hdHJpeCA8LSBjb3JyZWxhdGUoc2FxKQpgYGAKIyMgUTMgaXMgbm90IHNjYWxlZCBpbiB0aGUgc2FtZSBkaXJlY3Rpb24tIHdlIG5lZWQgdG8gcmV2ZXJzZSBjb2RlIGl0OgpgYGB7cn0KZ2xpbXBzZShzYXEkUTAzKQpgYGAKIyMgRmlyc3QsIGBtdXRhdGVgIGFuZCBjcmVhdGUgdGhlIHJldmVyc2UgaXRlbSwgdGhlbiB1c2UgYHNlbGVjdGAgdG8gZHJvcCB0aGUgb3JpZ2luYWwKYGBge3J9CnNhcS5jbGVhbiA8LXNhcSAlPiUKICBtdXRhdGUoLiwKICAgICAgICAgUTAzUiA9IDYtIFEwMykgJT4lCiAgc2VsZWN0KC4sCiAgICAgICAgIC1RMDMpCmBgYAojIyBEb3VibGUgY2hlY2sgdGhhdCB0aGUgcmV2ZXJzZSBjb2Rpbmcgd29ya2VkOgpgYGB7cn0KZ2xpbXBzZShzYXEuY2xlYW4kUTAzUikKYGBgCiMgUmVsaWFiaWxpdHkgMTAxOiBDYWxjdWxhdGluZyBDb2VmZmljaWVudCBBbHBoYQojIyBMZXQncyBzdGFydCB3aXRoIGEgc3Vic2NhbGUgLSBzZWxlY3QgdGhlIHF1ZXN0aW9ucyBhYm91dCBjb21wdXRlcnMKYGBge3J9CmNvbXB1dGVycyA8LXNhcS5jbGVhbiAlPiUKICBzZWxlY3QoLiwKICAgICAgICAgUTA2LAogICAgICAgICBRMDcsCiAgICAgICAgIFExMCwKICAgICAgICAgUTEzLAogICAgICAgICBRMTQsCiAgICAgICAgIFExNSwKICAgICAgICAgUTE4KQpgYGAKCiMjIENoZWNrIHRoZSBjb3JyZWxhdGlvbnM6CmBgYHtyfQpjb3JyZWxhdGUoY29tcHV0ZXJzKQpgYGAKIyMgSG93IHJlbGlhYmxlIHdvdWxkIHRoZXNlIGl0ZW1zIGJlIGFzIGEgc2NhbGU/IEdldCBhIENyb25iYWNoJ3MgYWxwaGEgZm9yIHRoaXMgbWVhc3VyZToKYGBge3J9CnBzeWNoOjphbHBoYShjb21wdXRlcnMpCmBgYAoKIyMgV2hhdCBoYXBwZW5zIHdoZW4gd2UgYWRkIGFuIGl0ZW0gdGhhdCBkb2Vzbid0IGZpdD8KYGBge3J9CmNvbXB1dGVyczIgPC1zYXEuY2xlYW4gJT4lCiAgc2VsZWN0KC4sCiAgICAgICAgIFEwNiwKICAgICAgICAgUTA3LAogICAgICAgICBRMTAsCiAgICAgICAgIFExMywKICAgICAgICAgUTE0LAogICAgICAgICBRMTUsCiAgICAgICAgIFExOCwKICAgICAgICAgUTIzKQpgYGAKCmBgYHtyfQpjb3JyZWxhdGUoY29tcHV0ZXJzMikKYGBgCgpgYGB7cn0KcHN5Y2g6OmFscGhhKGNvbXB1dGVyczIpCmBgYAoKIyBSZWxpYWJpbGl0eSAxMDI6IENyZWF0aW5nIGEgc2NhbGUgc2NvcmUgdXNpbmcgYWxwaGEKTm93LCB3ZSBjYW4gY3JlYXRlIGEgbmV3IHZhcmlhYmxlLCBgY29tcHV0ZXJzYCwgdGhhdCBpcyBjb25zdHJ1Y3RlZCBieSB0YWtpbmcgdGhlIG1lYW4gb2Ygb3VyIDcgYGNvbXB1dGVyc2AgaXRlbXMuCmBgYHtyfQpteS5rZXlzLmxpc3QgPC0gbGlzdChjb21wdXRlcnNfdmFyID0gYygiUTA2IiwiUTA3IiwiUTEwIiwgIlExMyIsIlExNCIsICJRMTUiLCAiUTE4IikpCiAgICAgICAgICAgICAgICAgICAgIApteS5zY2FsZXMgPC0gc2NvcmVJdGVtcyhteS5rZXlzLmxpc3QsIHNhcS5jbGVhbiwgaW1wdXRlID0gIm5vbmUiKQpgYGAKCmBgYHtyfQpwcmludChteS5zY2FsZXMsIHNob3J0ID0gRkFMU0UpCmBgYAojIyBTYXZlIGFjb3JlZCBzY2FsZXMgYXMgbmV3IHZhcmlhYmxlcwpgYGB7cn0KbXkuc2NvcmVzIDwtIGFzX3RpYmJsZShteS5zY2FsZXMkc2NvcmVzKQpgYGAKCiMjIEF0dGFjaCBOZXcgVmFyaWFibGUgdG8gT2xkIERhdGEgU2V0IChTQVEpCmBgYHtyfQpzYXEuY2xlYW4xIDwtYmluZF9jb2xzKHNhcS5jbGVhbiwgbXkuc2NvcmVzKQpnbGltcHNlKHNhcS5jbGVhbjEpCmBgYAoKIyBQZXJmb3JtaW5nIGFuIEVGQQpOb3csIGxldCdzIGxvb2sgYXQgdGhlIGxhcmdlciBTQVEgc3VydmV5IC0gaG93IG1hbnkgZGlzdGluY3QgZmFjdG9ycyBkb2VzIGl0IG1lYXN1cmU/IFdlIGNhbiB1c2UgYSBzY3JlZSBwbG90IHRvIGFpZCB3aXRoIHRoZSBkZWNpc2lvbiBvZiBob3cgbWFueSBjb21wb25lbnRzIHRvIGluY2x1ZGU6CiMjIFNjcmVlIHBsb3QKYGBge3J9CmZhLnBhcmFsbGVsKHNhcS5jbGVhbiwgZm0gPSAicGEiLCBmYSA9ICJwYyIpCmBgYApGcm9tIHRoaXMgcGxvdCwgd2UgY2FuIHNlZSB0aGF0IHRoZSBmaXJzdCBjb21wb25lbnQgaGFzIHRoZSBsYXJnZXN0IGVpZ2VuIHZhbHVlLiBXZSBhbHNvIHNlZSB0aGF0IHRoZSBQYXJhbGxlbCBBbmFseXNpcyBTY3JlZSBQbG90IGluZGljYXRlcyA0IHJldGFpbmVkIGZhY3RvcnMuCgojIyBXZSBub3cgcHJvY2VlZCB0byBjb25zdHJ1Y3QgYSBQQ0EgbW9kZWwsIHdpdGggYSBzaW5nbGUgcHJpbmNpcGFsIGNvbXBvbmVudC4KQmFzaWMgY29kZSBmb3IgRmFjdG9yIEFuYWx5c2lzOiBOb3RlIHRoZSBjb2RlIGBwcmluY2lwYWxgIGZvciBhIFBDQS4gV2UgYWxzbyBuZWVkIHRvIGNyZWF0ZSBhIGNvcnJlbGF0aW9uIG1hdHJpeCwgd2hpY2ggaXMgdGhlIGlucHV0IHRoYXQgYHBzeWNoYCB1c2VzIHRvIGRvIHRoaXMuCmBgYHtyfQpzYXFjb3IgPC0gYXMubWF0cml4KHNhcS5jbGVhbikKcGNhTW9kZWwxIDwtIHByaW5jaXBhbChzYXFjb3Iscm90YXRlID0gIm5vbmUiKQpwY2FNb2RlbDEKYGBgClRoZSBTUyBsb2FkaW5ncyByZXByZXNlbnQgdGhlIGVpZ2VudmFsdWUgZm9yIHRoZSBzaW5nbGUgcHJpbmNpcGFsIGNvbXBvbmVudCwgd2hpY2ggaGFzIGEgdmFsdWUgb2YgNy4yOSwgd2l0aCB0aGUgcHJvcG9ydGlvbiBvZiB2YXJpYW5jZSBleHBsYWluZWQgYnkgdGhpcyBwcmluY2lwYWwgY29tcG9uZW50IGVxdWFsIHRvIDMyJSwgd2hpY2ggbGVhdmVzIDY4JSBvZiB0aGUgdmFyaWFuY2UgdW5leHBsYWluZWQuIFdlIHRodXMgZml0IGEgc2Vjb25kIG1vZGVsLCB3aGljaCBhbGxvd3MgZm9yIDQgZmFjdG9ycyBhcyByZWNvbW1lbmRlZCBieSB0aGUgc2NyZWUgcGxvdC4KCmBgYHtyfQpwY2FNb2RlbDIgPC0gcHJpbmNpcGFsKHNhcWNvciwgbmZhY3RvcnMgPSA0LCByb3RhdGUgPSAibm9uZSIpCnBjYU1vZGVsMgpgYGAKCmBgYHtyfQpwcmludChwY2FNb2RlbDIkbG9hZGluZ3MpCmBgYApUaGUgZWlnZW52YWx1ZXMgZm9yIHRoZSBjb21wb25lbnRzIGFyZSDOuzEgPSA3LjI5IChmcm9tIHByZXZpb3VzIGNvbXB1dGF0aW9uKSwgzrsyID0gMS43NCwgzrszID0gMS4zMiwgYW5kIM67NCA9IDEuMjMuIFRvZ2V0aGVyIHRoZXNlIGNvbXBvbmVudHMgYWNjb3VudCBmb3IgNTAlIG9mIHRoZSB2YXJpYXRpb24gaW4gdGhlIGRhdGFzZXQuIEJsYW5rcyBpbiB0aGlzIGNoYXJ0IHJlcHJlc2VudCBpdGVtcyB3aXRoIGxvYWRpbmdzIDwgLjEwIChidXQgbWF5IG5vdCBiZSAwLCBqdXN0IHNtYWxsKS4KCiMjIE9ydGhvZ29uYWwgUm90YXRpb24gKGFsc28gY2FsbGVkIHZhcmltYXgpCmBgYHtyfQp2YXJpbWF4TW9kZWwgPC0gZmEocj1zYXFjb3IsIG5mYWN0b3JzID0gNCwgcm90YXRlID0gInZhcmltYXgiLCBmbSA9ICJwYSIpCnByaW50KHZhcmltYXhNb2RlbCkKYGBgCiMjIEV4YW1pbmUgcGF0dGVybiBvZiBmYWN0b3IgbG9hZGluZ3MKYGBge3J9CnByaW50KHZhcmltYXhNb2RlbCRsb2FkaW5ncykKYGBgCgojIyBPYmxpcXVlIFJvdGF0aW9uIChhbHNvIGNhbGxlZCBwcm9tYXgpCmBgYHtyfQpwcm9tYXhNb2RlbCA8LSBmYShyPXNhcWNvciwgbmZhY3RvcnMgPSA0LCByb3RhdGUgPSAicHJvbWF4IiwgZm0gPSAicGEiKQpwcmludChwcm9tYXhNb2RlbCkKYGBgCiMjIEV4YW1pbmUgcGF0dGVybiBvZiBmYWN0b3IgbG9hZGluZ3MKYGBge3J9CnByaW50KHByb21heE1vZGVsJGxvYWRpbmdzKQpgYGAKIyMgQ3JlYXRlIGEgcGF0aCBkaWFncmFtIGZvciB5b3VyIGZhY3RvciBtb2RlbDoKYGBge3J9CmZhLmRpYWdyYW0ocHJvbWF4TW9kZWwpCmBgYAoKIyMgQ29tcGFyZSB0aGUgcm90YXRlZCBhbmQgdW5yb3RhdGVkIHNvbHV0aW9uczoKYGBge3J9CnBjYU1vZGVsMiRsb2FkaW5ncwpwcm9tYXhNb2RlbCRsb2FkaW5ncwpgYGAKCiMjIE9idGFpbiBLYWlzZXItTWV5ZXItT2xraW4gKEtNTykgdGVzdCBmb3Igc2FtcGxpbmcgYWRlcXVhY3k6CmBgYHtyfQpLTU8oc2FxLmNsZWFuKQpgYGAKCiMjIFByZWRpY3QgZmFjdG9yIHNjb3JlcyBmb3IgZWFjaCBvYnNlcnZhdGlvbjoKYGBge3J9CmZhY3Rvcl9zY29yZXMgPC1hc190aWJibGUocHJlZGljdC5wc3ljaChwcm9tYXhNb2RlbCwgZGF0YT1zYXEuY2xlYW4pKQpnbGltcHNlKGZhY3Rvcl9zY29yZXMpCmBgYAoKIyMgQ29ycmVsYXRlIHlvdXIgZmFjdG9yIHNjb3JlcyAoc2hvdWxkIGJlIHNhbWUgYXMgcHJldmlvdXMgcmVzdWx0cyk6CmBgYHtyfQpjb3JyZWxhdGUoZmFjdG9yX3Njb3JlcykKYGBgCgo=

Multivariate Statistics Module 9: Intro to Exploratory Factor Analysis

Dr. Broda

Step 1: Generate a Reliability Coefficient for a Set of Items

Using the SPSS Anxiety Questionnaire, from Field, Ch. 17

Explore your data with glimpse

`summary` provides a nice, compact coverage of mean, median, etc.

Now, create a correlation matrix for all of your indicators:

Q3 is not scaled in the same direction- we need to reverse code it:

First, `mutate` and create the reverse item, then use `select` to drop the original

Double check that the reverse coding worked:

Reliability 101: Calculating Coefficient Alpha

Let’s start with a subscale - select the questions about computers

Check the correlations:

How reliable would these items be as a scale? Get a Cronbach’s alpha for this measure:

What happens when we add an item that doesn’t fit?

Reliability 102: Creating a scale score using alpha

Save acored scales as new variables

Attach New Variable to Old Data Set (SAQ)

Performing an EFA

We now proceed to construct a PCA model, with a single principal component.

Orthogonal Rotation (also called varimax)

Examine pattern of factor loadings

Oblique Rotation (also called promax)

Examine pattern of factor loadings

Create a path diagram for your factor model:

Compare the rotated and unrotated solutions:

Obtain Kaiser-Meyer-Olkin (KMO) test for sampling adequacy:

Predict factor scores for each observation:

Correlate your factor scores (should be same as previous results):

Multivariate Statistics Module 9: Intro to Exploratory Factor Analysis

Dr. Broda

Step 1: Generate a Reliability Coefficient for a Set of Items

Using the SPSS Anxiety Questionnaire, from Field, Ch. 17

Explore your data with glimpse

summary provides a nice, compact coverage of mean, median, etc.

Now, create a correlation matrix for all of your indicators:

Q3 is not scaled in the same direction- we need to reverse code it:

First, mutate and create the reverse item, then use select to drop the original

Double check that the reverse coding worked:

Reliability 101: Calculating Coefficient Alpha

Let’s start with a subscale - select the questions about computers

Check the correlations:

How reliable would these items be as a scale? Get a Cronbach’s alpha for this measure:

What happens when we add an item that doesn’t fit?

Reliability 102: Creating a scale score using alpha

Save acored scales as new variables

Attach New Variable to Old Data Set (SAQ)

Performing an EFA

We now proceed to construct a PCA model, with a single principal component.

Orthogonal Rotation (also called varimax)

Examine pattern of factor loadings

Oblique Rotation (also called promax)

Examine pattern of factor loadings

Create a path diagram for your factor model:

Compare the rotated and unrotated solutions:

Obtain Kaiser-Meyer-Olkin (KMO) test for sampling adequacy:

Predict factor scores for each observation:

Correlate your factor scores (should be same as previous results):

`summary` provides a nice, compact coverage of mean, median, etc.

First, `mutate` and create the reverse item, then use `select` to drop the original