Factor analysis can best be understood as a latent variable modeling paradigm in which a set of observed variables are the indicators of a latent variable. In this schema, the latent variable (e.g. intelligence) is of primary interest, but cannot be directly observed. However, it is theorized that the latent variable has a direct influence on each of the observed indicators (e.g. items on a scale, subscales in a battery of measures), so that they can in turn be used to gain insights into the latent variable. This idea is at the core of educational and psychological measurement of abilities. library(foreign)
To begin, let us consider an example in which a researcher has collected data on achievement goal orientation using the 12-item Likert achievement goal scale. Each item has seven options ranging from “not at all like me” to “very true of me”. ##ITEMS The items appear below. - AGS1 = My goal is to completely master the material presented in my classes. (MAP)
AGS2 = I want to avoid learning less than it is possible to learn. (MAV)
AGS3 = It is important for me to do better than other students. (PAP)
AGS4 = I want to avoid performing poorly compared to others. (PAV)
AGS5 = I want to learn as much as possible. (MAP)
AGS6 = It is important for me to avoid an incomplete understanding of the course material. (MAV)
AGS7 = It is important for me to understand the content of my courses as thoroughly as possible. (MAP)
AGS8 = My goal is to avoid performing worse than other students. (PAV)
AGS9 = I want to do well compared to other students. (PAP)
AGS10 = It is important for me to avoid doing poorly compared to other students. (PAV)
AGS11 = My goal is to perform better than the other students. (PAP)
AGS12 = My goal is to avoid learning less than I possibly could. (MAV)
The researcher would like to investigate the latent structure of achievement goal orientation, using the responses to these 12 items from 430 college students. The theory underlying the AGS states that there exist four distinct latent traits: mastery approach (MAP), mastery avoidant (MAV), performance approach (PAP),and performance avoidant (PAV)
##
## Call:
## factanal(x = ~ags1 + ags2 + ags3 + ags4 + ags5 + ags6 + ags7 + ags8 + ags9 + ags10 + ags11 + ags12, factors = 4, rotation = "promax")
##
## Uniquenesses:
## ags1 ags2 ags3 ags4 ags5 ags6 ags7 ags8 ags9 ags10 ags11 ags12
## 0.510 0.402 0.323 0.390 0.543 0.384 0.101 0.281 0.275 0.136 0.005 0.226
##
## Loadings:
## Factor1 Factor2 Factor3 Factor4
## ags1 0.656
## ags2 0.768
## ags3 0.804 0.126
## ags4 0.777 -0.112
## ags5 0.532 0.201
## ags6 0.789
## ags7 1.020 -0.118
## ags8 0.801 -0.102 0.108 0.181
## ags9 0.849
## ags10 0.933
## ags11 0.762 0.573
## ags12 0.921
##
## Factor1 Factor2 Factor3 Factor4
## SS loadings 4.069 2.396 1.525 0.397
## Proportion Var 0.339 0.200 0.127 0.033
## Cumulative Var 0.339 0.539 0.666 0.699
##
## Factor Correlations:
## Factor1 Factor2 Factor3 Factor4
## Factor1 1.0000 0.0706 -0.09984 -0.09467
## Factor2 0.0706 1.0000 0.09368 -0.64631
## Factor3 -0.0998 0.0937 1.00000 -0.00975
## Factor4 -0.0947 -0.6463 -0.00975 1.00000
##
## Test of the hypothesis that 4 factors are sufficient.
## The chi square statistic is 57.74 on 24 degrees of freedom.
## The p-value is 0.000132
for example, approximately 48.7% of the variance in item ags1 is not associated with the four retained factors.
## ags1 ags2 ags3 ags4 ags5 ags6 ags7
## 0.4896984 0.5976648 0.6772644 0.6095133 0.4571493 0.6156570 0.8990315
## ags8 ags9 ags10 ags11 ags12
## 0.7187991 0.7250718 0.8641310 0.9950000 0.7744962
Thus, approximately 51% of the variance in ags1 is associated with the four factors, compared to 99.5% of the factor accounted for variance in ags8.
Finally, the chi-square goodness of fit test is the last portion of the output from factanal. For this model, the chi-square was 77.4 with 24 degrees of freedom, and a p-value of 0.000132. This value is well below our α of 0.05, leading us to reject the null hypothesis that the model adequately fits the data.
You can also embed plots, for example:
## Loading required package: MASS
## Loading required package: boot
##
## Attaching package: 'boot'
## The following object is masked from 'package:psych':
##
## logit
## Loading required package: lattice
##
## Attaching package: 'lattice'
## The following object is masked from 'package:boot':
##
## melanoma
##
## Attaching package: 'nFactors'
## The following object is masked from 'package:lattice':
##
## parallel
##
## Call:
## factanal(x = ~ags1 + ags2 + ags3 + ags4 + ags5 + ags6 + ags7 + ags8 + ags9 + ags10 + ags11 + ags12, factors = 3, rotation = "promax")
##
## Uniquenesses:
## ags1 ags2 ags3 ags4 ags5 ags6 ags7 ags8 ags9 ags10 ags11 ags12
## 0.513 0.410 0.307 0.433 0.543 0.384 0.103 0.281 0.270 0.187 0.295 0.217
##
## Loadings:
## Factor1 Factor2 Factor3
## ags1 0.648
## ags2 0.756
## ags3 0.836
## ags4 0.746
## ags5 0.531 0.201
## ags6 0.793
## ags7 1.018 -0.116
## ags8 0.838 -0.118 0.119
## ags9 0.852
## ags10 0.899
## ags11 0.838
## ags12 0.926
##
## Factor1 Factor2 Factor3
## SS loadings 4.199 2.404 1.519
## Proportion Var 0.350 0.200 0.127
## Cumulative Var 0.350 0.550 0.677
##
## Factor Correlations:
## Factor1 Factor2 Factor3
## Factor1 1.0000 0.0594 0.0928
## Factor2 0.0594 1.0000 0.6424
## Factor3 0.0928 0.6424 1.0000
##
## Test of the hypothesis that 3 factors are sufficient.
## The chi square statistic is 113.42 on 33 degrees of freedom.
## The p-value is 9.39e-11
##
## Call:
## factanal(x = ~ags1 + ags2 + ags3 + ags4 + ags5 + ags6 + ags7 + ags8 + ags9 + ags10 + ags11 + ags12, factors = 2, rotation = "promax")
##
## Uniquenesses:
## ags1 ags2 ags3 ags4 ags5 ags6 ags7 ags8 ags9 ags10 ags11 ags12
## 0.476 0.698 0.315 0.432 0.524 0.385 0.213 0.292 0.273 0.187 0.297 0.678
##
## Loadings:
## Factor1 Factor2
## ags1 0.724
## ags2 0.548
## ags3 0.831
## ags4 0.748
## ags5 0.692
## ags6 0.781
## ags7 0.889
## ags8 0.844
## ags9 0.848
## ags10 0.900
## ags11 0.841
## ags12 0.561
##
## Factor1 Factor2
## SS loadings 4.205 3.031
## Proportion Var 0.350 0.253
## Cumulative Var 0.350 0.603
##
## Factor Correlations:
## Factor1 Factor2
## Factor1 1.0000 0.0869
## Factor2 0.0869 1.0000
##
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 261.85 on 43 degrees of freedom.
## The p-value is 3.7e-33
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate =
## rotate, : A Heywood case was detected. Examine the loadings carefully.
## Factor Analysis using method = pa
## Call: fa(r = performance.data, nfactors = 4, rotate = "promax", residuals = TRUE,
## SMC = TRUE, fm = "pa")
##
## Warning: A Heywood case was detected.
## Standardized loadings (pattern matrix) based upon correlation matrix
## PA1 PA2 PA3 PA4 h2 u2 com
## ags1 0.02 0.71 0.05 0.08 0.54 0.46 1.0
## ags2 -0.03 0.00 0.79 -0.04 0.63 0.37 1.0
## ags3 0.84 0.04 -0.11 0.01 0.70 0.30 1.0
## ags4 0.71 -0.04 0.08 -0.28 0.67 0.33 1.3
## ags5 -0.04 0.56 0.18 -0.02 0.48 0.52 1.2
## ags6 0.04 0.74 0.01 -0.02 0.57 0.43 1.0
## ags7 -0.01 1.00 -0.11 -0.01 0.88 0.12 1.0
## ags8 0.88 -0.08 0.09 0.25 0.76 0.24 1.2
## ags9 0.84 0.07 -0.05 -0.05 0.73 0.27 1.0
## ags10 0.89 0.01 0.00 -0.08 0.82 0.18 1.0
## ags11 0.91 -0.01 0.00 0.32 0.81 0.19 1.2
## ags12 0.01 -0.04 0.88 0.01 0.74 0.26 1.0
##
## PA1 PA2 PA3 PA4
## SS loadings 4.26 2.38 1.46 0.22
## Proportion Var 0.36 0.20 0.12 0.02
## Cumulative Var 0.36 0.55 0.68 0.69
## Proportion Explained 0.51 0.29 0.18 0.03
## Cumulative Proportion 0.51 0.80 0.97 1.00
##
## With factor correlations of
## PA1 PA2 PA3 PA4
## PA1 1.00 0.06 0.10 -0.19
## PA2 0.06 1.00 0.64 -0.13
## PA3 0.10 0.64 1.00 -0.03
## PA4 -0.19 -0.13 -0.03 1.00
##
## Mean item complexity = 1.1
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 66 and the objective function was 7.87 with Chi Square of 2504.87
## The degrees of freedom for the model are 24 and the objective function was 0.22
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.03
##
## The harmonic number of observations is 324 with the empirical chi square 14.8 with prob < 0.93
## The total number of observations was 324 with MLE Chi Square = 68.07 with prob < 4.3e-06
##
## Tucker Lewis Index of factoring reliability = 0.95
## RMSEA index = 0.077 and the 90 % confidence intervals are 0.055 0.097
## BIC = -70.67
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## PA1 PA2 PA3 PA4
## Correlation of scores with factors 0.97 0.96 0.92 0.72
## Multiple R square of scores with factors 0.95 0.92 0.84 0.51
## Minimum correlation of possible factor scores 0.89 0.84 0.68 0.02
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate =
## rotate, : A Heywood case was detected. Examine the loadings carefully.
## Factor Analysis using method = pa
## Call: fa(r = performance.data, nfactors = 3, rotate = "promax", residuals = TRUE,
## SMC = TRUE, fm = "pa")
##
## Warning: A Heywood case was detected.
## Standardized loadings (pattern matrix) based upon correlation matrix
## PA1 PA2 PA3 h2 u2 com
## ags1 0.00 0.68 0.06 0.52 0.48 1.0
## ags2 -0.02 0.01 0.78 0.62 0.38 1.0
## ags3 0.85 0.05 -0.11 0.71 0.29 1.0
## ags4 0.74 0.03 0.04 0.56 0.44 1.0
## ags5 -0.04 0.56 0.18 0.48 0.52 1.2
## ags6 0.04 0.75 0.01 0.57 0.43 1.0
## ags7 -0.01 1.01 -0.11 0.88 0.12 1.0
## ags8 0.83 -0.12 0.12 0.71 0.29 1.1
## ags9 0.85 0.09 -0.06 0.73 0.27 1.0
## ags10 0.90 0.03 -0.01 0.82 0.18 1.0
## ags11 0.84 -0.08 0.04 0.71 0.29 1.0
## ags12 0.01 -0.04 0.89 0.74 0.26 1.0
##
## PA1 PA2 PA3
## SS loadings 4.20 2.38 1.47
## Proportion Var 0.35 0.20 0.12
## Cumulative Var 0.35 0.55 0.67
## Proportion Explained 0.52 0.30 0.18
## Cumulative Proportion 0.52 0.82 1.00
##
## With factor correlations of
## PA1 PA2 PA3
## PA1 1.00 0.06 0.10
## PA2 0.06 1.00 0.64
## PA3 0.10 0.64 1.00
##
## Mean item complexity = 1
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 66 and the objective function was 7.87 with Chi Square of 2504.87
## The degrees of freedom for the model are 33 and the objective function was 0.37
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.04
##
## The harmonic number of observations is 324 with the empirical chi square 26.39 with prob < 0.79
## The total number of observations was 324 with MLE Chi Square = 117.76 with prob < 1.9e-11
##
## Tucker Lewis Index of factoring reliability = 0.93
## RMSEA index = 0.091 and the 90 % confidence intervals are 0.072 0.107
## BIC = -73
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## PA1 PA2 PA3
## Correlation of scores with factors 0.97 0.96 0.92
## Multiple R square of scores with factors 0.94 0.92 0.84
## Minimum correlation of possible factor scores 0.88 0.85 0.68
## Factor Analysis using method = pa
## Call: fa(r = performance.data, nfactors = 2, rotate = "promax", residuals = TRUE,
## SMC = TRUE, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
## PA1 PA2 h2 u2 com
## ags1 -0.01 0.72 0.52 0.48 1
## ags2 0.01 0.62 0.38 0.62 1
## ags3 0.84 -0.05 0.70 0.30 1
## ags4 0.74 0.06 0.56 0.44 1
## ags5 -0.05 0.71 0.50 0.50 1
## ags6 0.03 0.73 0.54 0.46 1
## ags7 -0.03 0.85 0.71 0.29 1
## ags8 0.84 -0.03 0.70 0.30 1
## ags9 0.85 0.03 0.72 0.28 1
## ags10 0.90 0.02 0.82 0.18 1
## ags11 0.84 -0.05 0.71 0.29 1
## ags12 0.05 0.63 0.41 0.59 1
##
## PA1 PA2
## SS loadings 4.20 3.06
## Proportion Var 0.35 0.26
## Cumulative Var 0.35 0.61
## Proportion Explained 0.58 0.42
## Cumulative Proportion 0.58 1.00
##
## With factor correlations of
## PA1 PA2
## PA1 1.00 0.09
## PA2 0.09 1.00
##
## Mean item complexity = 1
## Test of the hypothesis that 2 factors are sufficient.
##
## The degrees of freedom for the null model are 66 and the objective function was 7.87 with Chi Square of 2504.87
## The degrees of freedom for the model are 43 and the objective function was 0.86
##
## The root mean square of the residuals (RMSR) is 0.05
## The df corrected root mean square of the residuals is 0.06
##
## The harmonic number of observations is 324 with the empirical chi square 117.56 with prob < 7.5e-09
## The total number of observations was 324 with MLE Chi Square = 272.45 with prob < 4.1e-35
##
## Tucker Lewis Index of factoring reliability = 0.855
## RMSEA index = 0.13 and the 90 % confidence intervals are 0.114 0.143
## BIC = 23.88
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy
## PA1 PA2
## Correlation of scores with factors 0.97 0.94
## Multiple R square of scores with factors 0.94 0.88
## Minimum correlation of possible factor scores 0.88 0.75
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## ags1 -0.335 -0.381 -0.268 -0.551 0.135
## ags2 -0.523 0.434 0.200 0.695
## ags3 -0.403 -0.162 0.284 -0.564 0.618
## ags4 -0.279 0.516 0.135 -0.225 -0.188 0.640
## ags5 -0.284 -0.222 -0.123 -0.132 -0.412 -0.185 -0.117
## ags6 -0.338 -0.410 0.653 0.142
## ags7 -0.337 -0.400 0.169
## ags8 -0.482 0.205 -0.497 0.438 0.393
## ags9 -0.333 -0.111 0.228 -0.236 -0.701
## ags10 -0.431 0.236 0.439 -0.146 -0.123
## ags11 -0.466 -0.419 -0.486 0.109 -0.540 0.180
## ags12 -0.538 0.439 -0.108 -0.677
## Comp.10 Comp.11 Comp.12
## ags1 0.565 0.134
## ags2
## ags3 -0.123
## ags4 0.359
## ags5 -0.711 -0.215 0.259
## ags6 0.498
## ags7 -0.143 -0.813
## ags8 -0.275 0.208
## ags9 0.513
## ags10 0.197 -0.689
## ags11 0.103
## ags12 0.157
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.083 0.083 0.083 0.083 0.083 0.083 0.083 0.083
## Cumulative Var 0.083 0.167 0.250 0.333 0.417 0.500 0.583 0.667
## Comp.9 Comp.10 Comp.11 Comp.12
## SS loadings 1.000 1.000 1.000 1.000
## Proportion Var 0.083 0.083 0.083 0.083
## Cumulative Var 0.750 0.833 0.917 1.000