Set up environment

library(psych)
library(lavaan)
## This is lavaan 0.6-15
## lavaan is FREE software! Please report any bugs.
## 
## Attaching package: 'lavaan'
## The following object is masked from 'package:psych':
## 
##     cor2cov

Import data and set up correlation matrix with variable names

CorrMat <- matrix(scan(file='child_obs.dat'),ncol=8)

colnames(CorrMat) <- c("cry","fear","soc","activ","verbal","ease","socinv","expl")
rownames(CorrMat) <- c("cry","fear","soc","activ","verbal","ease","socinv","expl")
CorrMat
##          cry  fear   soc activ verbal  ease socinv  expl
## cry     1.00  0.71  0.52  0.63  -0.05  0.00  -0.05 -0.04
## fear    0.71  1.00  0.50  0.55  -0.04 -0.01  -0.07 -0.07
## soc     0.52  0.50  1.00  0.48  -0.14 -0.14  -0.05 -0.18
## activ   0.63  0.55  0.48  1.00  -0.03  0.02   0.00  0.03
## verbal -0.05 -0.04 -0.14 -0.03   1.00  0.66   0.40  0.37
## ease    0.00 -0.01 -0.14  0.02   0.66  1.00   0.42  0.55
## socinv -0.05 -0.07 -0.05  0.00   0.40  0.42   1.00  0.53
## expl   -0.04 -0.07 -0.18  0.03   0.37  0.55   0.53  1.00

Let’s examine the correlation matrix. Based on the pattern of correlations, what do you think will happen will happen when we extract factors? How many factors do you think we’ll be able to extract only based on the patterns in the correlation matrix?

Extracting Factors

Let’s use the psych package to try to figure out how many factors to extract based on our correlation matrix. We will use different methods: Eigenvalues & eigenvectors; scree plots, parallel test

#eigenvalues
eigen(CorrMat)
## eigen() decomposition
## $values
## [1] 2.8198845 2.3766540 0.7507196 0.6066356 0.4590880 0.4277279 0.2853157
## [8] 0.2739745
## 
## $vectors
##            [,1]      [,2]        [,3]        [,4]         [,5]       [,6]
## [1,] -0.4521422 0.2903285 -0.06214806  0.19392934  0.259405970 -0.0279186
## [2,] -0.4410647 0.2672811 -0.14042664  0.14646920  0.568429409  0.1444129
## [3,] -0.4280631 0.1513488  0.15178101 -0.66625332 -0.320472813  0.4677588
## [4,] -0.3998511 0.2993952  0.08605121  0.23719307 -0.581094047 -0.5452557
## [5,]  0.2601046 0.4155417 -0.55611565 -0.30901941  0.001160685 -0.2408235
## [6,]  0.2584873 0.4748983 -0.35306557  0.05924143 -0.164277521  0.2776815
## [7,]  0.2342350 0.3973416  0.58271245 -0.38073760  0.344654050 -0.3774149
## [8,]  0.2618668 0.4190972  0.41533358  0.44364455 -0.153129257  0.4297888
##             [,7]        [,8]
## [1,]  0.35053664  0.69199634
## [2,] -0.29371965 -0.51245636
## [3,] -0.06717184  0.03018887
## [4,] -0.05892183 -0.21993265
## [5,] -0.47685822  0.26367022
## [6,]  0.61268959 -0.31624102
## [7,]  0.18213273 -0.09131238
## [8,] -0.38329410  0.17714258
#scree plot
scree(CorrMat, hline=TRUE, factors=FALSE)

#parallel analysis
fa.parallel(CorrMat,n.obs=453)

## Parallel analysis suggests that the number of factors =  2  and the number of components =  2
# minimum average partial
nfactors(CorrMat,n.obs=453)$map

## [1] 0.14260361 0.06232114 0.10484149 0.17446582 0.27767585 0.46714201 1.00000000
## [8]         NA

Vectors. For each column, if you square the value and add them up, they should add up to the eigenvalues at the top. We only have 2 eigenvalues that are greater than 1; therefore, we can support extracting 2 factors given our correlation matrix.

Parallel analysis suggests the same thing, as well as scree plot

FACTOR ANALYSIS USING PSYCH PACKAGE

EFA with no rotation

fa(CorrMat,n.obs=453, nfactors=2, rotate="none", fm="pa")
## Factor Analysis using method =  pa
## Call: fa(r = CorrMat, nfactors = 2, n.obs = 453, rotate = "none", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          PA1  PA2   h2   u2 com
## cry     0.79 0.37 0.76 0.24 1.4
## fear    0.73 0.32 0.63 0.37 1.4
## soc     0.63 0.13 0.42 0.58 1.1
## activ   0.64 0.35 0.53 0.47 1.5
## verbal -0.34 0.60 0.47 0.53 1.6
## ease   -0.36 0.75 0.70 0.30 1.4
## socinv -0.29 0.52 0.36 0.64 1.6
## expl   -0.34 0.59 0.46 0.54 1.6
## 
##                        PA1  PA2
## SS loadings           2.40 1.92
## Proportion Var        0.30 0.24
## Cumulative Var        0.30 0.54
## Proportion Explained  0.55 0.45
## Cumulative Proportion 0.55 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 2 factors are sufficient.
## 
## df null model =  28  with the objective function =  3.06 with Chi Square =  1372.78
## df of  the model are 13  and the objective function was  0.21 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic n.obs is  453 with the empirical chi square  46.72  with prob <  1.1e-05 
## The total n.obs was  453  with Likelihood Chi Square =  93.39  with prob <  3.1e-14 
## 
## Tucker Lewis Index of factoring reliability =  0.871
## RMSEA index =  0.117  and the 90 % confidence intervals are  0.095 0.14
## BIC =  13.88
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    PA1  PA2
## Correlation of (regression) scores with factors   0.93 0.91
## Multiple R square of scores with factors          0.87 0.82
## Minimum correlation of possible factor scores     0.73 0.65

What do you notice?

EFA using varimax rotation

#commonly loading and h2 is the variance for the item # adopt and closer to simple structure

fa(CorrMat,n.obs=453, nfactors=2, rotate="varimax", fm="pa")
## Factor Analysis using method =  pa
## Call: fa(r = CorrMat, nfactors = 2, n.obs = 453, rotate = "varimax", 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          PA1   PA2   h2   u2 com
## cry     0.87  0.00 0.76 0.24 1.0
## fear    0.79 -0.02 0.63 0.37 1.0
## soc     0.63 -0.15 0.42 0.58 1.1
## activ   0.72  0.04 0.53 0.47 1.0
## verbal -0.05  0.68 0.47 0.53 1.0
## ease   -0.01  0.84 0.70 0.30 1.0
## socinv -0.04  0.60 0.36 0.64 1.0
## expl   -0.05  0.68 0.46 0.54 1.0
## 
##                        PA1  PA2
## SS loadings           2.31 2.01
## Proportion Var        0.29 0.25
## Cumulative Var        0.29 0.54
## Proportion Explained  0.53 0.47
## Cumulative Proportion 0.53 1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## df null model =  28  with the objective function =  3.06 with Chi Square =  1372.78
## df of  the model are 13  and the objective function was  0.21 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic n.obs is  453 with the empirical chi square  46.72  with prob <  1.1e-05 
## The total n.obs was  453  with Likelihood Chi Square =  93.39  with prob <  3.1e-14 
## 
## Tucker Lewis Index of factoring reliability =  0.871
## RMSEA index =  0.117  and the 90 % confidence intervals are  0.095 0.14
## BIC =  13.88
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    PA1  PA2
## Correlation of (regression) scores with factors   0.93 0.91
## Multiple R square of scores with factors          0.87 0.82
## Minimum correlation of possible factor scores     0.73 0.64

What do you notice? zeros and negative

EFA using oblimin rotation

fa(CorrMat,n.obs=453, nfactors=2, rotate="oblimin", fm="pa")
## Loading required namespace: GPArotation
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : I
## am sorry, to do these rotations requires the GPArotation package to be
## installed
## Factor Analysis using method =  pa
## Call: fa(r = CorrMat, nfactors = 2, n.obs = 453, rotate = "oblimin", 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          PA1  PA2   h2   u2 com
## cry     0.79 0.37 0.76 0.24 1.4
## fear    0.73 0.32 0.63 0.37 1.4
## soc     0.63 0.13 0.42 0.58 1.1
## activ   0.64 0.35 0.53 0.47 1.5
## verbal -0.34 0.60 0.47 0.53 1.6
## ease   -0.36 0.75 0.70 0.30 1.4
## socinv -0.29 0.52 0.36 0.64 1.6
## expl   -0.34 0.59 0.46 0.54 1.6
## 
##                        PA1  PA2
## SS loadings           2.40 1.92
## Proportion Var        0.30 0.24
## Cumulative Var        0.30 0.54
## Proportion Explained  0.55 0.45
## Cumulative Proportion 0.55 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 2 factors are sufficient.
## 
## df null model =  28  with the objective function =  3.06 with Chi Square =  1372.78
## df of  the model are 13  and the objective function was  0.21 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic n.obs is  453 with the empirical chi square  46.72  with prob <  1.1e-05 
## The total n.obs was  453  with Likelihood Chi Square =  93.39  with prob <  3.1e-14 
## 
## Tucker Lewis Index of factoring reliability =  0.871
## RMSEA index =  0.117  and the 90 % confidence intervals are  0.095 0.14
## BIC =  13.88
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    PA1  PA2
## Correlation of (regression) scores with factors   0.93 0.91
## Multiple R square of scores with factors          0.87 0.82
## Minimum correlation of possible factor scores     0.73 0.65

What do you notice?

similar to the intial rotation - start with the orthogonal first

EFA using Lavaan

efa_model2 <- "
  efa('block1')*F1 =~ cry+fear+soc+activ+verbal+ease+socinv+expl
  efa('block1')*F2 =~ cry+fear+soc+activ+verbal+ease+socinv+expl
  "
efa_sat2 <- lavaan(
    model          = efa_model2, 
    sample.cov     = CorrMat,  
    auto.var       = TRUE, 
    auto.efa       = TRUE,
    sample.nobs=453,
    sample.cov.rescale = FALSE)

summary(efa_sat2,fit.measures=T)
## lavaan 0.6.15 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        23
## 
##   Rotation method                       GEOMIN OBLIQUE
##   Geomin epsilon                                 0.001
##   Rotation algorithm (rstarts)                GPA (30)
##   Standardized metric                             TRUE
##   Row weights                                     None
## 
##   Number of observations                           453
## 
## Model Test User Model:
##                                                       
##   Test statistic                                88.514
##   Degrees of freedom                                13
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              1386.552
##   Degrees of freedom                                28
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.944
##   Tucker-Lewis Index (TLI)                       0.880
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4493.214
##   Loglikelihood unrestricted model (H1)      -4448.957
##                                                       
##   Akaike (AIC)                                9032.429
##   Bayesian (BIC)                              9127.094
##   Sample-size adjusted Bayesian (SABIC)       9054.100
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.113
##   90 Percent confidence interval - lower         0.092
##   90 Percent confidence interval - upper         0.136
##   P-value H_0: RMSEA <= 0.050                    0.000
##   P-value H_0: RMSEA >= 0.080                    0.994
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.041
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   F1 =~ block1                                        
##     cry               0.877    0.040   21.744    0.000
##     fear              0.801    0.042   19.199    0.000
##     soc               0.608    0.045   13.633    0.000
##     activ             0.718    0.043   16.574    0.000
##     verbal           -0.015    0.024   -0.614    0.539
##     ease              0.043    0.030    1.468    0.142
##     socinv           -0.021    0.039   -0.546    0.585
##     expl             -0.020    0.034   -0.601    0.548
##   F2 =~ block1                                        
##     cry               0.007    0.017    0.382    0.703
##     fear             -0.011    0.023   -0.463    0.643
##     soc              -0.153    0.042   -3.645    0.000
##     activ             0.038    0.036    1.049    0.294
##     verbal            0.727    0.045   16.181    0.000
##     ease              0.872    0.043   20.073    0.000
##     socinv            0.543    0.047   11.461    0.000
##     expl              0.636    0.046   13.833    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .cry               0.231    0.033    7.095    0.000
##    .fear              0.358    0.034   10.420    0.000
##    .soc               0.597    0.044   13.512    0.000
##    .activ             0.487    0.039   12.505    0.000
##    .verbal            0.470    0.042   11.096    0.000
##    .ease              0.243    0.042    5.721    0.000
##    .socinv            0.703    0.051   13.783    0.000
##    .expl              0.593    0.046   12.869    0.000
##     F1                1.000                           
##     F2                1.000