Investingating the constructs that could answer the question “What influences love at first sight?”. Read about the experiment: https://www.kaggle.com/annavictoria/speed-dating-experiment
dating <-read.csv("Speed Dating Data.csv")
# Choosing the variables we think belong to latent factors
dating1<- dating[c("imprace","imprelig", "date", "go_out", "sports",
"tvsports", "exercise", "dining" , "museums", "art",
"hiking", "gaming", "clubbing",
"reading", "tv", "theater", "movies", "concerts",
"music", "shopping", "yoga", "exphappy" , "attr1_1",
"sinc1_1", "intel1_1", "fun1_1", "amb1_1",
"shar1_1", "attr2_1", "sinc2_1", "intel2_1",
"fun2_1", "amb2_1", "shar2_1", "attr3_1", "sinc3_1",
"intel3_1", "fun3_1", "amb3_1")]
dating1 <- as.data.frame(dating1)
dating12 <- na.omit(dating1) # as EFA doesn't work with missing cases
dim(dating1)
## [1] 8378 39
#summary(dating1)
So, we have more than 8 thousand of cases and 39 variables to discover the latent factors (constructs) which can help to explain correlations between those variables. First of all, let’s check variables’ type.
sapply(dating1, typeof)
## imprace imprelig date go_out sports tvsports exercise
## "integer" "integer" "integer" "integer" "integer" "integer" "integer"
## dining museums art hiking gaming clubbing reading
## "integer" "integer" "integer" "integer" "integer" "integer" "integer"
## tv theater movies concerts music shopping yoga
## "integer" "integer" "integer" "integer" "integer" "integer" "integer"
## exphappy attr1_1 sinc1_1 intel1_1 fun1_1 amb1_1 shar1_1
## "integer" "double" "double" "double" "double" "double" "double"
## attr2_1 sinc2_1 intel2_1 fun2_1 amb2_1 shar2_1 attr3_1
## "double" "double" "double" "double" "double" "double" "integer"
## sinc3_1 intel3_1 fun3_1 amb3_1
## "integer" "integer" "integer" "integer"
I looks like all of the variables contain numeric values. So I can proceed to EFA.
First of all, let’s see how many factors the eigenvalues suggest to take by using a scree plot:
library(polycor)
library(psych)
dat.cor <- hetcor(dating1)
dat.cor<- dat.cor$correlations
fa.parallel(dating1,
fa="both",
n.iter=100)
## Parallel analysis suggests that the number of factors = 14 and the number of components = 13
Two blue lines on the plot stand for the observed eigenvalues, while red dotted lines present either random eigenvalues or simulated data. The logic is that all pints on the blue lines lying above the corresponding blue lines are the quantity of factors and components syggested to be extracted. In this case I would say that it’s not exactly clear which red line should be considered for factors. However, the software articulates that there should be 14 factors, so the line of interest is probably the most lower one.
Just to be sure, I will run another type of a scree plot
scree(dating1)
This scree plot tells that 5-6 factors and 13 principal components should be considered. While the latter corresponds to what the software tell us from the previous plot, the number of factors to be extracted is different. Let’s try both options and see which quantity of factors explains more variance.
First, with the maximum quantity of factors - 14, and no rotation
fa_1 <- fa(dating1, nfactors=14, rotate="none", fm="ml")
print(fa_1$loadings, cutoff = 0, digits = 2)
##
## Loadings:
## ML11 ML1 ML3 ML2 ML14 ML12 ML7 ML4 ML10 ML6 ML13
## imprace 0.03 -0.03 0.05 0.01 -0.03 0.22 0.10 0.08 -0.01 0.02 0.27
## imprelig -0.10 -0.02 0.17 0.12 0.08 0.13 0.02 0.14 -0.03 0.02 0.19
## date -0.09 0.05 0.20 0.06 -0.24 -0.04 -0.07 -0.05 0.06 0.10 -0.04
## go_out 0.03 0.03 0.03 0.11 -0.26 -0.07 -0.13 -0.06 0.07 0.09 0.03
## sports -0.04 0.01 -0.25 -0.05 0.32 0.05 0.06 -0.08 0.12 -0.09 -0.07
## tvsports 0.01 0.04 -0.21 0.04 0.12 0.29 0.03 -0.08 0.12 0.03 0.10
## exercise 0.01 -0.10 -0.02 -0.04 0.26 0.07 0.13 0.01 0.06 -0.01 0.08
## dining 0.40 -0.02 0.11 0.01 0.19 0.15 0.06 0.10 -0.05 0.04 0.10
## museums 0.86 0.02 0.20 0.06 -0.01 -0.19 -0.04 0.09 -0.20 0.09 0.13
## art 0.84 0.00 0.16 0.05 -0.02 -0.17 -0.01 0.07 -0.15 0.08 0.06
## hiking 0.21 0.07 0.04 -0.03 0.10 -0.13 -0.02 -0.05 0.08 0.02 -0.17
## gaming 0.03 0.12 -0.18 0.00 0.00 0.27 0.06 -0.07 0.06 -0.04 0.02
## clubbing 0.18 -0.01 -0.01 -0.08 0.09 0.11 0.09 -0.02 0.04 0.08 -0.05
## reading 0.24 0.02 0.14 0.04 0.06 -0.10 -0.07 0.10 -0.14 0.03 0.06
## tv 0.08 0.05 0.09 0.06 -0.23 0.64 0.00 -0.01 0.09 0.08 0.49
## theater 0.54 0.03 0.27 0.07 -0.14 0.17 -0.05 0.11 -0.07 0.14 0.05
## movies 0.38 0.08 0.15 0.03 -0.18 0.37 -0.08 0.07 -0.08 0.12 0.00
## concerts 0.56 0.03 0.09 0.09 -0.09 0.39 -0.08 0.07 0.02 0.00 -0.53
## music 0.44 0.03 0.08 0.01 0.04 0.39 -0.02 0.06 0.06 -0.02 -0.43
## shopping 0.28 -0.09 0.14 -0.03 -0.03 0.42 0.13 0.07 0.06 0.15 0.24
## yoga 0.30 0.02 0.12 0.04 0.11 0.05 0.02 0.07 0.02 0.04 -0.09
## exphappy 0.13 0.13 -0.18 -0.10 0.14 0.11 0.00 -0.07 0.00 -0.05 -0.07
## attr1_1 0.02 -0.67 -0.58 -0.11 -0.01 -0.01 0.19 -0.22 0.11 -0.28 0.00
## sinc1_1 0.02 0.29 0.15 0.15 0.00 -0.01 -0.56 -0.14 0.39 0.46 0.00
## intel1_1 -0.09 0.03 0.13 0.02 0.00 0.00 -0.29 0.36 -0.81 0.15 -0.01
## fun1_1 0.00 0.21 0.14 -0.75 0.00 0.00 0.02 -0.01 0.06 -0.07 0.00
## amb1_1 -0.01 0.31 0.38 0.14 -0.01 -0.01 0.62 0.27 0.14 0.45 0.00
## shar1_1 0.01 0.46 0.36 0.62 0.00 0.00 -0.01 -0.04 0.08 -0.48 0.00
## attr2_1 0.00 -0.91 0.23 0.15 0.00 0.00 -0.13 0.07 0.04 0.10 0.00
## sinc2_1 0.00 0.66 -0.37 0.09 0.00 0.00 -0.02 -0.57 -0.03 0.21 0.00
## intel2_1 0.00 0.56 -0.52 -0.16 0.00 0.00 -0.14 0.56 0.06 -0.10 0.00
## fun2_1 0.00 0.18 0.50 -0.63 0.00 0.00 -0.02 -0.18 0.01 -0.19 0.00
## amb2_1 0.04 0.37 -0.32 0.03 -0.01 0.01 0.38 0.11 -0.04 -0.05 0.00
## shar2_1 -0.03 0.47 0.16 0.29 -0.01 0.00 0.11 -0.07 -0.06 -0.17 0.00
## attr3_1 0.13 -0.06 -0.13 -0.06 0.54 0.12 0.25 -0.01 -0.05 0.01 0.09
## sinc3_1 0.14 0.08 0.10 0.10 0.32 0.15 -0.18 0.01 0.15 0.15 0.06
## intel3_1 0.05 0.06 -0.09 0.06 0.51 0.09 0.08 0.08 -0.14 -0.08 0.12
## fun3_1 0.22 -0.04 -0.09 -0.12 0.51 0.24 0.19 0.06 0.15 0.06 0.05
## amb3_1 0.11 -0.01 -0.07 0.04 0.38 0.21 0.39 0.12 0.03 0.12 0.03
## ML9 ML5 ML8
## imprace 0.03 0.02 0.06
## imprelig 0.03 -0.01 0.02
## date 0.01 -0.04 -0.07
## go_out -0.06 0.06 -0.06
## sports 0.02 0.00 -0.03
## tvsports -0.03 -0.12 0.02
## exercise 0.05 0.04 0.04
## dining 0.11 0.07 0.11
## museums 0.09 0.02 0.01
## art 0.12 -0.01 0.07
## hiking 0.05 -0.01 -0.01
## gaming -0.03 0.04 0.07
## clubbing 0.00 0.00 0.10
## reading 0.05 0.06 -0.05
## tv 0.03 -0.04 0.04
## theater 0.17 0.06 0.07
## movies 0.07 0.01 0.00
## concerts 0.08 -0.01 0.03
## music 0.08 -0.02 -0.03
## shopping 0.09 0.00 0.13
## yoga 0.11 -0.05 0.04
## exphappy 0.02 -0.03 0.04
## attr1_1 0.05 0.14 0.02
## sinc1_1 -0.02 0.17 -0.36
## intel1_1 -0.06 0.13 0.08
## fun1_1 0.02 -0.60 0.02
## amb1_1 0.05 0.12 0.19
## shar1_1 -0.08 -0.14 0.11
## attr2_1 -0.03 -0.15 0.20
## sinc2_1 -0.02 -0.06 0.22
## intel2_1 0.02 0.13 0.16
## fun2_1 -0.04 0.50 0.06
## amb2_1 -0.49 -0.12 -0.57
## shar2_1 0.69 -0.07 -0.35
## attr3_1 0.17 0.06 0.03
## sinc3_1 0.06 0.06 -0.13
## intel3_1 -0.02 0.04 -0.05
## fun3_1 0.06 -0.23 0.07
## amb3_1 0.04 0.04 0.10
##
## ML11 ML1 ML3 ML2 ML14 ML12 ML7 ML4 ML10 ML6 ML13 ML9
## SS loadings 3.03 2.92 2.05 1.65 1.59 1.57 1.41 1.10 1.10 1.05 1.00 0.90
## Proportion Var 0.08 0.07 0.05 0.04 0.04 0.04 0.04 0.03 0.03 0.03 0.03 0.02
## Cumulative Var 0.08 0.15 0.21 0.25 0.29 0.33 0.36 0.39 0.42 0.45 0.47 0.50
## ML5 ML8
## SS loadings 0.87 0.85
## Proportion Var 0.02 0.02
## Cumulative Var 0.52 0.54
fa.diagram(fa_1)
The produced diagram suggests that 14 is probably too much as the results become conceptually uninterpretable. Additionaly, there are quite a few of variables that have a loading of more than |0.4| (considered the minimum desired value to attribute a variable to one of the discovered factors) among all of the factors.
Let’s briefly take a look at the values of measures of good fit of this factor model
fa_1$TLI
## [1] 0.7182358
fa_1$RMSEA[1]
## RMSEA
## 0.0821538
The first one is the comparative Tucker Lewis Index (TLI) values which is not acceptable in this case, as it falls below 0.9. The second one is the Root Mean Square Error of Approximation (RMSEA) value which is considered adequate when it’s up to 0.08 - which is pretty close to our case, but not quite there yet. Thus, this factor model is considered of bad fit.
Then, let’s take 5 factors, as it was suggested by the second scree plot, and redo factor analysis. Again, with no rotation.
fa_2 <- fa(dating1, nfactors=5, rotate="none", fm="ml")
print(fa_2$loadings, cutoff = 0, digits = 2)
##
## Loadings:
## ML3 ML4 ML2 ML1 ML5
## imprace 0.05 0.08 -0.04 -0.03 0.24
## imprelig -0.01 -0.05 -0.04 -0.07 0.28
## date -0.04 -0.36 -0.05 0.01 0.08
## go_out 0.04 -0.31 0.02 -0.01 -0.03
## sports -0.15 0.41 0.14 0.03 -0.11
## tvsports -0.08 0.29 0.19 -0.07 0.13
## exercise -0.02 0.32 -0.07 -0.04 0.06
## dining 0.43 0.25 -0.10 0.01 0.21
## museums 0.91 -0.02 -0.08 -0.01 -0.13
## art 0.91 0.02 -0.07 -0.03 -0.16
## hiking 0.19 0.04 0.00 0.07 -0.11
## gaming -0.04 0.21 0.14 0.09 0.11
## clubbing 0.15 0.24 -0.04 0.01 0.09
## reading 0.30 -0.12 -0.05 0.02 -0.02
## tv 0.09 0.04 0.00 -0.01 0.50
## theater 0.63 -0.10 -0.13 0.00 0.28
## movies 0.42 -0.07 -0.03 0.04 0.33
## concerts 0.48 0.04 -0.01 -0.03 0.17
## music 0.36 0.15 -0.01 0.01 0.20
## shopping 0.30 0.24 -0.15 -0.07 0.43
## yoga 0.34 0.08 -0.03 -0.03 0.12
## exphappy 0.06 0.26 0.13 0.12 -0.04
## attr1_1 -0.25 0.47 -0.14 -0.36 -0.39
## sinc1_1 0.06 -0.37 0.10 0.15 0.18
## intel1_1 0.13 -0.26 -0.07 -0.01 -0.01
## fun1_1 -0.04 0.13 -0.09 0.30 -0.02
## amb1_1 0.18 -0.05 0.06 0.16 0.41
## shar1_1 0.11 -0.32 0.26 0.10 0.22
## attr2_1 0.00 0.00 -0.60 -0.79 0.00
## sinc2_1 -0.03 -0.01 0.57 0.36 -0.01
## intel2_1 -0.01 0.10 0.52 0.33 -0.06
## fun2_1 0.00 0.00 -0.62 0.78 0.00
## amb2_1 -0.07 0.10 0.59 0.22 -0.11
## shar2_1 0.15 -0.19 0.41 0.27 0.19
## attr3_1 0.11 0.53 0.04 0.01 0.07
## sinc3_1 0.16 0.03 0.02 0.01 0.26
## intel3_1 0.06 0.31 0.13 0.03 0.08
## fun3_1 0.16 0.57 0.06 -0.09 0.22
## amb3_1 0.11 0.46 0.08 -0.05 0.26
##
## ML3 ML4 ML2 ML1 ML5
## SS loadings 3.41 2.45 2.18 1.94 1.64
## Proportion Var 0.09 0.06 0.06 0.05 0.04
## Cumulative Var 0.09 0.15 0.21 0.26 0.30
fa.diagram(fa_2)
The diagram looks not so messy now, yet the factor loadings are still insufficient for most of the variables.
fa_2$TLI
## [1] 0.2667007
fa_2$RMSEA[1]
## RMSEA
## 0.1325313
However, the values of fit measures worsened - the TLI fell even further from the minimum acceptable value, while the RMSEA increased.
Let’s try 5 factors with rotation
fa_3 <- fa(dating1, nfactors=5, rotate="varimax", fm="ml")
print(fa_3$loadings, cutoff = 0, digits = 2)
##
## Loadings:
## ML3 ML2 ML4 ML5 ML1
## imprace 0.03 -0.04 0.09 0.24 0.00
## imprelig -0.04 -0.06 -0.03 0.28 -0.05
## date -0.06 0.00 -0.36 0.09 0.00
## go_out 0.03 0.03 -0.30 -0.02 -0.06
## sports -0.13 0.09 0.42 -0.15 0.00
## tvsports -0.10 0.10 0.32 0.10 -0.12
## exercise 0.00 -0.10 0.31 0.05 0.04
## dining 0.42 -0.07 0.23 0.25 0.05
## museums 0.92 -0.02 -0.07 -0.03 -0.02
## art 0.92 -0.03 -0.02 -0.06 -0.04
## hiking 0.20 0.04 0.03 -0.09 0.05
## gaming -0.05 0.15 0.23 0.09 0.02
## clubbing 0.16 -0.03 0.23 0.10 0.05
## reading 0.30 -0.01 -0.13 0.02 0.02
## tv 0.04 0.01 0.06 0.50 -0.02
## theater 0.60 -0.05 -0.12 0.35 0.02
## movies 0.38 0.04 -0.07 0.38 0.01
## concerts 0.46 0.00 0.04 0.21 -0.05
## music 0.35 0.01 0.15 0.23 0.01
## shopping 0.27 -0.15 0.24 0.45 0.02
## yoga 0.33 -0.02 0.07 0.15 -0.02
## exphappy 0.07 0.16 0.27 -0.05 0.06
## attr1_1 -0.19 -0.38 0.45 -0.43 -0.15
## sinc1_1 0.02 0.20 -0.36 0.19 0.02
## intel1_1 0.13 -0.04 -0.27 0.02 -0.01
## fun1_1 -0.02 0.07 0.11 -0.03 0.32
## amb1_1 0.13 0.17 -0.03 0.42 0.08
## shar1_1 0.05 0.31 -0.28 0.23 -0.10
## attr2_1 0.02 -0.94 -0.04 0.03 -0.34
## sinc2_1 -0.06 0.67 0.04 -0.04 0.00
## intel2_1 -0.03 0.61 0.15 -0.09 0.01
## fun2_1 0.06 -0.09 -0.12 0.03 0.98
## amb2_1 -0.09 0.59 0.16 -0.14 -0.11
## shar2_1 0.09 0.52 -0.14 0.19 -0.03
## attr3_1 0.12 0.01 0.53 0.06 0.04
## sinc3_1 0.13 0.04 0.04 0.27 -0.01
## intel3_1 0.05 0.11 0.32 0.06 -0.01
## fun3_1 0.15 -0.03 0.58 0.21 -0.06
## amb3_1 0.10 0.02 0.47 0.25 -0.05
##
## ML3 ML2 ML4 ML5 ML1
## SS loadings 3.26 2.77 2.49 1.82 1.29
## Proportion Var 0.08 0.07 0.06 0.05 0.03
## Cumulative Var 0.08 0.15 0.22 0.26 0.30
fa.diagram(fa_3)
This model uses “Varimax” rotation, assuming that factors are orthogonal so that no correlation between them is present. It shows a bit different diagram while the conclusion about factor loadings still applies.
fa_3$TLI
## [1] 0.2667007
fa_3$RMSEA[1]
## RMSEA
## 0.1325313
The TLI and RMSEA values hadn’t changed at all suggesting that rotation did not help.
Let’s do it with “Oblimin” rotation assuming that factors are correlated with each other.
fa_4 <- fa(dating1, nfactors=5, rotate="oblimin", fm="ml")
print(fa_4$loadings, cutoff = 0, digits = 2)
##
## Loadings:
## ML2 ML3 ML4 ML5 ML1
## imprace -0.07 -0.07 0.20 0.18 0.00
## imprelig -0.13 -0.14 0.10 0.27 -0.04
## date -0.09 -0.06 -0.27 0.25 0.00
## go_out -0.02 0.06 -0.28 0.13 -0.06
## sports 0.18 -0.10 0.29 -0.31 -0.02
## tvsports 0.11 -0.14 0.31 -0.03 -0.15
## exercise -0.04 -0.05 0.30 -0.11 0.06
## dining -0.05 0.31 0.36 0.11 0.07
## museums 0.01 0.93 -0.03 -0.01 0.00
## art 0.01 0.94 0.00 -0.06 -0.02
## hiking 0.08 0.23 0.00 -0.09 0.04
## gaming 0.17 -0.10 0.24 0.00 -0.01
## clubbing 0.00 0.10 0.27 -0.02 0.05
## reading -0.02 0.30 -0.09 0.07 0.02
## tv -0.08 -0.15 0.31 0.44 -0.03
## theater -0.12 0.47 0.11 0.37 0.04
## movies -0.04 0.24 0.15 0.39 0.00
## concerts -0.03 0.38 0.16 0.19 -0.05
## music 0.01 0.25 0.26 0.14 0.01
## shopping -0.18 0.08 0.46 0.29 0.05
## yoga -0.03 0.27 0.16 0.10 -0.01
## exphappy 0.22 0.07 0.21 -0.15 0.03
## attr1_1 -0.22 -0.06 0.18 -0.65 -0.08
## sinc1_1 0.09 -0.02 -0.23 0.37 -0.01
## intel1_1 -0.09 0.14 -0.22 0.13 0.00
## fun1_1 0.14 -0.03 0.10 -0.09 0.30
## amb1_1 0.09 -0.02 0.19 0.43 0.05
## shar1_1 0.19 0.00 -0.15 0.40 -0.16
## attr2_1 -0.96 0.00 -0.01 -0.06 -0.16
## sinc2_1 0.66 -0.02 -0.01 0.04 -0.13
## intel2_1 0.63 0.01 0.06 -0.06 -0.10
## fun2_1 0.02 -0.01 -0.01 -0.01 1.00
## amb2_1 0.61 -0.02 0.03 -0.11 -0.23
## shar2_1 0.44 0.05 -0.05 0.32 -0.13
## attr3_1 0.10 0.06 0.51 -0.19 0.03
## sinc3_1 0.00 0.03 0.17 0.24 -0.02
## intel3_1 0.15 0.01 0.31 -0.07 -0.04
## fun3_1 0.03 0.04 0.63 -0.06 -0.06
## amb3_1 0.05 -0.02 0.54 0.03 -0.05
##
## ML2 ML3 ML4 ML5 ML1
## SS loadings 2.72 2.70 2.52 2.13 1.30
## Proportion Var 0.07 0.07 0.06 0.05 0.03
## Cumulative Var 0.07 0.14 0.20 0.26 0.29
fa.diagram(fa_4)
fa_4$TLI
## [1] 0.2667007
fa_4$RMSEA[1]
## RMSEA
## 0.1325313
Factor loadings changed now reaching even |0.96| in some cases. Nevertheless, measures of fit are still the same indicating a bad model fit.
Finally, I will try to fit the model very last time with 6 factors, as suggested by the second scree plot, and “Oblimin” rotation, as I believe there should be some correlation between factors concerning what attracts people in their potential partner on the first date.
fa6 <- fa(dating1, nfactors=6, rotate="oblimin", fm="ml")
print(fa6$loadings, cutoff = 0, digits = 2)
##
## Loadings:
## ML5 ML4 ML6 ML1 ML2 ML3
## imprace -0.01 -0.06 0.19 0.04 -0.03 0.01
## imprelig -0.12 -0.15 0.17 0.22 -0.08 -0.04
## date -0.03 -0.10 -0.28 0.21 0.08 0.00
## go_out 0.11 0.00 -0.33 0.04 0.07 -0.05
## sports -0.14 0.16 0.24 -0.17 0.10 -0.03
## tvsports -0.11 0.09 0.23 -0.01 0.12 -0.15
## exercise -0.06 -0.07 0.29 -0.05 0.05 0.05
## dining 0.34 -0.05 0.35 0.03 -0.03 0.07
## museums 0.93 0.02 -0.03 -0.02 -0.04 0.00
## art 0.93 0.02 -0.01 -0.05 0.02 -0.02
## hiking 0.21 0.05 -0.02 0.01 0.11 0.04
## gaming -0.05 0.19 0.16 -0.09 0.04 -0.01
## clubbing 0.11 -0.01 0.23 -0.01 0.05 0.05
## reading 0.29 -0.01 -0.03 0.05 -0.12 0.03
## tv -0.01 -0.07 0.22 0.18 0.08 -0.02
## theater 0.56 -0.10 0.08 0.16 -0.01 0.05
## movies 0.34 -0.01 0.10 0.15 -0.04 0.01
## concerts 0.45 -0.01 0.09 0.05 0.06 -0.04
## music 0.32 0.00 0.20 0.06 0.09 0.01
## shopping 0.18 -0.18 0.39 0.10 0.05 0.05
## yoga 0.28 -0.05 0.16 0.11 0.06 -0.02
## exphappy 0.06 0.22 0.17 -0.12 0.01 0.02
## attr1_1 -0.01 -0.07 0.02 -0.93 0.17 -0.04
## sinc1_1 0.01 0.01 -0.25 0.49 0.24 -0.03
## intel1_1 0.02 -0.02 -0.03 0.07 -0.98 -0.01
## fun1_1 -0.08 0.05 0.11 0.20 0.17 0.28
## amb1_1 -0.01 -0.03 0.28 0.56 0.12 0.03
## shar1_1 0.03 0.11 -0.11 0.49 0.18 -0.17
## attr2_1 -0.01 -0.95 0.00 -0.07 -0.03 -0.17
## sinc2_1 0.00 0.65 -0.08 0.05 0.13 -0.13
## intel2_1 -0.02 0.65 0.10 -0.07 -0.22 -0.10
## fun2_1 0.00 0.03 -0.02 0.00 0.01 1.00
## amb2_1 -0.05 0.60 0.03 -0.05 0.04 -0.23
## shar2_1 0.11 0.41 -0.04 0.25 0.11 -0.12
## attr3_1 0.04 0.09 0.53 -0.17 -0.03 0.03
## sinc3_1 0.07 -0.05 0.18 0.23 0.09 -0.02
## intel3_1 -0.03 0.14 0.40 -0.04 -0.16 -0.04
## fun3_1 0.03 -0.03 0.64 0.05 0.15 -0.07
## amb3_1 -0.02 0.03 0.58 0.01 0.01 -0.06
##
## ML5 ML4 ML6 ML1 ML2 ML3
## SS loadings 2.93 2.55 2.39 2.13 1.36 1.28
## Proportion Var 0.08 0.07 0.06 0.05 0.03 0.03
## Cumulative Var 0.08 0.14 0.20 0.26 0.29 0.32
fa.diagram(fa6)
fa6$TLI
## [1] 0.2733481
fa6$RMSEA[1]
## RMSEA
## 0.1319293
Okaaay, the situation did not change much, although the TLI value increased a little. But this is still really bad FA model.. Perhaps, the model with 14 factors, though being uninterpretable, was the closest to ideal judging by performance indicators (TLI, RMSEA).
So, let’s give it a shot and try to label the resulting from the last model factors.
The ML5 is mostly represented with such variables as museums, art, theater, and concerts. It seems appropriate to label this factor as “Cultural interests”.
The ML4 is represented with attr2_1, sinc2_1, intel2_1, shar2_1, and amb2_1 and can be labeled as “Personal characteristics_2”.
The ML6 is mostly covered with amb3_1, fun3_1, intel3_1, and attr3_1 asking for “Personal characteristics_3”.
The ML1, mostly covered with attr1_1, sinc1_1, amb1_1, and shar1_1, and the ML2, mostly presented by intel1_1 only, can be in turn labeled as “Personal characteristics_1”. Yet as they’re somehow happened to be in separated factors, it becomes quite confusing. So ML2 can be called “Intelligence_1”, I guess.
Lastly, the ML3 is presented by fun2_1 and so, by analogy with the previous one, can be labeled as “Sense of humor_2”. With numbers for each of labels including those indicating the wave of the survey.
Still, it’s a mess :)
Anyways, that’s all for EFA Part 1. Thanks for your attention!