Speed Dating dataset (Kaggle) “What influences love at first sight?” Read about the experiment. https://www.kaggle.com/annavictoria/speed-dating-experiment
dating <-read.csv("Speed Dating Data.csv")
#names(dating)
# Choose the variables we think belong to factors.
dating1<- dating[c("imprace","imprelig", "date", "go_out", "sports",
"tvsports", "exercise", "dining" , "museums", "art",
"hiking", "gaming", "clubbing",
"reading", "tv", "theater", "movies", "concerts",
"music", "shopping", "yoga", "exphappy" , "attr1_1",
"sinc1_1", "intel1_1", "fun1_1", "amb1_1",
"shar1_1", "attr2_1", "sinc2_1", "intel2_1",
"fun2_1", "amb2_1", "shar2_1", "attr3_1", "sinc3_1",
"intel3_1", "fun3_1", "amb3_1")]
dating1 <- as.data.frame(dating1)
#dim(dating1)
# summary(dating1)
We have such a question: Are there latent factors which explain correlations of the observed variables?
NB:
If you have ordinal or binary variables, there are two ways:
library(polycor)
dat.cor <- hetcor(dating1)
dat.cor<- dat.cor$correlations
dating12 <- na.omit(dating1)
library(psych)
##
## Attaching package: 'psych'
## The following object is masked from 'package:polycor':
##
## polyserial
fa.parallel(dating12, fa="both", n.iter=100)
## Parallel analysis suggests that the number of factors = 15 and the number of components = 13
How many factors should be extracted? Interpret the Parallel Analysis screen plot.
Here we need to count how many crosses are above black line - that’s the N of factors, suggested by parallel analysis. We have 15. The hogher the cross - the better is explains it’s varables (higher eigenvalues).
Let’s try to use the maximum number of factors firstly.
fa(dating12, nfactors=15, rotate="none", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 15, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 ML11 ML10 ML12 ML13 ML4 ML5 ML3 ML14 ML15
## imprace -0.06 0.01 -0.03 0.14 0.16 -0.08 0.05 0.02 0.07 0.29 -0.09
## imprelig 0.01 0.19 -0.09 0.04 0.10 0.01 0.14 -0.02 0.08 0.21 0.01
## date 0.04 0.24 -0.09 -0.04 -0.06 -0.14 -0.02 0.03 -0.08 -0.06 -0.19
## go_out 0.06 0.12 0.03 -0.01 -0.09 -0.18 -0.05 -0.11 -0.14 -0.02 -0.19
## sports 0.02 -0.23 -0.04 -0.03 0.20 0.72 -0.14 -0.06 0.00 0.11 -0.29
## tvsports 0.06 -0.14 -0.01 0.04 0.35 0.31 -0.18 -0.12 -0.03 0.26 -0.30
## exercise -0.11 -0.04 -0.05 0.12 0.12 0.39 -0.01 0.03 0.04 0.19 -0.09
## dining -0.06 0.06 0.25 0.31 0.14 0.03 0.10 0.03 0.07 0.12 0.23
## museums -0.03 0.18 0.80 0.43 -0.15 0.03 0.12 0.03 0.04 0.08 -0.04
## art -0.04 0.13 0.73 0.43 -0.10 0.01 0.07 0.03 0.06 0.00 0.00
## hiking 0.08 0.04 0.20 0.11 -0.02 0.25 -0.03 0.03 -0.06 -0.16 -0.09
## gaming 0.12 -0.17 0.00 0.05 0.28 0.00 -0.10 -0.04 0.02 0.08 -0.09
## clubbing -0.06 -0.07 0.06 0.19 0.13 0.07 -0.03 0.05 -0.02 -0.01 0.06
## reading 0.02 0.14 0.24 0.08 -0.09 -0.01 0.16 0.02 0.02 0.03 0.12
## tv 0.06 0.11 -0.01 0.14 0.51 -0.38 -0.02 -0.01 -0.02 0.54 -0.13
## theater -0.01 0.24 0.39 0.37 0.16 -0.22 0.14 0.03 -0.01 0.01 0.01
## movies 0.03 0.14 0.27 0.20 0.36 -0.26 0.08 0.01 -0.05 0.02 -0.02
## concerts 0.00 0.12 0.42 0.24 0.54 -0.04 0.04 -0.11 0.04 -0.49 -0.05
## music -0.01 0.06 0.29 0.21 0.50 0.01 0.03 -0.04 0.03 -0.36 0.10
## shopping -0.14 0.08 0.09 0.35 0.35 -0.20 0.05 0.06 -0.01 0.27 0.09
## yoga 0.00 0.12 0.20 0.20 0.10 0.06 0.06 0.00 0.05 -0.08 0.11
## exphappy 0.12 -0.20 0.13 0.01 0.17 0.18 -0.07 0.03 0.00 -0.01 0.01
## attr1_1 -0.64 -0.55 0.03 -0.03 0.00 0.00 -0.41 -0.17 0.06 0.00 0.00
## sinc1_1 0.41 0.43 0.02 -0.04 0.00 0.00 -0.04 -0.24 -0.74 0.00 0.00
## intel1_1 -0.02 0.07 0.08 -0.30 0.03 0.01 0.53 0.06 0.03 0.01 0.01
## fun1_1 0.05 -0.19 0.14 -0.24 0.00 0.00 0.18 0.58 -0.03 0.00 0.00
## amb1_1 0.25 0.21 -0.38 0.74 -0.04 0.00 0.21 0.20 0.10 -0.03 -0.02
## shar1_1 0.50 0.49 0.03 -0.04 0.00 0.00 -0.12 -0.22 0.61 0.00 0.00
## attr2_1 -0.87 0.38 -0.01 0.00 0.00 0.00 0.02 -0.20 -0.04 0.00 0.00
## sinc2_1 0.68 -0.25 0.00 0.01 0.00 0.00 -0.55 0.06 -0.17 0.00 0.00
## intel2_1 0.50 -0.61 0.00 0.02 0.00 0.00 0.48 -0.32 0.01 0.00 0.00
## fun2_1 0.05 0.05 0.01 0.02 0.01 0.00 0.21 0.73 -0.14 0.00 0.00
## amb2_1 0.36 -0.34 0.04 -0.02 0.01 -0.01 -0.03 -0.06 0.11 0.00 0.00
## shar2_1 0.53 0.26 -0.01 -0.02 0.00 0.00 -0.12 0.06 0.32 0.00 0.00
## attr3_1 -0.08 -0.20 0.03 0.22 0.12 0.29 -0.07 0.08 0.09 0.17 0.45
## sinc3_1 0.12 0.20 0.09 0.09 0.16 0.17 0.02 -0.12 -0.21 0.11 0.24
## intel3_1 0.05 -0.10 0.04 0.01 0.09 0.27 0.03 -0.04 0.14 0.20 0.41
## fun3_1 -0.08 -0.14 0.09 0.23 0.25 0.26 -0.02 0.03 0.04 0.15 0.44
## amb3_1 -0.05 -0.14 -0.11 0.39 0.20 0.21 0.00 0.01 0.13 0.11 0.28
## ML9 ML7 ML8 ML6 h2 u2 com
## imprace 0.01 0.02 0.02 -0.01 0.155 0.8451 2.9
## imprelig -0.01 0.01 0.05 -0.02 0.130 0.8702 4.4
## date 0.04 -0.03 0.01 0.02 0.144 0.8561 3.9
## go_out -0.07 -0.04 -0.04 0.00 0.138 0.8623 5.9
## sports 0.07 -0.02 -0.06 0.10 0.751 0.2491 2.0
## tvsports 0.15 -0.05 0.03 -0.04 0.478 0.5217 5.7
## exercise 0.04 0.01 -0.02 0.05 0.248 0.7518 2.5
## dining -0.04 0.13 0.02 -0.02 0.298 0.7025 4.9
## museums -0.11 0.10 0.13 -0.04 0.938 0.0616 2.0
## art -0.04 0.14 0.11 -0.05 0.790 0.2102 2.0
## hiking 0.06 0.03 -0.04 0.04 0.165 0.8350 4.5
## gaming 0.00 -0.01 -0.07 -0.04 0.156 0.8445 3.3
## clubbing 0.06 0.02 0.00 -0.06 0.085 0.9147 4.3
## reading -0.13 0.05 0.06 0.02 0.156 0.8440 5.0
## tv 0.09 0.01 0.01 -0.03 0.756 0.2445 3.3
## theater -0.04 0.18 0.06 -0.01 0.483 0.5174 4.6
## movies -0.02 0.08 0.08 -0.02 0.353 0.6471 4.3
## concerts 0.03 0.08 -0.02 0.01 0.810 0.1901 3.7
## music 0.08 0.04 -0.01 0.08 0.537 0.4629 3.3
## shopping 0.08 0.07 0.04 -0.04 0.419 0.5807 4.8
## yoga 0.08 0.08 0.04 0.01 0.147 0.8532 5.4
## exphappy 0.04 0.04 -0.02 -0.01 0.141 0.8585 4.9
## attr1_1 -0.06 0.05 -0.16 0.22 0.995 0.0050 3.4
## sinc1_1 0.00 -0.04 -0.10 0.17 0.995 0.0050 2.7
## intel1_1 -0.54 0.20 0.38 -0.34 0.979 0.0211 4.6
## fun1_1 0.69 -0.10 0.03 -0.06 0.978 0.0217 2.8
## amb1_1 0.02 -0.10 0.18 -0.09 0.943 0.0572 2.7
## shar1_1 0.01 -0.07 -0.24 -0.10 0.995 0.0051 3.8
## attr2_1 0.09 0.08 0.04 -0.18 0.995 0.0050 1.6
## sinc2_1 0.00 0.12 0.06 -0.34 0.995 0.0050 3.0
## intel2_1 0.07 0.14 -0.11 -0.03 0.995 0.0050 3.7
## fun2_1 -0.21 0.05 -0.57 0.11 0.986 0.0137 2.5
## amb2_1 -0.09 -0.79 0.24 0.19 0.986 0.0135 2.3
## shar2_1 0.03 0.35 0.27 0.57 0.992 0.0085 4.4
## attr3_1 -0.01 0.09 0.10 0.10 0.472 0.5276 4.1
## sinc3_1 0.02 0.02 -0.02 0.10 0.266 0.7345 7.7
## intel3_1 -0.11 -0.04 0.08 0.03 0.343 0.6571 3.4
## fun3_1 0.32 -0.04 0.08 0.00 0.546 0.4542 4.9
## amb3_1 0.01 -0.04 0.13 -0.01 0.398 0.6022 4.6
##
## ML1 ML2 ML11 ML10 ML12 ML13 ML4 ML5 ML3 ML14
## SS loadings 2.90 2.16 2.12 1.99 1.64 1.52 1.32 1.28 1.25 1.15
## Proportion Var 0.07 0.06 0.05 0.05 0.04 0.04 0.03 0.03 0.03 0.03
## Cumulative Var 0.07 0.13 0.18 0.24 0.28 0.32 0.35 0.38 0.42 0.44
## Proportion Explained 0.13 0.10 0.10 0.09 0.07 0.07 0.06 0.06 0.06 0.05
## Cumulative Proportion 0.13 0.23 0.32 0.41 0.49 0.56 0.62 0.67 0.73 0.78
## ML15 ML9 ML7 ML8 ML6
## SS loadings 1.11 1.07 0.98 0.85 0.79
## Proportion Var 0.03 0.03 0.03 0.02 0.02
## Cumulative Var 0.47 0.50 0.53 0.55 0.57
## Proportion Explained 0.05 0.05 0.04 0.04 0.04
## Cumulative Proportion 0.83 0.88 0.93 0.96 1.00
##
## Mean item complexity = 3.8
## Test of the hypothesis that 15 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 261 and the objective function was 1.4
##
## The root mean square of the residuals (RMSR) is 0.03
## The df corrected root mean square of the residuals is 0.05
##
## The harmonic number of observations is 8191 with the empirical chi square 11661.7 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 11459.12 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.778
## RMSEA index = 0.072 and the 90 % confidence intervals are 0.071 0.074
## BIC = 9107.3
## Fit based upon off diagonal values = 0.95
## Measures of factor score adequacy
## ML1 ML2 ML11 ML10 ML12
## Correlation of (regression) scores with factors 1 1.00 0.97 0.98 0.91
## Multiple R square of scores with factors 1 1.00 0.95 0.96 0.83
## Minimum correlation of possible factor scores 1 0.99 0.90 0.92 0.67
## ML13 ML4 ML5 ML3 ML14
## Correlation of (regression) scores with factors 0.9 1.00 1.00 1.00 0.88
## Multiple R square of scores with factors 0.8 0.99 0.99 1.00 0.78
## Minimum correlation of possible factor scores 0.6 0.99 0.98 0.99 0.57
## ML15 ML9 ML7 ML8 ML6
## Correlation of (regression) scores with factors 0.83 0.99 0.99 0.99 0.99
## Multiple R square of scores with factors 0.69 0.98 0.99 0.99 0.99
## Minimum correlation of possible factor scores 0.37 0.96 0.97 0.97 0.98
We look at those variables, where h2 (comunality) is higher. In this case these are attr1_1, attr2_1, sinc1_1, sinc2_1, shar1_1, intel2_1 - with h2 > 0.99. The model itself is not really good. First off, too many factors - hard to interpret. Then, these factors are not really great in explanation - Proportion Variance is rather low (<0.1), which is bad. AS for the Proportion Explained, difference between first and last one is not so big (0.13 and 0.04), which is not that bad (you might think), but if we look at the graph below we still see that some factors explain of one variable, which IS bad. RMSR here is 0.03, which is not so bad actually. Anyway, that’s the model withour rotation, let’s try another one. First, let’s reduce thу N of factors.
factor.plot(fa(dating12, nfactors=15, rotate="none", fm="ml"))
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'
fa.diagram(fa(dating12, nfactors=15, rotate="none", fm="ml"))
fa(dating12, nfactors=5, rotate="none", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 5, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML4 ML2 ML1 ML5 h2 u2 com
## imprace 0.06 0.09 -0.04 -0.03 0.24 0.073 0.9269 1.5
## imprelig 0.01 -0.05 -0.04 -0.05 0.30 0.096 0.9041 1.2
## date -0.06 -0.35 -0.06 -0.01 0.08 0.138 0.8619 1.2
## go_out 0.02 -0.31 0.01 -0.03 -0.03 0.097 0.9033 1.0
## sports -0.14 0.40 0.14 0.04 -0.12 0.214 0.7857 1.7
## tvsports -0.07 0.29 0.19 -0.07 0.11 0.143 0.8566 2.3
## exercise 0.00 0.31 -0.07 -0.03 0.06 0.105 0.8951 1.2
## dining 0.41 0.25 -0.10 -0.02 0.21 0.292 0.7077 2.4
## museums 0.91 -0.04 -0.09 -0.06 -0.13 0.853 0.1467 1.1
## art 0.90 0.01 -0.08 -0.07 -0.15 0.843 0.1574 1.1
## hiking 0.21 0.03 0.01 0.07 -0.09 0.060 0.9401 1.7
## gaming -0.02 0.21 0.14 0.09 0.11 0.085 0.9146 2.8
## clubbing 0.12 0.26 -0.04 -0.01 0.08 0.091 0.9089 1.7
## reading 0.30 -0.13 -0.06 0.01 -0.01 0.108 0.8917 1.5
## tv 0.09 0.06 0.00 -0.02 0.50 0.262 0.7383 1.1
## theater 0.61 -0.09 -0.14 -0.04 0.29 0.482 0.5176 1.6
## movies 0.38 -0.04 -0.04 -0.02 0.32 0.252 0.7481 2.0
## concerts 0.45 0.06 -0.02 -0.08 0.16 0.233 0.7671 1.4
## music 0.32 0.18 -0.02 -0.03 0.19 0.171 0.8292 2.3
## shopping 0.28 0.26 -0.16 -0.09 0.42 0.353 0.6467 3.0
## yoga 0.32 0.08 -0.03 -0.05 0.12 0.127 0.8726 1.5
## exphappy 0.07 0.26 0.13 0.12 -0.05 0.109 0.8912 2.2
## attr1_1 -0.21 0.47 -0.14 -0.31 -0.42 0.555 0.4454 3.4
## sinc1_1 0.02 -0.37 0.09 0.12 0.19 0.196 0.8040 1.9
## intel1_1 0.12 -0.26 -0.08 -0.04 -0.02 0.089 0.9106 1.7
## fun1_1 -0.05 0.15 -0.10 0.28 -0.04 0.114 0.8858 2.0
## amb1_1 0.18 -0.04 0.05 0.16 0.42 0.239 0.7606 1.7
## shar1_1 0.10 -0.31 0.26 0.09 0.24 0.245 0.7552 3.3
## attr2_1 -0.01 0.00 -0.61 -0.79 0.00 0.995 0.0049 1.9
## sinc2_1 -0.02 -0.02 0.57 0.36 -0.01 0.452 0.5481 1.7
## intel2_1 -0.01 0.10 0.52 0.32 -0.07 0.391 0.6086 1.8
## fun2_1 0.00 0.00 -0.63 0.77 0.00 0.995 0.0050 1.9
## amb2_1 -0.07 0.10 0.59 0.21 -0.11 0.419 0.5811 1.4
## shar2_1 0.15 -0.19 0.41 0.26 0.21 0.342 0.6576 3.1
## attr3_1 0.12 0.53 0.04 0.01 0.06 0.302 0.6978 1.2
## sinc3_1 0.14 0.03 0.02 0.00 0.25 0.087 0.9133 1.6
## intel3_1 0.06 0.31 0.12 0.02 0.07 0.117 0.8835 1.5
## fun3_1 0.16 0.58 0.06 -0.09 0.21 0.411 0.5893 1.5
## amb3_1 0.11 0.46 0.08 -0.05 0.25 0.296 0.7040 1.8
##
## ML3 ML4 ML2 ML1 ML5
## SS loadings 3.20 2.48 2.21 1.86 1.68
## Proportion Var 0.08 0.06 0.06 0.05 0.04
## Cumulative Var 0.08 0.15 0.20 0.25 0.29
## Proportion Explained 0.28 0.22 0.19 0.16 0.15
## Cumulative Proportion 0.28 0.50 0.69 0.85 1.00
##
## Mean item complexity = 1.8
## Test of the hypothesis that 5 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 556 and the objective function was 9.91
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8191 with the empirical chi square 56272.09 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 80980.15 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.251
## RMSEA index = 0.133 and the 90 % confidence intervals are 0.132 0.134
## BIC = 75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy
## ML3 ML4 ML2 ML1 ML5
## Correlation of (regression) scores with factors 0.96 0.88 1.00 1.00 0.85
## Multiple R square of scores with factors 0.93 0.78 0.99 1.00 0.73
## Minimum correlation of possible factor scores 0.86 0.56 0.99 0.99 0.45
factor.plot(fa(dating12, nfactors=5, rotate="none", fm="ml"))
fa.diagram(fa(dating12, nfactors=5, rotate="none", fm="ml"))
#OR
#fa1 <- fa(dating12, nfactors=5, rotate="none", fm="ml")
#print(fa1$loadings,cutoff = 0.3)
-> Not good. Next model. Now with rotation.
fa(dating12, nfactors=5, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML2 ML4 ML5 ML1 h2 u2 com
## imprace 0.03 -0.05 0.09 0.25 0.00 0.073 0.9269 1.4
## imprelig -0.03 -0.05 -0.04 0.30 -0.04 0.096 0.9041 1.2
## date -0.07 -0.03 -0.35 0.09 -0.02 0.138 0.8619 1.2
## go_out 0.02 0.02 -0.30 -0.01 -0.06 0.097 0.9033 1.1
## sports -0.13 0.09 0.41 -0.16 0.01 0.214 0.7857 1.6
## tvsports -0.10 0.09 0.32 0.08 -0.13 0.143 0.8566 1.8
## exercise 0.00 -0.10 0.30 0.05 0.04 0.105 0.8951 1.3
## dining 0.39 -0.08 0.25 0.26 0.04 0.292 0.7077 2.7
## museums 0.92 -0.04 -0.04 0.00 -0.04 0.853 0.1467 1.0
## art 0.92 -0.04 0.01 -0.03 -0.05 0.843 0.1574 1.0
## hiking 0.22 0.06 0.03 -0.07 0.06 0.060 0.9401 1.5
## gaming -0.05 0.15 0.23 0.09 0.02 0.085 0.9146 2.2
## clubbing 0.12 -0.05 0.26 0.09 0.03 0.091 0.9089 1.8
## reading 0.30 -0.01 -0.14 0.04 0.02 0.108 0.8917 1.5
## tv 0.02 0.01 0.08 0.50 -0.03 0.262 0.7383 1.1
## theater 0.57 -0.08 -0.09 0.37 0.00 0.482 0.5176 1.8
## movies 0.34 0.00 -0.03 0.37 -0.02 0.252 0.7481 2.0
## concerts 0.42 -0.03 0.07 0.21 -0.07 0.233 0.7671 1.6
## music 0.30 -0.02 0.19 0.22 -0.02 0.171 0.8292 2.6
## shopping 0.24 -0.17 0.26 0.45 0.01 0.353 0.6467 2.5
## yoga 0.30 -0.03 0.09 0.16 -0.03 0.127 0.8726 1.8
## exphappy 0.07 0.16 0.27 -0.05 0.06 0.109 0.8912 1.9
## attr1_1 -0.13 -0.35 0.44 -0.45 -0.12 0.555 0.4454 3.2
## sinc1_1 -0.01 0.18 -0.35 0.20 0.01 0.196 0.8040 2.2
## intel1_1 0.13 -0.06 -0.26 0.01 -0.02 0.089 0.9106 1.6
## fun1_1 -0.04 0.06 0.12 -0.05 0.30 0.114 0.8858 1.5
## amb1_1 0.11 0.16 -0.02 0.44 0.08 0.239 0.7606 1.5
## shar1_1 0.04 0.31 -0.27 0.25 -0.11 0.245 0.7552 3.3
## attr2_1 0.05 -0.94 -0.05 0.02 -0.32 0.995 0.0049 1.2
## sinc2_1 -0.07 0.67 0.04 -0.03 -0.01 0.452 0.5481 1.0
## intel2_1 -0.04 0.60 0.15 -0.10 0.00 0.391 0.6086 1.2
## fun2_1 0.04 -0.10 -0.11 0.04 0.99 0.995 0.0050 1.1
## amb2_1 -0.10 0.59 0.16 -0.15 -0.13 0.419 0.5811 1.5
## shar2_1 0.09 0.52 -0.14 0.22 -0.04 0.342 0.6576 1.6
## attr3_1 0.11 0.01 0.53 0.06 0.04 0.302 0.6978 1.1
## sinc3_1 0.11 0.03 0.05 0.27 -0.02 0.087 0.9133 1.4
## intel3_1 0.04 0.09 0.32 0.06 -0.02 0.117 0.8835 1.3
## fun3_1 0.13 -0.03 0.59 0.20 -0.06 0.411 0.5893 1.4
## amb3_1 0.08 0.02 0.48 0.24 -0.05 0.296 0.7040 1.6
##
## ML3 ML2 ML4 ML5 ML1
## SS loadings 3.01 2.73 2.52 1.90 1.27
## Proportion Var 0.08 0.07 0.06 0.05 0.03
## Cumulative Var 0.08 0.15 0.21 0.26 0.29
## Proportion Explained 0.26 0.24 0.22 0.17 0.11
## Cumulative Proportion 0.26 0.50 0.72 0.89 1.00
##
## Mean item complexity = 1.6
## Test of the hypothesis that 5 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 556 and the objective function was 9.91
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8191 with the empirical chi square 56272.09 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 80980.15 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.251
## RMSEA index = 0.133 and the 90 % confidence intervals are 0.132 0.134
## BIC = 75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy
## ML3 ML2 ML4 ML5 ML1
## Correlation of (regression) scores with factors 0.96 1.00 0.88 0.85 1.00
## Multiple R square of scores with factors 0.93 0.99 0.78 0.73 0.99
## Minimum correlation of possible factor scores 0.85 0.98 0.57 0.46 0.99
factor.plot(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))
fa.diagram(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))
Here we see that:
library(GPArotation)
fa(dating12, nfactors=5, rotate="oblimin", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 5, rotate = "oblimin", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML3 ML4 ML5 ML1 h2 u2 com
## imprace -0.08 -0.05 0.21 0.18 0.01 0.073 0.9269 2.4
## imprelig -0.12 -0.12 0.11 0.29 -0.03 0.096 0.9041 2.0
## date -0.11 -0.07 -0.27 0.23 -0.01 0.138 0.8619 2.5
## go_out -0.03 0.05 -0.28 0.13 -0.06 0.097 0.9033 1.6
## sports 0.18 -0.10 0.27 -0.31 -0.02 0.214 0.7857 2.9
## tvsports 0.11 -0.13 0.31 -0.04 -0.15 0.143 0.8566 2.2
## exercise -0.05 -0.04 0.30 -0.10 0.06 0.105 0.8951 1.4
## dining -0.06 0.29 0.37 0.11 0.06 0.292 0.7077 2.2
## museums 0.01 0.93 -0.03 -0.01 0.00 0.853 0.1467 1.0
## art 0.01 0.93 0.00 -0.06 -0.02 0.843 0.1574 1.0
## hiking 0.09 0.24 0.00 -0.08 0.05 0.060 0.9401 1.6
## gaming 0.17 -0.08 0.24 0.00 -0.01 0.085 0.9146 2.1
## clubbing -0.01 0.07 0.28 -0.05 0.04 0.091 0.9089 1.2
## reading -0.02 0.29 -0.10 0.09 0.03 0.108 0.8917 1.4
## tv -0.08 -0.14 0.32 0.44 -0.03 0.262 0.7383 2.1
## theater -0.13 0.46 0.12 0.36 0.04 0.482 0.5176 2.2
## movies -0.06 0.23 0.16 0.35 -0.01 0.252 0.7481 2.3
## concerts -0.04 0.36 0.17 0.15 -0.05 0.233 0.7671 1.9
## music -0.01 0.22 0.28 0.11 -0.01 0.171 0.8292 2.3
## shopping -0.19 0.07 0.47 0.27 0.05 0.353 0.6467 2.1
## yoga -0.03 0.25 0.16 0.10 -0.01 0.127 0.8726 2.1
## exphappy 0.22 0.07 0.21 -0.15 0.03 0.109 0.8912 3.0
## attr1_1 -0.20 -0.04 0.17 -0.65 -0.07 0.555 0.4454 1.4
## sinc1_1 0.08 -0.04 -0.22 0.36 -0.02 0.196 0.8040 1.8
## intel1_1 -0.10 0.14 -0.23 0.11 -0.01 0.089 0.9106 2.7
## fun1_1 0.13 -0.05 0.10 -0.11 0.29 0.114 0.8858 2.1
## amb1_1 0.08 -0.01 0.20 0.43 0.06 0.239 0.7606 1.5
## shar1_1 0.19 0.01 -0.13 0.40 -0.16 0.245 0.7552 2.0
## attr2_1 -0.97 0.00 -0.01 -0.07 -0.16 0.995 0.0049 1.1
## sinc2_1 0.66 -0.02 -0.01 0.05 -0.13 0.452 0.5481 1.1
## intel2_1 0.62 0.01 0.06 -0.08 -0.11 0.391 0.6086 1.1
## fun2_1 0.02 -0.01 -0.01 -0.01 1.00 0.995 0.0050 1.0
## amb2_1 0.60 -0.03 0.03 -0.12 -0.23 0.419 0.5811 1.4
## shar2_1 0.43 0.06 -0.04 0.34 -0.12 0.342 0.6576 2.1
## attr3_1 0.10 0.06 0.51 -0.19 0.03 0.302 0.6978 1.4
## sinc3_1 -0.01 0.03 0.17 0.23 -0.02 0.087 0.9133 1.9
## intel3_1 0.14 0.01 0.31 -0.07 -0.04 0.117 0.8835 1.5
## fun3_1 0.03 0.03 0.62 -0.08 -0.06 0.411 0.5893 1.1
## amb3_1 0.05 -0.03 0.54 0.02 -0.05 0.296 0.7040 1.0
##
## ML2 ML3 ML4 ML5 ML1
## SS loadings 2.69 2.72 2.59 2.16 1.28
## Proportion Var 0.07 0.07 0.07 0.06 0.03
## Cumulative Var 0.07 0.14 0.21 0.26 0.29
## Proportion Explained 0.24 0.24 0.23 0.19 0.11
## Cumulative Proportion 0.24 0.47 0.70 0.89 1.00
##
## With factor correlations of
## ML2 ML3 ML4 ML5 ML1
## ML2 1.00 -0.13 -0.06 0.09 0.04
## ML3 -0.13 1.00 0.18 0.22 0.02
## ML4 -0.06 0.18 1.00 -0.03 -0.06
## ML5 0.09 0.22 -0.03 1.00 0.09
## ML1 0.04 0.02 -0.06 0.09 1.00
##
## Mean item complexity = 1.8
## Test of the hypothesis that 5 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 556 and the objective function was 9.91
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8191 with the empirical chi square 56272.09 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 80980.15 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.251
## RMSEA index = 0.133 and the 90 % confidence intervals are 0.132 0.134
## BIC = 75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy
## ML2 ML3 ML4 ML5 ML1
## Correlation of (regression) scores with factors 1.00 0.96 0.88 0.88 1.00
## Multiple R square of scores with factors 0.99 0.93 0.78 0.77 0.99
## Minimum correlation of possible factor scores 0.99 0.85 0.56 0.54 0.99
factor.plot(fa(dating12, nfactors=5, rotate="oblimin", fm="ml"))
fa.diagram(fa(dating12, nfactors=5, rotate="oblimin", fm="ml"))
I again see that ML1 useless factor explaining only one variable… And in general the results are merely the same as in the 3rd model - Proportion Var are still below 0.1, Proportion explained is nearly the same, too (i still find the first 4 factors are good and the last one is extra, but iа i reduce the N of factors it still exist), RMSR > 0.05 (equals to 0.07). And for some reason i don’t see any arrows, shcoing the correlation between factors…
In my opinion the 3rd model here is the best. It look prettie and more interprettable than thr first (ofc, as there are much less factors), the Proportion Explained is better, than in the 2nd model and i see no reason to prefer the 4th model to the 3rd one, because there were no singinificant improvements.
I still don’t like the presence of ML1, but if I reduce the N of factors, the situation is not getting better. In contrast, it’s getting worse: more RMSR, more difference in Proportion Explained between factors, no improvements in Proportion Variance.
Increase in the N of factors didn’t help neither. So, I guess, 5 is the optimal N of factors. And it’s better to use varimax rotation.
fa(dating12, nfactors=4, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 4, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML2 ML4 ML1 h2 u2 com
## imprace 0.07 -0.06 0.10 0.01 0.0177 0.9823 2.4
## imprelig 0.02 -0.06 0.00 -0.02 0.0048 0.9952 1.3
## date -0.07 -0.03 -0.34 -0.02 0.1204 0.8796 1.1
## go_out 0.01 0.03 -0.32 -0.07 0.1086 0.8914 1.1
## sports -0.13 0.09 0.41 0.01 0.1897 0.8103 1.3
## tvsports -0.08 0.08 0.33 -0.12 0.1360 0.8640 1.5
## exercise 0.02 -0.11 0.31 0.05 0.1129 0.8871 1.3
## dining 0.44 -0.09 0.25 0.05 0.2640 0.7360 1.7
## museums 0.90 -0.02 -0.10 -0.04 0.8206 0.1794 1.0
## art 0.89 -0.03 -0.05 -0.06 0.7944 0.2056 1.0
## hiking 0.21 0.06 0.02 0.05 0.0515 0.9485 1.3
## gaming -0.02 0.15 0.22 0.02 0.0706 0.9294 1.8
## clubbing 0.14 -0.05 0.25 0.04 0.0869 0.9131 1.7
## reading 0.30 0.00 -0.14 0.02 0.1072 0.8928 1.4
## tv 0.09 -0.01 0.09 -0.01 0.0170 0.9830 2.0
## theater 0.62 -0.08 -0.10 0.01 0.3972 0.6028 1.1
## movies 0.39 -0.01 -0.04 -0.01 0.1523 0.8477 1.0
## concerts 0.46 -0.03 0.05 -0.07 0.2178 0.7822 1.1
## music 0.34 -0.02 0.18 -0.01 0.1486 0.8514 1.5
## shopping 0.31 -0.18 0.25 0.03 0.1898 0.8102 2.6
## yoga 0.33 -0.04 0.09 -0.02 0.1204 0.8796 1.2
## exphappy 0.07 0.16 0.26 0.05 0.0994 0.9006 1.9
## attr1_1 -0.17 -0.34 0.34 -0.13 0.2739 0.7261 2.7
## sinc1_1 0.00 0.18 -0.31 0.01 0.1255 0.8745 1.6
## intel1_1 0.12 -0.06 -0.24 -0.02 0.0769 0.9231 1.6
## fun1_1 -0.05 0.06 0.13 0.30 0.1156 0.8844 1.5
## amb1_1 0.17 0.14 0.04 0.09 0.0598 0.9402 2.7
## shar1_1 0.07 0.29 -0.21 -0.11 0.1478 0.8522 2.3
## attr2_1 0.06 -0.95 -0.05 -0.29 0.9951 0.0049 1.2
## sinc2_1 -0.08 0.67 0.04 -0.03 0.4524 0.5476 1.0
## intel2_1 -0.05 0.60 0.15 -0.02 0.3827 0.6173 1.1
## fun2_1 0.04 -0.07 -0.12 0.99 0.9950 0.0050 1.0
## amb2_1 -0.13 0.58 0.16 -0.15 0.4025 0.5975 1.4
## shar2_1 0.11 0.51 -0.10 -0.05 0.2813 0.7187 1.2
## attr3_1 0.15 0.00 0.55 0.05 0.3248 0.6752 1.2
## sinc3_1 0.15 0.02 0.09 -0.01 0.0315 0.9685 1.7
## intel3_1 0.06 0.09 0.35 -0.01 0.1344 0.8656 1.2
## fun3_1 0.18 -0.05 0.63 -0.04 0.4287 0.5713 1.2
## amb3_1 0.13 0.00 0.50 -0.03 0.2694 0.7306 1.1
##
## ML3 ML2 ML4 ML1
## SS loadings 3.27 2.72 2.47 1.26
## Proportion Var 0.08 0.07 0.06 0.03
## Cumulative Var 0.08 0.15 0.22 0.25
## Proportion Explained 0.34 0.28 0.25 0.13
## Cumulative Proportion 0.34 0.62 0.87 1.00
##
## Mean item complexity = 1.5
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 591 and the objective function was 10.99
##
## The root mean square of the residuals (RMSR) is 0.08
## The df corrected root mean square of the residuals is 0.09
##
## The harmonic number of observations is 8191 with the empirical chi square 76549.99 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 89841.06 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.219
## RMSEA index = 0.136 and the 90 % confidence intervals are 0.135 0.137
## BIC = 84515.68
## Fit based upon off diagonal values = 0.69
## Measures of factor score adequacy
## ML3 ML2 ML4 ML1
## Correlation of (regression) scores with factors 0.96 1.00 0.88 1.00
## Multiple R square of scores with factors 0.91 0.99 0.77 0.99
## Minimum correlation of possible factor scores 0.83 0.98 0.54 0.98
factor.plot(fa(dating12, nfactors=4, rotate="varimax", fm="ml"))
fa.diagram(fa(dating12, nfactors=4, rotate="varimax", fm="ml"))
fa(dating12, nfactors=6, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 6, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML5 ML3 ML6 ML4 ML2 ML1 h2 u2 com
## imprace 0.05 -0.05 0.18 0.06 -0.01 0.00 0.040 0.9603 1.6
## imprelig -0.01 -0.09 0.10 0.23 -0.12 -0.02 0.083 0.9169 2.4
## date -0.04 -0.08 -0.33 0.14 0.05 0.01 0.138 0.8615 1.6
## go_out 0.04 0.00 -0.33 -0.01 0.05 -0.04 0.115 0.8852 1.1
## sports -0.14 0.14 0.28 -0.10 0.14 -0.02 0.144 0.8562 2.9
## tvsports -0.08 0.11 0.24 0.04 0.16 -0.14 0.121 0.8788 3.3
## exercise 0.00 -0.08 0.30 0.01 0.06 0.03 0.098 0.9021 1.2
## dining 0.40 -0.07 0.34 0.06 -0.06 0.04 0.289 0.7114 2.1
## museums 0.91 -0.02 -0.03 -0.05 -0.09 -0.03 0.848 0.1517 1.0
## art 0.91 -0.02 -0.01 -0.07 -0.03 -0.04 0.832 0.1679 1.0
## hiking 0.22 0.06 -0.02 0.03 0.09 0.06 0.065 0.9346 1.7
## gaming -0.04 0.17 0.18 -0.03 0.06 0.01 0.069 0.9309 2.4
## clubbing 0.12 -0.03 0.24 0.01 0.06 0.03 0.079 0.9206 1.7
## reading 0.29 -0.02 -0.06 0.03 -0.17 0.02 0.117 0.8826 1.8
## tv 0.07 -0.03 0.17 0.20 0.06 0.00 0.075 0.9254 2.4
## theater 0.60 -0.10 0.03 0.12 -0.07 0.03 0.387 0.6133 1.2
## movies 0.36 -0.02 0.06 0.10 -0.06 0.00 0.150 0.8497 1.3
## concerts 0.45 -0.03 0.08 0.03 0.04 -0.06 0.211 0.7889 1.1
## music 0.32 -0.01 0.20 0.06 0.08 -0.01 0.152 0.8480 1.9
## shopping 0.27 -0.17 0.35 0.13 0.03 0.03 0.247 0.7527 2.8
## yoga 0.32 -0.04 0.13 0.11 0.02 -0.02 0.133 0.8673 1.7
## exphappy 0.06 0.19 0.19 -0.06 0.02 0.03 0.084 0.9156 2.5
## attr1_1 -0.15 -0.20 0.27 -0.85 0.34 -0.15 0.995 0.0050 1.9
## sinc1_1 0.03 0.09 -0.36 0.41 0.14 0.05 0.332 0.6682 2.4
## intel1_1 0.06 -0.07 -0.09 -0.02 -0.99 -0.07 0.995 0.0050 1.0
## fun1_1 -0.04 0.05 0.07 0.19 0.13 0.30 0.152 0.8479 2.3
## amb1_1 0.15 0.08 0.14 0.55 0.02 0.11 0.368 0.6317 1.4
## shar1_1 0.08 0.22 -0.22 0.45 0.10 -0.07 0.316 0.6837 2.2
## attr2_1 0.07 -0.94 0.03 -0.11 0.01 -0.30 0.995 0.0049 1.2
## sinc2_1 -0.07 0.66 -0.08 0.09 0.12 -0.02 0.469 0.5314 1.2
## intel2_1 -0.08 0.62 0.11 -0.03 -0.20 -0.04 0.446 0.5543 1.3
## fun2_1 0.03 -0.08 -0.05 -0.03 -0.07 0.99 0.995 0.0050 1.0
## amb2_1 -0.12 0.60 0.06 0.00 0.07 -0.15 0.407 0.5927 1.3
## shar2_1 0.11 0.47 -0.10 0.26 0.06 -0.02 0.306 0.6936 1.8
## attr3_1 0.10 0.07 0.56 -0.07 0.02 0.01 0.334 0.6664 1.1
## sinc3_1 0.13 0.00 0.12 0.23 0.04 0.00 0.085 0.9145 2.3
## intel3_1 0.02 0.12 0.39 0.02 -0.14 -0.04 0.186 0.8142 1.5
## fun3_1 0.14 0.00 0.62 0.14 0.16 -0.07 0.452 0.5477 1.4
## amb3_1 0.08 0.04 0.57 0.10 0.04 -0.06 0.346 0.6542 1.1
##
## ML5 ML3 ML6 ML4 ML2 ML1
## SS loadings 3.12 2.60 2.45 1.80 1.40 1.28
## Proportion Var 0.08 0.07 0.06 0.05 0.04 0.03
## Cumulative Var 0.08 0.15 0.21 0.26 0.29 0.32
## Proportion Explained 0.25 0.21 0.19 0.14 0.11 0.10
## Cumulative Proportion 0.25 0.45 0.65 0.79 0.90 1.00
##
## Mean item complexity = 1.7
## Test of the hypothesis that 6 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 522 and the objective function was 9.22
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8191 with the empirical chi square 57758.47 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 75386.35 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.258
## RMSEA index = 0.132 and the 90 % confidence intervals are 0.132 0.133
## BIC = 70682.71
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy
## ML5 ML3 ML6 ML4 ML2
## Correlation of (regression) scores with factors 0.96 1.00 0.88 0.99 1.00
## Multiple R square of scores with factors 0.92 0.99 0.78 0.97 0.99
## Minimum correlation of possible factor scores 0.85 0.99 0.56 0.94 0.98
## ML1
## Correlation of (regression) scores with factors 1.00
## Multiple R square of scores with factors 0.99
## Minimum correlation of possible factor scores 0.99
factor.plot(fa(dating12, nfactors=6, rotate="varimax", fm="ml"))
fa.diagram(fa(dating12, nfactors=6, rotate="varimax", fm="ml"))
Now let’s move on and try to explain what do our factors mean.
fa(dating12, nfactors=5, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML2 ML4 ML5 ML1 h2 u2 com
## imprace 0.03 -0.05 0.09 0.25 0.00 0.073 0.9269 1.4
## imprelig -0.03 -0.05 -0.04 0.30 -0.04 0.096 0.9041 1.2
## date -0.07 -0.03 -0.35 0.09 -0.02 0.138 0.8619 1.2
## go_out 0.02 0.02 -0.30 -0.01 -0.06 0.097 0.9033 1.1
## sports -0.13 0.09 0.41 -0.16 0.01 0.214 0.7857 1.6
## tvsports -0.10 0.09 0.32 0.08 -0.13 0.143 0.8566 1.8
## exercise 0.00 -0.10 0.30 0.05 0.04 0.105 0.8951 1.3
## dining 0.39 -0.08 0.25 0.26 0.04 0.292 0.7077 2.7
## museums 0.92 -0.04 -0.04 0.00 -0.04 0.853 0.1467 1.0
## art 0.92 -0.04 0.01 -0.03 -0.05 0.843 0.1574 1.0
## hiking 0.22 0.06 0.03 -0.07 0.06 0.060 0.9401 1.5
## gaming -0.05 0.15 0.23 0.09 0.02 0.085 0.9146 2.2
## clubbing 0.12 -0.05 0.26 0.09 0.03 0.091 0.9089 1.8
## reading 0.30 -0.01 -0.14 0.04 0.02 0.108 0.8917 1.5
## tv 0.02 0.01 0.08 0.50 -0.03 0.262 0.7383 1.1
## theater 0.57 -0.08 -0.09 0.37 0.00 0.482 0.5176 1.8
## movies 0.34 0.00 -0.03 0.37 -0.02 0.252 0.7481 2.0
## concerts 0.42 -0.03 0.07 0.21 -0.07 0.233 0.7671 1.6
## music 0.30 -0.02 0.19 0.22 -0.02 0.171 0.8292 2.6
## shopping 0.24 -0.17 0.26 0.45 0.01 0.353 0.6467 2.5
## yoga 0.30 -0.03 0.09 0.16 -0.03 0.127 0.8726 1.8
## exphappy 0.07 0.16 0.27 -0.05 0.06 0.109 0.8912 1.9
## attr1_1 -0.13 -0.35 0.44 -0.45 -0.12 0.555 0.4454 3.2
## sinc1_1 -0.01 0.18 -0.35 0.20 0.01 0.196 0.8040 2.2
## intel1_1 0.13 -0.06 -0.26 0.01 -0.02 0.089 0.9106 1.6
## fun1_1 -0.04 0.06 0.12 -0.05 0.30 0.114 0.8858 1.5
## amb1_1 0.11 0.16 -0.02 0.44 0.08 0.239 0.7606 1.5
## shar1_1 0.04 0.31 -0.27 0.25 -0.11 0.245 0.7552 3.3
## attr2_1 0.05 -0.94 -0.05 0.02 -0.32 0.995 0.0049 1.2
## sinc2_1 -0.07 0.67 0.04 -0.03 -0.01 0.452 0.5481 1.0
## intel2_1 -0.04 0.60 0.15 -0.10 0.00 0.391 0.6086 1.2
## fun2_1 0.04 -0.10 -0.11 0.04 0.99 0.995 0.0050 1.1
## amb2_1 -0.10 0.59 0.16 -0.15 -0.13 0.419 0.5811 1.5
## shar2_1 0.09 0.52 -0.14 0.22 -0.04 0.342 0.6576 1.6
## attr3_1 0.11 0.01 0.53 0.06 0.04 0.302 0.6978 1.1
## sinc3_1 0.11 0.03 0.05 0.27 -0.02 0.087 0.9133 1.4
## intel3_1 0.04 0.09 0.32 0.06 -0.02 0.117 0.8835 1.3
## fun3_1 0.13 -0.03 0.59 0.20 -0.06 0.411 0.5893 1.4
## amb3_1 0.08 0.02 0.48 0.24 -0.05 0.296 0.7040 1.6
##
## ML3 ML2 ML4 ML5 ML1
## SS loadings 3.01 2.73 2.52 1.90 1.27
## Proportion Var 0.08 0.07 0.06 0.05 0.03
## Cumulative Var 0.08 0.15 0.21 0.26 0.29
## Proportion Explained 0.26 0.24 0.22 0.17 0.11
## Cumulative Proportion 0.26 0.50 0.72 0.89 1.00
##
## Mean item complexity = 1.6
## Test of the hypothesis that 5 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 556 and the objective function was 9.91
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8191 with the empirical chi square 56272.09 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 80980.15 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.251
## RMSEA index = 0.133 and the 90 % confidence intervals are 0.132 0.134
## BIC = 75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy
## ML3 ML2 ML4 ML5 ML1
## Correlation of (regression) scores with factors 0.96 1.00 0.88 0.85 1.00
## Multiple R square of scores with factors 0.93 0.99 0.78 0.73 0.99
## Minimum correlation of possible factor scores 0.85 0.98 0.57 0.46 0.99
factor.plot(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))
fa.diagram(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))
First factor, ML3, seems to gather hobbies of a respondent: museums, art, theater, concerts, dining, music, yoga, reading, hiking. Interesting thing is that here most of the hobbies are kinda “intelligent”. Suh things like shopping and clubbing fell to other factors.
The second one, ML2, collects answers to the question “What do you think the opposite sex looks for in a date?”: attractive, sincere, intelligent, ambitious, has shared interests/hobbies. Also suddenly shar1_1 is here, too (which is answer “has shared interests/hobbies” to close question “what you look for in the opposite sex”)
Next, ML4, this factors explains a mix of: answers to the question “How do you think you measure up?” (fun, attractive, ambitious, intellingent), active hobbies (sport, excercise, tvsports)and how often reaspondent goes for a date. Maybe it’s a piece, where is explained how active people evaluate themselves and their date-activity?
likes tv, shopping, movies; looking for attractive and ambitious - that’s the character, that is described by the 4th factor, ML5;
the last factor, ML1, explains only 2 variables (according to the graph): it unites people, who look for a fun person on a date and expect, that the opposite sex on a date is looking for fun, too. If we look closer to the table we would also notice, that there also fit respondents, who expect, that the opposite sex on a date is looking for attractiveness.
Following the advice of my friend, I’ve decided also to try out less factors. Here’s the version with 3 factors:
And here factors just divided on that, explaining hobbies, and another, explaining what respondent expect from the opposite sex to look for on a date.
fa(dating12, nfactors=3, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating12, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML2 ML1 h2 u2 com
## imprace 0.07 -0.04 0.01 0.0061 0.9939 1.7
## imprelig 0.02 -0.06 0.02 0.0047 0.9953 1.3
## date -0.05 -0.06 -0.03 0.0070 0.9930 2.3
## go_out 0.02 0.00 0.03 0.0014 0.9986 1.9
## sports -0.15 0.11 0.04 0.0379 0.9621 2.0
## tvsports -0.10 0.10 0.16 0.0444 0.9556 2.4
## exercise 0.01 -0.08 -0.02 0.0061 0.9939 1.1
## dining 0.42 -0.04 -0.02 0.1805 0.8195 1.0
## museums 0.90 0.02 0.05 0.8206 0.1794 1.0
## art 0.89 0.02 0.07 0.8022 0.1978 1.0
## hiking 0.21 0.08 -0.04 0.0514 0.9486 1.4
## gaming -0.04 0.17 0.01 0.0295 0.9705 1.1
## clubbing 0.13 -0.02 -0.01 0.0170 0.9830 1.1
## reading 0.30 0.01 -0.03 0.0917 0.9083 1.0
## tv 0.08 0.00 0.02 0.0075 0.9925 1.2
## theater 0.62 -0.05 -0.01 0.3881 0.6119 1.0
## movies 0.39 0.01 0.02 0.1490 0.8510 1.0
## concerts 0.45 0.00 0.08 0.2124 0.7876 1.1
## music 0.33 0.01 0.04 0.1085 0.8915 1.0
## shopping 0.30 -0.14 0.00 0.1078 0.8922 1.4
## yoga 0.33 -0.01 0.04 0.1075 0.8925 1.0
## exphappy 0.05 0.19 -0.02 0.0372 0.9628 1.1
## attr1_1 -0.16 -0.32 0.16 0.1540 0.8460 2.0
## sinc1_1 0.00 0.15 -0.04 0.0247 0.9753 1.2
## intel1_1 0.13 -0.07 0.00 0.0224 0.9776 1.6
## fun1_1 -0.05 0.08 -0.29 0.0907 0.9093 1.2
## amb1_1 0.15 0.16 -0.08 0.0554 0.9446 2.5
## shar1_1 0.06 0.28 0.09 0.0869 0.9131 1.3
## attr2_1 0.11 -0.95 0.28 0.9951 0.0049 1.2
## sinc2_1 -0.12 0.66 0.04 0.4518 0.5482 1.1
## intel2_1 -0.09 0.60 0.05 0.3750 0.6250 1.1
## fun2_1 0.07 -0.05 -0.99 0.9950 0.0050 1.0
## amb2_1 -0.17 0.58 0.17 0.3958 0.6042 1.3
## shar2_1 0.08 0.50 0.04 0.2594 0.7406 1.1
## attr3_1 0.11 0.06 0.02 0.0167 0.9833 1.6
## sinc3_1 0.14 0.04 0.02 0.0215 0.9785 1.2
## intel3_1 0.04 0.12 0.06 0.0185 0.9815 1.6
## fun3_1 0.14 0.01 0.12 0.0348 0.9652 2.0
## amb3_1 0.10 0.05 0.09 0.0213 0.9787 2.5
##
## ML3 ML2 ML1
## SS loadings 3.25 2.69 1.30
## Proportion Var 0.08 0.07 0.03
## Cumulative Var 0.08 0.15 0.19
## Proportion Explained 0.45 0.37 0.18
## Cumulative Proportion 0.45 0.82 1.00
##
## Mean item complexity = 1.4
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 741 and the objective function was 17.61 with Chi Square of 143979.6
## The degrees of freedom for the model are 627 and the objective function was 12.44
##
## The root mean square of the residuals (RMSR) is 0.1
## The df corrected root mean square of the residuals is 0.11
##
## The harmonic number of observations is 8191 with the empirical chi square 123002.3 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 101687.7 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.166
## RMSEA index = 0.14 and the 90 % confidence intervals are 0.14 0.141
## BIC = 96037.94
## Fit based upon off diagonal values = 0.51
## Measures of factor score adequacy
## ML3 ML2 ML1
## Correlation of (regression) scores with factors 0.96 1.00 1.00
## Multiple R square of scores with factors 0.92 0.99 0.99
## Minimum correlation of possible factor scores 0.83 0.99 0.99
factor.plot(fa(dating12, nfactors=3, rotate="varimax", fm="ml"))
fa.diagram(fa(dating12, nfactors=3, rotate="varimax", fm="ml"))
I still don’t like factor, describing only 1-2 variables, so i will delite these variables from a dataset (fun1_1, fun2_1). Lat’s see what will happen.
# ones again
dating2<- dating[c("imprace","imprelig", "date", "go_out", "sports",
"tvsports", "exercise", "dining" , "museums", "art",
"hiking", "gaming", "clubbing",
"reading", "tv", "theater", "movies", "concerts",
"music", "shopping", "yoga", "exphappy" , "attr1_1",
"sinc1_1", "intel1_1", "amb1_1",
"shar1_1", "attr2_1", "sinc2_1", "intel2_1",
"amb2_1", "shar2_1", "attr3_1", "sinc3_1",
"intel3_1", "fun3_1", "amb3_1")]
dating2 <- as.data.frame(dating2)
dating22 <- na.omit(dating2)
# and model:
fa(dating22, nfactors=5, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating22, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML3 ML1 ML4 ML5 ML2 h2 u2 com
## imprace -0.01 -0.04 0.09 0.24 0.04 0.066 0.934 1.4
## imprelig -0.04 -0.10 0.07 0.09 0.24 0.083 0.917 1.9
## date -0.06 -0.06 -0.34 0.00 0.11 0.138 0.862 1.4
## go_out 0.03 0.00 -0.35 0.02 -0.03 0.127 0.873 1.0
## sports -0.14 0.13 0.33 -0.03 -0.14 0.162 0.838 2.1
## tvsports -0.15 0.08 0.16 0.25 -0.05 0.120 0.880 2.9
## exercise -0.02 -0.07 0.32 0.06 -0.01 0.109 0.891 1.2
## dining 0.34 -0.06 0.26 0.30 0.08 0.284 0.716 3.1
## museums 0.92 -0.06 0.00 0.13 0.01 0.860 0.140 1.0
## art 0.90 -0.06 0.02 0.15 -0.03 0.830 0.170 1.1
## hiking 0.23 0.07 0.04 -0.04 0.01 0.059 0.941 1.3
## gaming -0.10 0.19 0.06 0.27 -0.07 0.126 0.874 2.5
## clubbing 0.07 -0.02 0.17 0.21 -0.02 0.079 0.921 2.2
## reading 0.32 -0.03 -0.01 -0.07 0.11 0.119 0.881 1.3
## tv -0.09 0.01 -0.08 0.62 0.12 0.418 0.582 1.2
## theater 0.51 -0.08 -0.12 0.43 0.13 0.486 0.514 2.3
## movies 0.26 0.00 -0.16 0.51 0.09 0.360 0.640 1.8
## concerts 0.36 -0.03 -0.06 0.40 0.00 0.292 0.708 2.1
## music 0.23 0.00 0.08 0.37 0.01 0.199 0.801 1.8
## shopping 0.13 -0.14 0.15 0.58 0.08 0.399 0.601 1.4
## yoga 0.28 -0.05 0.12 0.16 0.10 0.129 0.871 2.4
## exphappy 0.04 0.20 0.16 0.10 -0.07 0.082 0.918 2.9
## attr1_1 -0.15 -0.21 0.25 0.03 -0.93 0.995 0.005 1.3
## sinc1_1 -0.01 0.11 -0.35 0.05 0.35 0.256 0.744 2.2
## intel1_1 0.18 -0.14 -0.10 -0.22 0.32 0.211 0.789 3.1
## amb1_1 0.09 0.10 0.14 0.14 0.52 0.331 0.669 1.5
## shar1_1 0.06 0.18 -0.16 0.00 0.39 0.214 0.786 1.8
## attr2_1 0.03 -0.99 0.00 0.07 -0.14 0.995 0.005 1.1
## sinc2_1 -0.07 0.64 -0.07 0.03 0.04 0.416 0.584 1.1
## intel2_1 -0.02 0.57 0.09 -0.07 0.02 0.343 0.657 1.1
## amb2_1 -0.08 0.53 0.11 -0.11 -0.04 0.314 0.686 1.2
## shar2_1 0.09 0.44 -0.07 0.04 0.22 0.258 0.742 1.7
## attr3_1 0.09 0.05 0.62 0.08 -0.06 0.404 0.596 1.1
## sinc3_1 0.09 0.00 0.12 0.14 0.20 0.083 0.917 3.0
## intel3_1 0.04 0.08 0.47 -0.05 0.07 0.233 0.767 1.2
## fun3_1 0.07 -0.02 0.58 0.26 0.06 0.416 0.584 1.4
## amb3_1 0.03 0.02 0.54 0.22 0.07 0.349 0.651 1.4
##
## ML3 ML1 ML4 ML5 ML2
## SS loadings 2.70 2.47 2.23 2.12 1.82
## Proportion Var 0.07 0.07 0.06 0.06 0.05
## Cumulative Var 0.07 0.14 0.20 0.26 0.31
## Proportion Explained 0.24 0.22 0.20 0.19 0.16
## Cumulative Proportion 0.24 0.46 0.65 0.84 1.00
##
## Mean item complexity = 1.7
## Test of the hypothesis that 5 factors are sufficient.
##
## The degrees of freedom for the null model are 666 and the objective function was 12.76 with Chi Square of 104370.9
## The degrees of freedom for the model are 491 and the objective function was 4.94
##
## The root mean square of the residuals (RMSR) is 0.06
## The df corrected root mean square of the residuals is 0.07
##
## The harmonic number of observations is 8191 with the empirical chi square 44458.21 with prob < 0
## The total number of observations was 8191 with Likelihood Chi Square = 40406.21 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.478
## RMSEA index = 0.1 and the 90 % confidence intervals are 0.099 0.1
## BIC = 35981.91
## Fit based upon off diagonal values = 0.81
## Measures of factor score adequacy
## ML3 ML1 ML4 ML5 ML2
## Correlation of (regression) scores with factors 0.96 1.00 0.87 0.86 0.99
## Multiple R square of scores with factors 0.91 0.99 0.75 0.73 0.97
## Minimum correlation of possible factor scores 0.83 0.99 0.51 0.47 0.95
factor.plot(fa(dating22, nfactors=5, rotate="varimax", fm="ml"))
fa.diagram(fa(dating22, nfactors=5, rotate="varimax", fm="ml"))
THAT LOOKS MUCH BETTER!
However I’ve decided to go further and clean the set even more - i’ve delited all the variables, that were not in use in models with 3 and 4 factors. Then, I’ve constracted models with that set and 3-4-5 factors, choosing the best option.
# let's clean it up last time
dating3<- dating[c("imprelig", "date", "go_out", "sports", "tvsports", "exercise", "dining" , "museums", "art", "tv", "theater", "movies", "concerts", "music", "shopping", "yoga", "attr1_1", "sinc1_1", "amb1_1", "attr2_1", "sinc2_1", "intel2_1", "amb2_1", "shar2_1", "attr3_1", "intel3_1", "fun3_1", "amb3_1")]
dating3 <- as.data.frame(dating3)
dating33 <- na.omit(dating3)
# model with 5 factors:
fa(dating33, nfactors=5, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating33, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML4 ML3 ML5 h2 u2 com
## imprelig -0.01 -0.08 0.02 -0.05 0.28 0.088 0.912 1.3
## date -0.07 -0.05 -0.36 -0.01 0.10 0.147 0.853 1.3
## go_out 0.04 0.00 -0.34 -0.01 -0.02 0.119 0.881 1.0
## sports -0.15 0.12 0.35 0.04 -0.23 0.219 0.781 2.5
## tvsports -0.13 0.06 0.24 0.12 -0.05 0.095 0.905 2.4
## exercise 0.00 -0.06 0.33 -0.02 0.03 0.114 0.886 1.1
## dining 0.38 -0.05 0.26 0.15 0.23 0.289 0.711 2.9
## museums 0.92 -0.03 -0.01 0.13 -0.01 0.873 0.127 1.0
## art 0.90 -0.04 0.02 0.17 -0.04 0.838 0.162 1.1
## tv 0.01 -0.01 0.03 0.16 0.39 0.177 0.823 1.4
## theater 0.54 -0.08 -0.10 0.30 0.31 0.490 0.510 2.4
## movies 0.27 -0.02 -0.08 0.39 0.26 0.299 0.701 2.7
## concerts 0.27 -0.02 -0.04 0.82 -0.03 0.752 0.248 1.2
## music 0.17 0.01 0.12 0.72 0.01 0.567 0.433 1.2
## shopping 0.21 -0.15 0.23 0.22 0.39 0.315 0.685 3.4
## yoga 0.27 -0.04 0.10 0.19 0.14 0.142 0.858 2.8
## attr1_1 -0.15 -0.33 0.36 -0.01 -0.54 0.552 0.448 2.7
## sinc1_1 -0.01 0.16 -0.40 0.07 0.18 0.226 0.774 1.8
## amb1_1 0.12 0.15 0.06 -0.03 0.58 0.382 0.618 1.2
## attr2_1 0.05 -1.00 0.00 0.04 -0.03 0.995 0.005 1.0
## sinc2_1 -0.06 0.63 -0.04 -0.02 -0.03 0.408 0.592 1.0
## intel2_1 -0.06 0.58 0.10 0.02 -0.08 0.356 0.644 1.1
## amb2_1 -0.09 0.53 0.13 -0.06 -0.10 0.315 0.685 1.3
## shar2_1 0.08 0.46 -0.10 0.05 0.21 0.273 0.727 1.6
## attr3_1 0.09 0.03 0.61 -0.01 0.05 0.382 0.618 1.1
## intel3_1 0.04 0.10 0.43 -0.07 0.05 0.201 0.799 1.2
## fun3_1 0.09 -0.03 0.58 0.14 0.15 0.393 0.607 1.3
## amb3_1 0.05 0.01 0.54 0.08 0.25 0.360 0.640 1.5
##
## ML2 ML1 ML4 ML3 ML5
## SS loadings 2.51 2.45 2.22 1.69 1.50
## Proportion Var 0.09 0.09 0.08 0.06 0.05
## Cumulative Var 0.09 0.18 0.26 0.32 0.37
## Proportion Explained 0.24 0.24 0.21 0.16 0.14
## Cumulative Proportion 0.24 0.48 0.69 0.86 1.00
##
## Mean item complexity = 1.7
## Test of the hypothesis that 5 factors are sufficient.
##
## The degrees of freedom for the null model are 378 and the objective function was 9.48 with Chi Square of 77958.42
## The degrees of freedom for the model are 248 and the objective function was 2.48
##
## The root mean square of the residuals (RMSR) is 0.06
## The df corrected root mean square of the residuals is 0.07
##
## The harmonic number of observations is 8235 with the empirical chi square 19097.03 with prob < 0
## The total number of observations was 8235 with Likelihood Chi Square = 20383.11 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.604
## RMSEA index = 0.099 and the 90 % confidence intervals are 0.098 0.1
## BIC = 18147.1
## Fit based upon off diagonal values = 0.9
## Measures of factor score adequacy
## ML2 ML1 ML4 ML3 ML5
## Correlation of (regression) scores with factors 0.96 1.00 0.88 0.89 0.84
## Multiple R square of scores with factors 0.92 0.99 0.77 0.79 0.70
## Minimum correlation of possible factor scores 0.83 0.99 0.53 0.59 0.40
#factor.plot(fa(dating33, nfactors=5, rotate="varimax", fm="ml"))
fa.diagram(fa(dating33, nfactors=5, rotate="varimax", fm="ml"))
# model with 4 factors:
fa(dating33, nfactors=4, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating33, nfactors = 4, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 ML4 h2 u2 com
## imprelig -0.01 -0.06 0.01 0.20 0.044 0.956 1.2
## date -0.07 -0.04 -0.37 0.07 0.147 0.853 1.2
## go_out 0.02 0.00 -0.35 -0.01 0.125 0.875 1.0
## sports -0.15 0.10 0.39 -0.15 0.207 0.793 1.8
## tvsports -0.09 0.05 0.26 0.09 0.085 0.915 1.6
## exercise 0.01 -0.06 0.33 0.03 0.114 0.886 1.1
## dining 0.44 -0.03 0.23 0.22 0.293 0.707 2.1
## museums 0.90 -0.03 -0.05 -0.13 0.840 0.160 1.1
## art 0.92 -0.03 -0.01 -0.16 0.865 0.135 1.1
## tv 0.11 0.02 -0.01 0.54 0.304 0.696 1.1
## theater 0.63 -0.05 -0.15 0.34 0.530 0.470 1.7
## movies 0.39 0.01 -0.11 0.42 0.340 0.660 2.1
## concerts 0.46 -0.03 -0.01 0.21 0.256 0.744 1.4
## music 0.35 0.00 0.13 0.23 0.191 0.809 2.1
## shopping 0.32 -0.12 0.19 0.49 0.389 0.611 2.2
## yoga 0.33 -0.03 0.09 0.13 0.134 0.866 1.5
## attr1_1 -0.17 -0.37 0.36 -0.29 0.376 0.624 3.3
## sinc1_1 0.01 0.17 -0.39 0.16 0.209 0.791 1.7
## amb1_1 0.16 0.18 0.02 0.33 0.171 0.829 2.0
## attr2_1 0.07 -0.99 -0.02 0.04 0.995 0.005 1.0
## sinc2_1 -0.07 0.63 -0.03 -0.03 0.406 0.594 1.0
## intel2_1 -0.06 0.57 0.13 -0.07 0.349 0.651 1.2
## amb2_1 -0.12 0.51 0.14 -0.13 0.316 0.684 1.4
## shar2_1 0.10 0.47 -0.10 0.12 0.260 0.740 1.3
## attr3_1 0.12 0.03 0.59 0.02 0.366 0.634 1.1
## intel3_1 0.04 0.09 0.42 -0.02 0.188 0.812 1.1
## fun3_1 0.17 -0.03 0.58 0.18 0.397 0.603 1.4
## amb3_1 0.11 0.02 0.51 0.21 0.314 0.686 1.4
##
## ML2 ML1 ML3 ML4
## SS loadings 3.13 2.45 2.20 1.43
## Proportion Var 0.11 0.09 0.08 0.05
## Cumulative Var 0.11 0.20 0.28 0.33
## Proportion Explained 0.34 0.27 0.24 0.16
## Cumulative Proportion 0.34 0.61 0.84 1.00
##
## Mean item complexity = 1.5
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 378 and the objective function was 9.48 with Chi Square of 77958.42
## The degrees of freedom for the model are 272 and the objective function was 3.09
##
## The root mean square of the residuals (RMSR) is 0.06
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8235 with the empirical chi square 25624.7 with prob < 0
## The total number of observations was 8235 with Likelihood Chi Square = 25387.3 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.55
## RMSEA index = 0.106 and the 90 % confidence intervals are 0.105 0.107
## BIC = 22934.91
## Fit based upon off diagonal values = 0.86
## Measures of factor score adequacy
## ML2 ML1 ML3 ML4
## Correlation of (regression) scores with factors 0.97 1.00 0.87 0.84
## Multiple R square of scores with factors 0.93 0.99 0.76 0.70
## Minimum correlation of possible factor scores 0.86 0.99 0.51 0.41
#factor.plot(fa(dating33, nfactors=4, rotate="varimax", fm="ml"))
fa.diagram(fa(dating33, nfactors=4, rotate="varimax", fm="ml"))
# model with 3 factors:
fa(dating33, nfactors=3, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating33, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 h2 u2 com
## imprelig 0.00 -0.07 0.04 0.007 0.993 1.6
## date -0.09 -0.04 -0.35 0.131 0.869 1.2
## go_out 0.00 0.01 -0.36 0.127 0.873 1.0
## sports -0.13 0.09 0.37 0.164 0.836 1.4
## tvsports -0.08 0.04 0.27 0.079 0.921 1.2
## exercise 0.03 -0.07 0.33 0.115 0.885 1.1
## dining 0.45 -0.03 0.23 0.258 0.742 1.5
## museums 0.90 0.01 -0.12 0.819 0.181 1.0
## art 0.90 0.00 -0.09 0.812 0.188 1.0
## tv 0.10 0.00 0.05 0.013 0.987 1.4
## theater 0.62 -0.04 -0.13 0.398 0.602 1.1
## movies 0.39 0.01 -0.07 0.154 0.846 1.1
## concerts 0.46 -0.02 -0.01 0.215 0.785 1.0
## music 0.36 0.00 0.14 0.148 0.852 1.3
## shopping 0.33 -0.14 0.21 0.169 0.831 2.1
## yoga 0.34 -0.02 0.09 0.123 0.877 1.2
## attr1_1 -0.14 -0.37 0.29 0.236 0.764 2.2
## sinc1_1 -0.02 0.18 -0.35 0.151 0.849 1.5
## amb1_1 0.15 0.17 0.08 0.059 0.941 2.4
## attr2_1 0.10 -0.99 -0.05 0.995 0.005 1.0
## sinc2_1 -0.10 0.63 -0.01 0.406 0.594 1.0
## intel2_1 -0.07 0.57 0.14 0.345 0.655 1.1
## amb2_1 -0.13 0.51 0.14 0.299 0.701 1.3
## shar2_1 0.08 0.47 -0.06 0.234 0.766 1.1
## attr3_1 0.15 0.02 0.59 0.372 0.628 1.1
## intel3_1 0.06 0.08 0.43 0.194 0.806 1.1
## fun3_1 0.20 -0.05 0.59 0.393 0.607 1.2
## amb3_1 0.14 0.00 0.53 0.297 0.703 1.1
##
## ML2 ML1 ML3
## SS loadings 3.13 2.43 2.15
## Proportion Var 0.11 0.09 0.08
## Cumulative Var 0.11 0.20 0.28
## Proportion Explained 0.41 0.32 0.28
## Cumulative Proportion 0.41 0.72 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 378 and the objective function was 9.48 with Chi Square of 77958.42
## The degrees of freedom for the model are 297 and the objective function was 3.99
##
## The root mean square of the residuals (RMSR) is 0.08
## The df corrected root mean square of the residuals is 0.09
##
## The harmonic number of observations is 8235 with the empirical chi square 40758.25 with prob < 0
## The total number of observations was 8235 with Likelihood Chi Square = 32769.1 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.467
## RMSEA index = 0.115 and the 90 % confidence intervals are 0.114 0.116
## BIC = 30091.31
## Fit based upon off diagonal values = 0.78
## Measures of factor score adequacy
## ML2 ML1 ML3
## Correlation of (regression) scores with factors 0.96 1.00 0.87
## Multiple R square of scores with factors 0.91 0.99 0.75
## Minimum correlation of possible factor scores 0.83 0.99 0.50
#factor.plot(fa(dating33, nfactors=3, rotate="varimax", fm="ml"))
fa.diagram(fa(dating33, nfactors=3, rotate="varimax", fm="ml"))
Well, it’s much better in my opinion. I’ll stay with the model with 3 variables, because there we see that proportion variance gets to 0.11 which is not so bad; then, proportion explained - it’s not really good, but a bit better than in two other models (it can be sen from the graph especially well). Cumulative variance there is less than in other models, but i still like it more.
Last but not least, I’ve cleaned it again. In the final model we have better cumulative variance (0.32) and proportion variance (from 0.9 to 0.13). Proportion explained variates from 0.27 to 0.41 (taking into account than with three factors the perfect is 0.33 - it’s quite good result!). RMSR is still not the best one, but i guess, we’ve already got used to that.
# last time
dating4<- dating[c( "date", "go_out", "sports", "exercise", "dining" , "museums", "art", "theater", "movies", "concerts", "music", "shopping", "yoga", "attr1_1", "attr2_1", "sinc2_1", "intel2_1", "amb2_1", "shar2_1", "attr3_1", "intel3_1", "fun3_1", "amb3_1")]
dating4 <- as.data.frame(dating4)
dating44 <- na.omit(dating4)
# model with 3 factors:
fa(dating44, nfactors=3, rotate="varimax", fm="ml")
## Factor Analysis using method = ml
## Call: fa(r = dating44, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 h2 u2 com
## date -0.07 -0.04 -0.35 0.13 0.869 1.1
## go_out 0.02 0.01 -0.35 0.12 0.876 1.0
## sports -0.14 0.09 0.33 0.14 0.863 1.5
## exercise 0.01 -0.07 0.31 0.10 0.898 1.1
## dining 0.43 -0.04 0.27 0.26 0.741 1.7
## museums 0.91 0.01 -0.07 0.83 0.168 1.0
## art 0.91 0.00 -0.04 0.82 0.178 1.0
## theater 0.61 -0.05 -0.08 0.39 0.615 1.0
## movies 0.38 0.00 -0.04 0.15 0.854 1.0
## concerts 0.46 -0.02 0.03 0.21 0.788 1.0
## music 0.35 0.00 0.17 0.15 0.850 1.5
## shopping 0.30 -0.14 0.23 0.16 0.837 2.3
## yoga 0.33 -0.03 0.13 0.12 0.876 1.3
## attr1_1 -0.15 -0.36 0.25 0.21 0.788 2.2
## attr2_1 0.10 -0.99 -0.04 1.00 0.005 1.0
## sinc2_1 -0.09 0.63 -0.01 0.40 0.596 1.0
## intel2_1 -0.07 0.57 0.12 0.34 0.657 1.1
## amb2_1 -0.13 0.51 0.11 0.29 0.705 1.2
## shar2_1 0.08 0.47 -0.05 0.23 0.769 1.1
## attr3_1 0.11 0.02 0.62 0.40 0.605 1.1
## intel3_1 0.04 0.09 0.44 0.20 0.796 1.1
## fun3_1 0.16 -0.05 0.62 0.42 0.585 1.1
## amb3_1 0.10 0.00 0.53 0.29 0.714 1.1
##
## ML2 ML1 ML3
## SS loadings 3.03 2.37 1.97
## Proportion Var 0.13 0.10 0.09
## Cumulative Var 0.13 0.23 0.32
## Proportion Explained 0.41 0.32 0.27
## Cumulative Proportion 0.41 0.73 1.00
##
## Mean item complexity = 1.2
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 253 and the objective function was 7.49 with Chi Square of 61673.47
## The degrees of freedom for the model are 187 and the objective function was 2.26
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.08
##
## The harmonic number of observations is 8245 with the empirical chi square 19574.7 with prob < 0
## The total number of observations was 8245 with Likelihood Chi Square = 18584.33 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.595
## RMSEA index = 0.109 and the 90 % confidence intervals are 0.108 0.111
## BIC = 16898.08
## Fit based upon off diagonal values = 0.87
## Measures of factor score adequacy
## ML2 ML1 ML3
## Correlation of (regression) scores with factors 0.96 1.00 0.86
## Multiple R square of scores with factors 0.92 0.99 0.74
## Minimum correlation of possible factor scores 0.84 0.99 0.48
factor.plot(fa(dating44, nfactors=3, rotate="varimax", fm="ml"))
fa.diagram(fa(dating44, nfactors=3, rotate="varimax", fm="ml"))
And a quick look at our factors:
first factor, ML2, collects hobbies : artistic and cultural ones in general (museum, theater, concerts, dining, music). Yoga and shopping also goes there.
then, ML1, as in the previous models it stays with answers to the question “What do you think the opposite sex looks for in a date?”: attractive, sincere, intelligent, ambitious, has shared interests/hobbies. Attr1_1 is here, too (which is “attractiveness” to close question “what you look for in the opposite sex”)
last one, ML3, is a mix of answers to the question “How do you think you measure up?” (attractie, fun, ambitious, intelligent) and some active activities (sport, exercise, going out and dates).
Let’s test the fit of the scales ML3: dining, museums, art, theater, movies, concerts, music, yoga. All var should be reversed to + or - side.
Internal consistency: Cronbach’s alpha
ML3<- as.data.frame(dating12[c("dining", "museums",
"art", "theater",
"movies", "concerts",
"music", "yoga")])
alpha(ML3)
##
## Reliability analysis
## Call: alpha(x = ML3)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.79 0.8 0.83 0.33 3.9 0.0035 6.9 1.3 0.29
##
## lower alpha upper 95% confidence boundaries
## 0.78 0.79 0.8
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## dining 0.78 0.79 0.83 0.35 3.8 0.0037 0.030 0.29
## museums 0.74 0.75 0.76 0.30 3.0 0.0043 0.017 0.29
## art 0.74 0.75 0.76 0.30 3.0 0.0044 0.018 0.25
## theater 0.75 0.76 0.79 0.31 3.1 0.0042 0.028 0.29
## movies 0.77 0.78 0.81 0.34 3.6 0.0038 0.030 0.29
## concerts 0.76 0.76 0.78 0.31 3.2 0.0041 0.028 0.29
## music 0.78 0.78 0.80 0.34 3.6 0.0037 0.025 0.29
## yoga 0.81 0.80 0.84 0.37 4.1 0.0032 0.027 0.33
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## dining 8191 0.51 0.53 0.41 0.37 7.8 1.7
## museums 8191 0.76 0.75 0.77 0.66 7.0 2.0
## art 8191 0.77 0.76 0.78 0.66 6.7 2.2
## theater 8191 0.72 0.72 0.67 0.59 6.8 2.2
## movies 8191 0.57 0.60 0.51 0.45 7.9 1.7
## concerts 8191 0.69 0.70 0.66 0.56 6.9 2.1
## music 8191 0.56 0.59 0.52 0.43 7.9 1.7
## yoga 8191 0.54 0.48 0.34 0.32 4.4 2.7
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 7 8 9 10 miss
## dining 0.00 0.00 0.01 0.02 0.08 0.09 0.18 0.23 0.20 0.19 0
## museums 0.01 0.01 0.05 0.06 0.10 0.11 0.22 0.19 0.15 0.10 0
## art 0.01 0.02 0.07 0.06 0.12 0.11 0.16 0.21 0.11 0.11 0
## theater 0.02 0.02 0.05 0.06 0.12 0.11 0.20 0.16 0.15 0.11 0
## movies 0.00 0.01 0.01 0.02 0.04 0.07 0.18 0.25 0.23 0.18 0
## concerts 0.01 0.03 0.05 0.06 0.11 0.14 0.18 0.18 0.14 0.10 0
## music 0.00 0.00 0.00 0.03 0.07 0.09 0.19 0.20 0.20 0.21 0
## yoga 0.19 0.14 0.13 0.09 0.10 0.10 0.10 0.06 0.05 0.04 0
Cronbach’s alpha (>.7 indicates good reliability). We have 0.79. It is quite good.
The task is:
ML2<- as.data.frame(dating44[c("dining", "museums", "art", "theater", "movies", "concerts", "music", "yoga", "shopping")])
alpha(ML2, check.keys=TRUE)
##
## Reliability analysis
## Call: alpha(x = ML2, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.8 0.81 0.84 0.32 4.2 0.0034 6.8 1.3 0.28
##
## lower alpha upper 95% confidence boundaries
## 0.79 0.8 0.8
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## dining 0.78 0.80 0.83 0.33 3.9 0.0037 0.026 0.28
## museums 0.76 0.77 0.78 0.29 3.3 0.0041 0.014 0.26
## art 0.75 0.77 0.78 0.29 3.3 0.0041 0.015 0.25
## theater 0.76 0.77 0.81 0.30 3.4 0.0040 0.023 0.25
## movies 0.78 0.79 0.82 0.32 3.8 0.0037 0.025 0.26
## concerts 0.77 0.78 0.80 0.30 3.5 0.0039 0.022 0.26
## music 0.78 0.79 0.81 0.32 3.8 0.0036 0.022 0.29
## yoga 0.80 0.81 0.84 0.35 4.2 0.0033 0.023 0.30
## shopping 0.79 0.80 0.83 0.33 4.0 0.0034 0.025 0.29
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## dining 8245 0.56 0.57 0.48 0.44 7.8 1.8
## museums 8245 0.73 0.74 0.76 0.64 7.0 2.0
## art 8245 0.74 0.74 0.76 0.64 6.7 2.2
## theater 8245 0.71 0.72 0.68 0.60 6.8 2.2
## movies 8245 0.56 0.60 0.52 0.46 7.9 1.7
## concerts 8245 0.67 0.68 0.65 0.55 6.8 2.1
## music 8245 0.56 0.58 0.53 0.44 7.9 1.8
## yoga 8245 0.52 0.48 0.35 0.33 4.4 2.7
## shopping 8245 0.56 0.53 0.43 0.38 5.6 2.6
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 7 8 9 10 miss
## dining 0.00 0.00 0.01 0.02 0.08 0.09 0.18 0.23 0.20 0.18 0
## museums 0.01 0.01 0.05 0.06 0.10 0.11 0.22 0.19 0.15 0.10 0
## art 0.01 0.03 0.07 0.06 0.12 0.11 0.16 0.21 0.11 0.12 0
## theater 0.02 0.02 0.05 0.06 0.12 0.11 0.20 0.16 0.15 0.11 0
## movies 0.00 0.01 0.01 0.02 0.04 0.07 0.18 0.25 0.23 0.18 0
## concerts 0.01 0.03 0.05 0.06 0.11 0.14 0.19 0.18 0.14 0.10 0
## music 0.00 0.00 0.01 0.03 0.07 0.09 0.19 0.20 0.20 0.22 0
## yoga 0.19 0.15 0.13 0.09 0.10 0.10 0.10 0.06 0.05 0.04 0
## shopping 0.06 0.11 0.07 0.09 0.14 0.12 0.14 0.10 0.10 0.07 0
ML1<- as.data.frame(dating44[c("attr1_1", "attr2_1", "sinc2_1", "intel2_1", "amb2_1", "shar2_1")])
alpha(ML1, check.keys=TRUE)
## Warning in alpha(ML1, check.keys = TRUE): Some items were negatively correlated with total scale and were automatically reversed.
## This is indicated by a negative sign for the variable name.
##
## Reliability analysis
## Call: alpha(x = ML1, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.69 0.69 0.79 0.27 2.3 0.004 33 6.1 0.23
##
## lower alpha upper 95% confidence boundaries
## 0.68 0.69 0.7
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## attr1_1- 0.70 0.69 0.80 0.31 2.26 0.0030 0.049 0.24
## attr2_1- 0.46 0.48 0.45 0.16 0.93 0.0090 0.010 0.18
## sinc2_1 0.64 0.65 0.71 0.27 1.86 0.0048 0.044 0.29
## intel2_1 0.66 0.67 0.73 0.29 2.03 0.0046 0.041 0.26
## amb2_1 0.67 0.69 0.74 0.31 2.22 0.0044 0.039 0.28
## shar2_1 0.67 0.69 0.74 0.31 2.21 0.0045 0.042 0.24
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## attr1_1- 8245 0.59 0.53 0.35 0.31 78 12.0
## attr2_1- 8245 0.94 0.94 1.01 0.83 70 16.0
## sinc2_1 8245 0.63 0.64 0.58 0.49 13 7.0
## intel2_1 8245 0.56 0.59 0.51 0.42 14 6.2
## amb2_1 8245 0.51 0.54 0.45 0.35 12 6.9
## shar2_1 8245 0.52 0.54 0.45 0.38 12 6.2
ML3<- as.data.frame(dating44[c("attr3_1", "intel3_1", "fun3_1", "amb3_1", "date", "go_out", "sports", "exercise")])
alpha(ML3, check.keys=TRUE)
## Warning in alpha(ML3, check.keys = TRUE): Some items were negatively correlated with total scale and were automatically reversed.
## This is indicated by a negative sign for the variable name.
##
## Reliability analysis
## Call: alpha(x = ML3, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.63 0.67 0.67 0.2 2 0.0059 7.3 0.93 0.16
##
## lower alpha upper 95% confidence boundaries
## 0.62 0.63 0.65
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## attr3_1 0.58 0.61 0.60 0.18 1.5 0.0067 0.0095 0.15
## intel3_1 0.61 0.64 0.64 0.20 1.8 0.0063 0.0105 0.16
## fun3_1 0.58 0.62 0.61 0.19 1.6 0.0066 0.0100 0.15
## amb3_1 0.59 0.63 0.63 0.19 1.7 0.0065 0.0119 0.16
## date- 0.62 0.65 0.65 0.21 1.9 0.0061 0.0124 0.16
## go_out- 0.62 0.65 0.65 0.21 1.9 0.0062 0.0118 0.16
## sports 0.61 0.65 0.64 0.21 1.9 0.0064 0.0113 0.18
## exercise 0.61 0.66 0.64 0.22 1.9 0.0065 0.0100 0.18
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## attr3_1 8245 0.59 0.65 0.60 0.45 7.1 1.4
## intel3_1 8245 0.45 0.54 0.44 0.33 8.4 1.1
## fun3_1 8245 0.57 0.62 0.55 0.41 7.7 1.6
## amb3_1 8245 0.57 0.59 0.50 0.38 7.6 1.8
## date- 8245 0.44 0.50 0.38 0.27 6.0 1.4
## go_out- 8245 0.42 0.50 0.38 0.28 8.8 1.1
## sports 8245 0.63 0.50 0.38 0.34 6.4 2.6
## exercise 8245 0.61 0.49 0.37 0.34 6.2 2.4
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 7 8 9 10 miss
## attr3_1 0.00 0.00 0.02 0.03 0.08 0.13 0.35 0.27 0.09 0.03 0
## intel3_1 0.00 0.00 0.00 0.00 0.01 0.02 0.14 0.35 0.32 0.16 0
## fun3_1 0.00 0.01 0.01 0.01 0.04 0.11 0.20 0.28 0.22 0.11 0
## amb3_1 0.00 0.01 0.02 0.03 0.08 0.08 0.20 0.25 0.20 0.14 0
## date 0.01 0.04 0.09 0.25 0.19 0.25 0.17 0.00 0.00 0.00 0
## go_out 0.31 0.36 0.24 0.05 0.02 0.01 0.00 0.00 0.00 0.00 0
## sports 0.04 0.06 0.08 0.07 0.10 0.09 0.14 0.16 0.13 0.13 0
## exercise 0.04 0.06 0.07 0.07 0.13 0.14 0.15 0.16 0.11 0.08 0
As I don’t have many factors, only 3, I’ve made analysis on all of them.
We see, that the first factor (ML2) is the best - its standardized alpha is 0.8, which is very nice. The second factor (ML1) has std.alpha 0.7, still pretty good, and the last one (ML3) has std.alpha 0.67.
Then we look at the reliability if an item is dropped.
Let’s focus on our first factor (about cultural hobbies): its standardized alpha is 0.8, hold it in your head. If we look at std.alpha for the variables, we may notive that in general it decreases. Only in case of shopping and yoga it changes a bit in the increasing way, which is could be expected (not the most “cultural” hobbies). However, difference is still not very big.
Now let’s have a look at the next factor, ML2, which is about the opposite sex expectations (what do they look for): as we remember, standardized alpha there is smallerm but still OK, 0.69. All of the variables there give lower std.alpha, being dropped. Great!
And the last factor, ML3, about the respondent themselves (how they measure themselves and their active lives): the lowest std.alpha 0.67 (which is a bit similar to the result of the previous model, but still the worst among all three factors). As for the reliability if an item is dropped: all of the variables there give lower std.alpha, being dropped. Very nice.
All in all, I think, I’ve got not so bad results. Factors give good explanations, variables in each do not contrast to each other, and even the model has not that bad variance. Perfect! Now let’s save it in new dataframe (spoiler: it would be called “dating_fa”).
If you need to save scores and use for future analysis. For example, include in a regression analysis.
fa1<-fa(dating12, nfactors=5, rotate="varimax", fm="ml", scores=T)
load <- fa1$loadings[,1:2]
plot(load) # set up plot
fascores<-as.data.frame(fa1$scores)
datingfa<-cbind(dating12,fascores)
#names(datingfa)
fa2<-fa(dating44, nfactors=3, rotate="varimax", fm="ml", scores=T)
load <- fa2$loadings[,1:2]
plot(load)
fascores<-as.data.frame(fa2$scores)
dating_fa<-cbind(dating44,fascores) # now we have all our factor scores in a data frame, hurray!
#names(datingfa)
~ done :)