Data and question

Speed Dating dataset (Kaggle) “What influences love at first sight?” Read about the experiment. https://www.kaggle.com/annavictoria/speed-dating-experiment

dating <-read.csv("Speed Dating Data.csv")
#names(dating)
# Choose the variables we think belong to factors.
dating1<- dating[c("imprace","imprelig", "date", "go_out", "sports", 
                   "tvsports", "exercise",  "dining" , "museums",  "art",  
                   "hiking", "gaming",  "clubbing",  
                   "reading", "tv",  "theater", "movies",  "concerts",   
                   "music",   "shopping",   "yoga", "exphappy" , "attr1_1",
                   "sinc1_1",   "intel1_1", "fun1_1",   "amb1_1",   
                   "shar1_1", "attr2_1", "sinc2_1",   "intel2_1",   
                   "fun2_1",   "amb2_1",   "shar2_1",   "attr3_1",   "sinc3_1",
                   "intel3_1",   "fun3_1",   "amb3_1")]
dating1 <- as.data.frame(dating1)
#dim(dating1)                
# summary(dating1)

We have such a question: Are there latent factors which explain correlations of the observed variables?

Part 1

NB:

If you have ordinal or binary variables, there are two ways:

  • in advance create a correlation matrix using the hetcor function (that selects the type of correlation for the type of variables)
  • use cor = “mixed” in fa.
library(polycor)
dat.cor <- hetcor(dating1)
dat.cor<- dat.cor$correlations
  • Check the type of variables.
  • fa does not work with NA
dating12 <- na.omit(dating1)
library(psych)
## 
## Attaching package: 'psych'
## The following object is masked from 'package:polycor':
## 
##     polyserial
fa.parallel(dating12, fa="both", n.iter=100) 

## Parallel analysis suggests that the number of factors =  15  and the number of components =  13

How many factors should be extracted? Interpret the Parallel Analysis screen plot.

Here we need to count how many crosses are above black line - that’s the N of factors, suggested by parallel analysis. We have 15. The hogher the cross - the better is explains it’s varables (higher eigenvalues).

Let’s try to use the maximum number of factors firstly.

[1] No rotation

fa(dating12, nfactors=15, rotate="none", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 15, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML1   ML2  ML11  ML10  ML12  ML13   ML4   ML5   ML3  ML14  ML15
## imprace  -0.06  0.01 -0.03  0.14  0.16 -0.08  0.05  0.02  0.07  0.29 -0.09
## imprelig  0.01  0.19 -0.09  0.04  0.10  0.01  0.14 -0.02  0.08  0.21  0.01
## date      0.04  0.24 -0.09 -0.04 -0.06 -0.14 -0.02  0.03 -0.08 -0.06 -0.19
## go_out    0.06  0.12  0.03 -0.01 -0.09 -0.18 -0.05 -0.11 -0.14 -0.02 -0.19
## sports    0.02 -0.23 -0.04 -0.03  0.20  0.72 -0.14 -0.06  0.00  0.11 -0.29
## tvsports  0.06 -0.14 -0.01  0.04  0.35  0.31 -0.18 -0.12 -0.03  0.26 -0.30
## exercise -0.11 -0.04 -0.05  0.12  0.12  0.39 -0.01  0.03  0.04  0.19 -0.09
## dining   -0.06  0.06  0.25  0.31  0.14  0.03  0.10  0.03  0.07  0.12  0.23
## museums  -0.03  0.18  0.80  0.43 -0.15  0.03  0.12  0.03  0.04  0.08 -0.04
## art      -0.04  0.13  0.73  0.43 -0.10  0.01  0.07  0.03  0.06  0.00  0.00
## hiking    0.08  0.04  0.20  0.11 -0.02  0.25 -0.03  0.03 -0.06 -0.16 -0.09
## gaming    0.12 -0.17  0.00  0.05  0.28  0.00 -0.10 -0.04  0.02  0.08 -0.09
## clubbing -0.06 -0.07  0.06  0.19  0.13  0.07 -0.03  0.05 -0.02 -0.01  0.06
## reading   0.02  0.14  0.24  0.08 -0.09 -0.01  0.16  0.02  0.02  0.03  0.12
## tv        0.06  0.11 -0.01  0.14  0.51 -0.38 -0.02 -0.01 -0.02  0.54 -0.13
## theater  -0.01  0.24  0.39  0.37  0.16 -0.22  0.14  0.03 -0.01  0.01  0.01
## movies    0.03  0.14  0.27  0.20  0.36 -0.26  0.08  0.01 -0.05  0.02 -0.02
## concerts  0.00  0.12  0.42  0.24  0.54 -0.04  0.04 -0.11  0.04 -0.49 -0.05
## music    -0.01  0.06  0.29  0.21  0.50  0.01  0.03 -0.04  0.03 -0.36  0.10
## shopping -0.14  0.08  0.09  0.35  0.35 -0.20  0.05  0.06 -0.01  0.27  0.09
## yoga      0.00  0.12  0.20  0.20  0.10  0.06  0.06  0.00  0.05 -0.08  0.11
## exphappy  0.12 -0.20  0.13  0.01  0.17  0.18 -0.07  0.03  0.00 -0.01  0.01
## attr1_1  -0.64 -0.55  0.03 -0.03  0.00  0.00 -0.41 -0.17  0.06  0.00  0.00
## sinc1_1   0.41  0.43  0.02 -0.04  0.00  0.00 -0.04 -0.24 -0.74  0.00  0.00
## intel1_1 -0.02  0.07  0.08 -0.30  0.03  0.01  0.53  0.06  0.03  0.01  0.01
## fun1_1    0.05 -0.19  0.14 -0.24  0.00  0.00  0.18  0.58 -0.03  0.00  0.00
## amb1_1    0.25  0.21 -0.38  0.74 -0.04  0.00  0.21  0.20  0.10 -0.03 -0.02
## shar1_1   0.50  0.49  0.03 -0.04  0.00  0.00 -0.12 -0.22  0.61  0.00  0.00
## attr2_1  -0.87  0.38 -0.01  0.00  0.00  0.00  0.02 -0.20 -0.04  0.00  0.00
## sinc2_1   0.68 -0.25  0.00  0.01  0.00  0.00 -0.55  0.06 -0.17  0.00  0.00
## intel2_1  0.50 -0.61  0.00  0.02  0.00  0.00  0.48 -0.32  0.01  0.00  0.00
## fun2_1    0.05  0.05  0.01  0.02  0.01  0.00  0.21  0.73 -0.14  0.00  0.00
## amb2_1    0.36 -0.34  0.04 -0.02  0.01 -0.01 -0.03 -0.06  0.11  0.00  0.00
## shar2_1   0.53  0.26 -0.01 -0.02  0.00  0.00 -0.12  0.06  0.32  0.00  0.00
## attr3_1  -0.08 -0.20  0.03  0.22  0.12  0.29 -0.07  0.08  0.09  0.17  0.45
## sinc3_1   0.12  0.20  0.09  0.09  0.16  0.17  0.02 -0.12 -0.21  0.11  0.24
## intel3_1  0.05 -0.10  0.04  0.01  0.09  0.27  0.03 -0.04  0.14  0.20  0.41
## fun3_1   -0.08 -0.14  0.09  0.23  0.25  0.26 -0.02  0.03  0.04  0.15  0.44
## amb3_1   -0.05 -0.14 -0.11  0.39  0.20  0.21  0.00  0.01  0.13  0.11  0.28
##            ML9   ML7   ML8   ML6    h2     u2 com
## imprace   0.01  0.02  0.02 -0.01 0.155 0.8451 2.9
## imprelig -0.01  0.01  0.05 -0.02 0.130 0.8702 4.4
## date      0.04 -0.03  0.01  0.02 0.144 0.8561 3.9
## go_out   -0.07 -0.04 -0.04  0.00 0.138 0.8623 5.9
## sports    0.07 -0.02 -0.06  0.10 0.751 0.2491 2.0
## tvsports  0.15 -0.05  0.03 -0.04 0.478 0.5217 5.7
## exercise  0.04  0.01 -0.02  0.05 0.248 0.7518 2.5
## dining   -0.04  0.13  0.02 -0.02 0.298 0.7025 4.9
## museums  -0.11  0.10  0.13 -0.04 0.938 0.0616 2.0
## art      -0.04  0.14  0.11 -0.05 0.790 0.2102 2.0
## hiking    0.06  0.03 -0.04  0.04 0.165 0.8350 4.5
## gaming    0.00 -0.01 -0.07 -0.04 0.156 0.8445 3.3
## clubbing  0.06  0.02  0.00 -0.06 0.085 0.9147 4.3
## reading  -0.13  0.05  0.06  0.02 0.156 0.8440 5.0
## tv        0.09  0.01  0.01 -0.03 0.756 0.2445 3.3
## theater  -0.04  0.18  0.06 -0.01 0.483 0.5174 4.6
## movies   -0.02  0.08  0.08 -0.02 0.353 0.6471 4.3
## concerts  0.03  0.08 -0.02  0.01 0.810 0.1901 3.7
## music     0.08  0.04 -0.01  0.08 0.537 0.4629 3.3
## shopping  0.08  0.07  0.04 -0.04 0.419 0.5807 4.8
## yoga      0.08  0.08  0.04  0.01 0.147 0.8532 5.4
## exphappy  0.04  0.04 -0.02 -0.01 0.141 0.8585 4.9
## attr1_1  -0.06  0.05 -0.16  0.22 0.995 0.0050 3.4
## sinc1_1   0.00 -0.04 -0.10  0.17 0.995 0.0050 2.7
## intel1_1 -0.54  0.20  0.38 -0.34 0.979 0.0211 4.6
## fun1_1    0.69 -0.10  0.03 -0.06 0.978 0.0217 2.8
## amb1_1    0.02 -0.10  0.18 -0.09 0.943 0.0572 2.7
## shar1_1   0.01 -0.07 -0.24 -0.10 0.995 0.0051 3.8
## attr2_1   0.09  0.08  0.04 -0.18 0.995 0.0050 1.6
## sinc2_1   0.00  0.12  0.06 -0.34 0.995 0.0050 3.0
## intel2_1  0.07  0.14 -0.11 -0.03 0.995 0.0050 3.7
## fun2_1   -0.21  0.05 -0.57  0.11 0.986 0.0137 2.5
## amb2_1   -0.09 -0.79  0.24  0.19 0.986 0.0135 2.3
## shar2_1   0.03  0.35  0.27  0.57 0.992 0.0085 4.4
## attr3_1  -0.01  0.09  0.10  0.10 0.472 0.5276 4.1
## sinc3_1   0.02  0.02 -0.02  0.10 0.266 0.7345 7.7
## intel3_1 -0.11 -0.04  0.08  0.03 0.343 0.6571 3.4
## fun3_1    0.32 -0.04  0.08  0.00 0.546 0.4542 4.9
## amb3_1    0.01 -0.04  0.13 -0.01 0.398 0.6022 4.6
## 
##                        ML1  ML2 ML11 ML10 ML12 ML13  ML4  ML5  ML3 ML14
## SS loadings           2.90 2.16 2.12 1.99 1.64 1.52 1.32 1.28 1.25 1.15
## Proportion Var        0.07 0.06 0.05 0.05 0.04 0.04 0.03 0.03 0.03 0.03
## Cumulative Var        0.07 0.13 0.18 0.24 0.28 0.32 0.35 0.38 0.42 0.44
## Proportion Explained  0.13 0.10 0.10 0.09 0.07 0.07 0.06 0.06 0.06 0.05
## Cumulative Proportion 0.13 0.23 0.32 0.41 0.49 0.56 0.62 0.67 0.73 0.78
##                       ML15  ML9  ML7  ML8  ML6
## SS loadings           1.11 1.07 0.98 0.85 0.79
## Proportion Var        0.03 0.03 0.03 0.02 0.02
## Cumulative Var        0.47 0.50 0.53 0.55 0.57
## Proportion Explained  0.05 0.05 0.04 0.04 0.04
## Cumulative Proportion 0.83 0.88 0.93 0.96 1.00
## 
## Mean item complexity =  3.8
## Test of the hypothesis that 15 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 261  and the objective function was  1.4 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.05 
## 
## The harmonic number of observations is  8191 with the empirical chi square  11661.7  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  11459.12  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.778
## RMSEA index =  0.072  and the 90 % confidence intervals are  0.071 0.074
## BIC =  9107.3
## Fit based upon off diagonal values = 0.95
## Measures of factor score adequacy             
##                                                   ML1  ML2 ML11 ML10 ML12
## Correlation of (regression) scores with factors     1 1.00 0.97 0.98 0.91
## Multiple R square of scores with factors            1 1.00 0.95 0.96 0.83
## Minimum correlation of possible factor scores       1 0.99 0.90 0.92 0.67
##                                                   ML13  ML4  ML5  ML3 ML14
## Correlation of (regression) scores with factors    0.9 1.00 1.00 1.00 0.88
## Multiple R square of scores with factors           0.8 0.99 0.99 1.00 0.78
## Minimum correlation of possible factor scores      0.6 0.99 0.98 0.99 0.57
##                                                   ML15  ML9  ML7  ML8  ML6
## Correlation of (regression) scores with factors   0.83 0.99 0.99 0.99 0.99
## Multiple R square of scores with factors          0.69 0.98 0.99 0.99 0.99
## Minimum correlation of possible factor scores     0.37 0.96 0.97 0.97 0.98

We look at those variables, where h2 (comunality) is higher. In this case these are attr1_1, attr2_1, sinc1_1, sinc2_1, shar1_1, intel2_1 - with h2 > 0.99. The model itself is not really good. First off, too many factors - hard to interpret. Then, these factors are not really great in explanation - Proportion Variance is rather low (<0.1), which is bad. AS for the Proportion Explained, difference between first and last one is not so big (0.13 and 0.04), which is not that bad (you might think), but if we look at the graph below we still see that some factors explain of one variable, which IS bad. RMSR here is 0.03, which is not so bad actually. Anyway, that’s the model withour rotation, let’s try another one. First, let’s reduce thу N of factors.

factor.plot(fa(dating12, nfactors=15, rotate="none", fm="ml"))
## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

## Warning in plot.xy(xy.coords(x, y), type = type, ...): не разработанное
## pch-значение '26'

fa.diagram(fa(dating12, nfactors=15, rotate="none", fm="ml"))

[2] No rotation, less factors

fa(dating12, nfactors=5, rotate="none", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 5, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3   ML4   ML2   ML1   ML5    h2     u2 com
## imprace   0.06  0.09 -0.04 -0.03  0.24 0.073 0.9269 1.5
## imprelig  0.01 -0.05 -0.04 -0.05  0.30 0.096 0.9041 1.2
## date     -0.06 -0.35 -0.06 -0.01  0.08 0.138 0.8619 1.2
## go_out    0.02 -0.31  0.01 -0.03 -0.03 0.097 0.9033 1.0
## sports   -0.14  0.40  0.14  0.04 -0.12 0.214 0.7857 1.7
## tvsports -0.07  0.29  0.19 -0.07  0.11 0.143 0.8566 2.3
## exercise  0.00  0.31 -0.07 -0.03  0.06 0.105 0.8951 1.2
## dining    0.41  0.25 -0.10 -0.02  0.21 0.292 0.7077 2.4
## museums   0.91 -0.04 -0.09 -0.06 -0.13 0.853 0.1467 1.1
## art       0.90  0.01 -0.08 -0.07 -0.15 0.843 0.1574 1.1
## hiking    0.21  0.03  0.01  0.07 -0.09 0.060 0.9401 1.7
## gaming   -0.02  0.21  0.14  0.09  0.11 0.085 0.9146 2.8
## clubbing  0.12  0.26 -0.04 -0.01  0.08 0.091 0.9089 1.7
## reading   0.30 -0.13 -0.06  0.01 -0.01 0.108 0.8917 1.5
## tv        0.09  0.06  0.00 -0.02  0.50 0.262 0.7383 1.1
## theater   0.61 -0.09 -0.14 -0.04  0.29 0.482 0.5176 1.6
## movies    0.38 -0.04 -0.04 -0.02  0.32 0.252 0.7481 2.0
## concerts  0.45  0.06 -0.02 -0.08  0.16 0.233 0.7671 1.4
## music     0.32  0.18 -0.02 -0.03  0.19 0.171 0.8292 2.3
## shopping  0.28  0.26 -0.16 -0.09  0.42 0.353 0.6467 3.0
## yoga      0.32  0.08 -0.03 -0.05  0.12 0.127 0.8726 1.5
## exphappy  0.07  0.26  0.13  0.12 -0.05 0.109 0.8912 2.2
## attr1_1  -0.21  0.47 -0.14 -0.31 -0.42 0.555 0.4454 3.4
## sinc1_1   0.02 -0.37  0.09  0.12  0.19 0.196 0.8040 1.9
## intel1_1  0.12 -0.26 -0.08 -0.04 -0.02 0.089 0.9106 1.7
## fun1_1   -0.05  0.15 -0.10  0.28 -0.04 0.114 0.8858 2.0
## amb1_1    0.18 -0.04  0.05  0.16  0.42 0.239 0.7606 1.7
## shar1_1   0.10 -0.31  0.26  0.09  0.24 0.245 0.7552 3.3
## attr2_1  -0.01  0.00 -0.61 -0.79  0.00 0.995 0.0049 1.9
## sinc2_1  -0.02 -0.02  0.57  0.36 -0.01 0.452 0.5481 1.7
## intel2_1 -0.01  0.10  0.52  0.32 -0.07 0.391 0.6086 1.8
## fun2_1    0.00  0.00 -0.63  0.77  0.00 0.995 0.0050 1.9
## amb2_1   -0.07  0.10  0.59  0.21 -0.11 0.419 0.5811 1.4
## shar2_1   0.15 -0.19  0.41  0.26  0.21 0.342 0.6576 3.1
## attr3_1   0.12  0.53  0.04  0.01  0.06 0.302 0.6978 1.2
## sinc3_1   0.14  0.03  0.02  0.00  0.25 0.087 0.9133 1.6
## intel3_1  0.06  0.31  0.12  0.02  0.07 0.117 0.8835 1.5
## fun3_1    0.16  0.58  0.06 -0.09  0.21 0.411 0.5893 1.5
## amb3_1    0.11  0.46  0.08 -0.05  0.25 0.296 0.7040 1.8
## 
##                        ML3  ML4  ML2  ML1  ML5
## SS loadings           3.20 2.48 2.21 1.86 1.68
## Proportion Var        0.08 0.06 0.06 0.05 0.04
## Cumulative Var        0.08 0.15 0.20 0.25 0.29
## Proportion Explained  0.28 0.22 0.19 0.16 0.15
## Cumulative Proportion 0.28 0.50 0.69 0.85 1.00
## 
## Mean item complexity =  1.8
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 556  and the objective function was  9.91 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8191 with the empirical chi square  56272.09  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  80980.15  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.251
## RMSEA index =  0.133  and the 90 % confidence intervals are  0.132 0.134
## BIC =  75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy             
##                                                    ML3  ML4  ML2  ML1  ML5
## Correlation of (regression) scores with factors   0.96 0.88 1.00 1.00 0.85
## Multiple R square of scores with factors          0.93 0.78 0.99 1.00 0.73
## Minimum correlation of possible factor scores     0.86 0.56 0.99 0.99 0.45
factor.plot(fa(dating12, nfactors=5, rotate="none", fm="ml"))

fa.diagram(fa(dating12, nfactors=5, rotate="none", fm="ml"))

#OR 
#fa1 <- fa(dating12, nfactors=5, rotate="none", fm="ml") 
#print(fa1$loadings,cutoff = 0.3)
  • Low Cumulative Var = 0.29.
  • We have RMSR = 0.07. (should be closer to 0)
  • RMSEA index = 0.133 (<.08 acceptable, <.05 excellent)
  • Tucker Lewis Index= 0.215 (>.90 acceptable, >.95 excellent)
  • The sad result of fa.
  • In addition, again low Proportion Variances and Cumulative Variance, unequal Proportion Explained.

-> Not good. Next model. Now with rotation.

[3] rotation varimax

fa(dating12, nfactors=5, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3   ML2   ML4   ML5   ML1    h2     u2 com
## imprace   0.03 -0.05  0.09  0.25  0.00 0.073 0.9269 1.4
## imprelig -0.03 -0.05 -0.04  0.30 -0.04 0.096 0.9041 1.2
## date     -0.07 -0.03 -0.35  0.09 -0.02 0.138 0.8619 1.2
## go_out    0.02  0.02 -0.30 -0.01 -0.06 0.097 0.9033 1.1
## sports   -0.13  0.09  0.41 -0.16  0.01 0.214 0.7857 1.6
## tvsports -0.10  0.09  0.32  0.08 -0.13 0.143 0.8566 1.8
## exercise  0.00 -0.10  0.30  0.05  0.04 0.105 0.8951 1.3
## dining    0.39 -0.08  0.25  0.26  0.04 0.292 0.7077 2.7
## museums   0.92 -0.04 -0.04  0.00 -0.04 0.853 0.1467 1.0
## art       0.92 -0.04  0.01 -0.03 -0.05 0.843 0.1574 1.0
## hiking    0.22  0.06  0.03 -0.07  0.06 0.060 0.9401 1.5
## gaming   -0.05  0.15  0.23  0.09  0.02 0.085 0.9146 2.2
## clubbing  0.12 -0.05  0.26  0.09  0.03 0.091 0.9089 1.8
## reading   0.30 -0.01 -0.14  0.04  0.02 0.108 0.8917 1.5
## tv        0.02  0.01  0.08  0.50 -0.03 0.262 0.7383 1.1
## theater   0.57 -0.08 -0.09  0.37  0.00 0.482 0.5176 1.8
## movies    0.34  0.00 -0.03  0.37 -0.02 0.252 0.7481 2.0
## concerts  0.42 -0.03  0.07  0.21 -0.07 0.233 0.7671 1.6
## music     0.30 -0.02  0.19  0.22 -0.02 0.171 0.8292 2.6
## shopping  0.24 -0.17  0.26  0.45  0.01 0.353 0.6467 2.5
## yoga      0.30 -0.03  0.09  0.16 -0.03 0.127 0.8726 1.8
## exphappy  0.07  0.16  0.27 -0.05  0.06 0.109 0.8912 1.9
## attr1_1  -0.13 -0.35  0.44 -0.45 -0.12 0.555 0.4454 3.2
## sinc1_1  -0.01  0.18 -0.35  0.20  0.01 0.196 0.8040 2.2
## intel1_1  0.13 -0.06 -0.26  0.01 -0.02 0.089 0.9106 1.6
## fun1_1   -0.04  0.06  0.12 -0.05  0.30 0.114 0.8858 1.5
## amb1_1    0.11  0.16 -0.02  0.44  0.08 0.239 0.7606 1.5
## shar1_1   0.04  0.31 -0.27  0.25 -0.11 0.245 0.7552 3.3
## attr2_1   0.05 -0.94 -0.05  0.02 -0.32 0.995 0.0049 1.2
## sinc2_1  -0.07  0.67  0.04 -0.03 -0.01 0.452 0.5481 1.0
## intel2_1 -0.04  0.60  0.15 -0.10  0.00 0.391 0.6086 1.2
## fun2_1    0.04 -0.10 -0.11  0.04  0.99 0.995 0.0050 1.1
## amb2_1   -0.10  0.59  0.16 -0.15 -0.13 0.419 0.5811 1.5
## shar2_1   0.09  0.52 -0.14  0.22 -0.04 0.342 0.6576 1.6
## attr3_1   0.11  0.01  0.53  0.06  0.04 0.302 0.6978 1.1
## sinc3_1   0.11  0.03  0.05  0.27 -0.02 0.087 0.9133 1.4
## intel3_1  0.04  0.09  0.32  0.06 -0.02 0.117 0.8835 1.3
## fun3_1    0.13 -0.03  0.59  0.20 -0.06 0.411 0.5893 1.4
## amb3_1    0.08  0.02  0.48  0.24 -0.05 0.296 0.7040 1.6
## 
##                        ML3  ML2  ML4  ML5  ML1
## SS loadings           3.01 2.73 2.52 1.90 1.27
## Proportion Var        0.08 0.07 0.06 0.05 0.03
## Cumulative Var        0.08 0.15 0.21 0.26 0.29
## Proportion Explained  0.26 0.24 0.22 0.17 0.11
## Cumulative Proportion 0.26 0.50 0.72 0.89 1.00
## 
## Mean item complexity =  1.6
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 556  and the objective function was  9.91 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8191 with the empirical chi square  56272.09  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  80980.15  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.251
## RMSEA index =  0.133  and the 90 % confidence intervals are  0.132 0.134
## BIC =  75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML4  ML5  ML1
## Correlation of (regression) scores with factors   0.96 1.00 0.88 0.85 1.00
## Multiple R square of scores with factors          0.93 0.99 0.78 0.73 0.99
## Minimum correlation of possible factor scores     0.85 0.98 0.57 0.46 0.99
factor.plot(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))

fa.diagram(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))

Here we see that:

  • Proportion Var is still rather low (<0.1);
  • Proportion Explained is a bit better, but still is doubtful. Graph looks a bit better, but i would say that ML1 factor is extra here and can be reduced. Btw if we check the table with loadings we see, that actually ML1 is a really (i mean REALLY) weak factor;
  • RMSR is 0.07 - not really good, that it’s >0.05

[4] rotation oblimin

library(GPArotation)
fa(dating12, nfactors=5, rotate="oblimin", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 5, rotate = "oblimin", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML2   ML3   ML4   ML5   ML1    h2     u2 com
## imprace  -0.08 -0.05  0.21  0.18  0.01 0.073 0.9269 2.4
## imprelig -0.12 -0.12  0.11  0.29 -0.03 0.096 0.9041 2.0
## date     -0.11 -0.07 -0.27  0.23 -0.01 0.138 0.8619 2.5
## go_out   -0.03  0.05 -0.28  0.13 -0.06 0.097 0.9033 1.6
## sports    0.18 -0.10  0.27 -0.31 -0.02 0.214 0.7857 2.9
## tvsports  0.11 -0.13  0.31 -0.04 -0.15 0.143 0.8566 2.2
## exercise -0.05 -0.04  0.30 -0.10  0.06 0.105 0.8951 1.4
## dining   -0.06  0.29  0.37  0.11  0.06 0.292 0.7077 2.2
## museums   0.01  0.93 -0.03 -0.01  0.00 0.853 0.1467 1.0
## art       0.01  0.93  0.00 -0.06 -0.02 0.843 0.1574 1.0
## hiking    0.09  0.24  0.00 -0.08  0.05 0.060 0.9401 1.6
## gaming    0.17 -0.08  0.24  0.00 -0.01 0.085 0.9146 2.1
## clubbing -0.01  0.07  0.28 -0.05  0.04 0.091 0.9089 1.2
## reading  -0.02  0.29 -0.10  0.09  0.03 0.108 0.8917 1.4
## tv       -0.08 -0.14  0.32  0.44 -0.03 0.262 0.7383 2.1
## theater  -0.13  0.46  0.12  0.36  0.04 0.482 0.5176 2.2
## movies   -0.06  0.23  0.16  0.35 -0.01 0.252 0.7481 2.3
## concerts -0.04  0.36  0.17  0.15 -0.05 0.233 0.7671 1.9
## music    -0.01  0.22  0.28  0.11 -0.01 0.171 0.8292 2.3
## shopping -0.19  0.07  0.47  0.27  0.05 0.353 0.6467 2.1
## yoga     -0.03  0.25  0.16  0.10 -0.01 0.127 0.8726 2.1
## exphappy  0.22  0.07  0.21 -0.15  0.03 0.109 0.8912 3.0
## attr1_1  -0.20 -0.04  0.17 -0.65 -0.07 0.555 0.4454 1.4
## sinc1_1   0.08 -0.04 -0.22  0.36 -0.02 0.196 0.8040 1.8
## intel1_1 -0.10  0.14 -0.23  0.11 -0.01 0.089 0.9106 2.7
## fun1_1    0.13 -0.05  0.10 -0.11  0.29 0.114 0.8858 2.1
## amb1_1    0.08 -0.01  0.20  0.43  0.06 0.239 0.7606 1.5
## shar1_1   0.19  0.01 -0.13  0.40 -0.16 0.245 0.7552 2.0
## attr2_1  -0.97  0.00 -0.01 -0.07 -0.16 0.995 0.0049 1.1
## sinc2_1   0.66 -0.02 -0.01  0.05 -0.13 0.452 0.5481 1.1
## intel2_1  0.62  0.01  0.06 -0.08 -0.11 0.391 0.6086 1.1
## fun2_1    0.02 -0.01 -0.01 -0.01  1.00 0.995 0.0050 1.0
## amb2_1    0.60 -0.03  0.03 -0.12 -0.23 0.419 0.5811 1.4
## shar2_1   0.43  0.06 -0.04  0.34 -0.12 0.342 0.6576 2.1
## attr3_1   0.10  0.06  0.51 -0.19  0.03 0.302 0.6978 1.4
## sinc3_1  -0.01  0.03  0.17  0.23 -0.02 0.087 0.9133 1.9
## intel3_1  0.14  0.01  0.31 -0.07 -0.04 0.117 0.8835 1.5
## fun3_1    0.03  0.03  0.62 -0.08 -0.06 0.411 0.5893 1.1
## amb3_1    0.05 -0.03  0.54  0.02 -0.05 0.296 0.7040 1.0
## 
##                        ML2  ML3  ML4  ML5  ML1
## SS loadings           2.69 2.72 2.59 2.16 1.28
## Proportion Var        0.07 0.07 0.07 0.06 0.03
## Cumulative Var        0.07 0.14 0.21 0.26 0.29
## Proportion Explained  0.24 0.24 0.23 0.19 0.11
## Cumulative Proportion 0.24 0.47 0.70 0.89 1.00
## 
##  With factor correlations of 
##       ML2   ML3   ML4   ML5   ML1
## ML2  1.00 -0.13 -0.06  0.09  0.04
## ML3 -0.13  1.00  0.18  0.22  0.02
## ML4 -0.06  0.18  1.00 -0.03 -0.06
## ML5  0.09  0.22 -0.03  1.00  0.09
## ML1  0.04  0.02 -0.06  0.09  1.00
## 
## Mean item complexity =  1.8
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 556  and the objective function was  9.91 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8191 with the empirical chi square  56272.09  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  80980.15  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.251
## RMSEA index =  0.133  and the 90 % confidence intervals are  0.132 0.134
## BIC =  75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy             
##                                                    ML2  ML3  ML4  ML5  ML1
## Correlation of (regression) scores with factors   1.00 0.96 0.88 0.88 1.00
## Multiple R square of scores with factors          0.99 0.93 0.78 0.77 0.99
## Minimum correlation of possible factor scores     0.99 0.85 0.56 0.54 0.99
factor.plot(fa(dating12, nfactors=5, rotate="oblimin", fm="ml"))

fa.diagram(fa(dating12, nfactors=5, rotate="oblimin", fm="ml"))

I again see that ML1 useless factor explaining only one variable… And in general the results are merely the same as in the 3rd model - Proportion Var are still below 0.1, Proportion explained is nearly the same, too (i still find the first 4 factors are good and the last one is extra, but iа i reduce the N of factors it still exist), RMSR > 0.05 (equals to 0.07). And for some reason i don’t see any arrows, shcoing the correlation between factors…

To sum up

In my opinion the 3rd model here is the best. It look prettie and more interprettable than thr first (ofc, as there are much less factors), the Proportion Explained is better, than in the 2nd model and i see no reason to prefer the 4th model to the 3rd one, because there were no singinificant improvements.

I still don’t like the presence of ML1, but if I reduce the N of factors, the situation is not getting better. In contrast, it’s getting worse: more RMSR, more difference in Proportion Explained between factors, no improvements in Proportion Variance.

Increase in the N of factors didn’t help neither. So, I guess, 5 is the optimal N of factors. And it’s better to use varimax rotation.

fa(dating12, nfactors=4, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 4, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3   ML2   ML4   ML1     h2     u2 com
## imprace   0.07 -0.06  0.10  0.01 0.0177 0.9823 2.4
## imprelig  0.02 -0.06  0.00 -0.02 0.0048 0.9952 1.3
## date     -0.07 -0.03 -0.34 -0.02 0.1204 0.8796 1.1
## go_out    0.01  0.03 -0.32 -0.07 0.1086 0.8914 1.1
## sports   -0.13  0.09  0.41  0.01 0.1897 0.8103 1.3
## tvsports -0.08  0.08  0.33 -0.12 0.1360 0.8640 1.5
## exercise  0.02 -0.11  0.31  0.05 0.1129 0.8871 1.3
## dining    0.44 -0.09  0.25  0.05 0.2640 0.7360 1.7
## museums   0.90 -0.02 -0.10 -0.04 0.8206 0.1794 1.0
## art       0.89 -0.03 -0.05 -0.06 0.7944 0.2056 1.0
## hiking    0.21  0.06  0.02  0.05 0.0515 0.9485 1.3
## gaming   -0.02  0.15  0.22  0.02 0.0706 0.9294 1.8
## clubbing  0.14 -0.05  0.25  0.04 0.0869 0.9131 1.7
## reading   0.30  0.00 -0.14  0.02 0.1072 0.8928 1.4
## tv        0.09 -0.01  0.09 -0.01 0.0170 0.9830 2.0
## theater   0.62 -0.08 -0.10  0.01 0.3972 0.6028 1.1
## movies    0.39 -0.01 -0.04 -0.01 0.1523 0.8477 1.0
## concerts  0.46 -0.03  0.05 -0.07 0.2178 0.7822 1.1
## music     0.34 -0.02  0.18 -0.01 0.1486 0.8514 1.5
## shopping  0.31 -0.18  0.25  0.03 0.1898 0.8102 2.6
## yoga      0.33 -0.04  0.09 -0.02 0.1204 0.8796 1.2
## exphappy  0.07  0.16  0.26  0.05 0.0994 0.9006 1.9
## attr1_1  -0.17 -0.34  0.34 -0.13 0.2739 0.7261 2.7
## sinc1_1   0.00  0.18 -0.31  0.01 0.1255 0.8745 1.6
## intel1_1  0.12 -0.06 -0.24 -0.02 0.0769 0.9231 1.6
## fun1_1   -0.05  0.06  0.13  0.30 0.1156 0.8844 1.5
## amb1_1    0.17  0.14  0.04  0.09 0.0598 0.9402 2.7
## shar1_1   0.07  0.29 -0.21 -0.11 0.1478 0.8522 2.3
## attr2_1   0.06 -0.95 -0.05 -0.29 0.9951 0.0049 1.2
## sinc2_1  -0.08  0.67  0.04 -0.03 0.4524 0.5476 1.0
## intel2_1 -0.05  0.60  0.15 -0.02 0.3827 0.6173 1.1
## fun2_1    0.04 -0.07 -0.12  0.99 0.9950 0.0050 1.0
## amb2_1   -0.13  0.58  0.16 -0.15 0.4025 0.5975 1.4
## shar2_1   0.11  0.51 -0.10 -0.05 0.2813 0.7187 1.2
## attr3_1   0.15  0.00  0.55  0.05 0.3248 0.6752 1.2
## sinc3_1   0.15  0.02  0.09 -0.01 0.0315 0.9685 1.7
## intel3_1  0.06  0.09  0.35 -0.01 0.1344 0.8656 1.2
## fun3_1    0.18 -0.05  0.63 -0.04 0.4287 0.5713 1.2
## amb3_1    0.13  0.00  0.50 -0.03 0.2694 0.7306 1.1
## 
##                        ML3  ML2  ML4  ML1
## SS loadings           3.27 2.72 2.47 1.26
## Proportion Var        0.08 0.07 0.06 0.03
## Cumulative Var        0.08 0.15 0.22 0.25
## Proportion Explained  0.34 0.28 0.25 0.13
## Cumulative Proportion 0.34 0.62 0.87 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 591  and the objective function was  10.99 
## 
## The root mean square of the residuals (RMSR) is  0.08 
## The df corrected root mean square of the residuals is  0.09 
## 
## The harmonic number of observations is  8191 with the empirical chi square  76549.99  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  89841.06  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.219
## RMSEA index =  0.136  and the 90 % confidence intervals are  0.135 0.137
## BIC =  84515.68
## Fit based upon off diagonal values = 0.69
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML4  ML1
## Correlation of (regression) scores with factors   0.96 1.00 0.88 1.00
## Multiple R square of scores with factors          0.91 0.99 0.77 0.99
## Minimum correlation of possible factor scores     0.83 0.98 0.54 0.98
factor.plot(fa(dating12, nfactors=4, rotate="varimax", fm="ml"))

fa.diagram(fa(dating12, nfactors=4, rotate="varimax", fm="ml"))

fa(dating12, nfactors=6, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 6, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML5   ML3   ML6   ML4   ML2   ML1    h2     u2 com
## imprace   0.05 -0.05  0.18  0.06 -0.01  0.00 0.040 0.9603 1.6
## imprelig -0.01 -0.09  0.10  0.23 -0.12 -0.02 0.083 0.9169 2.4
## date     -0.04 -0.08 -0.33  0.14  0.05  0.01 0.138 0.8615 1.6
## go_out    0.04  0.00 -0.33 -0.01  0.05 -0.04 0.115 0.8852 1.1
## sports   -0.14  0.14  0.28 -0.10  0.14 -0.02 0.144 0.8562 2.9
## tvsports -0.08  0.11  0.24  0.04  0.16 -0.14 0.121 0.8788 3.3
## exercise  0.00 -0.08  0.30  0.01  0.06  0.03 0.098 0.9021 1.2
## dining    0.40 -0.07  0.34  0.06 -0.06  0.04 0.289 0.7114 2.1
## museums   0.91 -0.02 -0.03 -0.05 -0.09 -0.03 0.848 0.1517 1.0
## art       0.91 -0.02 -0.01 -0.07 -0.03 -0.04 0.832 0.1679 1.0
## hiking    0.22  0.06 -0.02  0.03  0.09  0.06 0.065 0.9346 1.7
## gaming   -0.04  0.17  0.18 -0.03  0.06  0.01 0.069 0.9309 2.4
## clubbing  0.12 -0.03  0.24  0.01  0.06  0.03 0.079 0.9206 1.7
## reading   0.29 -0.02 -0.06  0.03 -0.17  0.02 0.117 0.8826 1.8
## tv        0.07 -0.03  0.17  0.20  0.06  0.00 0.075 0.9254 2.4
## theater   0.60 -0.10  0.03  0.12 -0.07  0.03 0.387 0.6133 1.2
## movies    0.36 -0.02  0.06  0.10 -0.06  0.00 0.150 0.8497 1.3
## concerts  0.45 -0.03  0.08  0.03  0.04 -0.06 0.211 0.7889 1.1
## music     0.32 -0.01  0.20  0.06  0.08 -0.01 0.152 0.8480 1.9
## shopping  0.27 -0.17  0.35  0.13  0.03  0.03 0.247 0.7527 2.8
## yoga      0.32 -0.04  0.13  0.11  0.02 -0.02 0.133 0.8673 1.7
## exphappy  0.06  0.19  0.19 -0.06  0.02  0.03 0.084 0.9156 2.5
## attr1_1  -0.15 -0.20  0.27 -0.85  0.34 -0.15 0.995 0.0050 1.9
## sinc1_1   0.03  0.09 -0.36  0.41  0.14  0.05 0.332 0.6682 2.4
## intel1_1  0.06 -0.07 -0.09 -0.02 -0.99 -0.07 0.995 0.0050 1.0
## fun1_1   -0.04  0.05  0.07  0.19  0.13  0.30 0.152 0.8479 2.3
## amb1_1    0.15  0.08  0.14  0.55  0.02  0.11 0.368 0.6317 1.4
## shar1_1   0.08  0.22 -0.22  0.45  0.10 -0.07 0.316 0.6837 2.2
## attr2_1   0.07 -0.94  0.03 -0.11  0.01 -0.30 0.995 0.0049 1.2
## sinc2_1  -0.07  0.66 -0.08  0.09  0.12 -0.02 0.469 0.5314 1.2
## intel2_1 -0.08  0.62  0.11 -0.03 -0.20 -0.04 0.446 0.5543 1.3
## fun2_1    0.03 -0.08 -0.05 -0.03 -0.07  0.99 0.995 0.0050 1.0
## amb2_1   -0.12  0.60  0.06  0.00  0.07 -0.15 0.407 0.5927 1.3
## shar2_1   0.11  0.47 -0.10  0.26  0.06 -0.02 0.306 0.6936 1.8
## attr3_1   0.10  0.07  0.56 -0.07  0.02  0.01 0.334 0.6664 1.1
## sinc3_1   0.13  0.00  0.12  0.23  0.04  0.00 0.085 0.9145 2.3
## intel3_1  0.02  0.12  0.39  0.02 -0.14 -0.04 0.186 0.8142 1.5
## fun3_1    0.14  0.00  0.62  0.14  0.16 -0.07 0.452 0.5477 1.4
## amb3_1    0.08  0.04  0.57  0.10  0.04 -0.06 0.346 0.6542 1.1
## 
##                        ML5  ML3  ML6  ML4  ML2  ML1
## SS loadings           3.12 2.60 2.45 1.80 1.40 1.28
## Proportion Var        0.08 0.07 0.06 0.05 0.04 0.03
## Cumulative Var        0.08 0.15 0.21 0.26 0.29 0.32
## Proportion Explained  0.25 0.21 0.19 0.14 0.11 0.10
## Cumulative Proportion 0.25 0.45 0.65 0.79 0.90 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 6 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 522  and the objective function was  9.22 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8191 with the empirical chi square  57758.47  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  75386.35  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.258
## RMSEA index =  0.132  and the 90 % confidence intervals are  0.132 0.133
## BIC =  70682.71
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy             
##                                                    ML5  ML3  ML6  ML4  ML2
## Correlation of (regression) scores with factors   0.96 1.00 0.88 0.99 1.00
## Multiple R square of scores with factors          0.92 0.99 0.78 0.97 0.99
## Minimum correlation of possible factor scores     0.85 0.99 0.56 0.94 0.98
##                                                    ML1
## Correlation of (regression) scores with factors   1.00
## Multiple R square of scores with factors          0.99
## Minimum correlation of possible factor scores     0.99
factor.plot(fa(dating12, nfactors=6, rotate="varimax", fm="ml"))

fa.diagram(fa(dating12, nfactors=6, rotate="varimax", fm="ml"))

Final interpretation

Now let’s move on and try to explain what do our factors mean.

fa(dating12, nfactors=5, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3   ML2   ML4   ML5   ML1    h2     u2 com
## imprace   0.03 -0.05  0.09  0.25  0.00 0.073 0.9269 1.4
## imprelig -0.03 -0.05 -0.04  0.30 -0.04 0.096 0.9041 1.2
## date     -0.07 -0.03 -0.35  0.09 -0.02 0.138 0.8619 1.2
## go_out    0.02  0.02 -0.30 -0.01 -0.06 0.097 0.9033 1.1
## sports   -0.13  0.09  0.41 -0.16  0.01 0.214 0.7857 1.6
## tvsports -0.10  0.09  0.32  0.08 -0.13 0.143 0.8566 1.8
## exercise  0.00 -0.10  0.30  0.05  0.04 0.105 0.8951 1.3
## dining    0.39 -0.08  0.25  0.26  0.04 0.292 0.7077 2.7
## museums   0.92 -0.04 -0.04  0.00 -0.04 0.853 0.1467 1.0
## art       0.92 -0.04  0.01 -0.03 -0.05 0.843 0.1574 1.0
## hiking    0.22  0.06  0.03 -0.07  0.06 0.060 0.9401 1.5
## gaming   -0.05  0.15  0.23  0.09  0.02 0.085 0.9146 2.2
## clubbing  0.12 -0.05  0.26  0.09  0.03 0.091 0.9089 1.8
## reading   0.30 -0.01 -0.14  0.04  0.02 0.108 0.8917 1.5
## tv        0.02  0.01  0.08  0.50 -0.03 0.262 0.7383 1.1
## theater   0.57 -0.08 -0.09  0.37  0.00 0.482 0.5176 1.8
## movies    0.34  0.00 -0.03  0.37 -0.02 0.252 0.7481 2.0
## concerts  0.42 -0.03  0.07  0.21 -0.07 0.233 0.7671 1.6
## music     0.30 -0.02  0.19  0.22 -0.02 0.171 0.8292 2.6
## shopping  0.24 -0.17  0.26  0.45  0.01 0.353 0.6467 2.5
## yoga      0.30 -0.03  0.09  0.16 -0.03 0.127 0.8726 1.8
## exphappy  0.07  0.16  0.27 -0.05  0.06 0.109 0.8912 1.9
## attr1_1  -0.13 -0.35  0.44 -0.45 -0.12 0.555 0.4454 3.2
## sinc1_1  -0.01  0.18 -0.35  0.20  0.01 0.196 0.8040 2.2
## intel1_1  0.13 -0.06 -0.26  0.01 -0.02 0.089 0.9106 1.6
## fun1_1   -0.04  0.06  0.12 -0.05  0.30 0.114 0.8858 1.5
## amb1_1    0.11  0.16 -0.02  0.44  0.08 0.239 0.7606 1.5
## shar1_1   0.04  0.31 -0.27  0.25 -0.11 0.245 0.7552 3.3
## attr2_1   0.05 -0.94 -0.05  0.02 -0.32 0.995 0.0049 1.2
## sinc2_1  -0.07  0.67  0.04 -0.03 -0.01 0.452 0.5481 1.0
## intel2_1 -0.04  0.60  0.15 -0.10  0.00 0.391 0.6086 1.2
## fun2_1    0.04 -0.10 -0.11  0.04  0.99 0.995 0.0050 1.1
## amb2_1   -0.10  0.59  0.16 -0.15 -0.13 0.419 0.5811 1.5
## shar2_1   0.09  0.52 -0.14  0.22 -0.04 0.342 0.6576 1.6
## attr3_1   0.11  0.01  0.53  0.06  0.04 0.302 0.6978 1.1
## sinc3_1   0.11  0.03  0.05  0.27 -0.02 0.087 0.9133 1.4
## intel3_1  0.04  0.09  0.32  0.06 -0.02 0.117 0.8835 1.3
## fun3_1    0.13 -0.03  0.59  0.20 -0.06 0.411 0.5893 1.4
## amb3_1    0.08  0.02  0.48  0.24 -0.05 0.296 0.7040 1.6
## 
##                        ML3  ML2  ML4  ML5  ML1
## SS loadings           3.01 2.73 2.52 1.90 1.27
## Proportion Var        0.08 0.07 0.06 0.05 0.03
## Cumulative Var        0.08 0.15 0.21 0.26 0.29
## Proportion Explained  0.26 0.24 0.22 0.17 0.11
## Cumulative Proportion 0.26 0.50 0.72 0.89 1.00
## 
## Mean item complexity =  1.6
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 556  and the objective function was  9.91 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8191 with the empirical chi square  56272.09  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  80980.15  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.251
## RMSEA index =  0.133  and the 90 % confidence intervals are  0.132 0.134
## BIC =  75970.16
## Fit based upon off diagonal values = 0.77
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML4  ML5  ML1
## Correlation of (regression) scores with factors   0.96 1.00 0.88 0.85 1.00
## Multiple R square of scores with factors          0.93 0.99 0.78 0.73 0.99
## Minimum correlation of possible factor scores     0.85 0.98 0.57 0.46 0.99
factor.plot(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))

fa.diagram(fa(dating12, nfactors=5, rotate="varimax", fm="ml"))

  • First factor, ML3, seems to gather hobbies of a respondent: museums, art, theater, concerts, dining, music, yoga, reading, hiking. Interesting thing is that here most of the hobbies are kinda “intelligent”. Suh things like shopping and clubbing fell to other factors.

  • The second one, ML2, collects answers to the question “What do you think the opposite sex looks for in a date?”: attractive, sincere, intelligent, ambitious, has shared interests/hobbies. Also suddenly shar1_1 is here, too (which is answer “has shared interests/hobbies” to close question “what you look for in the opposite sex”)

  • Next, ML4, this factors explains a mix of: answers to the question “How do you think you measure up?” (fun, attractive, ambitious, intellingent), active hobbies (sport, excercise, tvsports)and how often reaspondent goes for a date. Maybe it’s a piece, where is explained how active people evaluate themselves and their date-activity?

  • likes tv, shopping, movies; looking for attractive and ambitious - that’s the character, that is described by the 4th factor, ML5;

  • the last factor, ML1, explains only 2 variables (according to the graph): it unites people, who look for a fun person on a date and expect, that the opposite sex on a date is looking for fun, too. If we look closer to the table we would also notice, that there also fit respondents, who expect, that the opposite sex on a date is looking for attractiveness.

More experiments

Following the advice of my friend, I’ve decided also to try out less factors. Here’s the version with 3 factors:

  • RMSR is much better - 0.1;
  • Proportion Explained is… one factor is still rather weak, but the situation is a bit better, than in previous models;
  • Proportion Variance stays the same: 0.08 and 0.07 for first factors, 0.03 for the last one;
  • and this last factor is again look like an appendix on the graph :)

And here factors just divided on that, explaining hobbies, and another, explaining what respondent expect from the opposite sex to look for on a date.

fa(dating12, nfactors=3, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating12, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3   ML2   ML1     h2     u2 com
## imprace   0.07 -0.04  0.01 0.0061 0.9939 1.7
## imprelig  0.02 -0.06  0.02 0.0047 0.9953 1.3
## date     -0.05 -0.06 -0.03 0.0070 0.9930 2.3
## go_out    0.02  0.00  0.03 0.0014 0.9986 1.9
## sports   -0.15  0.11  0.04 0.0379 0.9621 2.0
## tvsports -0.10  0.10  0.16 0.0444 0.9556 2.4
## exercise  0.01 -0.08 -0.02 0.0061 0.9939 1.1
## dining    0.42 -0.04 -0.02 0.1805 0.8195 1.0
## museums   0.90  0.02  0.05 0.8206 0.1794 1.0
## art       0.89  0.02  0.07 0.8022 0.1978 1.0
## hiking    0.21  0.08 -0.04 0.0514 0.9486 1.4
## gaming   -0.04  0.17  0.01 0.0295 0.9705 1.1
## clubbing  0.13 -0.02 -0.01 0.0170 0.9830 1.1
## reading   0.30  0.01 -0.03 0.0917 0.9083 1.0
## tv        0.08  0.00  0.02 0.0075 0.9925 1.2
## theater   0.62 -0.05 -0.01 0.3881 0.6119 1.0
## movies    0.39  0.01  0.02 0.1490 0.8510 1.0
## concerts  0.45  0.00  0.08 0.2124 0.7876 1.1
## music     0.33  0.01  0.04 0.1085 0.8915 1.0
## shopping  0.30 -0.14  0.00 0.1078 0.8922 1.4
## yoga      0.33 -0.01  0.04 0.1075 0.8925 1.0
## exphappy  0.05  0.19 -0.02 0.0372 0.9628 1.1
## attr1_1  -0.16 -0.32  0.16 0.1540 0.8460 2.0
## sinc1_1   0.00  0.15 -0.04 0.0247 0.9753 1.2
## intel1_1  0.13 -0.07  0.00 0.0224 0.9776 1.6
## fun1_1   -0.05  0.08 -0.29 0.0907 0.9093 1.2
## amb1_1    0.15  0.16 -0.08 0.0554 0.9446 2.5
## shar1_1   0.06  0.28  0.09 0.0869 0.9131 1.3
## attr2_1   0.11 -0.95  0.28 0.9951 0.0049 1.2
## sinc2_1  -0.12  0.66  0.04 0.4518 0.5482 1.1
## intel2_1 -0.09  0.60  0.05 0.3750 0.6250 1.1
## fun2_1    0.07 -0.05 -0.99 0.9950 0.0050 1.0
## amb2_1   -0.17  0.58  0.17 0.3958 0.6042 1.3
## shar2_1   0.08  0.50  0.04 0.2594 0.7406 1.1
## attr3_1   0.11  0.06  0.02 0.0167 0.9833 1.6
## sinc3_1   0.14  0.04  0.02 0.0215 0.9785 1.2
## intel3_1  0.04  0.12  0.06 0.0185 0.9815 1.6
## fun3_1    0.14  0.01  0.12 0.0348 0.9652 2.0
## amb3_1    0.10  0.05  0.09 0.0213 0.9787 2.5
## 
##                        ML3  ML2  ML1
## SS loadings           3.25 2.69 1.30
## Proportion Var        0.08 0.07 0.03
## Cumulative Var        0.08 0.15 0.19
## Proportion Explained  0.45 0.37 0.18
## Cumulative Proportion 0.45 0.82 1.00
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  741  and the objective function was  17.61 with Chi Square of  143979.6
## The degrees of freedom for the model are 627  and the objective function was  12.44 
## 
## The root mean square of the residuals (RMSR) is  0.1 
## The df corrected root mean square of the residuals is  0.11 
## 
## The harmonic number of observations is  8191 with the empirical chi square  123002.3  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  101687.7  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.166
## RMSEA index =  0.14  and the 90 % confidence intervals are  0.14 0.141
## BIC =  96037.94
## Fit based upon off diagonal values = 0.51
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML1
## Correlation of (regression) scores with factors   0.96 1.00 1.00
## Multiple R square of scores with factors          0.92 0.99 0.99
## Minimum correlation of possible factor scores     0.83 0.99 0.99
factor.plot(fa(dating12, nfactors=3, rotate="varimax", fm="ml"))

fa.diagram(fa(dating12, nfactors=3, rotate="varimax", fm="ml"))

I still don’t like factor, describing only 1-2 variables, so i will delite these variables from a dataset (fun1_1, fun2_1). Lat’s see what will happen.

# ones again
dating2<- dating[c("imprace","imprelig", "date", "go_out", "sports", 
                   "tvsports", "exercise",  "dining" , "museums",  "art",  
                   "hiking", "gaming",  "clubbing",  
                   "reading", "tv",  "theater", "movies",  "concerts",   
                   "music",   "shopping",   "yoga", "exphappy" , "attr1_1",
                   "sinc1_1",   "intel1_1",   "amb1_1",   
                   "shar1_1", "attr2_1", "sinc2_1",   "intel2_1",   
                   "amb2_1",   "shar2_1",   "attr3_1",   "sinc3_1",
                   "intel3_1",   "fun3_1",   "amb3_1")]
dating2 <- as.data.frame(dating2)
dating22 <- na.omit(dating2)

# and model:

fa(dating22, nfactors=5, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating22, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3   ML1   ML4   ML5   ML2    h2    u2 com
## imprace  -0.01 -0.04  0.09  0.24  0.04 0.066 0.934 1.4
## imprelig -0.04 -0.10  0.07  0.09  0.24 0.083 0.917 1.9
## date     -0.06 -0.06 -0.34  0.00  0.11 0.138 0.862 1.4
## go_out    0.03  0.00 -0.35  0.02 -0.03 0.127 0.873 1.0
## sports   -0.14  0.13  0.33 -0.03 -0.14 0.162 0.838 2.1
## tvsports -0.15  0.08  0.16  0.25 -0.05 0.120 0.880 2.9
## exercise -0.02 -0.07  0.32  0.06 -0.01 0.109 0.891 1.2
## dining    0.34 -0.06  0.26  0.30  0.08 0.284 0.716 3.1
## museums   0.92 -0.06  0.00  0.13  0.01 0.860 0.140 1.0
## art       0.90 -0.06  0.02  0.15 -0.03 0.830 0.170 1.1
## hiking    0.23  0.07  0.04 -0.04  0.01 0.059 0.941 1.3
## gaming   -0.10  0.19  0.06  0.27 -0.07 0.126 0.874 2.5
## clubbing  0.07 -0.02  0.17  0.21 -0.02 0.079 0.921 2.2
## reading   0.32 -0.03 -0.01 -0.07  0.11 0.119 0.881 1.3
## tv       -0.09  0.01 -0.08  0.62  0.12 0.418 0.582 1.2
## theater   0.51 -0.08 -0.12  0.43  0.13 0.486 0.514 2.3
## movies    0.26  0.00 -0.16  0.51  0.09 0.360 0.640 1.8
## concerts  0.36 -0.03 -0.06  0.40  0.00 0.292 0.708 2.1
## music     0.23  0.00  0.08  0.37  0.01 0.199 0.801 1.8
## shopping  0.13 -0.14  0.15  0.58  0.08 0.399 0.601 1.4
## yoga      0.28 -0.05  0.12  0.16  0.10 0.129 0.871 2.4
## exphappy  0.04  0.20  0.16  0.10 -0.07 0.082 0.918 2.9
## attr1_1  -0.15 -0.21  0.25  0.03 -0.93 0.995 0.005 1.3
## sinc1_1  -0.01  0.11 -0.35  0.05  0.35 0.256 0.744 2.2
## intel1_1  0.18 -0.14 -0.10 -0.22  0.32 0.211 0.789 3.1
## amb1_1    0.09  0.10  0.14  0.14  0.52 0.331 0.669 1.5
## shar1_1   0.06  0.18 -0.16  0.00  0.39 0.214 0.786 1.8
## attr2_1   0.03 -0.99  0.00  0.07 -0.14 0.995 0.005 1.1
## sinc2_1  -0.07  0.64 -0.07  0.03  0.04 0.416 0.584 1.1
## intel2_1 -0.02  0.57  0.09 -0.07  0.02 0.343 0.657 1.1
## amb2_1   -0.08  0.53  0.11 -0.11 -0.04 0.314 0.686 1.2
## shar2_1   0.09  0.44 -0.07  0.04  0.22 0.258 0.742 1.7
## attr3_1   0.09  0.05  0.62  0.08 -0.06 0.404 0.596 1.1
## sinc3_1   0.09  0.00  0.12  0.14  0.20 0.083 0.917 3.0
## intel3_1  0.04  0.08  0.47 -0.05  0.07 0.233 0.767 1.2
## fun3_1    0.07 -0.02  0.58  0.26  0.06 0.416 0.584 1.4
## amb3_1    0.03  0.02  0.54  0.22  0.07 0.349 0.651 1.4
## 
##                        ML3  ML1  ML4  ML5  ML2
## SS loadings           2.70 2.47 2.23 2.12 1.82
## Proportion Var        0.07 0.07 0.06 0.06 0.05
## Cumulative Var        0.07 0.14 0.20 0.26 0.31
## Proportion Explained  0.24 0.22 0.20 0.19 0.16
## Cumulative Proportion 0.24 0.46 0.65 0.84 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  666  and the objective function was  12.76 with Chi Square of  104370.9
## The degrees of freedom for the model are 491  and the objective function was  4.94 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  8191 with the empirical chi square  44458.21  with prob <  0 
## The total number of observations was  8191  with Likelihood Chi Square =  40406.21  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.478
## RMSEA index =  0.1  and the 90 % confidence intervals are  0.099 0.1
## BIC =  35981.91
## Fit based upon off diagonal values = 0.81
## Measures of factor score adequacy             
##                                                    ML3  ML1  ML4  ML5  ML2
## Correlation of (regression) scores with factors   0.96 1.00 0.87 0.86 0.99
## Multiple R square of scores with factors          0.91 0.99 0.75 0.73 0.97
## Minimum correlation of possible factor scores     0.83 0.99 0.51 0.47 0.95
factor.plot(fa(dating22, nfactors=5, rotate="varimax", fm="ml"))

fa.diagram(fa(dating22, nfactors=5, rotate="varimax", fm="ml"))

THAT LOOKS MUCH BETTER!

However I’ve decided to go further and clean the set even more - i’ve delited all the variables, that were not in use in models with 3 and 4 factors. Then, I’ve constracted models with that set and 3-4-5 factors, choosing the best option.

# let's clean it up last time
dating3<- dating[c("imprelig", "date", "go_out", "sports",  "tvsports", "exercise",  "dining" , "museums",  "art", "tv", "theater", "movies",  "concerts",  "music",   "shopping",   "yoga",  "attr1_1",  "sinc1_1",   "amb1_1", "attr2_1", "sinc2_1", "intel2_1",  "amb2_1",   "shar2_1",   "attr3_1", "intel3_1",   "fun3_1",   "amb3_1")]
dating3 <- as.data.frame(dating3)
dating33 <- na.omit(dating3)

# model with 5 factors:
fa(dating33, nfactors=5, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating33, nfactors = 5, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML2   ML1   ML4   ML3   ML5    h2    u2 com
## imprelig -0.01 -0.08  0.02 -0.05  0.28 0.088 0.912 1.3
## date     -0.07 -0.05 -0.36 -0.01  0.10 0.147 0.853 1.3
## go_out    0.04  0.00 -0.34 -0.01 -0.02 0.119 0.881 1.0
## sports   -0.15  0.12  0.35  0.04 -0.23 0.219 0.781 2.5
## tvsports -0.13  0.06  0.24  0.12 -0.05 0.095 0.905 2.4
## exercise  0.00 -0.06  0.33 -0.02  0.03 0.114 0.886 1.1
## dining    0.38 -0.05  0.26  0.15  0.23 0.289 0.711 2.9
## museums   0.92 -0.03 -0.01  0.13 -0.01 0.873 0.127 1.0
## art       0.90 -0.04  0.02  0.17 -0.04 0.838 0.162 1.1
## tv        0.01 -0.01  0.03  0.16  0.39 0.177 0.823 1.4
## theater   0.54 -0.08 -0.10  0.30  0.31 0.490 0.510 2.4
## movies    0.27 -0.02 -0.08  0.39  0.26 0.299 0.701 2.7
## concerts  0.27 -0.02 -0.04  0.82 -0.03 0.752 0.248 1.2
## music     0.17  0.01  0.12  0.72  0.01 0.567 0.433 1.2
## shopping  0.21 -0.15  0.23  0.22  0.39 0.315 0.685 3.4
## yoga      0.27 -0.04  0.10  0.19  0.14 0.142 0.858 2.8
## attr1_1  -0.15 -0.33  0.36 -0.01 -0.54 0.552 0.448 2.7
## sinc1_1  -0.01  0.16 -0.40  0.07  0.18 0.226 0.774 1.8
## amb1_1    0.12  0.15  0.06 -0.03  0.58 0.382 0.618 1.2
## attr2_1   0.05 -1.00  0.00  0.04 -0.03 0.995 0.005 1.0
## sinc2_1  -0.06  0.63 -0.04 -0.02 -0.03 0.408 0.592 1.0
## intel2_1 -0.06  0.58  0.10  0.02 -0.08 0.356 0.644 1.1
## amb2_1   -0.09  0.53  0.13 -0.06 -0.10 0.315 0.685 1.3
## shar2_1   0.08  0.46 -0.10  0.05  0.21 0.273 0.727 1.6
## attr3_1   0.09  0.03  0.61 -0.01  0.05 0.382 0.618 1.1
## intel3_1  0.04  0.10  0.43 -0.07  0.05 0.201 0.799 1.2
## fun3_1    0.09 -0.03  0.58  0.14  0.15 0.393 0.607 1.3
## amb3_1    0.05  0.01  0.54  0.08  0.25 0.360 0.640 1.5
## 
##                        ML2  ML1  ML4  ML3  ML5
## SS loadings           2.51 2.45 2.22 1.69 1.50
## Proportion Var        0.09 0.09 0.08 0.06 0.05
## Cumulative Var        0.09 0.18 0.26 0.32 0.37
## Proportion Explained  0.24 0.24 0.21 0.16 0.14
## Cumulative Proportion 0.24 0.48 0.69 0.86 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  378  and the objective function was  9.48 with Chi Square of  77958.42
## The degrees of freedom for the model are 248  and the objective function was  2.48 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  8235 with the empirical chi square  19097.03  with prob <  0 
## The total number of observations was  8235  with Likelihood Chi Square =  20383.11  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.604
## RMSEA index =  0.099  and the 90 % confidence intervals are  0.098 0.1
## BIC =  18147.1
## Fit based upon off diagonal values = 0.9
## Measures of factor score adequacy             
##                                                    ML2  ML1  ML4  ML3  ML5
## Correlation of (regression) scores with factors   0.96 1.00 0.88 0.89 0.84
## Multiple R square of scores with factors          0.92 0.99 0.77 0.79 0.70
## Minimum correlation of possible factor scores     0.83 0.99 0.53 0.59 0.40
#factor.plot(fa(dating33, nfactors=5, rotate="varimax", fm="ml"))
fa.diagram(fa(dating33, nfactors=5, rotate="varimax", fm="ml"))

# model with 4 factors:
fa(dating33, nfactors=4, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating33, nfactors = 4, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML2   ML1   ML3   ML4    h2    u2 com
## imprelig -0.01 -0.06  0.01  0.20 0.044 0.956 1.2
## date     -0.07 -0.04 -0.37  0.07 0.147 0.853 1.2
## go_out    0.02  0.00 -0.35 -0.01 0.125 0.875 1.0
## sports   -0.15  0.10  0.39 -0.15 0.207 0.793 1.8
## tvsports -0.09  0.05  0.26  0.09 0.085 0.915 1.6
## exercise  0.01 -0.06  0.33  0.03 0.114 0.886 1.1
## dining    0.44 -0.03  0.23  0.22 0.293 0.707 2.1
## museums   0.90 -0.03 -0.05 -0.13 0.840 0.160 1.1
## art       0.92 -0.03 -0.01 -0.16 0.865 0.135 1.1
## tv        0.11  0.02 -0.01  0.54 0.304 0.696 1.1
## theater   0.63 -0.05 -0.15  0.34 0.530 0.470 1.7
## movies    0.39  0.01 -0.11  0.42 0.340 0.660 2.1
## concerts  0.46 -0.03 -0.01  0.21 0.256 0.744 1.4
## music     0.35  0.00  0.13  0.23 0.191 0.809 2.1
## shopping  0.32 -0.12  0.19  0.49 0.389 0.611 2.2
## yoga      0.33 -0.03  0.09  0.13 0.134 0.866 1.5
## attr1_1  -0.17 -0.37  0.36 -0.29 0.376 0.624 3.3
## sinc1_1   0.01  0.17 -0.39  0.16 0.209 0.791 1.7
## amb1_1    0.16  0.18  0.02  0.33 0.171 0.829 2.0
## attr2_1   0.07 -0.99 -0.02  0.04 0.995 0.005 1.0
## sinc2_1  -0.07  0.63 -0.03 -0.03 0.406 0.594 1.0
## intel2_1 -0.06  0.57  0.13 -0.07 0.349 0.651 1.2
## amb2_1   -0.12  0.51  0.14 -0.13 0.316 0.684 1.4
## shar2_1   0.10  0.47 -0.10  0.12 0.260 0.740 1.3
## attr3_1   0.12  0.03  0.59  0.02 0.366 0.634 1.1
## intel3_1  0.04  0.09  0.42 -0.02 0.188 0.812 1.1
## fun3_1    0.17 -0.03  0.58  0.18 0.397 0.603 1.4
## amb3_1    0.11  0.02  0.51  0.21 0.314 0.686 1.4
## 
##                        ML2  ML1  ML3  ML4
## SS loadings           3.13 2.45 2.20 1.43
## Proportion Var        0.11 0.09 0.08 0.05
## Cumulative Var        0.11 0.20 0.28 0.33
## Proportion Explained  0.34 0.27 0.24 0.16
## Cumulative Proportion 0.34 0.61 0.84 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  378  and the objective function was  9.48 with Chi Square of  77958.42
## The degrees of freedom for the model are 272  and the objective function was  3.09 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8235 with the empirical chi square  25624.7  with prob <  0 
## The total number of observations was  8235  with Likelihood Chi Square =  25387.3  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.55
## RMSEA index =  0.106  and the 90 % confidence intervals are  0.105 0.107
## BIC =  22934.91
## Fit based upon off diagonal values = 0.86
## Measures of factor score adequacy             
##                                                    ML2  ML1  ML3  ML4
## Correlation of (regression) scores with factors   0.97 1.00 0.87 0.84
## Multiple R square of scores with factors          0.93 0.99 0.76 0.70
## Minimum correlation of possible factor scores     0.86 0.99 0.51 0.41
#factor.plot(fa(dating33, nfactors=4, rotate="varimax", fm="ml"))
fa.diagram(fa(dating33, nfactors=4, rotate="varimax", fm="ml"))

# model with 3 factors:
fa(dating33, nfactors=3, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating33, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML2   ML1   ML3    h2    u2 com
## imprelig  0.00 -0.07  0.04 0.007 0.993 1.6
## date     -0.09 -0.04 -0.35 0.131 0.869 1.2
## go_out    0.00  0.01 -0.36 0.127 0.873 1.0
## sports   -0.13  0.09  0.37 0.164 0.836 1.4
## tvsports -0.08  0.04  0.27 0.079 0.921 1.2
## exercise  0.03 -0.07  0.33 0.115 0.885 1.1
## dining    0.45 -0.03  0.23 0.258 0.742 1.5
## museums   0.90  0.01 -0.12 0.819 0.181 1.0
## art       0.90  0.00 -0.09 0.812 0.188 1.0
## tv        0.10  0.00  0.05 0.013 0.987 1.4
## theater   0.62 -0.04 -0.13 0.398 0.602 1.1
## movies    0.39  0.01 -0.07 0.154 0.846 1.1
## concerts  0.46 -0.02 -0.01 0.215 0.785 1.0
## music     0.36  0.00  0.14 0.148 0.852 1.3
## shopping  0.33 -0.14  0.21 0.169 0.831 2.1
## yoga      0.34 -0.02  0.09 0.123 0.877 1.2
## attr1_1  -0.14 -0.37  0.29 0.236 0.764 2.2
## sinc1_1  -0.02  0.18 -0.35 0.151 0.849 1.5
## amb1_1    0.15  0.17  0.08 0.059 0.941 2.4
## attr2_1   0.10 -0.99 -0.05 0.995 0.005 1.0
## sinc2_1  -0.10  0.63 -0.01 0.406 0.594 1.0
## intel2_1 -0.07  0.57  0.14 0.345 0.655 1.1
## amb2_1   -0.13  0.51  0.14 0.299 0.701 1.3
## shar2_1   0.08  0.47 -0.06 0.234 0.766 1.1
## attr3_1   0.15  0.02  0.59 0.372 0.628 1.1
## intel3_1  0.06  0.08  0.43 0.194 0.806 1.1
## fun3_1    0.20 -0.05  0.59 0.393 0.607 1.2
## amb3_1    0.14  0.00  0.53 0.297 0.703 1.1
## 
##                        ML2  ML1  ML3
## SS loadings           3.13 2.43 2.15
## Proportion Var        0.11 0.09 0.08
## Cumulative Var        0.11 0.20 0.28
## Proportion Explained  0.41 0.32 0.28
## Cumulative Proportion 0.41 0.72 1.00
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  378  and the objective function was  9.48 with Chi Square of  77958.42
## The degrees of freedom for the model are 297  and the objective function was  3.99 
## 
## The root mean square of the residuals (RMSR) is  0.08 
## The df corrected root mean square of the residuals is  0.09 
## 
## The harmonic number of observations is  8235 with the empirical chi square  40758.25  with prob <  0 
## The total number of observations was  8235  with Likelihood Chi Square =  32769.1  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.467
## RMSEA index =  0.115  and the 90 % confidence intervals are  0.114 0.116
## BIC =  30091.31
## Fit based upon off diagonal values = 0.78
## Measures of factor score adequacy             
##                                                    ML2  ML1  ML3
## Correlation of (regression) scores with factors   0.96 1.00 0.87
## Multiple R square of scores with factors          0.91 0.99 0.75
## Minimum correlation of possible factor scores     0.83 0.99 0.50
#factor.plot(fa(dating33, nfactors=3, rotate="varimax", fm="ml"))
fa.diagram(fa(dating33, nfactors=3, rotate="varimax", fm="ml"))

Well, it’s much better in my opinion. I’ll stay with the model with 3 variables, because there we see that proportion variance gets to 0.11 which is not so bad; then, proportion explained - it’s not really good, but a bit better than in two other models (it can be sen from the graph especially well). Cumulative variance there is less than in other models, but i still like it more.

Last but not least, I’ve cleaned it again. In the final model we have better cumulative variance (0.32) and proportion variance (from 0.9 to 0.13). Proportion explained variates from 0.27 to 0.41 (taking into account than with three factors the perfect is 0.33 - it’s quite good result!). RMSR is still not the best one, but i guess, we’ve already got used to that.

# last time
dating4<- dating[c( "date", "go_out", "sports", "exercise",  "dining" , "museums",  "art",  "theater", "movies",  "concerts",  "music",   "shopping",   "yoga",  "attr1_1", "attr2_1", "sinc2_1", "intel2_1",  "amb2_1",   "shar2_1",   "attr3_1", "intel3_1",   "fun3_1",   "amb3_1")]
dating4 <- as.data.frame(dating4)
dating44 <- na.omit(dating4)

# model with 3 factors:
fa(dating44, nfactors=3, rotate="varimax", fm="ml") 
## Factor Analysis using method =  ml
## Call: fa(r = dating44, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML2   ML1   ML3   h2    u2 com
## date     -0.07 -0.04 -0.35 0.13 0.869 1.1
## go_out    0.02  0.01 -0.35 0.12 0.876 1.0
## sports   -0.14  0.09  0.33 0.14 0.863 1.5
## exercise  0.01 -0.07  0.31 0.10 0.898 1.1
## dining    0.43 -0.04  0.27 0.26 0.741 1.7
## museums   0.91  0.01 -0.07 0.83 0.168 1.0
## art       0.91  0.00 -0.04 0.82 0.178 1.0
## theater   0.61 -0.05 -0.08 0.39 0.615 1.0
## movies    0.38  0.00 -0.04 0.15 0.854 1.0
## concerts  0.46 -0.02  0.03 0.21 0.788 1.0
## music     0.35  0.00  0.17 0.15 0.850 1.5
## shopping  0.30 -0.14  0.23 0.16 0.837 2.3
## yoga      0.33 -0.03  0.13 0.12 0.876 1.3
## attr1_1  -0.15 -0.36  0.25 0.21 0.788 2.2
## attr2_1   0.10 -0.99 -0.04 1.00 0.005 1.0
## sinc2_1  -0.09  0.63 -0.01 0.40 0.596 1.0
## intel2_1 -0.07  0.57  0.12 0.34 0.657 1.1
## amb2_1   -0.13  0.51  0.11 0.29 0.705 1.2
## shar2_1   0.08  0.47 -0.05 0.23 0.769 1.1
## attr3_1   0.11  0.02  0.62 0.40 0.605 1.1
## intel3_1  0.04  0.09  0.44 0.20 0.796 1.1
## fun3_1    0.16 -0.05  0.62 0.42 0.585 1.1
## amb3_1    0.10  0.00  0.53 0.29 0.714 1.1
## 
##                        ML2  ML1  ML3
## SS loadings           3.03 2.37 1.97
## Proportion Var        0.13 0.10 0.09
## Cumulative Var        0.13 0.23 0.32
## Proportion Explained  0.41 0.32 0.27
## Cumulative Proportion 0.41 0.73 1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  253  and the objective function was  7.49 with Chi Square of  61673.47
## The degrees of freedom for the model are 187  and the objective function was  2.26 
## 
## The root mean square of the residuals (RMSR) is  0.07 
## The df corrected root mean square of the residuals is  0.08 
## 
## The harmonic number of observations is  8245 with the empirical chi square  19574.7  with prob <  0 
## The total number of observations was  8245  with Likelihood Chi Square =  18584.33  with prob <  0 
## 
## Tucker Lewis Index of factoring reliability =  0.595
## RMSEA index =  0.109  and the 90 % confidence intervals are  0.108 0.111
## BIC =  16898.08
## Fit based upon off diagonal values = 0.87
## Measures of factor score adequacy             
##                                                    ML2  ML1  ML3
## Correlation of (regression) scores with factors   0.96 1.00 0.86
## Multiple R square of scores with factors          0.92 0.99 0.74
## Minimum correlation of possible factor scores     0.84 0.99 0.48
factor.plot(fa(dating44, nfactors=3, rotate="varimax", fm="ml"))

fa.diagram(fa(dating44, nfactors=3, rotate="varimax", fm="ml"))

And a quick look at our factors:

  • first factor, ML2, collects hobbies : artistic and cultural ones in general (museum, theater, concerts, dining, music). Yoga and shopping also goes there.

  • then, ML1, as in the previous models it stays with answers to the question “What do you think the opposite sex looks for in a date?”: attractive, sincere, intelligent, ambitious, has shared interests/hobbies. Attr1_1 is here, too (which is “attractiveness” to close question “what you look for in the opposite sex”)

  • last one, ML3, is a mix of answers to the question “How do you think you measure up?” (attractie, fun, ambitious, intelligent) and some active activities (sport, exercise, going out and dates).

PART 2

Let’s test the fit of the scales ML3: dining, museums, art, theater, movies, concerts, music, yoga. All var should be reversed to + or - side.

Internal consistency: Cronbach’s alpha

ML3<- as.data.frame(dating12[c("dining", "museums", 
                   "art", "theater", 
                   "movies", "concerts", 
                   "music", "yoga")])
alpha(ML3)
## 
## Reliability analysis   
## Call: alpha(x = ML3)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean  sd median_r
##       0.79       0.8    0.83      0.33 3.9 0.0035  6.9 1.3     0.29
## 
##  lower alpha upper     95% confidence boundaries
## 0.78 0.79 0.8 
## 
##  Reliability if an item is dropped:
##          raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## dining        0.78      0.79    0.83      0.35 3.8   0.0037 0.030  0.29
## museums       0.74      0.75    0.76      0.30 3.0   0.0043 0.017  0.29
## art           0.74      0.75    0.76      0.30 3.0   0.0044 0.018  0.25
## theater       0.75      0.76    0.79      0.31 3.1   0.0042 0.028  0.29
## movies        0.77      0.78    0.81      0.34 3.6   0.0038 0.030  0.29
## concerts      0.76      0.76    0.78      0.31 3.2   0.0041 0.028  0.29
## music         0.78      0.78    0.80      0.34 3.6   0.0037 0.025  0.29
## yoga          0.81      0.80    0.84      0.37 4.1   0.0032 0.027  0.33
## 
##  Item statistics 
##             n raw.r std.r r.cor r.drop mean  sd
## dining   8191  0.51  0.53  0.41   0.37  7.8 1.7
## museums  8191  0.76  0.75  0.77   0.66  7.0 2.0
## art      8191  0.77  0.76  0.78   0.66  6.7 2.2
## theater  8191  0.72  0.72  0.67   0.59  6.8 2.2
## movies   8191  0.57  0.60  0.51   0.45  7.9 1.7
## concerts 8191  0.69  0.70  0.66   0.56  6.9 2.1
## music    8191  0.56  0.59  0.52   0.43  7.9 1.7
## yoga     8191  0.54  0.48  0.34   0.32  4.4 2.7
## 
## Non missing response frequency for each item
##             1    2    3    4    5    6    7    8    9   10 miss
## dining   0.00 0.00 0.01 0.02 0.08 0.09 0.18 0.23 0.20 0.19    0
## museums  0.01 0.01 0.05 0.06 0.10 0.11 0.22 0.19 0.15 0.10    0
## art      0.01 0.02 0.07 0.06 0.12 0.11 0.16 0.21 0.11 0.11    0
## theater  0.02 0.02 0.05 0.06 0.12 0.11 0.20 0.16 0.15 0.11    0
## movies   0.00 0.01 0.01 0.02 0.04 0.07 0.18 0.25 0.23 0.18    0
## concerts 0.01 0.03 0.05 0.06 0.11 0.14 0.18 0.18 0.14 0.10    0
## music    0.00 0.00 0.00 0.03 0.07 0.09 0.19 0.20 0.20 0.21    0
## yoga     0.19 0.14 0.13 0.09 0.10 0.10 0.10 0.06 0.05 0.04    0

Cronbach’s alpha (>.7 indicates good reliability). We have 0.79. It is quite good.

The task is:

  • Repeat Reliability analysis for any 2 factors (for your final fa model).
  • Interpret the results basing on “Reliability if an item is dropped (raw_alpha)”.
ML2<- as.data.frame(dating44[c("dining", "museums", "art", "theater", "movies", "concerts", "music", "yoga", "shopping")])
alpha(ML2, check.keys=TRUE)
## 
## Reliability analysis   
## Call: alpha(x = ML2, check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean  sd median_r
##        0.8      0.81    0.84      0.32 4.2 0.0034  6.8 1.3     0.28
## 
##  lower alpha upper     95% confidence boundaries
## 0.79 0.8 0.8 
## 
##  Reliability if an item is dropped:
##          raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## dining        0.78      0.80    0.83      0.33 3.9   0.0037 0.026  0.28
## museums       0.76      0.77    0.78      0.29 3.3   0.0041 0.014  0.26
## art           0.75      0.77    0.78      0.29 3.3   0.0041 0.015  0.25
## theater       0.76      0.77    0.81      0.30 3.4   0.0040 0.023  0.25
## movies        0.78      0.79    0.82      0.32 3.8   0.0037 0.025  0.26
## concerts      0.77      0.78    0.80      0.30 3.5   0.0039 0.022  0.26
## music         0.78      0.79    0.81      0.32 3.8   0.0036 0.022  0.29
## yoga          0.80      0.81    0.84      0.35 4.2   0.0033 0.023  0.30
## shopping      0.79      0.80    0.83      0.33 4.0   0.0034 0.025  0.29
## 
##  Item statistics 
##             n raw.r std.r r.cor r.drop mean  sd
## dining   8245  0.56  0.57  0.48   0.44  7.8 1.8
## museums  8245  0.73  0.74  0.76   0.64  7.0 2.0
## art      8245  0.74  0.74  0.76   0.64  6.7 2.2
## theater  8245  0.71  0.72  0.68   0.60  6.8 2.2
## movies   8245  0.56  0.60  0.52   0.46  7.9 1.7
## concerts 8245  0.67  0.68  0.65   0.55  6.8 2.1
## music    8245  0.56  0.58  0.53   0.44  7.9 1.8
## yoga     8245  0.52  0.48  0.35   0.33  4.4 2.7
## shopping 8245  0.56  0.53  0.43   0.38  5.6 2.6
## 
## Non missing response frequency for each item
##             1    2    3    4    5    6    7    8    9   10 miss
## dining   0.00 0.00 0.01 0.02 0.08 0.09 0.18 0.23 0.20 0.18    0
## museums  0.01 0.01 0.05 0.06 0.10 0.11 0.22 0.19 0.15 0.10    0
## art      0.01 0.03 0.07 0.06 0.12 0.11 0.16 0.21 0.11 0.12    0
## theater  0.02 0.02 0.05 0.06 0.12 0.11 0.20 0.16 0.15 0.11    0
## movies   0.00 0.01 0.01 0.02 0.04 0.07 0.18 0.25 0.23 0.18    0
## concerts 0.01 0.03 0.05 0.06 0.11 0.14 0.19 0.18 0.14 0.10    0
## music    0.00 0.00 0.01 0.03 0.07 0.09 0.19 0.20 0.20 0.22    0
## yoga     0.19 0.15 0.13 0.09 0.10 0.10 0.10 0.06 0.05 0.04    0
## shopping 0.06 0.11 0.07 0.09 0.14 0.12 0.14 0.10 0.10 0.07    0
ML1<- as.data.frame(dating44[c("attr1_1", "attr2_1", "sinc2_1", "intel2_1",  "amb2_1",   "shar2_1")])
alpha(ML1, check.keys=TRUE)
## Warning in alpha(ML1, check.keys = TRUE): Some items were negatively correlated with total scale and were automatically reversed.
##  This is indicated by a negative sign for the variable name.
## 
## Reliability analysis   
## Call: alpha(x = ML1, check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.69      0.69    0.79      0.27 2.3 0.004   33 6.1     0.23
## 
##  lower alpha upper     95% confidence boundaries
## 0.68 0.69 0.7 
## 
##  Reliability if an item is dropped:
##          raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r med.r
## attr1_1-      0.70      0.69    0.80      0.31 2.26   0.0030 0.049  0.24
## attr2_1-      0.46      0.48    0.45      0.16 0.93   0.0090 0.010  0.18
## sinc2_1       0.64      0.65    0.71      0.27 1.86   0.0048 0.044  0.29
## intel2_1      0.66      0.67    0.73      0.29 2.03   0.0046 0.041  0.26
## amb2_1        0.67      0.69    0.74      0.31 2.22   0.0044 0.039  0.28
## shar2_1       0.67      0.69    0.74      0.31 2.21   0.0045 0.042  0.24
## 
##  Item statistics 
##             n raw.r std.r r.cor r.drop mean   sd
## attr1_1- 8245  0.59  0.53  0.35   0.31   78 12.0
## attr2_1- 8245  0.94  0.94  1.01   0.83   70 16.0
## sinc2_1  8245  0.63  0.64  0.58   0.49   13  7.0
## intel2_1 8245  0.56  0.59  0.51   0.42   14  6.2
## amb2_1   8245  0.51  0.54  0.45   0.35   12  6.9
## shar2_1  8245  0.52  0.54  0.45   0.38   12  6.2
ML3<- as.data.frame(dating44[c("attr3_1", "intel3_1", "fun3_1", "amb3_1", "date", "go_out", "sports", "exercise")])
alpha(ML3, check.keys=TRUE)
## Warning in alpha(ML3, check.keys = TRUE): Some items were negatively correlated with total scale and were automatically reversed.
##  This is indicated by a negative sign for the variable name.
## 
## Reliability analysis   
## Call: alpha(x = ML3, check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean   sd median_r
##       0.63      0.67    0.67       0.2   2 0.0059  7.3 0.93     0.16
## 
##  lower alpha upper     95% confidence boundaries
## 0.62 0.63 0.65 
## 
##  Reliability if an item is dropped:
##          raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## attr3_1       0.58      0.61    0.60      0.18 1.5   0.0067 0.0095  0.15
## intel3_1      0.61      0.64    0.64      0.20 1.8   0.0063 0.0105  0.16
## fun3_1        0.58      0.62    0.61      0.19 1.6   0.0066 0.0100  0.15
## amb3_1        0.59      0.63    0.63      0.19 1.7   0.0065 0.0119  0.16
## date-         0.62      0.65    0.65      0.21 1.9   0.0061 0.0124  0.16
## go_out-       0.62      0.65    0.65      0.21 1.9   0.0062 0.0118  0.16
## sports        0.61      0.65    0.64      0.21 1.9   0.0064 0.0113  0.18
## exercise      0.61      0.66    0.64      0.22 1.9   0.0065 0.0100  0.18
## 
##  Item statistics 
##             n raw.r std.r r.cor r.drop mean  sd
## attr3_1  8245  0.59  0.65  0.60   0.45  7.1 1.4
## intel3_1 8245  0.45  0.54  0.44   0.33  8.4 1.1
## fun3_1   8245  0.57  0.62  0.55   0.41  7.7 1.6
## amb3_1   8245  0.57  0.59  0.50   0.38  7.6 1.8
## date-    8245  0.44  0.50  0.38   0.27  6.0 1.4
## go_out-  8245  0.42  0.50  0.38   0.28  8.8 1.1
## sports   8245  0.63  0.50  0.38   0.34  6.4 2.6
## exercise 8245  0.61  0.49  0.37   0.34  6.2 2.4
## 
## Non missing response frequency for each item
##             1    2    3    4    5    6    7    8    9   10 miss
## attr3_1  0.00 0.00 0.02 0.03 0.08 0.13 0.35 0.27 0.09 0.03    0
## intel3_1 0.00 0.00 0.00 0.00 0.01 0.02 0.14 0.35 0.32 0.16    0
## fun3_1   0.00 0.01 0.01 0.01 0.04 0.11 0.20 0.28 0.22 0.11    0
## amb3_1   0.00 0.01 0.02 0.03 0.08 0.08 0.20 0.25 0.20 0.14    0
## date     0.01 0.04 0.09 0.25 0.19 0.25 0.17 0.00 0.00 0.00    0
## go_out   0.31 0.36 0.24 0.05 0.02 0.01 0.00 0.00 0.00 0.00    0
## sports   0.04 0.06 0.08 0.07 0.10 0.09 0.14 0.16 0.13 0.13    0
## exercise 0.04 0.06 0.07 0.07 0.13 0.14 0.15 0.16 0.11 0.08    0

As I don’t have many factors, only 3, I’ve made analysis on all of them.

We see, that the first factor (ML2) is the best - its standardized alpha is 0.8, which is very nice. The second factor (ML1) has std.alpha 0.7, still pretty good, and the last one (ML3) has std.alpha 0.67.

Then we look at the reliability if an item is dropped.

Let’s focus on our first factor (about cultural hobbies): its standardized alpha is 0.8, hold it in your head. If we look at std.alpha for the variables, we may notive that in general it decreases. Only in case of shopping and yoga it changes a bit in the increasing way, which is could be expected (not the most “cultural” hobbies). However, difference is still not very big.

Now let’s have a look at the next factor, ML2, which is about the opposite sex expectations (what do they look for): as we remember, standardized alpha there is smallerm but still OK, 0.69. All of the variables there give lower std.alpha, being dropped. Great!

And the last factor, ML3, about the respondent themselves (how they measure themselves and their active lives): the lowest std.alpha 0.67 (which is a bit similar to the result of the previous model, but still the worst among all three factors). As for the reliability if an item is dropped: all of the variables there give lower std.alpha, being dropped. Very nice.

All in all, I think, I’ve got not so bad results. Factors give good explanations, variables in each do not contrast to each other, and even the model has not that bad variance. Perfect! Now let’s save it in new dataframe (spoiler: it would be called “dating_fa”).

PART 3

If you need to save scores and use for future analysis. For example, include in a regression analysis.

fa1<-fa(dating12, nfactors=5, rotate="varimax", fm="ml", scores=T) 
load <- fa1$loadings[,1:2] 
plot(load) # set up plot 

fascores<-as.data.frame(fa1$scores)
datingfa<-cbind(dating12,fascores)
#names(datingfa)
fa2<-fa(dating44, nfactors=3, rotate="varimax", fm="ml", scores=T) 
load <- fa2$loadings[,1:2] 
plot(load)  

fascores<-as.data.frame(fa2$scores)
dating_fa<-cbind(dating44,fascores) # now we have all our factor scores in a data frame, hurray!
#names(datingfa)

~ done :)