Andmefailis praktikum5.Rdata on tabel nimega omadused. Selles sisalduvateks tunnusteks on 159 inimese enesekohased hinnangud 16-le omadusele, lisaks on tabelis ära toodud vastaja vanus.

1. Uurige Bartletti testi ja korrelatsioonimaatriksi determinandi abil, kas andmestikul on probleeme liiga nõrkade või liiga tugevate muutujatevaheliste seoste rohkusega. (Kuna soovime faktoranalüüsi kaasata ainult omaduste hinnangud ja mitte vastaja vanust, oleks mõistlik vanus välja jätta. Seda saab teha andes funktsioonidele ette mitte terve tabeli omadused vaid tabeli ilma 17. tunnuseta: omadused[,-17])

library(psych)
omadused2 <- omadused[,-17]#jätab välja andmestikus 17. veeru
cortest.bartlett(omadused2)
## R was not square, finding R from data
## $chisq
## [1] 1118.432
## 
## $p.value
## [1] 1.409687e-161
## 
## $df
## [1] 120
ommatrix <- round(cor(omadused2, use="complete"), 2)
#ommatrix

#multikolineaarsuse testimine:
det(ommatrix)
## [1] 0.0005473827

2. üritage kindlaks määrata mõistlik faktorite arv. Kui me faktorite arvu ette ei tea, võiks alustuseks teha mõne suurema faktorite arvuga mudeli. Vaadake omaväärtusi ja tehke omaväärtuste graafik. Millist faktorite arvu soovitavad Kaiseri ja Cattelli kriteeriumid? Millist faktorite arvu soovitab paralleelanalüüs? Kui päris ühest vastust ei saa, proovige teha paar erinevat mudelit ja vaadakenende faktorlaadungite tabelit. Milline mudel oleks kõige lihtsamini tõlgendatav?

#omaväärtused saame faktormudelist kätte, kui lisame mudelinime lõppu dollari märgi ja values:
#mudeli teeme psych mooduli abil, kasutame funktsiooni fa:
om.m1 <- fa(omadused2, nfactors=4, rotate ="varimax", fm="ml", scores =TRUE)
om.m1$values
##  [1]  4.4378721520  2.3860286868  1.2675813028  0.6210781403  0.2650901093
##  [6]  0.1797381203  0.1300574331  0.0970379557  0.0557144924  0.0118856390
## [11]  0.0002710486 -0.0163518911 -0.0673082745 -0.1484060147 -0.2035540077
## [16] -0.3231930546
plot(om.m1$values, tyoe ="b")

fa.parallel(omadused)

## Parallel analysis suggests that the number of factors =  4  and the number of components =  3

3. Kas esineb muutujaid, mida tasuks välja jätta (ei laadu ühelegi faktorile tugevalt, laadub rohkem kui ühele enam-vähem võrdselt, teistest muutujatest oluliselt madalam kommunaliteet)?

print.psych(om.m1, cut=0.3, sort=TRUE)
## Factor Analysis using method =  ml
## Call: fa(r = omadused2, nfactors = 4, rotate = "varimax", scores = TRUE, 
##     fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##                 item   ML1   ML2   ML3   ML4   h2   u2 com
## j_rjekindlusetu    9  0.79                   0.69 0.31 1.2
## p_simatu          11  0.77                   0.61 0.39 1.1
## vastutustundetu    4  0.70             -0.39 0.68 0.32 1.8
## hooletu            6  0.58             -0.57 0.72 0.28 2.4
## eraldihoidev       8        0.85             0.73 0.27 1.0
## seltskondlik      13       -0.75  0.37       0.73 0.27 1.6
## loid              10  0.41  0.62             0.56 0.44 1.8
## vaikne             2        0.57             0.36 0.64 1.2
## kartlik           16        0.41 -0.39       0.43 0.57 3.1
## salatseja          3        0.36             0.15 0.85 1.4
## h_irimatu          7              0.70       0.51 0.49 1.1
## muretu            15              0.65       0.49 0.51 1.3
## pingevaba          1              0.55       0.35 0.65 1.3
## enesekindel       12 -0.37 -0.35  0.49       0.54 0.46 3.1
## korralik          14 -0.32              0.83 0.81 0.19 1.4
## ettevaatlik        5                    0.40 0.32 0.68 2.9
## 
##                        ML1  ML2  ML3  ML4
## SS loadings           2.69 2.67 1.87 1.45
## Proportion Var        0.17 0.17 0.12 0.09
## Cumulative Var        0.17 0.34 0.45 0.54
## Proportion Explained  0.31 0.31 0.22 0.17
## Cumulative Proportion 0.31 0.62 0.83 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  120  and the objective function was  7.37 with Chi Square of  1118.43
## The degrees of freedom for the model are 62  and the objective function was  0.75 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.05 
## 
## The harmonic number of observations is  158 with the empirical chi square  51.97  with prob <  0.81 
## The total number of observations was  159  with MLE Chi Square =  111.53  with prob <  0.00012 
## 
## Tucker Lewis Index of factoring reliability =  0.902
## RMSEA index =  0.076  and the 90 % confidence intervals are  0.049 0.092
## BIC =  -202.74
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                 ML1  ML2  ML3  ML4
## Correlation of scores with factors             0.90 0.92 0.88 0.88
## Multiple R square of scores with factors       0.82 0.85 0.77 0.78
## Minimum correlation of possible factor scores  0.64 0.71 0.54 0.56

4. Proovige nii täisnurkset kui kaldnurkset pööramist. Millised on kaldnurkse pööramise puhul faktoritevahelised korrelatsioonid? Kas nende põhjal oleks kaldnurkselt pööratud faktorlahend antud juhul õigustatud?

#pööramata:
fa.2 <- fa(omadused2, nfactors=3, rotate ="none")
fa.2$loadings
## 
## Loadings:
##                 MR1    MR2    MR3   
## pingevaba       -0.291  0.342  0.392
## vaikne           0.383 -0.429  0.138
## salatseja        0.274 -0.116  0.241
## vastutustundetu  0.692  0.470       
## ettevaatlik            -0.529       
## hooletu          0.701  0.423       
## h_irimatu       -0.224  0.236  0.634
## eraldihoidev     0.596 -0.512  0.364
## j_rjekindlusetu  0.684  0.363 -0.174
## loid             0.699 -0.139  0.173
## p_simatu         0.537  0.440 -0.121
## enesekindel     -0.643  0.137  0.335
## seltskondlik    -0.695  0.484       
## korralik        -0.548 -0.369  0.109
## muretu                  0.402  0.542
## kartlik          0.486 -0.311 -0.190
## 
##                  MR1   MR2   MR3
## SS loadings    4.367 2.304 1.301
## Proportion Var 0.273 0.144 0.081
## Cumulative Var 0.273 0.417 0.498
fa.3 <- fa(omadused2, nfactors=3, rotate ="oblimin")#kaldnurkne pööramine
## Loading required namespace: GPArotation
fa.3$loadings
## 
## Loadings:
##                 MR1    MR2    MR3   
## pingevaba              -0.186  0.524
## vaikne                  0.598       
## salatseja               0.368  0.164
## vastutustundetu  0.823         0.107
## ettevaatlik     -0.433  0.378 -0.153
## hooletu          0.791         0.104
## h_irimatu                      0.719
## eraldihoidev            0.901  0.121
## j_rjekindlusetu  0.772        -0.146
## loid             0.334  0.573       
## p_simatu         0.725 -0.108       
## enesekindel     -0.383 -0.255  0.451
## seltskondlik           -0.730  0.235
## korralik        -0.675              
## muretu           0.278         0.640
## kartlik          0.117  0.381 -0.343
## 
##                  MR1   MR2   MR3
## SS loadings    3.447 2.587 1.696
## Proportion Var 0.215 0.162 0.106
## Cumulative Var 0.215 0.377 0.483
fa.4 <- fa(omadused2, nfactors=3, rotate ="varimax")#ortogonaalne pööramine 
fa.4$loadings
## 
## Loadings:
##                 MR1    MR2    MR3   
## pingevaba              -0.208  0.558
## vaikne                  0.570 -0.154
## salatseja               0.359       
## vastutustundetu  0.814  0.184       
## ettevaatlik     -0.379  0.304 -0.212
## hooletu          0.788  0.224       
## h_irimatu                      0.708
## eraldihoidev            0.864       
## j_rjekindlusetu  0.765  0.137 -0.164
## loid             0.394  0.614       
## p_simatu         0.703              
## enesekindel     -0.418 -0.337  0.506
## seltskondlik    -0.180 -0.740  0.371
## korralik        -0.661              
## muretu           0.264         0.623
## kartlik          0.168  0.410 -0.415
## 
##                  MR1   MR2   MR3
## SS loadings    3.422 2.654 1.896
## Proportion Var 0.214 0.166 0.119
## Cumulative Var 0.214 0.380 0.498

5. Kui olete parima mudeli välja valinud, katsuge faktoreid tõlgendada ja pange neile nimed.

6. Kui suure osa kogu andmestiku variatiivsusest need faktorid ära kirjeldavad?

#Tulbas nimega h2 paiknevad kommunaliteedid, mis näaitavad kui suure osa muutuja variatiivususest faktorid summaarselt ara kirjeldavad.
print.psych(fa.4, cut=0.3, sort=TRUE)
## Factor Analysis using method =  minres
## Call: fa(r = omadused2, nfactors = 3, rotate = "varimax")
## Standardized loadings (pattern matrix) based upon correlation matrix
##                 item   MR1   MR2   MR3   h2   u2 com
## vastutustundetu    4  0.81             0.70 0.30 1.1
## hooletu            6  0.79             0.68 0.32 1.2
## j_rjekindlusetu    9  0.76             0.63 0.37 1.2
## p_simatu          11  0.70             0.50 0.50 1.0
## korralik          14 -0.66             0.45 0.55 1.1
## ettevaatlik        5 -0.38  0.30       0.28 0.72 2.5
## eraldihoidev       8        0.86       0.75 0.25 1.0
## seltskondlik      13       -0.74  0.37 0.72 0.28 1.6
## loid              10  0.39  0.61       0.54 0.46 1.7
## vaikne             2        0.57       0.35 0.65 1.2
## salatseja          3        0.36       0.15 0.85 1.3
## h_irimatu          7              0.71 0.51 0.49 1.0
## muretu            15              0.62 0.46 0.54 1.4
## pingevaba          1              0.56 0.36 0.64 1.3
## enesekindel       12 -0.42 -0.34  0.51 0.54 0.46 2.7
## kartlik           16        0.41 -0.42 0.37 0.63 2.3
## 
##                        MR1  MR2  MR3
## SS loadings           3.42 2.65 1.90
## Proportion Var        0.21 0.17 0.12
## Cumulative Var        0.21 0.38 0.50
## Proportion Explained  0.43 0.33 0.24
## Cumulative Proportion 0.43 0.76 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  120  and the objective function was  7.37 with Chi Square of  1118.43
## The degrees of freedom for the model are 75  and the objective function was  1.15 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  158 with the empirical chi square  93.39  with prob <  0.074 
## The total number of observations was  159  with MLE Chi Square =  172.47  with prob <  1.2e-09 
## 
## Tucker Lewis Index of factoring reliability =  0.841
## RMSEA index =  0.095  and the 90 % confidence intervals are  0.073 0.108
## BIC =  -207.7
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                 MR1  MR2  MR3
## Correlation of scores with factors             0.94 0.92 0.87
## Multiple R square of scores with factors       0.89 0.85 0.77
## Minimum correlation of possible factor scores  0.77 0.71 0.53

7. Arvutage välja faktorskoorid. Uurige regressioonimudeli abil, kas mõne faktori skoorid seostuvad vastaja vanusega.

skoorid <- data.frame(fa.4$scores)
head(skoorid)
##          MR1        MR2        MR3
## 1 -0.7911694 -1.2322377  1.1714490
## 2 -0.9479709 -0.4897219  0.2554737
## 3         NA         NA         NA
## 4 -1.4045513 -1.1050959 -0.0692634
## 5 -0.8802008 -0.2791943 -0.1733208
## 6 -1.0893579 -0.3528082 -0.8548060
names(omadused)
##  [1] "pingevaba"       "vaikne"          "salatseja"      
##  [4] "vastutustundetu" "ettevaatlik"     "hooletu"        
##  [7] "h_irimatu"       "eraldihoidev"    "j_rjekindlusetu"
## [10] "loid"            "p_simatu"        "enesekindel"    
## [13] "seltskondlik"    "korralik"        "muretu"         
## [16] "kartlik"         "vanus"
summary(lm(skoorid$MR1~ omadused$vanus))#vanus on olemas esimeses andemstikus
## 
## Call:
## lm(formula = skoorid$MR1 ~ omadused$vanus)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.91677 -0.70232 -0.06253  0.56043  2.87461 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     0.968732   0.352099   2.751  0.00666 **
## omadused$vanus -0.021153   0.007542  -2.805  0.00569 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9267 on 152 degrees of freedom
##   (5 observations deleted due to missingness)
## Multiple R-squared:  0.04921,    Adjusted R-squared:  0.04295 
## F-statistic: 7.866 on 1 and 152 DF,  p-value: 0.005694

8. Arvutage välja faktoritele vastavate alaskaalade Cronbachi alfad. Kas nende väärtusi võib pidada rahuldavaks?

library(dplyr)
alpha(select(omadused2, vastutustundetu, hooletu, j_rjekindlusetu, p_simatu, korralik, ettevaatlik),check.keys=TRUE)
## 
## Reliability analysis   
## Call: alpha(x = select(omadused2, vastutustundetu, hooletu, j_rjekindlusetu, 
##     p_simatu, korralik, ettevaatlik), check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd
##       0.82      0.84    0.84      0.46 5.1 0.041  2.9 0.91
## 
##  lower alpha upper     95% confidence boundaries
## 0.74 0.82 0.9 
## 
##  Reliability if an item is dropped:
##                 raw_alpha std.alpha G6(smc) average_r S/N alpha se
## vastutustundetu      0.77      0.78    0.78      0.41 3.5    0.053
## hooletu              0.77      0.78    0.78      0.42 3.6    0.052
## j_rjekindlusetu      0.78      0.80    0.80      0.44 4.0    0.051
## p_simatu             0.79      0.81    0.81      0.46 4.2    0.049
## korralik-            0.79      0.80    0.80      0.45 4.1    0.050
## ettevaatlik-         0.86      0.87    0.86      0.57 6.5    0.042
## 
##  Item statistics 
##                   n raw.r std.r r.cor r.drop mean  sd
## vastutustundetu 158  0.83  0.84  0.82   0.74  2.3 1.1
## hooletu         158  0.81  0.83  0.81   0.72  2.8 1.1
## j_rjekindlusetu 158  0.77  0.78  0.74   0.65  2.8 1.2
## p_simatu        158  0.75  0.74  0.68   0.60  3.2 1.3
## korralik-       158  0.76  0.76  0.70   0.63  2.8 1.3
## ettevaatlik-    158  0.53  0.50  0.33   0.30  3.7 1.4
## 
## Non missing response frequency for each item
##                    1    2    3    4    5    6 miss
## vastutustundetu 0.28 0.37 0.25 0.05 0.03 0.03 0.01
## hooletu         0.09 0.33 0.34 0.16 0.06 0.02 0.01
## j_rjekindlusetu 0.14 0.29 0.34 0.16 0.05 0.02 0.01
## p_simatu        0.10 0.22 0.30 0.24 0.06 0.08 0.01
## korralik        0.04 0.03 0.18 0.34 0.22 0.19 0.01
## ettevaatlik     0.13 0.22 0.18 0.30 0.11 0.07 0.01
alpha(select(omadused2, eraldihoidev, seltskondlik, loid, vaikne, kartlik, salatseja),check.keys=TRUE)
## 
## Reliability analysis   
## Call: alpha(x = select(omadused2, eraldihoidev, seltskondlik, loid, 
##     vaikne, kartlik, salatseja), check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd
##        0.8      0.79    0.78      0.39 3.8 0.044  2.8 0.86
## 
##  lower alpha upper     95% confidence boundaries
## 0.71 0.8 0.88 
## 
##  Reliability if an item is dropped:
##               raw_alpha std.alpha G6(smc) average_r S/N alpha se
## eraldihoidev       0.73      0.73    0.71      0.35 2.7    0.057
## seltskondlik-      0.73      0.73    0.70      0.35 2.7    0.057
## loid               0.75      0.74    0.73      0.37 2.9    0.054
## vaikne             0.77      0.77    0.75      0.40 3.3    0.052
## kartlik            0.79      0.78    0.76      0.42 3.6    0.050
## salatseja          0.81      0.81    0.79      0.46 4.3    0.047
## 
##  Item statistics 
##                 n raw.r std.r r.cor r.drop mean  sd
## eraldihoidev  158  0.81  0.80  0.77   0.69  2.7 1.3
## seltskondlik- 158  0.82  0.81  0.79   0.70  2.8 1.3
## loid          158  0.74  0.75  0.69   0.62  2.6 1.1
## vaikne        158  0.70  0.68  0.58   0.53  3.1 1.3
## kartlik       158  0.60  0.63  0.52   0.46  3.0 1.0
## salatseja     158  0.53  0.54  0.37   0.33  2.6 1.2
## 
## Non missing response frequency for each item
##                 1    2    3    4    5    6 miss
## eraldihoidev 0.22 0.27 0.29 0.13 0.06 0.04 0.01
## seltskondlik 0.01 0.12 0.18 0.22 0.27 0.19 0.01
## loid         0.18 0.25 0.36 0.15 0.05 0.01 0.01
## vaikne       0.13 0.22 0.32 0.18 0.11 0.05 0.01
## kartlik      0.06 0.25 0.39 0.24 0.05 0.01 0.01
## salatseja    0.18 0.28 0.36 0.11 0.04 0.03 0.01
alpha(select(omadused2, h_irimatu, muretu, pingevaba, enesekindel),check.keys=TRUE)
## 
## Reliability analysis   
## Call: alpha(x = select(omadused2, h_irimatu, muretu, pingevaba, enesekindel), 
##     check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd
##       0.67      0.68    0.62      0.35 2.1 0.071  3.5 0.84
## 
##  lower alpha upper     95% confidence boundaries
## 0.54 0.67 0.81 
## 
##  Reliability if an item is dropped:
##             raw_alpha std.alpha G6(smc) average_r S/N alpha se
## h_irimatu        0.56      0.56    0.47      0.30 1.3    0.096
## muretu           0.64      0.65    0.55      0.38 1.8    0.088
## pingevaba        0.57      0.58    0.50      0.31 1.4    0.095
## enesekindel      0.65      0.66    0.57      0.39 1.9    0.087
## 
##  Item statistics 
##               n raw.r std.r r.cor r.drop mean  sd
## h_irimatu   157  0.74  0.76  0.65   0.54  3.8 1.1
## muretu      158  0.68  0.68  0.51   0.40  2.8 1.2
## pingevaba   156  0.75  0.75  0.62   0.51  3.5 1.2
## enesekindel 157  0.68  0.67  0.48   0.39  3.8 1.3
## 
## Non missing response frequency for each item
##                1    2    3    4    5    6 miss
## h_irimatu   0.01 0.08 0.29 0.37 0.20 0.06 0.01
## muretu      0.13 0.30 0.34 0.13 0.09 0.02 0.01
## pingevaba   0.06 0.15 0.31 0.28 0.16 0.04 0.02
## enesekindel 0.04 0.09 0.29 0.25 0.24 0.08 0.01