第Q回(12月11日) Task Check and Weekly Assignment

因子分析概論(2)

To Do
□ 因子分析の様々なオプションを試してみる。

Assignment
□ 検査データを因子分析し,因子軸の回転をした上で,因子に命名しなさい。
因子負荷量とともにレポートすること。

前回同様,5教科
いつものファイルの読み込みと下処理。済んでいる場合はやらなくてよいです。
※Workspaceにsampleが入っている場合=済んでいる場合です。

sample <- read.csv("sample(mac).csv", na.strings = "*")
sample$sex <- factor(sample$sex, labels = c("male", "female"))

因子分析に使うところだけサブセットを作っておきましょうか。

fac_data <- subset(sample, select = c("kokugo", "sansuu", "rika", "eigo", "syakai"))
plot(fac_data)

plot of chunk unnamed-chunk-2

因子分析は次のようにするのでした。

library(psych)
fa(fac_data, nfactors = 2)
## Loading required package: GPArotation
## Factor Analysis using method =  minres
## Call: fa(r = fac_data, nfactors = 2)
## Standardized loadings (pattern matrix) based upon correlation matrix
##          MR2   MR1   h2     u2
## kokugo  0.99 -0.04 0.99 0.0082
## sansuu  0.03  1.00 1.00 0.0050
## rika   -0.10  0.69 0.49 0.5090
## eigo    0.93 -0.01 0.86 0.1388
## syakai  0.68  0.16 0.47 0.5314
## 
##                        MR2  MR1
## SS loadings           2.31 1.50
## Proportion Var        0.46 0.30
## Cumulative Var        0.46 0.76
## Proportion Explained  0.61 0.39
## Cumulative Proportion 0.61 1.00
## 
##  With factor correlations of 
##       MR2   MR1
## MR2  1.00 -0.08
## MR1 -0.08  1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.08 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.12 
## The number of observations was  100  with Chi Square =  7.36  with prob <  0.0067 
## 
## Tucker Lewis Index of factoring reliability =  0.791
## RMSEA index =  0.259  and the 90 % confidence intervals are  0.107 0.436
## BIC =  2.76
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                 MR2  MR1
## Correlation of scores with factors             1.00 1.00
## Multiple R square of scores with factors       0.99 1.00
## Minimum correlation of possible factor scores  0.98 0.99

推定オプションを追加

fa(fac_data, nfactors = 2, fm = "gls")
## Factor Analysis using method =  gls
## Call: fa(r = fac_data, nfactors = 2, fm = "gls")
## Standardized loadings (pattern matrix) based upon correlation matrix
##         GLS1  GLS2   h2     u2
## kokugo  0.99 -0.05 0.98 0.0162
## sansuu  0.03  1.00 1.00 0.0021
## rika   -0.11  0.69 0.50 0.5022
## eigo    0.93  0.01 0.86 0.1422
## syakai  0.69  0.12 0.47 0.5277
## 
##                       GLS1 GLS2
## SS loadings           2.32 1.49
## Proportion Var        0.46 0.30
## Cumulative Var        0.46 0.76
## Proportion Explained  0.61 0.39
## Cumulative Proportion 0.61 1.00
## 
##  With factor correlations of 
##       GLS1  GLS2
## GLS1  1.00 -0.07
## GLS2 -0.07  1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.09 
## 
## The root mean square of the residuals (RMSR) is  0.02 
## The df corrected root mean square of the residuals is  0.1 
## The number of observations was  100  with Chi Square =  8.47  with prob <  0.0036 
## 
## Tucker Lewis Index of factoring reliability =  0.755
## RMSEA index =  0.281  and the 90 % confidence intervals are  0.127 0.455
## BIC =  3.86
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                GLS1 GLS2
## Correlation of scores with factors             0.99 1.00
## Multiple R square of scores with factors       0.99 1.01
## Minimum correlation of possible factor scores  0.97 1.01
## 
##  WARNING, the factor score fit indices suggest that the solution is degenerate. Try a different method of factor extraction.
## Warning: the factor score fit indices suggest that the solution is
## degenerate
fa(fac_data, nfactors = 2, fm = "ml")
## Factor Analysis using method =  ml
## Call: fa(r = fac_data, nfactors = 2, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          ML1   ML2   h2    u2
## kokugo  0.99 -0.04 1.00 0.005
## sansuu  0.03  1.00 1.00 0.005
## rika   -0.10  0.69 0.49 0.509
## eigo    0.93 -0.01 0.86 0.141
## syakai  0.68  0.16 0.47 0.533
## 
##                        ML1  ML2
## SS loadings           2.31 1.50
## Proportion Var        0.46 0.30
## Cumulative Var        0.46 0.76
## Proportion Explained  0.61 0.39
## Cumulative Proportion 0.61 1.00
## 
##  With factor correlations of 
##       ML1   ML2
## ML1  1.00 -0.08
## ML2 -0.08  1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.08 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.12 
## The number of observations was  100  with Chi Square =  7.33  with prob <  0.0068 
## 
## Tucker Lewis Index of factoring reliability =  0.792
## RMSEA index =  0.259  and the 90 % confidence intervals are  0.106 0.435
## BIC =  2.72
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                 ML1  ML2
## Correlation of scores with factors             1.00 1.00
## Multiple R square of scores with factors       1.00 1.00
## Minimum correlation of possible factor scores  0.99 0.99
fa(fac_data, nfactors = 2, fm = "pa")
## maximum iteration exceeded
## Factor Analysis using method =  pa
## Call: fa(r = fac_data, nfactors = 2, fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          PA1   PA2   h2       u2
## kokugo  1.00 -0.04 1.00  6.9e-05
## sansuu  0.02  1.06 1.13 -1.3e-01
## rika   -0.12  0.64 0.43  5.7e-01
## eigo    0.92  0.01 0.84  1.6e-01
## syakai  0.68  0.12 0.46  5.4e-01
## 
##                        PA1  PA2
## SS loadings           2.31 1.56
## Proportion Var        0.46 0.31
## Cumulative Var        0.46 0.77
## Proportion Explained  0.60 0.40
## Cumulative Proportion 0.60 1.00
## 
##  With factor correlations of 
##       PA1   PA2
## PA1  1.00 -0.06
## PA2 -0.06  1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.08 
## 
## The root mean square of the residuals (RMSR) is  0.02 
## The df corrected root mean square of the residuals is  0.09 
## The number of observations was  100  with Chi Square =  7.71  with prob <  0.0055 
## 
## Tucker Lewis Index of factoring reliability =  0.78
## RMSEA index =  0.266  and the 90 % confidence intervals are  0.113 0.442
## BIC =  3.1
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                PA1  PA2
## Correlation of scores with factors               1 1.08
## Multiple R square of scores with factors         1 1.16
## Minimum correlation of possible factor scores    1 1.33
## 
##  WARNING, the factor score fit indices suggest that the solution is degenerate. Try a different method of factor extraction.
## Warning: the factor score fit indices suggest that the solution is
## degenerate

回転オプションを追加

fa(fac_data, nfactors = 2, fm = "ml", rotate = "promax")
## Factor Analysis using method =  ml
## Call: fa(r = fac_data, nfactors = 2, rotate = "promax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          ML2   ML1   h2    u2
## kokugo  0.99 -0.05 1.00 0.005
## sansuu  0.01  1.00 1.00 0.005
## rika   -0.12  0.68 0.49 0.509
## eigo    0.93 -0.02 0.86 0.141
## syakai  0.67  0.16 0.47 0.533
## 
##                        ML2  ML1
## SS loadings           2.31 1.49
## Proportion Var        0.46 0.30
## Cumulative Var        0.46 0.76
## Proportion Explained  0.61 0.39
## Cumulative Proportion 0.61 1.00
## 
##  With factor correlations of 
##       ML2   ML1
## ML2  1.00 -0.05
## ML1 -0.05  1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.08 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.12 
## The number of observations was  100  with Chi Square =  7.33  with prob <  0.0068 
## 
## Tucker Lewis Index of factoring reliability =  0.792
## RMSEA index =  0.259  and the 90 % confidence intervals are  0.106 0.435
## BIC =  2.72
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                 ML2  ML1
## Correlation of scores with factors             1.00 1.00
## Multiple R square of scores with factors       1.00 1.00
## Minimum correlation of possible factor scores  0.99 0.99
fa(fac_data, nfactors = 2, fm = "ml", rotate = "varimax")
## Factor Analysis using method =  ml
## Call: fa(r = fac_data, nfactors = 2, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          ML2   ML1   h2    u2
## kokugo  0.99 -0.14 1.00 0.005
## sansuu  0.05  1.00 1.00 0.005
## rika   -0.08  0.70 0.49 0.509
## eigo    0.92 -0.10 0.86 0.141
## syakai  0.68  0.09 0.47 0.533
## 
##                        ML2  ML1
## SS loadings           2.29 1.52
## Proportion Var        0.46 0.30
## Cumulative Var        0.46 0.76
## Proportion Explained  0.60 0.40
## Cumulative Proportion 0.60 1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.08 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.12 
## The number of observations was  100  with Chi Square =  7.33  with prob <  0.0068 
## 
## Tucker Lewis Index of factoring reliability =  0.792
## RMSEA index =  0.259  and the 90 % confidence intervals are  0.106 0.435
## BIC =  2.72
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                 ML2  ML1
## Correlation of scores with factors             1.00 1.00
## Multiple R square of scores with factors       1.00 1.00
## Minimum correlation of possible factor scores  0.99 0.99

出力用に整えて

result.fa <- fa(fac_data, nfactors = 2, fm = "ml", rotate = "varimax")
print(result.fa, digit = 3, sort = T)
## Factor Analysis using method =  ml
## Call: fa(r = fac_data, nfactors = 2, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        item    ML2    ML1    h2    u2
## kokugo    1  0.987 -0.144 0.995 0.005
## eigo      4  0.921 -0.105 0.859 0.141
## syakai    5  0.677  0.095 0.467 0.533
## sansuu    2  0.053  0.996 0.995 0.005
## rika      3 -0.085  0.696 0.491 0.509
## 
##                         ML2   ML1
## SS loadings           2.290 1.517
## Proportion Var        0.458 0.303
## Cumulative Var        0.458 0.761
## Proportion Explained  0.602 0.398
## Cumulative Proportion 0.602 1.000
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.309 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.077 
## 
## The root mean square of the residuals (RMSR) is  0.026 
## The df corrected root mean square of the residuals is  0.118 
## The number of observations was  100  with Chi Square =  7.328  with prob <  0.00679 
## 
## Tucker Lewis Index of factoring reliability =  0.7925
## RMSEA index =  0.2587  and the 90 % confidence intervals are  0.1061 0.4352
## BIC =  2.723
## Fit based upon off diagonal values = 0.994
## Measures of factor score adequacy             
##                                                  ML2   ML1
## Correlation of scores with factors             0.998 0.998
## Multiple R square of scores with factors       0.995 0.995
## Minimum correlation of possible factor scores  0.990 0.990

因子得点の算出

result.fa <- fa(fac_data, nfactors = 2, fm = "ml", rotate = "varimax", scores = T)
result.fa$scores
##          ML2      ML1
## 1   -2.28782  0.52135
## 2    1.12618 -1.49112
## 3   -1.06278  0.47028
## 4   -0.50931  0.60941
## 5    0.74822  0.20653
## 6         NA       NA
## 7         NA       NA
## 8    0.42126 -1.28433
## 9   -1.12771  0.64371
## 10  -0.66403  2.12267
## 11   0.59562 -0.45290
## 12        NA       NA
## 13  -0.77437  0.29396
## 14   0.02907  1.41534
## 15   0.35936  1.07128
## 16   0.73045  1.04833
## 17   0.13312 -0.60439
## 18  -1.13081 -2.20637
## 19   1.65254 -1.18254
## 20   0.64307  1.04547
## 21   0.30235  1.22896
## 22   0.83031  0.04123
## 23   0.63148 -1.29447
## 24  -0.55542  0.27383
## 25  -0.53745  1.94844
## 26   0.91697 -0.81596
## 27  -1.07856 -0.35986
## 28  -1.26689  0.98694
## 29   0.55806 -1.12788
## 30   0.70725  0.37142
## 31   2.01168 -0.37008
## 32   1.26971 -0.49705
## 33   1.27007 -0.15337
## 34   0.05609 -0.43770
## 35   0.17384  0.89896
## 36   0.33584 -0.61067
## 37   0.10210  0.41413
## 38  -0.87525  0.46217
## 39  -1.45305  0.99816
## 40   0.83226  0.19536
## 41   0.50208  0.55683
## 42   0.58750 -0.11684
## 43   0.08960 -1.26746
## 44  -0.88458 -0.04719
## 45   1.12299 -0.48915
## 46  -0.22051  0.25278
## 47  -0.95712 -1.38464
## 48  -0.79556 -0.37866
## 49  -0.41955 -0.40051
## 50  -0.37485 -0.23528
## 51  -0.30681 -0.23504
## 52  -1.03326  0.46792
## 53   0.12013 -1.09653
## 54   1.30800  0.84119
## 55   1.04386 -0.64571
## 56   1.80835  0.98677
## 57  -2.03813 -0.31921
## 58  -0.11617  0.58135
## 59   0.96493 -0.48143
## 60  -1.24427 -1.87774
## 61  -0.45939  0.93894
## 62  -1.40031  0.81961
## 63  -1.62075  0.16412
## 64  -0.80704  0.62569
## 65  -0.58404  0.44376
## 66  -1.92619 -0.14980
## 67  -0.46867  0.11211
## 68  -0.58661 -1.56832
## 69   0.34264  0.89019
## 70  -0.45484 -1.24249
## 71   0.10153  2.41090
## 72  -0.31991 -0.40537
## 73   1.23657 -0.31912
## 74  -0.20601 -1.25083
## 75  -0.21821 -0.57870
## 76  -0.61668 -1.22526
## 77   1.92370  0.14825
## 78   0.84758  0.20493
## 79   2.13928 -1.03818
## 80  -0.34168  1.59565
## 81   1.75009  1.15797
## 82  -1.67828 -1.83542
## 83   0.66776  1.21894
## 84   1.10882  1.36359
## 85   1.89139  0.47861
## 86  -0.09089 -0.41141
## 87  -0.55280 -0.89432
## 88   0.37035 -1.95369
## 89  -0.42903 -0.57637
## 90   0.57338 -1.12336
## 91  -1.16208  0.80913
## 92  -0.64795  0.44817
## 93  -1.30917  0.32045
## 94  -0.20495  0.58666
## 95   0.15207  0.90547
## 96  -0.11280 -0.92334
## 97  -0.14831  0.92208
## 98   2.20477  0.96461
## 99  -1.33279  1.83117
## 100 -0.81582 -1.21169

因子得点の基本的な特徴を再確認

summary(result.fa$scores)
##       ML2               ML1        
##  Min.   :-2.2878   Min.   :-2.206  
##  1st Qu.:-0.6640   1st Qu.:-0.604  
##  Median :-0.1128   Median : 0.148  
##  Mean   :-0.0094   Mean   : 0.018  
##  3rd Qu.: 0.6678   3rd Qu.: 0.809  
##  Max.   : 2.2048   Max.   : 2.411  
##  NA's   :3         NA's   :3
hist(result.fa$scores[, 1])

plot of chunk unnamed-chunk-8

hist(result.fa$scores[, 2])

plot of chunk unnamed-chunk-8

plot(result.fa$scores[, 1], result.fa$scores[, 2])

plot of chunk unnamed-chunk-8

cor(result.fa$scores[, 1], result.fa$scores[, 2], use = "complete.obs")
## [1] -0.007582

今日はここまで。

課題は,http://www1.doshisha.ac.jp/~mjin/data/ にある職業適性能力検査データをつかって,因子分析をし(抽出方法の指定,回転の指定もする),因子に命名してください。