第八回(12月04日) Task Check and Weekly Assignment

因子分析概論(1)

To Do
□ 因子分析をやってみる

Assignment
□ 検査データの因子数はいくつにするのがよいか。決定して実行せよ。

いつものファイルの読み込みと下処理。済んでいる場合はやらなくてよいです。
※Workspaceにsampleが入っている場合=済んでいる場合です。

sample <- read.csv("sample(mac).csv", na.strings = "*")
sample$sex <- factor(sample$sex, labels = c("male", "female"))

因子分析に使うところだけサブセットを作っておきましょうか。

fac_data <- subset(sample, select = c("kokugo", "sansuu", "rika", "eigo", "syakai"))
plot(fac_data)

plot of chunk unnamed-chunk-2

因子分析の基本は相関行列です。

fac_cor <- cor(fac_data, use = "complete.obs")
fac_cor
##          kokugo   sansuu    rika     eigo  syakai
## kokugo  1.00000 -0.09332 -0.1814  0.92632  0.6514
## sansuu -0.09332  1.00000  0.6715 -0.05901  0.1287
## rika   -0.18145  0.67145  1.0000 -0.10476 -0.1026
## eigo    0.92632 -0.05901 -0.1048  1.00000  0.6087
## syakai  0.65137  0.12866 -0.1026  0.60869  1.0000

相関行列を固有値分解すると,適切な因子数の目安が立てられます。

eigen(fac_cor)
## $values
## [1] 2.51464 1.64970 0.50586 0.26116 0.06864
## 
## $vectors
##         [,1]     [,2]    [,3]    [,4]     [,5]
## [1,]  0.6015 -0.06416 -0.2828 -0.1582  0.72737
## [2,] -0.1026 -0.70841  0.2737 -0.6423 -0.01094
## [3,] -0.1918 -0.66555 -0.4132  0.5874  0.06698
## [4,]  0.5854 -0.10657 -0.4049 -0.1369 -0.68066
## [5,]  0.4982 -0.19933  0.7145  0.4456 -0.05497

でもこのままだと分かりにくいので,パッケージをつかって分かりやすく見てみましょう。

library(psych)
VSS.scree(fac_data)

plot of chunk unnamed-chunk-5

因子数が決まれば,それを指定してやれば因子分析の完了です。

fa(fac_data, nfactors = 2)
## Loading required package: GPArotation
## Factor Analysis using method =  minres
## Call: fa(r = fac_data, nfactors = 2)
## Standardized loadings (pattern matrix) based upon correlation matrix
##          MR2   MR1   h2     u2
## kokugo  0.99 -0.04 0.99 0.0082
## sansuu  0.03  1.00 1.00 0.0050
## rika   -0.10  0.69 0.49 0.5090
## eigo    0.93 -0.01 0.86 0.1388
## syakai  0.68  0.16 0.47 0.5314
## 
##                        MR2  MR1
## SS loadings           2.31 1.50
## Proportion Var        0.46 0.30
## Cumulative Var        0.46 0.76
## Proportion Explained  0.61 0.39
## Cumulative Proportion 0.61 1.00
## 
##  With factor correlations of 
##       MR2   MR1
## MR2  1.00 -0.08
## MR1 -0.08  1.00
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.31 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.08 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.12 
## The number of observations was  100  with Chi Square =  7.36  with prob <  0.0067 
## 
## Tucker Lewis Index of factoring reliability =  0.791
## RMSEA index =  0.259  and the 90 % confidence intervals are  0.107 0.436
## BIC =  2.76
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                 MR2  MR1
## Correlation of scores with factors             1.00 1.00
## Multiple R square of scores with factors       0.99 1.00
## Minimum correlation of possible factor scores  0.98 0.99

美しく見せるためにちょっと手を加えてみましょう。

fa.result <- fa(fac_data, nfactors = 2)
print(fa.result, sort = T, digit = 3)
## Factor Analysis using method =  minres
## Call: fa(r = fac_data, nfactors = 2)
## Standardized loadings (pattern matrix) based upon correlation matrix
##        item    MR2    MR1    h2      u2
## kokugo    1  0.992 -0.044 0.992 0.00818
## eigo      4  0.927 -0.011 0.861 0.13878
## syakai    5  0.677  0.163 0.469 0.53143
## sansuu    2  0.029  0.999 0.995 0.00500
## rika      3 -0.102  0.686 0.491 0.50900
## 
##                         MR2   MR1
## SS loadings           2.312 1.496
## Proportion Var        0.462 0.299
## Cumulative Var        0.462 0.762
## Proportion Explained  0.607 0.393
## Cumulative Proportion 0.607 1.000
## 
##  With factor correlations of 
##        MR2    MR1
## MR2  1.000 -0.077
## MR1 -0.077  1.000
## 
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  10  and the objective function was  3.309 with Chi Square of  319.4
## The degrees of freedom for the model are 1  and the objective function was  0.077 
## 
## The root mean square of the residuals (RMSR) is  0.026 
## The df corrected root mean square of the residuals is  0.118 
## The number of observations was  100  with Chi Square =  7.361  with prob <  0.00667 
## 
## Tucker Lewis Index of factoring reliability =  0.7914
## RMSEA index =  0.2593  and the 90 % confidence intervals are  0.1068 0.4358
## BIC =  2.756
## Fit based upon off diagonal values = 0.994
## Measures of factor score adequacy             
##                                                  MR2   MR1
## Correlation of scores with factors             0.996 0.998
## Multiple R square of scores with factors       0.992 0.995
## Minimum correlation of possible factor scores  0.984 0.990

今日はここまで。

課題は,http://www1.doshisha.ac.jp/~mjin/data/ にある職業適性能力検査データをつかって,因子数を決定し,因子分析を実行するところまでです。