第八回(12月04日) Task Check and Weekly Assignment
To Do
□ 因子分析をやってみる
Assignment
□ 検査データの因子数はいくつにするのがよいか。決定して実行せよ。
いつものファイルの読み込みと下処理。済んでいる場合はやらなくてよいです。
※Workspaceにsampleが入っている場合=済んでいる場合です。
sample <- read.csv("sample(mac).csv", na.strings = "*")
sample$sex <- factor(sample$sex, labels = c("male", "female"))
因子分析に使うところだけサブセットを作っておきましょうか。
fac_data <- subset(sample, select = c("kokugo", "sansuu", "rika", "eigo", "syakai"))
plot(fac_data)
因子分析の基本は相関行列です。
fac_cor <- cor(fac_data, use = "complete.obs")
fac_cor
## kokugo sansuu rika eigo syakai
## kokugo 1.00000 -0.09332 -0.1814 0.92632 0.6514
## sansuu -0.09332 1.00000 0.6715 -0.05901 0.1287
## rika -0.18145 0.67145 1.0000 -0.10476 -0.1026
## eigo 0.92632 -0.05901 -0.1048 1.00000 0.6087
## syakai 0.65137 0.12866 -0.1026 0.60869 1.0000
相関行列を固有値分解すると,適切な因子数の目安が立てられます。
eigen(fac_cor)
## $values
## [1] 2.51464 1.64970 0.50586 0.26116 0.06864
##
## $vectors
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.6015 -0.06416 -0.2828 -0.1582 0.72737
## [2,] -0.1026 -0.70841 0.2737 -0.6423 -0.01094
## [3,] -0.1918 -0.66555 -0.4132 0.5874 0.06698
## [4,] 0.5854 -0.10657 -0.4049 -0.1369 -0.68066
## [5,] 0.4982 -0.19933 0.7145 0.4456 -0.05497
でもこのままだと分かりにくいので,パッケージをつかって分かりやすく見てみましょう。
library(psych)
VSS.scree(fac_data)
因子数が決まれば,それを指定してやれば因子分析の完了です。
fa(fac_data, nfactors = 2)
## Loading required package: GPArotation
## Factor Analysis using method = minres
## Call: fa(r = fac_data, nfactors = 2)
## Standardized loadings (pattern matrix) based upon correlation matrix
## MR2 MR1 h2 u2
## kokugo 0.99 -0.04 0.99 0.0082
## sansuu 0.03 1.00 1.00 0.0050
## rika -0.10 0.69 0.49 0.5090
## eigo 0.93 -0.01 0.86 0.1388
## syakai 0.68 0.16 0.47 0.5314
##
## MR2 MR1
## SS loadings 2.31 1.50
## Proportion Var 0.46 0.30
## Cumulative Var 0.46 0.76
## Proportion Explained 0.61 0.39
## Cumulative Proportion 0.61 1.00
##
## With factor correlations of
## MR2 MR1
## MR2 1.00 -0.08
## MR1 -0.08 1.00
##
## Test of the hypothesis that 2 factors are sufficient.
##
## The degrees of freedom for the null model are 10 and the objective function was 3.31 with Chi Square of 319.4
## The degrees of freedom for the model are 1 and the objective function was 0.08
##
## The root mean square of the residuals (RMSR) is 0.03
## The df corrected root mean square of the residuals is 0.12
## The number of observations was 100 with Chi Square = 7.36 with prob < 0.0067
##
## Tucker Lewis Index of factoring reliability = 0.791
## RMSEA index = 0.259 and the 90 % confidence intervals are 0.107 0.436
## BIC = 2.76
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## MR2 MR1
## Correlation of scores with factors 1.00 1.00
## Multiple R square of scores with factors 0.99 1.00
## Minimum correlation of possible factor scores 0.98 0.99
美しく見せるためにちょっと手を加えてみましょう。
fa.result <- fa(fac_data, nfactors = 2)
print(fa.result, sort = T, digit = 3)
## Factor Analysis using method = minres
## Call: fa(r = fac_data, nfactors = 2)
## Standardized loadings (pattern matrix) based upon correlation matrix
## item MR2 MR1 h2 u2
## kokugo 1 0.992 -0.044 0.992 0.00818
## eigo 4 0.927 -0.011 0.861 0.13878
## syakai 5 0.677 0.163 0.469 0.53143
## sansuu 2 0.029 0.999 0.995 0.00500
## rika 3 -0.102 0.686 0.491 0.50900
##
## MR2 MR1
## SS loadings 2.312 1.496
## Proportion Var 0.462 0.299
## Cumulative Var 0.462 0.762
## Proportion Explained 0.607 0.393
## Cumulative Proportion 0.607 1.000
##
## With factor correlations of
## MR2 MR1
## MR2 1.000 -0.077
## MR1 -0.077 1.000
##
## Test of the hypothesis that 2 factors are sufficient.
##
## The degrees of freedom for the null model are 10 and the objective function was 3.309 with Chi Square of 319.4
## The degrees of freedom for the model are 1 and the objective function was 0.077
##
## The root mean square of the residuals (RMSR) is 0.026
## The df corrected root mean square of the residuals is 0.118
## The number of observations was 100 with Chi Square = 7.361 with prob < 0.00667
##
## Tucker Lewis Index of factoring reliability = 0.7914
## RMSEA index = 0.2593 and the 90 % confidence intervals are 0.1068 0.4358
## BIC = 2.756
## Fit based upon off diagonal values = 0.994
## Measures of factor score adequacy
## MR2 MR1
## Correlation of scores with factors 0.996 0.998
## Multiple R square of scores with factors 0.992 0.995
## Minimum correlation of possible factor scores 0.984 0.990
今日はここまで。
課題は,http://www1.doshisha.ac.jp/~mjin/data/ にある職業適性能力検査データをつかって,因子数を決定し,因子分析を実行するところまでです。