library(psych)
library(moments)
library(lavaan)
library(reshape2)
library(tidyverse)
library(GPArotation)

匯入資料並命名為dta

data("bfi")
dta <- bfi

檢視資料結構

str(bfi)
## 'data.frame':    2800 obs. of  28 variables:
##  $ A1       : int  2 2 5 4 2 6 2 4 4 2 ...
##  $ A2       : int  4 4 4 4 3 6 5 3 3 5 ...
##  $ A3       : int  3 5 5 6 3 5 5 1 6 6 ...
##  $ A4       : int  4 2 4 5 4 6 3 5 3 6 ...
##  $ A5       : int  4 5 4 5 5 5 5 1 3 5 ...
##  $ C1       : int  2 5 4 4 4 6 5 3 6 6 ...
##  $ C2       : int  3 4 5 4 4 6 4 2 6 5 ...
##  $ C3       : int  3 4 4 3 5 6 4 4 3 6 ...
##  $ C4       : int  4 3 2 5 3 1 2 2 4 2 ...
##  $ C5       : int  4 4 5 5 2 3 3 4 5 1 ...
##  $ E1       : int  3 1 2 5 2 2 4 3 5 2 ...
##  $ E2       : int  3 1 4 3 2 1 3 6 3 2 ...
##  $ E3       : int  3 6 4 4 5 6 4 4 NA 4 ...
##  $ E4       : int  4 4 4 4 4 5 5 2 4 5 ...
##  $ E5       : int  4 3 5 4 5 6 5 1 3 5 ...
##  $ N1       : int  3 3 4 2 2 3 1 6 5 5 ...
##  $ N2       : int  4 3 5 5 3 5 2 3 5 5 ...
##  $ N3       : int  2 3 4 2 4 2 2 2 2 5 ...
##  $ N4       : int  2 5 2 4 4 2 1 6 3 2 ...
##  $ N5       : int  3 5 3 1 3 3 1 4 3 4 ...
##  $ O1       : int  3 4 4 3 3 4 5 3 6 5 ...
##  $ O2       : int  6 2 2 3 3 3 2 2 6 1 ...
##  $ O3       : int  3 4 5 4 4 5 5 4 6 5 ...
##  $ O4       : int  4 3 5 3 3 6 6 5 6 5 ...
##  $ O5       : int  3 3 2 5 3 1 1 3 1 2 ...
##  $ gender   : int  1 2 2 2 1 2 1 1 1 2 ...
##  $ education: int  NA NA NA NA NA 3 NA 2 1 NA ...
##  $ age      : int  16 18 17 17 17 21 18 19 19 17 ...

擷取資料的第1 ~ 25個變項(題目1 ~ 25)

dta <- dta[, -c(26:28)]

轉換反向題

dta[, c("A1", "C4", "C5", "E1", "E2", "O2", "O5")] <- 7 - dta[, c("A1", "C4", "C5", "E1", "E2", "O2", "O5")]

Drop掉遺漏的資料

dta <- na.omit(dta)

平行分析 25 個題目背後的因素數目

需挑出特徵值大於1,因為表示此 factor 只能解釋非常少部分的變異量這對減少變數量並沒有什麼幫助

fa.parallel(dta[, 1:25], fa = "pc", show.legend = FALSE)

## Parallel analysis suggests that the number of factors =  NA  and the number of components =  5

得出此量表應該要有五個因素

探索性因素分析

知道量表所反映的子構念數

因素負載量-因素對構念的影響

因素負載量越高代表此題越能反映子構念,意即越重要越好

print.psych(fa(dta[, 1:25], nfactor = 5, fm = "pa", rotate = "promax"), cut = .3)
## Factor Analysis using method =  pa
## Call: fa(r = dta[, 1:25], nfactors = 5, rotate = "promax", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##      PA2   PA1   PA3   PA5   PA4   h2   u2 com
## A1                    0.46       0.20 0.80 1.4
## A2                    0.61       0.46 0.54 1.1
## A3                    0.62       0.54 0.46 1.3
## A4                    0.41       0.30 0.70 2.0
## A5        0.33        0.49       0.47 0.53 1.8
## C1              0.57             0.35 0.65 1.2
## C2              0.70             0.45 0.55 1.2
## C3              0.60             0.32 0.68 1.1
## C4              0.65             0.48 0.52 1.2
## C5              0.56             0.44 0.56 1.4
## E1        0.64                   0.35 0.65 1.1
## E2        0.71                   0.55 0.45 1.1
## E3        0.55                   0.44 0.56 1.6
## E4        0.66                   0.54 0.46 1.3
## E5        0.50                   0.41 0.59 1.8
## N1  0.84                         0.68 0.32 1.3
## N2  0.79                         0.61 0.39 1.2
## N3  0.74                         0.54 0.46 1.0
## N4  0.53 -0.31                   0.51 0.49 1.8
## N5  0.53                         0.35 0.65 1.5
## O1                          0.49 0.32 0.68 1.3
## O2                          0.48 0.27 0.73 1.5
## O3                          0.58 0.47 0.53 1.5
## O4                          0.37 0.25 0.75 2.7
## O5                          0.54 0.30 0.70 1.1
## 
##                        PA2  PA1  PA3  PA5  PA4
## SS loadings           2.69 2.59 2.02 1.79 1.50
## Proportion Var        0.11 0.10 0.08 0.07 0.06
## Cumulative Var        0.11 0.21 0.29 0.36 0.42
## Proportion Explained  0.25 0.24 0.19 0.17 0.14
## Cumulative Proportion 0.25 0.50 0.69 0.86 1.00
## 
##  With factor correlations of 
##       PA2   PA1   PA3   PA5  PA4
## PA2  1.00 -0.26 -0.22 -0.01 0.04
## PA1 -0.26  1.00  0.40  0.35 0.14
## PA3 -0.22  0.40  1.00  0.24 0.19
## PA5 -0.01  0.35  0.24  1.00 0.16
## PA4  0.04  0.14  0.19  0.16 1.00
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 5 factors are sufficient.
## 
## The degrees of freedom for the null model are  300  and the objective function was  7.48 with Chi Square of  18146.07
## The degrees of freedom for the model are 185  and the objective function was  0.64 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic number of observations is  2436 with the empirical chi square  1131.91  with prob <  1.1e-135 
## The total number of observations was  2436  with Likelihood Chi Square =  1538.69  with prob <  8e-212 
## 
## Tucker Lewis Index of factoring reliability =  0.877
## RMSEA index =  0.055  and the 90 % confidence intervals are  0.052 0.057
## BIC =  96.03
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    PA2  PA1  PA3  PA5  PA4
## Correlation of (regression) scores with factors   0.93 0.91 0.89 0.87 0.84
## Multiple R square of scores with factors          0.86 0.83 0.79 0.75 0.70
## Minimum correlation of possible factor scores     0.72 0.66 0.58 0.51 0.40

cut = .3 不顯示.3以下的因素負載量

由圖可知

PA1外向性 (Extroversion)中因素負載量E2=0.71, E4=0.66, E1=0.64為最高的三題,因此最能代表此向度。

PA2神經質 (Neuroticism)中因素負載量N1=0.84, N2=0.79, N3=0.74為最高的三題,因此最能代表此向度。

PA3嚴謹性 (Conscientiousness)中因素負載量C2=0.70, C4=0.65, C3=0.60為最高的三題,因此最能代表此向度。

PA4開放性 (Openness)中因素負載量O3=0.58, O5=0.54, O1=0.49為最高的三題,因此最能代表此向度。

PA5親和性 (Agreeableness)中因素負載量A3=0.62, A2=0.61, A5=0.49為最高的三題,因此最能代表此向度。