<- subset(dta,select=c(Sleep1,Sleep2,Sleep3,Sleep4,Sleep5)) dta_sleep
head(dta_sleep)
## Sleep1 Sleep2 Sleep3 Sleep4 Sleep5
## 1 1 2 0 1 2
## 2 2 2 1 0 1
## 3 1 0 2 1 1
## 4 2 1 2 2 2
## 5 1 1 0 1 1
## 6 1 1 2 0 1
library(pacman)
p_load("psych")
使用psych package,
利用KMO及bartlett檢定判斷資料是否適合進行因數分析,KMO皆大於0.6以上可接受,bartlett p<0.01室合作因數分析
KMO(dta_sleep)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = dta_sleep)
## Overall MSA = 0.65
## MSA for each item =
## Sleep1 Sleep2 Sleep3 Sleep4 Sleep5
## 0.63 0.71 0.61 0.66 0.64
cortest.bartlett(dta_sleep)
## R was not square, finding R from data
## $chisq
## [1] 296.117
##
## $p.value
## [1] 1.028829e-57
##
## $df
## [1] 10
先嘗試一個因子, 說明: R square 解釋例為61%,h2為每個變量的解釋例,sleep1(satisfaction)解釋力最高37%,u2變異數無法被解釋的比例,sleep3(timing,半夜2~4點睡眠時間超過一半),無法被解釋的部分有高達91%。似乎timing不是很好的預測題目。
print.psych(fa(dta_sleep,fm="pa",nfactor=1,rotate="varimax"),cut = .3)#fm因素萃取法--vaiirmax最大變異法,nfacto提取因子數,rotate轉軸法
## Factor Analysis using method = pa
## Call: fa(r = dta_sleep, nfactors = 1, rotate = "varimax", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
## PA1 h2 u2 com
## Sleep1 0.61 0.378 0.62 1
## Sleep2 0.38 0.141 0.86 1
## Sleep3 0.087 0.91 1
## Sleep4 0.48 0.234 0.77 1
## Sleep5 0.55 0.299 0.70 1
##
## PA1
## SS loadings 1.14
## Proportion Var 0.23
##
## Mean item complexity = 1
## Test of the hypothesis that 1 factor is sufficient.
##
## df null model = 10 with the objective function = 0.46 with Chi Square = 296.12
## df of the model are 5 and the objective function was 0.07
##
## The root mean square of the residuals (RMSR) is 0.07
## The df corrected root mean square of the residuals is 0.1
##
## The harmonic n.obs is 647 with the empirical chi square 59.73 with prob < 1.4e-11
## The total n.obs was 647 with Likelihood Chi Square = 47.39 with prob < 4.7e-09
##
## Tucker Lewis Index of factoring reliability = 0.703
## RMSEA index = 0.114 and the 90 % confidence intervals are 0.086 0.145
## BIC = 15.03
## Fit based upon off diagonal values = 0.91
## Measures of factor score adequacy
## PA1
## Correlation of (regression) scores with factors 0.78
## Multiple R square of scores with factors 0.61
## Minimum correlation of possible factor scores 0.23
嘗試2個因子 說明: sleep2(Alertness)被剃除,sleep3(timing)解釋力最低。
print.psych(fa(dta_sleep,fm="pa",nfactor=2,rotate="varimax"),cut = .3)
## maximum iteration exceeded
## Factor Analysis using method = pa
## Call: fa(r = dta_sleep, nfactors = 2, rotate = "varimax", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
## PA1 PA2 h2 u2 com
## Sleep1 0.82 0.69 0.31 1.0
## Sleep2 0.12 0.88 1.8
## Sleep3 0.36 0.14 0.86 1.1
## Sleep4 0.76 0.61 0.39 1.1
## Sleep5 0.48 0.26 0.74 1.3
##
## PA1 PA2
## SS loadings 1.03 0.78
## Proportion Var 0.21 0.16
## Cumulative Var 0.21 0.36
## Proportion Explained 0.57 0.43
## Cumulative Proportion 0.57 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 2 factors are sufficient.
##
## df null model = 10 with the objective function = 0.46 with Chi Square = 296.12
## df of the model are 1 and the objective function was 0.01
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.06
##
## The harmonic n.obs is 647 with the empirical chi square 3.98 with prob < 0.046
## The total n.obs was 647 with Likelihood Chi Square = 3.43 with prob < 0.064
##
## Tucker Lewis Index of factoring reliability = 0.915
## RMSEA index = 0.061 and the 90 % confidence intervals are 0 0.138
## BIC = -3.04
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy
## PA1 PA2
## Correlation of (regression) scores with factors 0.84 0.78
## Multiple R square of scores with factors 0.71 0.60
## Minimum correlation of possible factor scores 0.41 0.21
平行分析,可用來判斷提取的因數,圖形上顯示建議1個因子(真實數據的特徵值大於隨機數據矩陣相應的平均特徵值)。
fa.parallel(dta_sleep, fa = "pc", show.legend = TRUE, n.iter = 100)
## Parallel analysis suggests that the number of factors = NA and the number of components = 1
使用lavaan package, 参考文献:Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/
library(pacman)
p_load("lavaan", "semPlot")
<- 'sleep=~Sleep1+Sleep2+Sleep3+Sleep4+Sleep5'
sleep.model<- cfa(sleep.model, data=dta_sleep) fit
`
結果判讀,CFI為0.856未大於0.9,SRMR0.057有小於0.08可接受。
<- 'sleep=~Sleep1+Sleep2+Sleep3+Sleep4+Sleep5'
sleep.model= cfa(model = sleep.model,data = dta_sleep)
cfafit summary(cfafit,
fit.measures = T,
standardized = T)
## lavaan 0.6.15 ended normally after 32 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 10
##
## Number of observations 647
##
## Model Test User Model:
##
## Test statistic 46.388
## Degrees of freedom 5
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 297.728
## Degrees of freedom 10
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.856
## Tucker-Lewis Index (TLI) 0.712
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -3294.993
## Loglikelihood unrestricted model (H1) -3271.799
##
## Akaike (AIC) 6609.985
## Bayesian (BIC) 6654.709
## Sample-size adjusted Bayesian (SABIC) 6622.959
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.113
## 90 Percent confidence interval - lower 0.085
## 90 Percent confidence interval - upper 0.144
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 0.971
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.057
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## sleep =~
## Sleep1 1.000 0.403 0.646
## Sleep2 0.611 0.095 6.416 0.000 0.246 0.371
## Sleep3 0.507 0.102 4.964 0.000 0.204 0.267
## Sleep4 0.870 0.122 7.159 0.000 0.351 0.439
## Sleep5 0.919 0.118 7.811 0.000 0.371 0.574
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Sleep1 0.227 0.023 9.759 0.000 0.227 0.583
## .Sleep2 0.381 0.023 16.228 0.000 0.381 0.863
## .Sleep3 0.544 0.032 17.152 0.000 0.544 0.929
## .Sleep4 0.516 0.034 15.304 0.000 0.516 0.807
## .Sleep5 0.280 0.023 12.118 0.000 0.280 0.671
## sleep 0.163 0.026 6.173 0.000 1.000 1.000
dev.new()
semPaths(cfafit,what = "std", #顯示標準化的估計值,顯示原始估計值則 what = "par"
rotation = 2, #將潛變量置左側,顯變量observable variable置于右
edge.color = "black",
esize = 0.5, #箭頭的粗细
edge.label.cex = 1, #所有值的字號
exoVar = F ) #不顯示外生變量的變異數
結論:
整體而言,我們的資料顯示適合一個因子,做CEA的結果勉強可符合。 路徑圖看來圖中每個箭頭上的數字是標準化的係數,係數越大則線條和字的顏色越深,sleep3(Timing)和sleep4(Efficiency)殘差變異數接近透明,比較好的預測題目是Sleep1(Satisfaction)及sleep5(Duration )。