EFAshiny : R Code Demo

Overview

This tutorial aims to demonstrate how to perform the fundamental analyses in EFAshiny using R code. By following each line in this tutorial, you can learn basic R programming and can implement automated processing script in future analyses. Also, you can further use the code in EFAshiny within a script pupeline without launching the app. However, we still suggest you to use EFAshiny APP :)) You can also have a look at our tutorial of EFAshiny GUI to make an easy comparison while writing the code. Have fun with EFAshiny and R !

Preparation

First of all, we should load all the packages and fucntions in EFAshiny into R session. If you do not have these package installed, please use install.package("packagename") to perfrom installations.

require(ggplot2);require(psych);require(corrplot);require(reshape2);require(moments);require(gridExtra)
require(qgraph);require(bootnet);require(igraph);require(ggcorrplot);require(RCurl)
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/my_summary.R")
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/faplot.R")
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/bootEGA.R")
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/bargraph.R")
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/stackbar.R")
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/printLoadings.R")
source("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/inst/efas/functions/theme_default.R")

Data Input

Read the demonsration data RSE into R. We assign it to a object called dta. Then using head to observe and explore this data. See our tutorial for data description.

dta <- read.csv(text=getURL("https://raw.githubusercontent.com/PsyChiLin/EFAshiny/master/RSE/RSE.csv"))
head(dta)

##   Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
## 1  4  4  1  4  1  3  3  2  1   1
## 2  1  2  3  2  2  2  1  3  4   4
## 3  3  3  2  3  1  2  3  4  4   4
## 4  3  3  3  4  2  2  1  4  4   4
## 5  2  2  3  2  2  1  2  4  4   4
## 6  3  3  3  2  3  2  2  3  3   2

Data Summary

Several kinds of data summary can be perform in EFAshiny. We demonstrate them using basic code. The package ggplot2 is a useful package for plotting. We use it to plot the histograms, the density plots, and the correlation matrix. You can also check the documentation of ggplot2.

Numeric Statistic

NumericStatistic <- apply(dta,2,my_summary)
row.names(NumericStatistic) <- c("Mean","SD","Skewness","Kurtosis","Median","MAD")
NumericStatistic <- as.data.frame(t(NumericStatistic))
NumericStatistic <- round(NumericStatistic,3)
NumericStatistic

##      Mean    SD Skewness Kurtosis Median   MAD
## Q1  2.930 0.878   -0.489    2.543      3 1.483
## Q2  3.027 0.794   -0.661    3.222      3 0.000
## Q3  2.387 0.943    0.118    2.116      2 1.483
## Q4  2.926 0.796   -0.474    2.898      3 0.000
## Q5  2.441 0.968    0.036    2.028      2 1.483
## Q6  2.398 0.889    0.073    2.266      2 1.483
## Q7  2.285 0.908    0.101    2.137      2 1.483
## Q8  2.754 0.945   -0.332    2.222      3 1.483
## Q9  2.848 0.964   -0.534    2.367      3 1.483
## Q10 2.695 1.056   -0.270    1.866      3 1.483

Histogram

dta_long <- melt(dta)
colnames(dta_long) <- c("Item", "Response")
Histogram <- ggplot(dta_long, aes(x = Response, fill = Item))+
        geom_histogram(bins = 10)+
        facet_wrap(~Item)+
        theme_default()
Histogram

Density Plot

DensityPlot <- ggplot(dta_long, aes(x = Response, fill = Item))+
        geom_density()+
        facet_wrap(~Item)+
        theme_default()
DensityPlot

Correlation Matrix

CorMat <- cor(as.matrix(dta))
corrplot(CorMat,order="hclust",type="upper",method="ellipse",
         tl.pos = "lt",mar = c(2,2,2,2))
corrplot(CorMat,order="hclust",type="lower",method="number",
         diag=FALSE,tl.pos="n", cl.pos="n",add=TRUE,mar = c(2,2,2,2))

ggcorplot

ggcorrplot(CorMat, hc.order = T,type = "lower", lab = TRUE,
           colors = c("#E46726", "white", "#6D9EC2"))

Factor Retention

We provide several factor retention methods in EFAshiny. Those analyses can also be performed using R code. You can adopt these methods for your own use.

Scree Plot and Parallel Analysis

PA <- faplot(CorMat,n.obs = 256, quant = 0.95)

## Parallel analysis suggests that the number of factors =  NA  and the number of components =  2

PA[[1]]

Numeric Rules

NumericRule <- VSS(CorMat,n = 4, plot = F, n.obs = 256)
temp1 <- data.frame(nFactor = row.names(NumericRule$vss.stats), 
                     VSS1 = NumericRule$cfit.1, VSS2 = NumericRule$cfit.2, 
                     MAP = NumericRule$map)
temp2 <- NumericRule$vss.stats[,c(6:8,11)]
NumericRule <- cbind(temp1,temp2)
NumericRule

##   nFactor      VSS1      VSS2        MAP      RMSEA       BIC      SABIC
## 1       1 0.8914042 0.0000000 0.05447774 0.18020765 124.59929 235.558803
## 2       2 0.6280592 0.9415872 0.04162043 0.10067140 -52.99140  29.435667
## 3       3 0.4763088 0.7915412 0.07233850 0.07679094 -55.83651   1.228384
## 4       4 0.4340840 0.7117347 0.09725540 0.03740394 -46.47254 -11.599548
##         SRMR
## 1 0.09574323
## 2 0.03536388
## 3 0.02436892
## 4 0.01113139

Exploratory Graph Analysis (EGA)

EGArst <- bootEGA(data = dta, n = 10, medianStructure = TRUE, plot.MedianStructure = TRUE, 
                  ncores = 4, layout = "spring")

## model set to 'GGM'

## Estimating sample network...

## Estimating Network. Using package::function:
##   - qgraph::EBICglasso for EBIC model selection
##     - using glasso::glasso
##   - qgraph::cor_auto for correlation computation
##     - using lavaan::lavCor

## Variables detected as ordinal: Q1; Q2; Q3; Q4; Q5; Q6; Q7; Q8; Q9; Q10

## Bootstrapping...

## Computing statistics...

plot(EGArst$plot)

Extarction and Rotation

The key analyses in EFA is factor extarction and rotation. We teach you to perform the analyses using psych package. Using these lines, you can obtain basic results of EFA. Can also see the code in EFAshiny server and psych package for details.

EFArst <- fa(CorMat,2,n.obs=256, rotate = "promax",fm = "pa", n.iter = 200)

## Loading required namespace: GPArotation

EFArst

## Factor Analysis with confidence intervals using method = fa(r = CorMat, nfactors = 2, n.obs = 256, n.iter = 200, rotate = "promax", 
##     fm = "pa")
## Factor Analysis using method =  pa
## Call: fa(r = CorMat, nfactors = 2, n.obs = 256, n.iter = 200, rotate = "promax", 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA1   PA2   h2   u2 com
## Q1   0.88  0.09 0.67 0.33 1.0
## Q2   0.87  0.12 0.62 0.38 1.0
## Q3  -0.23  0.59 0.58 0.42 1.3
## Q4   0.77  0.08 0.52 0.48 1.0
## Q5  -0.39  0.37 0.48 0.52 2.0
## Q6   0.69 -0.14 0.63 0.37 1.1
## Q7   0.70 -0.13 0.62 0.38 1.1
## Q8  -0.23  0.41 0.35 0.65 1.6
## Q9   0.20  0.95 0.68 0.32 1.1
## Q10  0.11  0.96 0.79 0.21 1.0
## 
##                        PA1  PA2
## SS loadings           3.40 2.55
## Proportion Var        0.34 0.25
## Cumulative Var        0.34 0.60
## Proportion Explained  0.57 0.43
## Cumulative Proportion 0.57 1.00
## 
##  With factor correlations of 
##       PA1   PA2
## PA1  1.00 -0.67
## PA2 -0.67  1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  45  and the objective function was  5.73 with Chi Square of  1436.37
## The degrees of freedom for the model are 26  and the objective function was  0.37 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.05 
## 
## The harmonic number of observations is  256 with the empirical chi square  28.82  with prob <  0.32 
## The total number of observations was  256  with Likelihood Chi Square =  91.2  with prob <  3.6e-09 
## 
## Tucker Lewis Index of factoring reliability =  0.918
## RMSEA index =  0.101  and the 90 % confidence intervals are  0.078 0.122
## BIC =  -52.97
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    PA1  PA2
## Correlation of (regression) scores with factors   0.95 0.95
## Multiple R square of scores with factors          0.90 0.90
## Minimum correlation of possible factor scores     0.81 0.81
## 
##  Coefficients and bootstrapped confidence intervals 
##       low   PA1 upper   low   PA2 upper
## Q1   0.78  0.88  0.96 -0.03  0.09  0.22
## Q2   0.78  0.87  0.94  0.00  0.12  0.25
## Q3  -0.37 -0.23 -0.10  0.45  0.59  0.74
## Q4   0.66  0.77  0.87 -0.05  0.08  0.19
## Q5  -0.53 -0.39 -0.24  0.23  0.37  0.52
## Q6   0.54  0.69  0.83 -0.29 -0.14 -0.01
## Q7   0.57  0.70  0.82 -0.26 -0.13 -0.02
## Q8  -0.38 -0.23 -0.09  0.25  0.41  0.57
## Q9   0.07  0.20  0.30  0.85  0.95  1.02
## Q10  0.01  0.11  0.18  0.86  0.96  1.04
## 
##  Interfactor correlations and bootstrapped confidence intervals 
##         lower estimate upper
## PA1-PA2 -0.71    -0.67 -0.61

Visualization

By passing the results of EFA from aforementioned analyses into the well-established functions, you can easily visualize these results from EFA. Definitely, EFAshiny can help you to do these automatically.

Diagram

fa.diagram(EFArst,simple = T,cut = 0.33,
           sort = T,errors = T,e.size = 0.05)

Bootstrapping Factor Loadings

order <- rev(row.names(as.data.frame(printLoadings(EFArst$cis$means,sort = T,cutoff = 0)))) # define the order of the variable

## 
## Loadings:
##     PA1   PA2  
## Q1   0.87  0.09
## Q2   0.86  0.12
## Q4   0.76  0.07
## Q7   0.69 -0.14
## Q6   0.68 -0.15
## Q5  -0.39  0.38
## Q10  0.09  0.95
## Q9   0.19  0.94
## Q3  -0.23  0.60
## Q8  -0.23  0.41

bargraph(EFArst,order = order,nf = 2,highcol = "firebrick",lowcol = "chartreuse4",ci = T)

Factor Loadings and Correlation Matrix

stackbar(CorMat,EFArst,order = order,highcol = "firebrick",lowcol = "chartreuse4")