library(rio)
library(psych)
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(corpcor)
library(GPArotation)
GeoScience <- read.csv("/Users/Lorraine/Desktop/Geoscience.csv")
Geo <- GeoScience[, 1:13]
#Scree plot
pc1 <- principal(Geo, nfactors = 13, rotate = "none")
plot(pc1$values, type = "b")
#Parallel analysis
fa.parallel(Geo,fm = "pa", fa = "fa", n.iter = 500)
## Parallel analysis suggests that the number of factors = 3 and the number of components = NA
The scree plot based on the most complicated situation (13 factors) suggested that we could include the first 4 or 5 factors since the slope is relatively steep for them. However, the parallel analysis suggested us to include 3 factors becuase only 3 factors explains more variance than the variance explained by simulated situations.Therefore, I am going to include 3 factors.
FAOB <- fa(Geo, fm="pa", nfactors = 3, rotate = "promax")
print.psych(FAOB, cut = .3, sort = TRUE)
## Factor Analysis using method = pa
## Call: fa(r = Geo, nfactors = 3, rotate = "promax", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
## item PA1 PA2 PA3 h2 u2 com
## item07 7 0.83 0.7113 0.29 1.2
## item06 6 0.83 0.5865 0.41 1.2
## item03 3 0.68 0.5195 0.48 1.4
## item08 8 0.43 0.2420 0.76 1.3
## item12 12 0.0525 0.95 1.9
## item09 9 0.83 0.7100 0.29 1.1
## item10 10 0.74 0.5205 0.48 1.1
## item11 11 0.55 0.2715 0.73 1.1
## item13 13 0.39 0.2419 0.76 1.6
## item02 2 0.71 0.5614 0.44 1.1
## item01 1 0.55 0.3363 0.66 1.3
## item04 4 0.35 0.1752 0.82 1.4
## item05 5 0.0089 0.99 1.3
##
## PA1 PA2 PA3
## SS loadings 2.04 1.74 1.16
## Proportion Var 0.16 0.13 0.09
## Cumulative Var 0.16 0.29 0.38
## Proportion Explained 0.41 0.35 0.23
## Cumulative Proportion 0.41 0.77 1.00
##
## With factor correlations of
## PA1 PA2 PA3
## PA1 1.00 0.34 0.23
## PA2 0.34 1.00 0.14
## PA3 0.23 0.14 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 78 and the objective function was 3.27 with Chi Square of 427.43
## The degrees of freedom for the model are 42 and the objective function was 0.61
##
## The root mean square of the residuals (RMSR) is 0.05
## The df corrected root mean square of the residuals is 0.07
##
## The harmonic number of observations is 137 with the empirical chi square 61.69 with prob < 0.025
## The total number of observations was 137 with Likelihood Chi Square = 78.4 with prob < 0.00056
##
## Tucker Lewis Index of factoring reliability = 0.803
## RMSEA index = 0.084 and the 90 % confidence intervals are 0.052 0.107
## BIC = -128.24
## Fit based upon off diagonal values = 0.94
## Measures of factor score adequacy
## PA1 PA2 PA3
## Correlation of (regression) scores with factors 0.92 0.90 0.83
## Multiple R square of scores with factors 0.84 0.81 0.69
## Minimum correlation of possible factor scores 0.68 0.63 0.38
The loading table suggests that item 3, 6, 7, 8 loaded on factor 1, item 9, 10, 11, 13 loaded on factor 2, and item 1, 2, 4 loaded on factor 3, while item 5 did not load on any factor.
I would label Factor 1 as “Career interests in science” since all 4 items indicates reponse relates to career attitude towards science or geoscience. Factor 2 should be labeled as “Outdoor Enjoyment” since items loaded on factor 2 measure individuals’ interests in outdoor activities. Factor 3 should be labeled as “Knowledge in science” becuase all items loaded on this factor capture people’s skills and knowledge in science.
# PA1 PA2 PA3
PA1 1.00 0.34 0.23
PA2 0.34 1.00 0.14
PA3 0.23 0.14 1.00
The correlations among the factors are not strong, with a moderate correlation between factor 1 and factor 2, week correlations between factor 1 and factor 3 as well as factor 2 and factor 3.
Items best represnet factor 1: item3, item6, item7
Items best represent factor 2: item9, item10, item11
Items best represent factor 3: item1, item2, item4