stressdf <- read.csv("stress.csv")
I am first going to make a list of all the questions I will be dealing with. This only includes the questions that start with ‘c’.
questions <- c("c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8", "c9", "c10", "c11", "c12", "c13", "c14", "c15", "c16", "c17", "c18", "c19", "c20","c21", "c22", "c23", "c24", "c25", "c26", "c27", "c28", "c29", "c30", "c31", "c32", "c33", "c34", "c35", "c36", "c37", "c38", "c39", "c40", "c41", "c42", "c43", "c44", "c45", "c46", "c47", "c48")
Now any time I need to refer to all the questions, all I have to enter is ‘stressdf[questions]’.
First I have to make a correlation matrix of all the items.
questions.cor <- cor(stressdf[questions])
Then I have to run the Bartlett’s test. The output should be significant, meaning there is some correlation between variables/items in the survey. The second argument is the sample size (n) so I will input 310.
cortest.bartlett(questions.cor,n=310)
## $chisq
## [1] 5475.709
##
## $p.value
## [1] 0
##
## $df
## [1] 1128
Because the score is significant (p-value < 0.05) we move on to find the KMO.
KMO(questions.cor)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = questions.cor)
## Overall MSA = 0.84
## MSA for each item =
## c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16
## 0.79 0.84 0.86 0.87 0.71 0.82 0.78 0.76 0.82 0.84 0.81 0.80 0.86 0.84 0.88 0.88
## c17 c18 c19 c20 c21 c22 c23 c24 c25 c26 c27 c28 c29 c30 c31 c32
## 0.90 0.83 0.86 0.82 0.88 0.83 0.82 0.88 0.84 0.87 0.87 0.89 0.87 0.86 0.87 0.74
## c33 c34 c35 c36 c37 c38 c39 c40 c41 c42 c43 c44 c45 c46 c47 c48
## 0.79 0.87 0.86 0.85 0.85 0.83 0.84 0.86 0.85 0.86 0.85 0.77 0.84 0.84 0.90 0.79
The MSA is 0.8438605 putting it into the “great” category.
I am assigning the amount of factors to the amount of items there are in the dataframe.
efa <- principal(stressdf[questions],nfactors=(ncol(stressdf[questions])))
plot(efa$values,type = "b",
main="Eigen values",
xlab="Factors",
ylab="Items explained by factor")
From the Scree plot, I think there are 5 factors
Make a model with the 5 factors and then sort the factor loadings and additionally not showing any loading smaller than .35.
efa.model <- principal(stressdf[questions],nfactors=5,rotate="oblimin")
print.psych(efa.model,cut=.35,sort=TRUE)
## Principal Components Analysis
## Call: principal(r = stressdf[questions], nfactors = 5, rotate = "oblimin")
## Standardized loadings (pattern matrix) based upon correlation matrix
## item TC2 TC3 TC4 TC5 TC1 h2 u2 com
## c42 42 0.72 0.56 0.44 1.2
## c21 21 0.71 0.50 0.50 1.1
## c47 47 0.69 0.48 0.52 1.1
## c41 41 0.66 0.47 0.53 1.3
## c27 27 0.60 0.40 0.60 1.2
## c43 43 0.60 0.40 0.60 1.1
## c39 39 0.59 0.41 0.59 1.4
## c26 26 0.58 0.36 0.64 1.2
## c24 24 0.58 0.42 0.58 1.6
## c36 36 0.58 0.36 0.64 1.1
## c2 2 0.57 0.38 0.62 1.3
## c46 46 0.54 0.33 0.67 1.5
## c6 6 0.52 0.34 0.66 1.8
## c10 10 0.47 0.31 0.69 1.8
## c1 1 0.47 0.25 0.75 1.2
## c15 15 0.45 0.27 0.73 1.5
## c14 14 0.77 0.61 0.39 1.0
## c19 19 0.70 0.61 0.39 1.2
## c30 30 0.70 0.54 0.46 1.1
## c38 38 0.66 0.48 0.52 1.1
## c13 13 0.58 0.39 0.61 1.2
## c28 28 0.52 0.45 0.55 1.6
## c45 45 0.48 0.32 0.68 1.4
## c31 31 0.78 0.65 0.35 1.1
## c29 29 0.76 0.63 0.37 1.1
## c37 37 0.73 0.58 0.42 1.0
## c4 4 0.63 0.49 0.51 1.2
## c32 32 0.62 0.38 0.62 1.6
## c35 35 0.59 0.48 0.52 1.9
## c3 3 0.54 0.50 0.50 1.9
## c23 23 0.43 0.46 0.54 3.3
## c9 9 0.35 0.65 4.0
## c12 12 0.74 0.58 0.42 1.0
## c18 18 0.72 0.58 0.42 1.0
## c48 48 0.60 0.43 0.57 1.2
## c20 20 0.59 0.42 0.58 1.1
## c11 11 0.54 0.40 0.60 1.8
## c40 40 0.51 0.36 0.64 1.3
## c44 44 0.24 0.76 3.2
## c22 22 0.62 0.50 0.50 1.3
## c16 16 0.60 0.49 0.51 1.4
## c34 34 0.59 0.46 0.54 1.3
## c17 17 0.55 0.46 0.54 1.6
## c25 25 0.55 0.49 0.51 1.8
## c5 5 0.48 0.26 0.74 1.4
## c33 33 0.46 0.42 0.58 2.5
## c7 7 0.41 0.24 0.76 1.5
## c8 8 0.37 0.31 0.69 2.4
##
## TC2 TC3 TC4 TC5 TC1
## SS loadings 5.80 4.08 4.06 3.42 3.45
## Proportion Var 0.12 0.08 0.08 0.07 0.07
## Cumulative Var 0.12 0.21 0.29 0.36 0.43
## Proportion Explained 0.28 0.20 0.19 0.16 0.17
## Cumulative Proportion 0.28 0.47 0.67 0.83 1.00
##
## With component correlations of
## TC2 TC3 TC4 TC5 TC1
## TC2 1.00 0.00 0.14 -0.04 -0.02
## TC3 0.00 1.00 0.05 0.02 0.24
## TC4 0.14 0.05 1.00 0.33 0.14
## TC5 -0.04 0.02 0.33 1.00 0.20
## TC1 -0.02 0.24 0.14 0.20 1.00
##
## Mean item complexity = 1.5
## Test of the hypothesis that 5 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0.05
## with the empirical chi square 1730.51 with prob < 2.8e-55
##
## Fit based upon off diagonal values = 0.93
I am taking out item 9 and 44
questions2 <- c("c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8", "c10", "c11", "c12", "c13", "c14", "c15", "c16", "c17", "c18", "c19", "c20","c21", "c22", "c23", "c24", "c25", "c26", "c27", "c28", "c29", "c30", "c31", "c32", "c33", "c34", "c35", "c36", "c37", "c38", "c39", "c40", "c41", "c42", "c43", "c45", "c46", "c47", "c48")
efa.model2 <- principal(stressdf[questions2],nfactors=5,rotate="oblimin")
print.psych(efa.model2,cut=.35,sort=TRUE)
## Principal Components Analysis
## Call: principal(r = stressdf[questions2], nfactors = 5, rotate = "oblimin")
## Standardized loadings (pattern matrix) based upon correlation matrix
## item TC2 TC3 TC4 TC1 TC5 h2 u2 com
## c42 41 0.72 0.57 0.43 1.2
## c21 20 0.71 0.50 0.50 1.1
## c47 45 0.69 0.50 0.50 1.1
## c41 40 0.65 0.45 0.55 1.2
## c27 26 0.60 0.39 0.61 1.2
## c43 42 0.59 0.40 0.60 1.1
## c39 38 0.59 0.41 0.59 1.4
## c24 23 0.59 0.43 0.57 1.7
## c26 25 0.58 0.37 0.63 1.3
## c36 35 0.58 0.36 0.64 1.1
## c2 2 0.57 0.37 0.63 1.2
## c46 44 0.53 0.35 0.65 1.7
## c6 6 0.51 0.37 0.63 2.2
## c10 9 0.48 0.31 0.69 1.7
## c1 1 0.48 0.26 0.74 1.3
## c15 14 0.45 0.28 0.72 1.7
## c14 13 0.79 0.62 0.38 1.0
## c30 29 0.70 0.54 0.46 1.1
## c19 18 0.70 0.61 0.39 1.2
## c38 37 0.61 0.45 0.55 1.2
## c13 12 0.59 0.40 0.60 1.2
## c28 27 0.52 0.45 0.55 1.6
## c45 43 0.44 0.31 0.69 1.8
## c8 8 0.37 0.30 0.70 2.5
## c31 30 0.78 0.65 0.35 1.1
## c29 28 0.77 0.65 0.35 1.1
## c37 36 0.74 0.60 0.40 1.0
## c4 4 0.63 0.49 0.51 1.2
## c32 31 0.62 0.38 0.62 1.6
## c35 34 0.59 0.47 0.53 1.9
## c3 3 0.55 0.50 0.50 1.9
## c23 22 0.43 0.48 0.52 3.3
## c16 15 0.61 0.49 0.51 1.3
## c22 21 0.61 0.50 0.50 1.3
## c34 33 0.58 0.46 0.54 1.3
## c25 24 0.56 0.49 0.51 1.7
## c17 16 0.53 0.45 0.55 1.7
## c5 5 0.47 0.25 0.75 1.4
## c33 32 0.44 0.41 0.59 2.7
## c7 7 0.37 0.24 0.76 1.8
## c12 11 0.76 0.61 0.39 1.0
## c18 17 0.74 0.61 0.39 1.0
## c20 19 0.60 0.42 0.58 1.1
## c48 46 0.59 0.43 0.57 1.3
## c11 10 0.54 0.39 0.61 1.7
## c40 39 0.48 0.35 0.65 1.5
##
## TC2 TC3 TC4 TC1 TC5
## SS loadings 5.79 4.07 3.90 3.38 3.19
## Proportion Var 0.13 0.09 0.08 0.07 0.07
## Cumulative Var 0.13 0.21 0.30 0.37 0.44
## Proportion Explained 0.28 0.20 0.19 0.17 0.16
## Cumulative Proportion 0.28 0.49 0.68 0.84 1.00
##
## With component correlations of
## TC2 TC3 TC4 TC1 TC5
## TC2 1.00 0.00 0.14 -0.02 -0.02
## TC3 0.00 1.00 0.05 0.23 0.02
## TC4 0.14 0.05 1.00 0.14 0.32
## TC1 -0.02 0.23 0.14 1.00 0.20
## TC5 -0.02 0.02 0.32 0.20 1.00
##
## Mean item complexity = 1.5
## Test of the hypothesis that 5 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0.05
## with the empirical chi square 1577.99 with prob < 3.7e-51
##
## Fit based upon off diagonal values = 0.93
The items that loaded in the first factor were items like; “Make an extra effort to get things done” and “Adjust my priorities”. These items all sounded like they wanted to fix or solve the situation so I call factor 1:
Solution Response
Factor 2 has items like, “Become very tense” and “Worry about what I am going to do”. These items sounded like the situation really affected them and that their emotions were changed by it, thus I call factor 2:
Neurotic Response
Factor 3 has items, “Visit a friend” and “Go to a party” which are very outward and social actions after a stressful situation. I call factor 3:
Social Response
Factor 4 has items such as, “Go out for a snack or meal” and “Buy myself something”, both of which are very ‘self-care’ actions thus I call factor 4:
Self-Care Response
Finally, factor 5 has items like, “Blame myself for not knowing what to do” and “Focus on my general inadequacies” which are very negative reactions directed towards the self. I call factor 5:
Self-Punishing Response
Side note: items 9 and 44 did not load into any factor when the cut off was set to .35. Item 9’s loading is on factor 3 (TC4) with .31 and item 44’s loading is on factor 4 (TC5) with .32.
kable(efa.model2$r.scores)
| TC2 | TC3 | TC4 | TC1 | TC5 | |
|---|---|---|---|---|---|
| TC2 | 1.0000000 | 0.0009570 | 0.1430818 | -0.0232162 | -0.0242650 |
| TC3 | 0.0009570 | 1.0000000 | 0.0500892 | 0.2281678 | 0.0221495 |
| TC4 | 0.1430818 | 0.0500892 | 1.0000000 | 0.1408011 | 0.3243891 |
| TC1 | -0.0232162 | 0.2281678 | 0.1408011 | 1.0000000 | 0.1963883 |
| TC5 | -0.0242650 | 0.0221495 | 0.3243891 | 0.1963883 | 1.0000000 |
As seen in the previous output, the correlations between TC4 and TC5 was 0.33. There was also other correlations that were quite high (.24, .20, etc.) meaning that it was a good idea to use the rotation I used oblimin because that allows some correlation between factors.
There should be 5 times the amount of items in the sample size and this data set had much more than that even if there were 48 factors (>240).
The range of item loadings are pretty similar to each other (EFA and CFA). Some times the individual items load more under their factors in EFA and sometimes more on CFA.
I was expecting to see much stronger loadings under CFA but I guess that is not always the case.
Factor 1: EFA [.45, .72] CFA [.44, .71]
Factor 2: EFA [.48, .77] CFA [.52, .69]
Factor 3: EFA [.43, .78] CFA [.41, .74]
Factor 4: EFA [.51. .74] CFA [.46, .79]
Factor 5: EFA [.37, .62] CFA [.51, .67]
cfa.model <- 'F1=~c42+c21+c47+c41+c27+c43+c39+c26+c24+c36+c2+c46+c6+c10+c1
F2=~c14+c19+c30+c38+c13+c28+c45
F3=~c31+c29+c37+c4+c32+c35+c3+c23+c9
F4=~c12+c18+c48+c20+c11+c40
F5=~c22+c16+c34+c17+c25+c5+c33+c7+c8'
fit.cfa <- cfa(cfa.model,stressdf,std.lv=TRUE)
summary(fit.cfa, fit.measures=TRUE, standardized=TRUE)
## lavaan 0.6-10 ended normally after 21 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 102
##
## Number of observations 310
##
## Model Test User Model:
##
## Test statistic 1854.329
## Degrees of freedom 979
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 5587.496
## Degrees of freedom 1035
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.808
## Tucker-Lewis Index (TLI) 0.797
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -20289.720
## Loglikelihood unrestricted model (H1) -19362.556
##
## Akaike (AIC) 40783.440
## Bayesian (BIC) 41164.571
## Sample-size adjusted Bayesian (BIC) 40841.065
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.054
## 90 Percent confidence interval - lower 0.050
## 90 Percent confidence interval - upper 0.057
## P-value RMSEA <= 0.05 0.052
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.073
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## F1 =~
## c42 0.656 0.048 13.641 0.000 0.656 0.708
## c21 0.656 0.051 12.863 0.000 0.656 0.677
## c47 0.685 0.055 12.480 0.000 0.685 0.662
## c41 0.558 0.049 11.434 0.000 0.558 0.618
## c27 0.504 0.049 10.224 0.000 0.504 0.564
## c43 0.568 0.054 10.527 0.000 0.568 0.578
## c39 0.510 0.052 9.785 0.000 0.510 0.543
## c26 0.529 0.055 9.649 0.000 0.529 0.537
## c24 0.511 0.050 10.194 0.000 0.511 0.562
## c36 0.563 0.059 9.511 0.000 0.563 0.531
## c2 0.481 0.053 9.107 0.000 0.481 0.511
## c46 0.502 0.057 8.857 0.000 0.502 0.499
## c6 0.427 0.052 8.243 0.000 0.427 0.469
## c10 0.576 0.069 8.378 0.000 0.576 0.475
## c1 0.445 0.058 7.672 0.000 0.445 0.440
## F2 =~
## c14 0.825 0.064 12.935 0.000 0.825 0.688
## c19 0.995 0.064 15.568 0.000 0.995 0.788
## c30 0.750 0.063 11.969 0.000 0.750 0.648
## c38 0.822 0.066 12.478 0.000 0.822 0.669
## c13 0.698 0.072 9.703 0.000 0.698 0.546
## c28 0.655 0.059 11.155 0.000 0.655 0.612
## c45 0.635 0.070 9.064 0.000 0.635 0.515
## F3 =~
## c31 0.921 0.064 14.487 0.000 0.921 0.743
## c29 0.933 0.059 15.827 0.000 0.933 0.791
## c37 0.906 0.064 14.208 0.000 0.906 0.733
## c4 0.820 0.066 12.387 0.000 0.820 0.661
## c32 0.470 0.079 5.982 0.000 0.470 0.352
## c35 0.496 0.062 8.069 0.000 0.496 0.462
## c3 0.791 0.073 10.789 0.000 0.791 0.592
## c23 0.801 0.082 9.786 0.000 0.801 0.546
## c9 0.534 0.077 6.965 0.000 0.534 0.405
## F4 =~
## c12 1.042 0.068 15.273 0.000 1.042 0.791
## c18 1.013 0.066 15.332 0.000 1.013 0.793
## c48 0.713 0.087 8.198 0.000 0.713 0.477
## c20 0.779 0.075 10.387 0.000 0.779 0.584
## c11 0.611 0.078 7.789 0.000 0.611 0.456
## c40 0.583 0.073 7.975 0.000 0.583 0.466
## F5 =~
## c22 0.788 0.064 12.272 0.000 0.788 0.669
## c16 0.638 0.071 9.045 0.000 0.638 0.521
## c34 0.685 0.061 11.248 0.000 0.685 0.625
## c17 0.808 0.072 11.263 0.000 0.808 0.625
## c25 0.669 0.063 10.637 0.000 0.669 0.597
## c5 0.436 0.070 6.242 0.000 0.436 0.374
## c33 0.598 0.073 8.225 0.000 0.598 0.480
## c7 0.487 0.063 7.702 0.000 0.487 0.453
## c8 0.587 0.067 8.725 0.000 0.587 0.505
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## F1 ~~
## F2 -0.055 0.066 -0.834 0.404 -0.055 -0.055
## F3 0.151 0.064 2.365 0.018 0.151 0.151
## F4 0.031 0.067 0.465 0.642 0.031 0.031
## F5 0.008 0.068 0.112 0.911 0.008 0.008
## F2 ~~
## F3 0.109 0.066 1.661 0.097 0.109 0.109
## F4 0.174 0.066 2.619 0.009 0.174 0.174
## F5 0.697 0.042 16.447 0.000 0.697 0.697
## F3 ~~
## F4 0.571 0.049 11.612 0.000 0.571 0.571
## F5 0.255 0.065 3.955 0.000 0.255 0.255
## F4 ~~
## F5 0.376 0.062 6.073 0.000 0.376 0.376
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .c42 0.429 0.040 10.757 0.000 0.429 0.499
## .c21 0.508 0.046 11.022 0.000 0.508 0.542
## .c47 0.604 0.054 11.137 0.000 0.604 0.562
## .c41 0.504 0.044 11.410 0.000 0.504 0.618
## .c27 0.545 0.047 11.665 0.000 0.545 0.682
## .c43 0.644 0.055 11.607 0.000 0.644 0.666
## .c39 0.620 0.053 11.744 0.000 0.620 0.705
## .c26 0.691 0.059 11.768 0.000 0.691 0.712
## .c24 0.564 0.048 11.671 0.000 0.564 0.684
## .c36 0.808 0.069 11.791 0.000 0.808 0.719
## .c2 0.653 0.055 11.855 0.000 0.653 0.739
## .c46 0.759 0.064 11.892 0.000 0.759 0.751
## .c6 0.647 0.054 11.977 0.000 0.647 0.780
## .c10 1.136 0.095 11.959 0.000 1.136 0.774
## .c1 0.828 0.069 12.047 0.000 0.828 0.807
## .c14 0.758 0.072 10.589 0.000 0.758 0.527
## .c19 0.606 0.067 9.055 0.000 0.606 0.380
## .c30 0.780 0.071 10.956 0.000 0.780 0.581
## .c38 0.834 0.077 10.772 0.000 0.834 0.553
## .c13 1.149 0.099 11.577 0.000 1.149 0.702
## .c28 0.716 0.064 11.212 0.000 0.716 0.625
## .c45 1.117 0.095 11.709 0.000 1.117 0.735
## .c31 0.689 0.068 10.160 0.000 0.689 0.448
## .c29 0.522 0.056 9.343 0.000 0.522 0.374
## .c37 0.709 0.069 10.299 0.000 0.709 0.463
## .c4 0.868 0.079 11.017 0.000 0.868 0.563
## .c32 1.567 0.129 12.192 0.000 1.567 0.876
## .c35 0.908 0.076 11.952 0.000 0.908 0.787
## .c3 1.158 0.101 11.456 0.000 1.158 0.649
## .c23 1.512 0.130 11.669 0.000 1.512 0.702
## .c9 1.455 0.120 12.091 0.000 1.455 0.836
## .c12 0.650 0.079 8.233 0.000 0.650 0.374
## .c18 0.605 0.074 8.172 0.000 0.605 0.371
## .c48 1.727 0.147 11.740 0.000 1.727 0.772
## .c20 1.171 0.105 11.190 0.000 1.171 0.659
## .c11 1.424 0.120 11.819 0.000 1.424 0.792
## .c40 1.229 0.104 11.784 0.000 1.229 0.783
## .c22 0.765 0.074 10.405 0.000 0.765 0.552
## .c16 1.093 0.095 11.517 0.000 1.093 0.728
## .c34 0.732 0.068 10.840 0.000 0.732 0.610
## .c17 1.015 0.094 10.835 0.000 1.015 0.609
## .c25 0.807 0.073 11.059 0.000 0.807 0.643
## .c5 1.171 0.097 12.045 0.000 1.171 0.860
## .c33 1.194 0.102 11.702 0.000 1.194 0.770
## .c7 0.921 0.078 11.806 0.000 0.921 0.795
## .c8 1.005 0.087 11.593 0.000 1.005 0.745
## F1 1.000 1.000 1.000
## F2 1.000 1.000 1.000
## F3 1.000 1.000 1.000
## F4 1.000 1.000 1.000
## F5 1.000 1.000 1.000