The purpose of this document is to describe the analysis and results of the Weighted Likelihood Estimation (WLE) analysis of the Reasoning factor, the second of 16 personality factors found here:
https://openpsychometrics.org/_rawdata/
Initially, there were 4071 observations. However, after removal of the following threats to validity,
-12 persons skipping all questions
-18 persons giving the exact same Likert response to all 163 psychological items
-1 person skipping all questions bar 1
-577 students spending less than 500 seconds on entire questionnaire (163 psych Qs + age + gender + self-reported accuracy; if 3 seconds for all 166 questions, total seconds would be 498, therefore lower limit = 500 secs),
analysis was undertaken on 3463 cases.
The Cronbach’s alpha reliability is as follows:
## [1] 0.7666735
B1 “I make insightful remarks”
B2 “I know the answers to many questions”
B3 “I tend to analyze things”
B4 “I use my brain”
B5 “I learn quickly”
B6 “I counter others’ arguments”
B7 “I reflect on things before acting”
B8 “I weigh the pros against the cons”
B9 “I consider myself an average person” [Reversed]
B10 “I get confused easily” [Reversed]
B11 “I know that I am not a special person” [Reversed]
B12 “I have a poor vocabulary” [Reversed]
B13 “I skip difficult words while reading” [Reversed]
The 13 discrimination indices are as follows, respectively:
## [1] 0.4484044 0.4579474 0.4245839 0.5295567 0.4997765 0.3134875 0.2966078
## [8] 0.3344143 0.3744755 0.4285430 0.2819597 0.4971966 0.3490237
The frequency of the raw responses were as follows:
## B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13
## 0 89 87 22 14 35 93 128 56 322 167 301 74 171
## 1 294 507 130 47 204 475 524 287 1259 793 822 247 479
## 2 827 870 295 209 477 872 706 630 510 640 671 411 369
## 3 1670 1437 1700 1523 1705 1502 1553 1665 948 1392 1034 1367 1254
## 4 563 550 1305 1659 1028 495 534 809 411 459 626 1351 1179
xsi.index xsi.label est
1 1 B1_Cat1 -1.7534
2 2 B1_Cat2 -1.3583
3 3 B1_Cat3 -0.7699
4 4 B1_Cat4 1.3231
5 5 B2_Cat1 -2.2804
6 6 B2_Cat2 -0.8246
7 7 B2_Cat3 -0.5313
8 8 B2_Cat4 1.2299
9 9 B3_Cat1 -2.4798
10 10 B3_Cat2 -1.2840
11 11 B3_Cat3 -1.9632
12 12 B3_Cat4 0.3445
13 13 B4_Cat1 -1.9836
14 14 B4_Cat2 -2.0159
15 15 B4_Cat3 -2.2522
16 16 B4_Cat4 -0.0589
17 17 B5_Cat1 -2.4076
18 18 B5_Cat2 -1.2601
19 19 B5_Cat3 -1.4331
20 20 B5_Cat4 0.6396
21 21 B6_Cat1 -2.1490
22 22 B6_Cat2 -0.8923
23 23 B6_Cat3 -0.5723
24 24 B6_Cat4 1.3829
25 25 B7_Cat1 -1.9191
26 26 B7_Cat2 -0.5794
27 27 B7_Cat3 -0.8180
28 28 B7_Cat4 1.3353
29 29 B8_Cat1 -2.2289
30 30 B8_Cat2 -1.1484
31 31 B8_Cat3 -1.0819
32 32 B8_Cat4 0.9068
33 33 B9_Cat1 -1.7272
34 34 B9_Cat2 0.7700
35 35 B9_Cat3 -0.5027
36 36 B9_Cat4 1.2429
37 37 B10_Cat1 -2.0175
38 38 B10_Cat2 -0.0173
39 39 B10_Cat3 -0.7573
40 40 B10_Cat4 1.4248
41 41 B11_Cat1 -1.4272
42 42 B11_Cat2 0.0008
43 43 B11_Cat3 -0.3935
44 44 B11_Cat4 0.8212
45 45 B12_Cat1 -1.8325
46 46 B12_Cat2 -0.9170
47 47 B12_Cat3 -1.3719
48 48 B12_Cat4 0.1197
49 49 B13_Cat1 -1.5732
50 50 B13_Cat2 -0.0731
51 51 B13_Cat3 -1.3280
52 52 B13_Cat4 0.2265
……………………………..
Regression Coefficients
[,1] [1,] 0
Variance:
[,1] [1,] 0.3626
EAP Reliability:
[1] 0.78
EAP Reliability:
[1] 0.78
IRT Ability results using the TAM package were as follows: M = 0.01521347, SD = 0.7510011
Following are all item expected curves:
## Iteration in WLE/MLE estimation 1 | Maximal change 1.1779
## Iteration in WLE/MLE estimation 2 | Maximal change 0.3136
## Iteration in WLE/MLE estimation 3 | Maximal change 0.0567
## Iteration in WLE/MLE estimation 4 | Maximal change 0.0227
## Iteration in WLE/MLE estimation 5 | Maximal change 0.0089
## Iteration in WLE/MLE estimation 6 | Maximal change 0.0035
## Iteration in WLE/MLE estimation 7 | Maximal change 0.0014
## Iteration in WLE/MLE estimation 8 | Maximal change 5e-04
## Iteration in WLE/MLE estimation 9 | Maximal change 2e-04
## Iteration in WLE/MLE estimation 10 | Maximal change 1e-04
## ----
## WLE Reliability = 0.788
## ....................................................
## Plots exported in png format into folder:
## /Users/matthewcourtney/Desktop/Wu.Course/Day 8/Plots
Item characteristic cuves by item also here:
## Iteration in WLE/MLE estimation 1 | Maximal change 1.1779
## Iteration in WLE/MLE estimation 2 | Maximal change 0.3136
## Iteration in WLE/MLE estimation 3 | Maximal change 0.0567
## Iteration in WLE/MLE estimation 4 | Maximal change 0.0227
## Iteration in WLE/MLE estimation 5 | Maximal change 0.0089
## Iteration in WLE/MLE estimation 6 | Maximal change 0.0035
## Iteration in WLE/MLE estimation 7 | Maximal change 0.0014
## Iteration in WLE/MLE estimation 8 | Maximal change 5e-04
## Iteration in WLE/MLE estimation 9 | Maximal change 2e-04
## Iteration in WLE/MLE estimation 10 | Maximal change 1e-04
## ----
## WLE Reliability = 0.788
## ....................................................
## Plots exported in png format into folder:
## /Users/matthewcourtney/Desktop/Wu.Course/Day 8/Plots
The Thurstonian Thresholds were as follows:
## Cat1 Cat2 Cat3 Cat4
## B1 -2.188385 -1.3309021 -0.49026489 1.43417358
## B2 -2.478607 -1.0637512 -0.24618530 1.38198853
## B3 -2.767181 -1.7560730 -1.29702759 0.44577026
## B4 -2.639740 -2.1012268 -1.65188599 0.04953003
## B5 -2.680389 -1.5740662 -0.96432495 0.75924683
## B6 -2.382477 -1.0917664 -0.26742554 1.51052856
## B7 -2.153046 -0.9408875 -0.32968140 1.44680786
## B8 -2.507172 -1.3854675 -0.69387817 1.03317261
## B9 -1.819061 -0.1053772 0.25094604 1.44131470
## B10 -2.155426 -0.6241150 -0.12496948 1.54019165
## B11 -1.650970 -0.4617004 0.02481079 1.09048462
## B12 -2.183075 -1.3014221 -0.85848999 0.33444214
## B13 -1.828583 -0.8403625 -0.54830933 0.46920776
Finally, the Wright Map:
## Cat1 Cat2 Cat3 Cat4
## B1 -2.188385 -1.3309021 -0.49026489 1.43417358
## B2 -2.478607 -1.0637512 -0.24618530 1.38198853
## B3 -2.767181 -1.7560730 -1.29702759 0.44577026
## B4 -2.639740 -2.1012268 -1.65188599 0.04953003
## B5 -2.680389 -1.5740662 -0.96432495 0.75924683
## B6 -2.382477 -1.0917664 -0.26742554 1.51052856
## B7 -2.153046 -0.9408875 -0.32968140 1.44680786
## B8 -2.507172 -1.3854675 -0.69387817 1.03317261
## B9 -1.819061 -0.1053772 0.25094604 1.44131470
## B10 -2.155426 -0.6241150 -0.12496948 1.54019165
## B11 -1.650970 -0.4617004 0.02481079 1.09048462
## B12 -2.183075 -1.3014221 -0.85848999 0.33444214
## B13 -1.828583 -0.8403625 -0.54830933 0.46920776
Both CTT and IRT WLE reliability estimates were good at .77 and .79, respectively (EAP rel. = .78).
CTT Discrimination indices were between .28 and .53, therefore each items contributed substantiatively and positively to the overall score in the test.
For all 13 items, the most common response category was agree (3 on the 0-4 scale).
Across all 13 items, expected curves were well aligned–as self-reported reasoning ability increased, the expected score of students for each item (0-4) also increased. Likewise, item characteristic curves suggested logical ordering of the probability plots for each of the five item categories. All Thurstonian thresholds were also ordered.
The Wright map provided much insight. Strong agreement with “I counter other’s arguments” and strong disagreement with “I get confused easily” tended to identify those of the highest reasoning. However, strong disagreement with “I tend to analyze things”, “I use my brain”, and “I learn quickly” tended to identify those respondents of with the lowest self-reported reasoning.
Compared with Confirmatory Factor Analysis (CFA), IRT analysis provides a for a better diagnostic tool for understanding indicative behaviours that might be associated with increased capcity in some latent trait. Whilst never being confused and often countering arguments might be associated with the highest levels of reasoning, never tending to analyze things nor learn quickly may be especially indicative behaviours of those persons at the lower end of the reasoning spectrum.
Given the Likert design of the questions (as opposed to a selection of indicative behaviours for each question), it may be that a CFA analytical approach may have been envisiged by the original researchers. Although a standard CFA would not have readily identified which item categories were associated with the extremes of reasoning ability, factor scores generated from the analysis would have taken account of the extent to which each of the items discriminate, making factor scores a useful means of measuring a the latent trait (by acounting for systemic measurement error).
To sum, when the purpose of a instrument is to provide diagnostic feedback for the purpose of clinical or educational intervention, IRT analysis is useful. However, when the purpose of the instrument is to provide a broad gauge of the existance of some latent trait, whilst theoretically accounting for errors associated with the items themselves, CFA is useful. A two-paramter IRT model migtht also be considered as this model accounts for item discrimination.