The first step in data analysis is to review the data. Tables and visual inspection of the data are often used. The following table describes the ASQ:SE (36 months) for the Brazilian data.
| key | 0 | 5 | 10 | Total |
|---|---|---|---|---|
| q_1 | 90.7% (7235) | 8.5% (675) | 0.9% (69) | 100.0% (7979) |
| q_10 | 93.6% (7472) | 5.9% (468) | 0.5% (39) | 100.0% (7979) |
| q_11 | 75.7% (6038) | 22.2% (1774) | 2.1% (167) | 100.0% (7979) |
| q_12 | 58.5% (4669) | 22.0% (1753) | 19.5% (1557) | 100.0% (7979) |
| q_13 | 88.7% (7074) | 9.2% (733) | 2.2% (172) | 100.0% (7979) |
| q_14 | 81.4% (6493) | 14.9% (1190) | 3.7% (296) | 100.0% (7979) |
| q_15 | 91.2% (7278) | 3.2% (257) | 5.6% (444) | 100.0% (7979) |
| q_16 | 93.1% (7431) | 5.2% (411) | 1.7% (137) | 100.0% (7979) |
| q_17 | 86.5% (6901) | 9.9% (790) | 3.6% (288) | 100.0% (7979) |
| q_18 | 84.3% (6723) | 13.6% (1085) | 2.1% (171) | 100.0% (7979) |
| q_19 | 71.4% (5695) | 20.2% (1610) | 8.4% (674) | 100.0% (7979) |
| q_2 | 86.3% (6887) | 11.1% (883) | 2.6% (209) | 100.0% (7979) |
| q_20 | 56.6% (4516) | 29.2% (2326) | 14.2% (1137) | 100.0% (7979) |
| q_21 | 94.9% (7571) | 2.5% (198) | 2.6% (210) | 100.0% (7979) |
| q_22 | 96.7% (7719) | 2.0% (160) | 1.3% (100) | 100.0% (7979) |
| q_23 | 68.4% (5460) | 19.6% (1567) | 11.9% (952) | 100.0% (7979) |
| q_24 | 83.4% (6657) | 11.8% (942) | 4.8% (380) | 100.0% (7979) |
| q_25 | 66.2% (5283) | 18.3% (1461) | 15.5% (1235) | 100.0% (7979) |
| q_26 | 91.7% (7316) | 5.4% (434) | 2.9% (229) | 100.0% (7979) |
| q_27 | 94.5% (7539) | 4.9% (391) | 0.6% (49) | 100.0% (7979) |
| q_28 | 94.0% (7499) | 4.7% (376) | 1.3% (104) | 100.0% (7979) |
| q_29 | 75.7% (6042) | 18.4% (1466) | 5.9% (471) | 100.0% (7979) |
| q_3 | 86.0% (6861) | 10.8% (858) | 3.3% (260) | 100.0% (7979) |
| q_30 | 88.2% (7038) | 8.0% (635) | 3.8% (306) | 100.0% (7979) |
| q_4 | 59.7% (4767) | 26.2% (2089) | 14.1% (1123) | 100.0% (7979) |
| q_5 | 83.9% (6693) | 12.0% (955) | 4.1% (331) | 100.0% (7979) |
| q_6 | 77.1% (6152) | 15.0% (1198) | 7.9% (629) | 100.0% (7979) |
| q_7 | 76.8% (6128) | 18.5% (1474) | 4.7% (377) | 100.0% (7979) |
| q_8 | 85.5% (6826) | 12.1% (968) | 2.3% (185) | 100.0% (7979) |
| q_9 | 91.0% (7260) | 7.7% (613) | 1.3% (106) | 100.0% (7979) |
| Total | 82.4% (197223) | 12.4% (29740) | 5.2% (12407) | 100.0% (239370) |
These results are published and the full results can be seen at https://onlinelibrary.wiley.com/doi/abs/10.1111/cch.12649. The table below just confirms that the data is the same.
| average | sd | q95 | q99 | n |
|---|---|---|---|---|
| 34.19 | 28.48 | 90 | 130 | 7979 |
I’ll preserve only 500 observations to avoid computational difficulties. This data (sub)sampling is slightly different from the previous published, but with virtually identical results.
Within the IRT framework, the relationship between a respondent’s performance and the characteristics underlying the items is described by a monotonically increasing function called the item characteristic curve (ICC). Once the raw data of the ASQ:SE is coded to identify social and emotional difficulties, and the higher the score, the greater the probability of risk. I’ll recode all values before proceeding with IRT analysis.
Now, 10 will be, 0; 5 will be 1 and 0 will be 2.
Some previous work suggests the variability of the ASQ:SE data can be explained by two factors. The following graph also concludes in the same direction. Therefore, I’ll take advantage of previously published analysis to assign all items to their specific factors.
##
## Note: parallel analysis suggests 8 factors.
The model fit needs to be checked and CFI, TLI, RMSEA are the main statistics to that.
| M2 | df | p | RMSEA | RMSEA_5 | RMSEA_95 | SRMSR | TLI | CFI | |
|---|---|---|---|---|---|---|---|---|---|
| stats | 734.17 | 374 | 0 | 0.04 | 0.04 | 0.05 | 0.08 | 0.91 | 0.92 |
Once everything is ok, I’ll check each item results.
| a1 | a2 | d1 | d2 | |
|---|---|---|---|---|
| q_1 | 0.00 | 2.02 | 6.87 | 3.63 |
| q_2 | 0.00 | 1.20 | 4.23 | 2.35 |
| q_3 | 0.00 | 1.87 | 4.76 | 2.92 |
| q_4 | 0.39 | 0.00 | 1.65 | 0.19 |
| q_5 | 2.02 | 0.00 | 4.51 | 2.53 |
| q_6 | 0.60 | 0.00 | 2.61 | 1.15 |
| q_7 | 2.11 | 0.00 | 4.46 | 2.11 |
| q_8 | 3.23 | 0.00 | 7.15 | 4.19 |
| q_9 | 0.00 | 1.70 | 5.37 | 3.20 |
| q_10 | 0.00 | 1.65 | 7.49 | 3.56 |
| q_11 | 2.26 | 0.00 | 5.23 | 2.09 |
| q_12 | 0.93 | 0.00 | 1.48 | 0.27 |
| q_13 | 1.88 | 0.00 | 5.45 | 3.21 |
| q_14 | 0.78 | 0.00 | 3.69 | 1.70 |
| q_15 | 0.70 | 0.00 | 3.40 | 2.56 |
| q_16 | 0.71 | 0.00 | 4.11 | 2.55 |
| q_17 | 0.00 | 3.99 | 8.01 | 5.35 |
| q_18 | 2.40 | 0.00 | 5.88 | 2.96 |
| q_19 | 1.76 | 0.00 | 2.99 | 1.30 |
| q_20 | 0.60 | 0.00 | 2.00 | 0.30 |
| q_21 | 1.68 | 0.00 | 4.88 | 4.12 |
| q_22 | 1.37 | 0.00 | 5.96 | 4.94 |
| q_23 | 0.75 | 0.00 | 2.31 | 1.07 |
| q_24 | 1.41 | 0.00 | 3.48 | 2.00 |
| q_25 | 0.00 | 2.19 | 3.14 | 1.43 |
| q_26 | 0.00 | 3.42 | 7.18 | 5.70 |
| q_27 | 0.00 | 1.26 | 5.83 | 3.46 |
| q_28 | 0.00 | 0.99 | 4.58 | 3.16 |
| q_29 | 1.59 | 0.00 | 3.94 | 1.48 |
| q_30 | 0.81 | 0.00 | 3.41 | 2.15 |
And the test information curves for both domains / factors.
The latent score can be achieved by expected a-posteriori for each sum score.
That said, the distribution of the latent / “observed” ability can be checked.
and the correlation between latent (IRT derived) and traditional (sum scores) can be computed.
With that reported, I would argue CTT scores and IRT scores are (very much) related and, for pragmatic reasons, the summed score works well.