Data checking

The first step in data analysis is to review the data. Tables and visual inspection of the data are often used. The following table describes the ASQ:SE (36 months) for the Brazilian data.

key 0 5 10 Total
q_1 90.7% (7235) 8.5% (675) 0.9% (69) 100.0% (7979)
q_10 93.6% (7472) 5.9% (468) 0.5% (39) 100.0% (7979)
q_11 75.7% (6038) 22.2% (1774) 2.1% (167) 100.0% (7979)
q_12 58.5% (4669) 22.0% (1753) 19.5% (1557) 100.0% (7979)
q_13 88.7% (7074) 9.2% (733) 2.2% (172) 100.0% (7979)
q_14 81.4% (6493) 14.9% (1190) 3.7% (296) 100.0% (7979)
q_15 91.2% (7278) 3.2% (257) 5.6% (444) 100.0% (7979)
q_16 93.1% (7431) 5.2% (411) 1.7% (137) 100.0% (7979)
q_17 86.5% (6901) 9.9% (790) 3.6% (288) 100.0% (7979)
q_18 84.3% (6723) 13.6% (1085) 2.1% (171) 100.0% (7979)
q_19 71.4% (5695) 20.2% (1610) 8.4% (674) 100.0% (7979)
q_2 86.3% (6887) 11.1% (883) 2.6% (209) 100.0% (7979)
q_20 56.6% (4516) 29.2% (2326) 14.2% (1137) 100.0% (7979)
q_21 94.9% (7571) 2.5% (198) 2.6% (210) 100.0% (7979)
q_22 96.7% (7719) 2.0% (160) 1.3% (100) 100.0% (7979)
q_23 68.4% (5460) 19.6% (1567) 11.9% (952) 100.0% (7979)
q_24 83.4% (6657) 11.8% (942) 4.8% (380) 100.0% (7979)
q_25 66.2% (5283) 18.3% (1461) 15.5% (1235) 100.0% (7979)
q_26 91.7% (7316) 5.4% (434) 2.9% (229) 100.0% (7979)
q_27 94.5% (7539) 4.9% (391) 0.6% (49) 100.0% (7979)
q_28 94.0% (7499) 4.7% (376) 1.3% (104) 100.0% (7979)
q_29 75.7% (6042) 18.4% (1466) 5.9% (471) 100.0% (7979)
q_3 86.0% (6861) 10.8% (858) 3.3% (260) 100.0% (7979)
q_30 88.2% (7038) 8.0% (635) 3.8% (306) 100.0% (7979)
q_4 59.7% (4767) 26.2% (2089) 14.1% (1123) 100.0% (7979)
q_5 83.9% (6693) 12.0% (955) 4.1% (331) 100.0% (7979)
q_6 77.1% (6152) 15.0% (1198) 7.9% (629) 100.0% (7979)
q_7 76.8% (6128) 18.5% (1474) 4.7% (377) 100.0% (7979)
q_8 85.5% (6826) 12.1% (968) 2.3% (185) 100.0% (7979)
q_9 91.0% (7260) 7.7% (613) 1.3% (106) 100.0% (7979)
Total 82.4% (197223) 12.4% (29740) 5.2% (12407) 100.0% (239370)

Comparing the descriptives with the published version

These results are published and the full results can be seen at https://onlinelibrary.wiley.com/doi/abs/10.1111/cch.12649. The table below just confirms that the data is the same.

average sd q95 q99 n
34.19 28.48 90 130 7979

IRT analysis

I’ll preserve only 500 observations to avoid computational difficulties. This data (sub)sampling is slightly different from the previous published, but with virtually identical results.

Within the IRT framework, the relationship between a respondent’s performance and the characteristics underlying the items is described by a monotonically increasing function called the item characteristic curve (ICC). Once the raw data of the ASQ:SE is coded to identify social and emotional difficulties, and the higher the score, the greater the probability of risk. I’ll recode all values before proceeding with IRT analysis.

Now, 10 will be, 0; 5 will be 1 and 0 will be 2.

Some previous work suggests the variability of the ASQ:SE data can be explained by two factors. The following graph also concludes in the same direction. Therefore, I’ll take advantage of previously published analysis to assign all items to their specific factors.

## 
## Note: parallel analysis suggests 8 factors.

The model fit needs to be checked and CFI, TLI, RMSEA are the main statistics to that.

M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR TLI CFI
stats 734.17 374 0 0.04 0.04 0.05 0.08 0.91 0.92

Once everything is ok, I’ll check each item results.

a1 a2 d1 d2
q_1 0.00 2.02 6.87 3.63
q_2 0.00 1.20 4.23 2.35
q_3 0.00 1.87 4.76 2.92
q_4 0.39 0.00 1.65 0.19
q_5 2.02 0.00 4.51 2.53
q_6 0.60 0.00 2.61 1.15
q_7 2.11 0.00 4.46 2.11
q_8 3.23 0.00 7.15 4.19
q_9 0.00 1.70 5.37 3.20
q_10 0.00 1.65 7.49 3.56
q_11 2.26 0.00 5.23 2.09
q_12 0.93 0.00 1.48 0.27
q_13 1.88 0.00 5.45 3.21
q_14 0.78 0.00 3.69 1.70
q_15 0.70 0.00 3.40 2.56
q_16 0.71 0.00 4.11 2.55
q_17 0.00 3.99 8.01 5.35
q_18 2.40 0.00 5.88 2.96
q_19 1.76 0.00 2.99 1.30
q_20 0.60 0.00 2.00 0.30
q_21 1.68 0.00 4.88 4.12
q_22 1.37 0.00 5.96 4.94
q_23 0.75 0.00 2.31 1.07
q_24 1.41 0.00 3.48 2.00
q_25 0.00 2.19 3.14 1.43
q_26 0.00 3.42 7.18 5.70
q_27 0.00 1.26 5.83 3.46
q_28 0.00 0.99 4.58 3.16
q_29 1.59 0.00 3.94 1.48
q_30 0.81 0.00 3.41 2.15

And the test information curves for both domains / factors.

The latent score can be achieved by expected a-posteriori for each sum score.

That said, the distribution of the latent / “observed” ability can be checked.

and the correlation between latent (IRT derived) and traditional (sum scores) can be computed.

With that reported, I would argue CTT scores and IRT scores are (very much) related and, for pragmatic reasons, the summed score works well.