Multi-dimensional IRT Models

Here we attempt to determine whether the latent space of the data is multidimensional via exploratory multi-factor models. We fit exploratory 2- through 8-factor 2PL models, and then compare them.

Below is a table showing sequential model comparisons from the ordinary 2PL up to the 8-factor exploratory model. AIC continues decreasing (lower is better) at 8 factors, but BIC prefers the 5-factor over the 6-factor model. Before we try to understand the higher-dimensional models, we look more closely at the 2-factor model.

Comparisons of 2PL 1- through 8-factor exploratory models.
Model	AIC	BIC	logLik	df
2PL	571770.0	579159.8	-284525.0	NaN
2-factor	551749.9	562829.2	-273836.0	679
3-factor	542328.8	557092.1	-268447.4	678
4-factor	536212.0	554653.8	-264712.0	677
5-factor	532266.2	554381.2	-262063.1	676
6-factor	530139.1	555921.9	-260324.6	675
7-factor	529576.3	559021.3	-259369.2	674
8-factor	528517.3	561619.2	-258166.7	673

First we look at the structure of the loadings: how much variance accounted for on each dimension?

Proportion of variance per factor (sorted) in exploratory multidimensional models.
Model	F1	F2	F3	F4	F5	F6	F7	F8
1-d
2-d	0.47	0.38
3-d	0.39	0.22	0.21
4-d	0.38	0.22	0.15	0.03
5-d	0.35	0.21	0.17	0.04	0.03
6-d	0.35	0.19	0.19	0.03	0.02	0.01
7-d	0.33	0.19	0.16	0.02	0.02	0.01	0.01
8-d	0.33	0.17	0.16	0.03	0.02	0.02	0.01	0.01

After varimax rotation, most of the variance is captured (even in higher-dimensional models) by just the first two dimensions, with other dimensions explaining up to an additional ~0.14. We examine the 2- and 3-factor models to try to understand the structure of these factors in terms of the CDI categories.

What CDI categories do the factors load on? We inspect the average factor loading for each category for the 2-dimensional and 3-dimensional models.

2-factor Model

## `summarise()` ungrouping output (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)

Mean loadings on CDI category in 2-factor model
category	F1	F2
states	-0.74	0.58
food_drink	-0.72	0.56
body_parts	-0.72	0.58
clothing	-0.71	0.58
outside	-0.70	0.59
action_words	-0.70	0.58
places	-0.69	0.61
locations	-0.69	0.60
descriptive_words	-0.68	0.61
furniture_rooms	-0.68	0.58
vehicles	-0.68	0.59
pronouns	-0.67	0.61
toys	-0.67	0.63
people	-0.67	0.63
connecting_words	-0.66	0.60
question_words	-0.66	0.64
prepositions	-0.66	0.58
games_routines	-0.65	0.62
animals	-0.62	0.64
time_words	-0.62	0.63
household	-0.61	0.69
quantifiers	-0.60	0.68
sounds	-0.55	0.53

Let’s plot F1 vs. F2 for the 2-factor model and label the extremes.

3-factor Model

Mean loadings on CDI category in 3-factor model
category	F1	F2	F3
food_drink	-0.66	0.40	0.44
body_parts	-0.66	0.41	0.46
clothing	-0.65	0.41	0.44
outside	-0.64	0.45	0.42
action_words	-0.64	0.42	0.43
states	-0.63	0.32	0.57
connecting_words	-0.63	0.46	0.36
locations	-0.63	0.45	0.42
places	-0.62	0.44	0.47
furniture_rooms	-0.61	0.42	0.44
vehicles	-0.61	0.43	0.45
people	-0.61	0.47	0.45
descriptive_words	-0.61	0.43	0.48
prepositions	-0.60	0.46	0.39
pronouns	-0.60	0.44	0.46
toys	-0.60	0.46	0.47
games_routines	-0.58	0.47	0.43
question_words	-0.56	0.44	0.52
time_words	-0.55	0.46	0.46
animals	-0.54	0.47	0.48
quantifiers	-0.54	0.57	0.40
household	-0.53	0.51	0.51
sounds	-0.50	0.44	0.31

Let’s plot F2 vs. F3 for the 3-factor model and label the extremes.

Next we will attempt to understand the 7-factor exploratory model through clustering analyses.

Clustering the Items

We attempt to understand the factors by clustering items’ factor loadings, and then look at acquisition curves (or item difficulties?) for each cluster. We’ll first use mclust’s Gaussian finite mixture model and t-SNE to plot the solution, and then move on to k-means and hierarchical clustering.

cluster	a1	a2	a3	a4	a5	a6	a7	d	N
4	-2.93	-0.19	0.34	-0.86	-0.22	-0.76	0.34	-3.23	156
3	-2.88	-0.39	-0.09	0.52	0.18	-1.05	0.61	-2.00	211
2	-2.44	-0.08	1.13	-0.35	-0.40	-0.11	0.54	-3.35	126
1	-2.19	-0.62	0.07	0.44	0.05	-0.60	-0.01	-1.07	187

Mclust finds 4 clusters. Below are the number of words of each CDI category per cluster (1-4).

category	1	2	3	4
furniture_rooms	1	.	31	.
household	2	.	48	.
places	2	.	19	1
body_parts	4	.	23	.
toys	4	.	14	.
clothing	7	.	21	.
vehicles	10	.	4	.
sounds	12	.	.	.
people	19	.	10	.
games_routines	24	.	.	1
animals	43	.	.	.
food_drink	59	.	9	.
connecting_words	.	6	.	.
descriptive_words	.	18	.	45
locations	.	9	.	4
prepositions	.	15	.	.
pronouns	.	43	.	.
quantifiers	.	15	.	.
question_words	.	7	.	.
states	.	1	.	2
time_words	.	12	.	.
outside	.	.	31	.
.	.	.	1	.
action_words	.	.	.	103

Cluster 1 has many (early-learned?) nouns (food & drink, animals, games/routines, sounds, vehicles), with the rest picked up in cluster 3 (furniture/rooms, household, places, body parts, toys, clothing).

Do the words in these clusters vary systematically in their difficulty? Shown below, the items in cluster 1 are much easier, followed by cluster 3 – the rest of the nouns. Words in clusters 2 (grammatical) and 4 (verbs/adjectives) tend to be much harder.

Clustered Items Compared to CDI Categories

We take the 7-factor loadings of each item and conduct a k-means clustering with \(k = 22\), the same as the number of CDI categories (e.g., quantifiers, locations, toys, clothing, etc.). We use adjusted Rand Index to compare the clusters to the item category assignment (0 = chance agreement, 1 = perfect agreement), and find a value of 0.011. Next we will use gap statistics to choose the best \(k\), plot the solution, and again examine adjusted Rand Index vs. the CDI categories.

##   cluster size ave.sil.width
## 1       1   93          0.19
## 2       2   82          0.14
## 3       3   83          0.19
## 4       4   59          0.29
## 5       5   90          0.39
## 6       6   92          0.21
## 7       7  106          0.19
## 8       8   75          0.28

According to the gap statistics, the optimal number of clusters \(k\) = 8. The adjusted Rand Index of this clustering solution compared to the 22 CDI categories is 0.011.

Hierarchical Clustering

Bifactor Models

Since factors are loading on different lexical classes and CDI categories, and >6 factors are justified, let’s try bifactor models that load on 1) each lexical class (nouns, verbs, adjectives, function words, other) and 2) on each CDI category (22 levels, e.g. quantifiers, locations, animals, people, sounds, etc.).

Comparing the two bifactor models, the category model is preferred by AIC and BIC, but only fits almost as well (compare log-likelihoods) as the 2-factor exploratory model.

Comparison of lexical class and category bifactor models.
Model	AIC	BIC	logLik	df
Lexical Class	563429.1	574513.8	-279674.6	.
Category	561866.0	572950.7	-278893.0	0

Further analysis is needed to understand the multidimensional structure of the CDI data, but it is intriguing that nouns, which represent the bulk of the items on the CDI, seem to hang together, and separately from other parts of speech.

Understanding the Multidimensionality of Spanish CDI Data

George

2020-12-17