DLL-ES IRT Analysis and Recommendations

Goals

The goals are 1) to create a word list that is informative about both English and Spanish vocabulary size and 2) to ensure that there are sufficient doublets to estimate lexical overlap. On an IRT view, we can’t perfectly assess 2 (at least not without better bilingual CDI data), but we can assess criterion 1 - that is, we can look at whether the reduced word list is a good sub-test for the full CDI in each language. Our original concern was that the current DLL-ES test might not perform well for older or high ability kids due to the lack of abstract words, and we can test this formally.

The DLL lists are meant to be used together with the original English and Spanish MCDI short forms.

## [1] "water (beverage)"

## [1] "rain"

##  [1] "moo"         "truck"       "bottle"      "glass"       "light"      
##  [6] "cloud"       "hose"        "sun"         "night night" "jump"       
## [11] "look"        "sit"         "big"         "then"        "dirty"      
## [16] "today"

Notes: the DLL lists chicken, whereas wordbank has both chicken (food) and chicken (animal). Similar for water (beverage / not beverage both included in wordbank). We will use all related wordbank items in our analysis. There were also a lot of plurals listed on the DLL that were singular in wordbank (e.g., ears/orejas, dedos/fingers).

English DLL items not in our wordbank IRT model: one, two, three, family, drum, good morning, also, and many. Spanish DLL items not in our wordbank IRT model: tostada, algunos, alimentar, sonreír, no hay más (although we have no and no hay, as well as más), and lastimado (but we have lastimar(se)).

DLL2-Spanish has “esta noche” translated as “at night”, but should be “tonight”. DLL also has “puede”, a conjugation of infinitive verb poder (which is in Wordbank). Wordbank and DLL also often disagree about plural, e.g. escalera, and pierna on the DLL correspond to escaleras and piernas in Wordbank. (See also brazos and manos.)

Does the DLL short form recover full form scores?

English DLL Level 1

Using data from 3717 English-speaking children 12-18 month of age from Wordbank, we test how well sumscores from the DLL-ES1 short English form + CDI:WG short form predict children’s production scores from the full CDI (WG/WS). The left panel shows full CDI scores vs. the DLL-ES1 short + CDI:WG short score, and the right panel shows the full CDI scores vs. just the CDI:WG short form score.

Overall, the correlation of children’s CDI:WG short + DLL scores and their full CDI production scores is quite high (\(r=0.98\)), but as shown above, for small vocabulary sizes the DLL score overestimates full CDI:WG production scores, while for higher full CDI:WG scores the DLL underestimates vocab size (dotted line has slope \(=163 / 395\)). However, the CDI:WG short form alone (right panel) shows a similar (and more extreme) overestimation for small vocabulary sizes.

English DLL Level 2

Using data from 6411 English-speaking children 16-30 month of age from Wordbank, we test how well sumscores from the DLL-ES1 short English form + CDI:WG short form predict children’s full production scores from the CDI:WS. The left panel shows full CDI scores vs. the DLL-ES2 short + CDI:WS short form (A) score, and the right panel shows the full CDI scores vs. just the CDI:WG short form (A) score.

Overall, the correlation of children’s CDI:WS short + DLL2 scores and their full CDI production scores is quite high (\(r=0.98\)), but as shown above, the DLL2 again mostly overestimates production scores on the full CDI (dotted line has slope \(=166 / 680\)). In comparison, the CDI:WS short form (A) score only overestimates full CDI scores for smaller vocabulary sizes (<400).

Spanish DLL Level 1

Now we look at overestimation for Spanish DLLs + CDI short forms.

Using Wordbank data from 731 Spanish-speaking children aged 12-18 months, we test how well sumscores from the DLL-ES1 short Spanish form correlate with children’s full CDI:WG production scores.

As for English, the correlation of Spanish-speaking children’s DLL scores and their full CDI:WG production scores is quite high (\(r=0.99\)), but as shown above, their DLL score overestimates the production score on the full CDI at smaller vocabulary sizes (dotted line has slope \(=61 / 428\)). Do note that few children in this dataset have large productive vocabularies.

Spanish DLL Level 2

Recommendations

Overall, it seems that many of the items on the DLL are somewhat easier than average, and thus these forms tend to overestimate children’s full CDI scores (indeed, for items on the DLL1 English short form, the average easiness is -0.89, while the mean easiness of items not on the DLL is -1.85). This is also true of the CDI:WG short English form: the average easiness is -0.19 and the average ease of items not on the WG short form is -1.98. The CDI:WS short English form (A) is less biased towards easy items: average easiness is -1.29 vs. -1.82 for items not on the short WS. The histograms below show the distribution of easiness parameters for English (left) and Spanish (right) CDI words. Solid lines show the average ease of DLL items (DLL 1 = red, DLL 2 = orange), and dashed lines show the average of non-DLL items.

Spanish DLL1 items have an average ease of -0.96, while other items on the full CDI have a mean ease of -2.02. The Spanish DLL2 shows the least bias: items on it have an average ease of -1.75, while other CDI items have a mean of -1.94.

We recommend bringing the overall mean estimated IRT difficulty of the words selected for the DLLs closer to the mean difficulty of the words on the rest of the CDI.

To start, we examine IRT easiness parameters for the doublets on the existing DLL lists, looking for large itmes with large mismatch between their English and Spanish ease.

Do doublets have similar difficulties?

We want to whether assess doublet items have similar difficulty (operationalized by their IRT parameters) in English and in Spanish. For example, consider if “perro” was for some reason much more difficult than “dog”, then you wouldn’t want to include it because it wouldn’t be a good item for estimating vocabulary overlap!

I’ve entered the English translation-equivalent item for the Spanish DLL1 items. Below are shown the parameters for these items, (en_d = English easiness, sp_d = Spanish easiness), ordered by the most to least discrepant (difficulty difference squared). (_a1 columns show item discriminations (slopes), and sp_en_d_diff simply shows Spanish - English easiness) Clearly some of these items have quite different difficulty levels, e.g. hat is much easier than sombrero (and somewhat easier than gorra). We may want to pick a criterion for the maximum allowable discrepancy, and try to find items that are more equivalent.

word	translation	sp_a1	sp_d	en_a1	en_d	sp_en_d_diff	d_diff_sq
sombrero	hat	3.28	-2.14	3.15	1.37	-3.51	12.33
otro/otra vez	other	2.34	-0.84	3.09	-4.05	3.22	10.34
plato	dish	4.57	0.17	2.99	-2.90	3.06	9.39
nana	babysitter	0.97	-1.36	2.41	-4.31	2.94	8.67
acabar	finish	2.51	-0.93	3.13	-3.68	2.75	7.55
mesa	table	4.61	0.34	5.01	-2.22	2.56	6.55
afuera	outside	3.53	-1.82	2.61	0.44	-2.26	5.10
gorra	hat	3.07	-0.85	3.15	1.37	-2.22	4.92
pájaro	bird	2.77	-0.23	2.51	1.78	-2.01	4.05
niña	girl	2.35	0.45	2.79	-1.40	1.85	3.42
nariz	nose	4.20	0.45	3.42	2.18	-1.73	2.99
taza	cup	3.53	-0.84	3.39	0.83	-1.67	2.79
suave	soft	3.32	-4.34	3.28	-2.74	-1.60	2.55
cantar	sing	3.07	-0.79	4.05	-2.37	1.58	2.51
casa	house	4.22	0.55	3.31	-1.03	1.58	2.49
cobija	blanket	4.49	-1.29	3.26	0.29	-1.57	2.48
pato	duck	2.50	0.16	2.27	1.66	-1.50	2.26
sofá	couch	3.00	-3.68	3.78	-2.22	-1.46	2.13
dulce	candy	3.13	0.63	2.66	-0.75	1.38	1.91
yo	I	2.25	0.23	2.11	-1.11	1.34	1.79
ayudar	help	4.23	-2.32	3.36	-1.07	-1.25	1.55
cabeza	head	4.55	0.77	4.14	-0.43	1.19	1.43
jugo	juice	2.88	0.35	2.42	1.51	-1.16	1.34
radio	radio	3.89	-1.92	2.83	-3.03	1.11	1.24
cereal	cereal	2.37	-1.67	3.13	-0.57	-1.10	1.21
empujar	push	3.92	-3.13	3.33	-2.03	-1.09	1.20
tutú	choo choo	1.11	-0.52	2.00	0.55	-1.07	1.14
casa	home	4.22	0.55	3.34	-0.49	1.04	1.09
oscuro	dark	3.48	-3.75	3.39	-2.76	-1.00	0.99
besar	kiss	3.33	-1.15	3.42	-0.17	-0.98	0.97
baño	bath	4.61	0.50	2.81	1.45	-0.96	0.92
me	me	1.75	-1.31	2.01	-0.37	-0.95	0.89
piedra	rock	4.04	-1.49	3.16	-0.55	-0.94	0.89
silla	chair	4.92	0.91	4.39	0.11	0.81	0.65
patear	kick	3.52	-2.69	3.35	-1.93	-0.76	0.57
no	no	1.50	1.81	2.09	2.57	-0.76	0.57
saltar	jump	3.20	-2.14	3.87	-1.40	-0.74	0.55
noche	night	2.95	-2.26	2.59	-1.54	-0.73	0.53
hola	hi	1.94	1.23	1.45	1.85	-0.62	0.38
esperar(se)	wait	3.60	-3.34	3.02	-2.72	-0.61	0.38
planta	plant	3.89	-2.22	3.12	-2.80	0.58	0.34
romper	break	3.80	-2.11	3.98	-2.61	0.50	0.25
cuchara	spoon	4.17	0.04	3.42	0.54	-0.50	0.25
rápido (descriptive)	fast	3.64	-2.92	4.13	-3.37	0.45	0.20
brincar	jump	3.90	-1.02	3.87	-1.40	0.39	0.15
lluvia	rain	3.60	-1.35	4.29	-0.96	-0.38	0.15
galleta	cookie	2.95	1.21	2.92	1.56	-0.36	0.13
luna	moon	3.02	-0.19	2.43	0.09	-0.27	0.08
lejos	away	2.88	-2.79	2.98	-3.05	0.26	0.07
león	lion	3.09	-1.28	3.32	-1.12	-0.16	0.03
lámpara	lamp	4.22	-3.39	2.94	-3.28	-0.11	0.01
ratón	mouse	3.33	-1.10	3.28	-1.13	0.03	0.00
carreola	stroller	2.41	-1.85	2.61	-1.84	-0.01	0.00
muñeca	doll	2.59	-0.30	2.11	-0.30	0.00	0.00

DLL Level 2

word	translation	sp_a1	sp_d	en_a1	en_d	sp_en_d_diff	d_diff_sq
sombrero	hat	3.28	-2.14	3.15	1.37	-3.51	12.33
banco (outside)	bench	2.94	-2.18	3.59	-5.43	3.25	10.57
mucho	much	3.33	-1.80	3.10	-5.04	3.24	10.48
acabar	finish	2.51	-0.93	3.13	-3.68	2.75	7.55
plato	plate	4.57	0.17	4.19	-2.16	2.33	5.42
gorra	hat	3.07	-0.85	3.15	1.37	-2.22	4.92
barco	boat	3.41	-2.01	2.90	0.19	-2.20	4.84
pájaro	bird	2.77	-0.23	2.51	1.78	-2.01	4.05
caber	fit	3.76	-3.43	4.21	-5.30	1.88	3.52
tirar	dump	3.82	-2.11	2.81	-3.89	1.78	3.15
persona	person	3.19	-3.76	3.08	-5.51	1.75	3.05
abajo	down	3.46	-1.17	2.22	0.46	-1.63	2.66
feliz	happy	3.08	-3.12	2.86	-1.53	-1.59	2.52
columpio	swing (object)	3.80	-1.96	3.21	-0.49	-1.47	2.16
dulce	candy	3.13	0.63	2.66	-0.75	1.38	1.91
nosotros	us	3.51	-4.73	3.31	-6.09	1.36	1.86
amiga	friend	2.84	-1.99	3.50	-3.35	1.36	1.85
después	after	2.93	-4.03	3.88	-5.37	1.34	1.78
día	day	2.70	-2.47	3.09	-3.76	1.29	1.66
basura	trash	4.54	-0.03	2.20	-1.28	1.25	1.56
escuchar	listen	2.97	-2.99	3.70	-4.23	1.24	1.54
una	a	2.32	-1.10	2.25	-2.33	1.23	1.52
peine	comb	3.94	-0.68	2.92	-1.90	1.23	1.50
perro	dog	1.83	1.28	1.93	2.49	-1.21	1.47
bee/mee	baa baa	0.83	-0.30	1.17	0.90	-1.20	1.45
jugo	juice	2.88	0.35	2.42	1.51	-1.16	1.34
gustar	like	4.32	-2.70	4.56	-3.76	1.06	1.12
estrella	star	3.85	-1.78	2.82	-0.76	-1.01	1.03
oso	bear	2.96	-0.09	2.48	0.81	-0.91	0.83
ellos	them	3.42	-4.51	3.26	-5.40	0.89	0.80
hola	hello	1.94	1.23	1.41	0.35	0.87	0.76
horno	oven	3.65	-4.32	3.69	-3.51	-0.81	0.66
noche	night	2.95	-2.26	2.59	-1.54	-0.73	0.53
miau	meow	1.54	0.91	2.01	1.61	-0.70	0.49
collar	necklace	3.37	-2.99	3.02	-2.29	-0.70	0.49
terminar	finish	3.35	-3.05	3.13	-3.68	0.63	0.40
hola	hi	1.94	1.23	1.45	1.85	-0.62	0.38
frío	cold	3.63	-0.72	3.45	-0.11	-0.60	0.37
un	a	2.42	-1.81	2.25	-2.33	0.52	0.27
toalla	towel	4.89	-2.02	4.12	-1.51	-0.52	0.27
escuela	school	3.56	-0.88	2.87	-1.39	0.51	0.27
rápido (descriptive)	fast	3.64	-2.92	4.13	-3.37	0.45	0.20
lluvia	rain	3.60	-1.35	4.29	-0.96	-0.38	0.15
caballo	horse	2.99	0.65	3.34	0.32	0.34	0.11
último	last	4.55	-6.41	4.06	-6.69	0.28	0.08
escaleras	stairs	4.20	-1.84	3.28	-2.12	0.28	0.08
llevar(se)	carry	4.34	-3.36	4.09	-3.10	-0.26	0.07
avión	airplane	2.81	0.24	2.97	0.48	-0.24	0.06
pensar	think	3.96	-5.63	3.87	-5.86	0.22	0.05
cielo	sky	3.87	-1.37	2.88	-1.59	0.22	0.05
bandera	flag	3.48	-2.62	2.71	-2.42	-0.21	0.04
gracias	thank you	2.27	1.35	2.04	1.17	0.18	0.03
recámara	bedroom	3.72	-3.19	4.14	-3.03	-0.17	0.03
calcetín	sock	3.26	0.16	2.01	0.23	-0.07	0.01
piernas	leg	4.49	-1.64	4.86	-1.69	0.05	0.00

Recommended Item Swaps

We will use Wordbank’s unilemmas to find translation-equivalent pairs that have smaller d_diff_sq values than current DLL items. We first get the English / Spanish unilemmas from wordbank (both WS and WG), and below simply show the Spanish vs. English easiness parameters.

Working with 381 unilemmas that match both our English and Spanish IRT parameters. For each DLL list we will simply swap the N=15 items with the largest easiness difference for items with minimal easiness difference. We also attempt to find replacement items of the same lexical class. We report the original DLL list’s easiness SSE, as well as the improvement in (easiness SSE) after each swap is made.

English Level 1 DLL

new_dll1ENshort <- improve_DLL_list(dll1ENshort, dict, Nswaps=Nswaps, language="english")$new_list

## [1] "Original list item easiness SSE: 150.24"
## [1] "Mean Spanish item easiness: -0.8"
## [1] "Mean English item easiness: -1.34"
## [1] "Selecting from 291 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'water (beverage)' with 'doll' (SSE improvement = 11.46)"
## [1] "Replacing 'none' with 'these' (SSE improvement = 12.63)"
## [1] "Replacing 'on' with 'all' (SSE improvement = 15.45)"
## [1] "Replacing 'her' with 'that' (SSE improvement = 11.25)"
## [1] "Replacing 'street' with 'stroller' (SSE improvement = 17.35)"
## [1] "Replacing 'meat' with 'mouse' (SSE improvement = 4.52)"
## [1] "Replacing 'glass' with 'toy (object)' (SSE improvement = 5.32)"
## [1] "Replacing 'truck' with 'sock' (SSE improvement = 5.75)"
## [1] "Replacing 'bathroom' with 'stove' (SSE improvement = 5.82)"
## [1] "Replacing 'there' with 'same' (SSE improvement = 2.75)"
## [1] "Replacing 'chicken (food)' with 'leg' (SSE improvement = 2.99)"
## [1] "Replacing 'his' with 'why' (SSE improvement = 5.98)"
## [1] "Replacing 'sun' with 'diaper' (SSE improvement = 1.69)"
## [1] "Replacing 'sleep' with 'blow' (SSE improvement = 1.71)"
## [1] "Replacing 'hot' with 'full' (SSE improvement = 1.64)"
## [1] "New list item easiness SSE: 43.93"
## [1] "Mean Spanish item easiness: -1.09"
## [1] "Mean English item easiness: -1.33"

write.csv(new_dll1ENshort, file="DLL/new_DLL-ES1-short-English.csv")

Even swapping only 10 items reduced the total SSE by more than 50%. Although it was not optimized for, the average ease of the items also decreased for both languages, coming closer to the mean. We now do the same for the other DLL forms before determining whether the DLL overestimation has decreased.

English Level 2 DLL

new_dll2ENshort <- improve_DLL_list(dll2ENshort, dict, Nswaps=Nswaps, language="english")$new_list

## [1] "Original list item easiness SSE: 106.84"
## [1] "Mean Spanish item easiness: -1.02"
## [1] "Mean English item easiness: -1.14"
## [1] "Selecting from 319 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'street' with 'doll' (SSE improvement = 17.35)"
## [1] "Replacing 'glass' with 'stroller' (SSE improvement = 5.33)"
## [1] "Replacing 'truck' with 'sink' (SSE improvement = 5.76)"
## [1] "Replacing 'bathroom' with 'mouse' (SSE improvement = 5.82)"
## [1] "Replacing 'his' with 'where (question)' (SSE improvement = 6.1)"
## [1] "Replacing 'good night' with 'wanna' (SSE improvement = 6.64)"
## [1] "Replacing 'today' with 'shh' (SSE improvement = 7.38)"
## [1] "Replacing 'boy' with 'yum yum' (SSE improvement = 2.05)"
## [1] "Replacing 'fall' with 'blow' (SSE improvement = 2.08)"
## [1] "Replacing 'broken' with 'full' (SSE improvement = 2.01)"
## [1] "Replacing 'out' with 'these' (SSE improvement = 2.1)"
## [1] "Replacing 'belly button' with 'toy (object)' (SSE improvement = 2.11)"
## [1] "Replacing 'big' with 'sick' (SSE improvement = 2.3)"
## [1] "Replacing 'soap' with 'stove' (SSE improvement = 2.35)"
## [1] "Replacing 'moo' with 'baby' (SSE improvement = 2.82)"
## [1] "New list item easiness SSE: 34.65"
## [1] "Mean Spanish item easiness: -1.14"
## [1] "Mean English item easiness: -1.15"

write.csv(new_dll2ENshort, file="DLL/new_DLL-ES2-short-English.csv")

Spanish Level 1 DLL

new_dll1SPshort <- improve_DLL_list(dll1SPshort %>% select(-translation), dict, Nswaps=Nswaps, language="spanish")$new_list

## [1] "Original list item easiness SSE: 114.15"
## [1] "Mean Spanish item easiness: -0.78"
## [1] "Mean English item easiness: -0.74"
## [1] "Selecting from 328 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'finish' with 'blow' (SSE improvement = 7.55)"
## [1] "Replacing 'babysitter' with 'cockadoodledoo' (SSE improvement = 8.66)"
## [1] "Replacing 'water (beverage)' with 'sink' (SSE improvement = 11.46)"
## [1] "Replacing 'good night' with 'wanna' (SSE improvement = 6.64)"
## [1] "Replacing 'hat' with 'toy (object)' (SSE improvement = 12.33)"
## [1] "Replacing 'bird' with 'stove' (SSE improvement = 4.05)"
## [1] "Replacing 'plate' with 'train' (SSE improvement = 5.42)"
## [1] "Replacing 'bathroom' with 'diaper' (SSE improvement = 5.81)"
## [1] "Replacing 'girl' with 'shh' (SSE improvement = 3.39)"
## [1] "Replacing 'table' with 'hair' (SSE improvement = 6.54)"
## [1] "Replacing 'out' with 'where (question)' (SSE improvement = 2.1)"
## [1] "Replacing 'couch' with 'watch (object)' (SSE improvement = 2.12)"
## [1] "Replacing 'duck' with 'grapes' (SSE improvement = 2.24)"
## [1] "Replacing 'blanket' with 'sweater' (SSE improvement = 2.47)"
## [1] "Replacing 'candy' with 'bedroom' (SSE improvement = 1.88)"
## [1] "New list item easiness SSE: 31.49"
## [1] "Mean Spanish item easiness: -1"
## [1] "Mean English item easiness: -0.83"

write.csv(new_dll1SPshort, file="DLL/new_DLL-ES1-short-Spanish.csv")

Note especially that the average ease of English items on this list was originally quite high (-0.29 vs. Spanish’s -0.78), but it is somewhat closer to the mean ease of the Spanish items after the substitutions.

Spanish Level 2 DLL

new_dll2SPshort <- improve_DLL_list(dll2SPshort %>% select(-translation), dict, Nswaps=Nswaps, language="spanish")$new_list

## [1] "Original list item easiness SSE: 73.77"
## [1] "Mean Spanish item easiness: -1.27"
## [1] "Mean English item easiness: -1.28"
## [1] "Selecting from 330 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'finish' with 'blow' (SSE improvement = 7.55)"
## [1] "Replacing 'hat' with 'doll' (SSE improvement = 12.33)"
## [1] "Replacing 'bird' with 'stroller' (SSE improvement = 4.05)"
## [1] "Replacing 'boat' with 'sink' (SSE improvement = 4.84)"
## [1] "Replacing 'plate' with 'mouse' (SSE improvement = 5.42)"
## [1] "Replacing 'a lot' with 'where (question)' (SSE improvement = 6.85)"
## [1] "Replacing 'swing (object)' with 'toy (object)' (SSE improvement = 2.15)"
## [1] "Replacing 'candy' with 'stove' (SSE improvement = 1.9)"
## [1] "Replacing 'down' with 'these' (SSE improvement = 2.66)"
## [1] "Replacing 'person' with 'cockadoodledoo' (SSE improvement = 3.04)"
## [1] "Replacing 'after' with 'shh' (SSE improvement = 1.76)"
## [1] "Replacing 'they' with 'all' (SSE improvement = 3.67)"
## [1] "Replacing 'juice' with 'train' (SSE improvement = 1.33)"
## [1] "Replacing 'baa baa' with 'yum yum' (SSE improvement = 1.44)"
## [1] "Replacing 'dog' with 'diaper' (SSE improvement = 1.46)"
## [1] "New list item easiness SSE: 13.32"
## [1] "Mean Spanish item easiness: -1.2"
## [1] "Mean English item easiness: -1.22"

write.csv(new_dll2SPshort, file="DLL/new_DLL-ES2-short-Spanish.csv")

Much like the previous list, after substitutions the mean ease of the English items is closer to those of the Spanish items.

Overestimation in new DLL lists

The new DLL1 English short form (left) certainly looks like it’s overestimating less (compare to just the CDI:WG short form). We quantify the reliability (ICC1) and the overall root mean squared error (RMSE) of the new DLL form + CDI short form, just the CDI short form, and of the old DLL form.

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00608386 (tol = 0.002, component 1)

## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

The table below shows that the reliability of the original (old) DLLs vs. the (new) DLL with swaps is the same for extrapolating to children’s full CDI scores (column: DLL.vs..full.ICC1), and higher than the reliability of extrapolating full CDI scores from just the appropriate short CDI form (column: CDI.short.vs..full.ICC1). The RMSE of the new DLL forms vs. the old DLL forms improved in all cases except for the the DLL1 SP short form, which was marginally worse (new RMSE=12.60 vs. old RMSE=12.22). For all DLL forms (new and old), the RMSE is better than that achieved by extrapolating from the CDI short form (column: CDI.short.vs..full.RMSE).

DLL	DLL.vs..full.ICC1	CDI.short.vs..full.ICC1	DLL.vs..full.RMSE	CDI.short.vs..full.RMSE
Old DLL1 EN short	0.97	0.92	10.93	19.97
New DLL1 EN short	0.97	0.92	11.42	19.97
Old DLL2 EN short	0.97	0.96	43.45	46.30
New DLL2 EN short	0.97	0.96	42.22	46.30
Old DLL1 SP short	0.98	0.97	12.22	18.67
New DLL1 SP short	0.98	0.97	11.81	18.67
Old DLL2 SP short	0.96	0.94	48.27	62.35
New DLL2 SP short	0.96	0.94	49.65	62.35

Summary

For each of the DLL lists, swapping the 15 items with the largest discrepancy between English and Spanish easiness for items of minimal discrepancy within the same lexical class resulted in substantially reducing the total easiness SSE, and also resulted in mean item easiness (in both languages) that are more equal and closer to the means of each language, and thus generally better for extrapolating to children’s full CDI score. Recognizing that these swaps are chosen algorithmically, we recommend the DLL team to consider each of the above swaps and determine whether any key words have been removed, or whether any undesirable words have been added.