Goals

The goals are 1) to create a word list that is informative about both English and Spanish vocabulary size and 2) to ensure that there are sufficient doublets to estimate lexical overlap. On an IRT view, we can’t perfectly assess 2 (at least not without better bilingual CDI data), but we can assess criterion 1 - that is, we can look at whether the reduced word list is a good sub-test for the full CDI in each language. Our original concern was that the current DLL-ES test might not perform well for older or high ability kids due to the lack of abstract words, and we can test this formally.

The DLL lists are meant to be used together with the original English and Spanish MCDI short forms.

## [1] "water (beverage)"
## [1] "rain"
##  [1] "moo"         "truck"       "bottle"      "glass"       "light"      
##  [6] "cloud"       "hose"        "sun"         "night night" "jump"       
## [11] "look"        "sit"         "big"         "then"        "dirty"      
## [16] "today"

Notes: the DLL lists chicken, whereas wordbank has both chicken (food) and chicken (animal). Similar for water (beverage / not beverage both included in wordbank). We will use all related wordbank items in our analysis. There were also a lot of plurals listed on the DLL that were singular in wordbank (e.g., ears/orejas, dedos/fingers).

English DLL items not in our wordbank IRT model: one, two, three, family, drum, good morning, also, and many. Spanish DLL items not in our wordbank IRT model: tostada, algunos, alimentar, sonreír, no hay más (although we have no and no hay, as well as más), and lastimado (but we have lastimar(se)).

DLL2-Spanish has “esta noche” translated as “at night”, but should be “tonight”. DLL also has “puede”, a conjugation of infinitive verb poder (which is in Wordbank). Wordbank and DLL also often disagree about plural, e.g. escalera, and pierna on the DLL correspond to escaleras and piernas in Wordbank. (See also brazos and manos.)

Does the DLL short form recover full form scores?

English DLL Level 1

Using data from 3717 English-speaking children 12-18 month of age from Wordbank, we test how well sumscores from the DLL-ES1 short English form + CDI:WG short form predict children’s production scores from the full CDI (WG/WS). The left panel shows full CDI scores vs. the DLL-ES1 short + CDI:WG short score, and the right panel shows the full CDI scores vs. just the CDI:WG short form score.

Overall, the correlation of children’s CDI:WG short + DLL scores and their full CDI production scores is quite high (\(r=0.98\)), but as shown above, for small vocabulary sizes the DLL score overestimates full CDI:WG production scores, while for higher full CDI:WG scores the DLL underestimates vocab size (dotted line has slope \(=163 / 395\)). However, the CDI:WG short form alone (right panel) shows a similar (and more extreme) overestimation for small vocabulary sizes.

English DLL Level 2

Using data from 6411 English-speaking children 16-30 month of age from Wordbank, we test how well sumscores from the DLL-ES1 short English form + CDI:WG short form predict children’s full production scores from the CDI:WS. The left panel shows full CDI scores vs. the DLL-ES2 short + CDI:WS short form (A) score, and the right panel shows the full CDI scores vs. just the CDI:WG short form (A) score.

Overall, the correlation of children’s CDI:WS short + DLL2 scores and their full CDI production scores is quite high (\(r=0.98\)), but as shown above, the DLL2 again mostly overestimates production scores on the full CDI (dotted line has slope \(=166 / 680\)). In comparison, the CDI:WS short form (A) score only overestimates full CDI scores for smaller vocabulary sizes (<400).

Spanish DLL Level 1

Now we look at overestimation for Spanish DLLs + CDI short forms.

Using Wordbank data from 731 Spanish-speaking children aged 12-18 months, we test how well sumscores from the DLL-ES1 short Spanish form correlate with children’s full CDI:WG production scores.

As for English, the correlation of Spanish-speaking children’s DLL scores and their full CDI:WG production scores is quite high (\(r=0.99\)), but as shown above, their DLL score overestimates the production score on the full CDI at smaller vocabulary sizes (dotted line has slope \(=61 / 428\)). Do note that few children in this dataset have large productive vocabularies.

Spanish DLL Level 2

Recommendations

Overall, it seems that many of the items on the DLL are somewhat easier than average, and thus these forms tend to overestimate children’s full CDI scores (indeed, for items on the DLL1 English short form, the average easiness is -0.89, while the mean easiness of items not on the DLL is -1.85). This is also true of the CDI:WG short English form: the average easiness is -0.19 and the average ease of items not on the WG short form is -1.98. The CDI:WS short English form (A) is less biased towards easy items: average easiness is -1.29 vs. -1.82 for items not on the short WS. The histograms below show the distribution of easiness parameters for English (left) and Spanish (right) CDI words. Solid lines show the average ease of DLL items (DLL 1 = red, DLL 2 = orange), and dashed lines show the average of non-DLL items.

Spanish DLL1 items have an average ease of -0.96, while other items on the full CDI have a mean ease of -2.02. The Spanish DLL2 shows the least bias: items on it have an average ease of -1.75, while other CDI items have a mean of -1.94.

We recommend bringing the overall mean estimated IRT difficulty of the words selected for the DLLs closer to the mean difficulty of the words on the rest of the CDI.

To start, we examine IRT easiness parameters for the doublets on the existing DLL lists, looking for large itmes with large mismatch between their English and Spanish ease.

Do doublets have similar difficulties?

We want to whether assess doublet items have similar difficulty (operationalized by their IRT parameters) in English and in Spanish. For example, consider if “perro” was for some reason much more difficult than “dog”, then you wouldn’t want to include it because it wouldn’t be a good item for estimating vocabulary overlap!

I’ve entered the English translation-equivalent item for the Spanish DLL1 items. Below are shown the parameters for these items, (en_d = English easiness, sp_d = Spanish easiness), ordered by the most to least discrepant (difficulty difference squared). (_a1 columns show item discriminations (slopes), and sp_en_d_diff simply shows Spanish - English easiness) Clearly some of these items have quite different difficulty levels, e.g. hat is much easier than sombrero (and somewhat easier than gorra). We may want to pick a criterion for the maximum allowable discrepancy, and try to find items that are more equivalent.

word translation sp_a1 sp_d en_a1 en_d sp_en_d_diff d_diff_sq
sombrero hat 3.28 -2.14 3.15 1.37 -3.51 12.33
otro/otra vez other 2.34 -0.84 3.09 -4.05 3.22 10.34
plato dish 4.57 0.17 2.99 -2.90 3.06 9.39
nana babysitter 0.97 -1.36 2.41 -4.31 2.94 8.67
acabar finish 2.51 -0.93 3.13 -3.68 2.75 7.55
mesa table 4.61 0.34 5.01 -2.22 2.56 6.55
afuera outside 3.53 -1.82 2.61 0.44 -2.26 5.10
gorra hat 3.07 -0.85 3.15 1.37 -2.22 4.92
pájaro bird 2.77 -0.23 2.51 1.78 -2.01 4.05
niña girl 2.35 0.45 2.79 -1.40 1.85 3.42
nariz nose 4.20 0.45 3.42 2.18 -1.73 2.99
taza cup 3.53 -0.84 3.39 0.83 -1.67 2.79
suave soft 3.32 -4.34 3.28 -2.74 -1.60 2.55
cantar sing 3.07 -0.79 4.05 -2.37 1.58 2.51
casa house 4.22 0.55 3.31 -1.03 1.58 2.49
cobija blanket 4.49 -1.29 3.26 0.29 -1.57 2.48
pato duck 2.50 0.16 2.27 1.66 -1.50 2.26
sofá couch 3.00 -3.68 3.78 -2.22 -1.46 2.13
dulce candy 3.13 0.63 2.66 -0.75 1.38 1.91
yo I 2.25 0.23 2.11 -1.11 1.34 1.79
ayudar help 4.23 -2.32 3.36 -1.07 -1.25 1.55
cabeza head 4.55 0.77 4.14 -0.43 1.19 1.43
jugo juice 2.88 0.35 2.42 1.51 -1.16 1.34
radio radio 3.89 -1.92 2.83 -3.03 1.11 1.24
cereal cereal 2.37 -1.67 3.13 -0.57 -1.10 1.21
empujar push 3.92 -3.13 3.33 -2.03 -1.09 1.20
tutú choo choo 1.11 -0.52 2.00 0.55 -1.07 1.14
casa home 4.22 0.55 3.34 -0.49 1.04 1.09
oscuro dark 3.48 -3.75 3.39 -2.76 -1.00 0.99
besar kiss 3.33 -1.15 3.42 -0.17 -0.98 0.97
baño bath 4.61 0.50 2.81 1.45 -0.96 0.92
me me 1.75 -1.31 2.01 -0.37 -0.95 0.89
piedra rock 4.04 -1.49 3.16 -0.55 -0.94 0.89
silla chair 4.92 0.91 4.39 0.11 0.81 0.65
patear kick 3.52 -2.69 3.35 -1.93 -0.76 0.57
no no 1.50 1.81 2.09 2.57 -0.76 0.57
saltar jump 3.20 -2.14 3.87 -1.40 -0.74 0.55
noche night 2.95 -2.26 2.59 -1.54 -0.73 0.53
hola hi 1.94 1.23 1.45 1.85 -0.62 0.38
esperar(se) wait 3.60 -3.34 3.02 -2.72 -0.61 0.38
planta plant 3.89 -2.22 3.12 -2.80 0.58 0.34
romper break 3.80 -2.11 3.98 -2.61 0.50 0.25
cuchara spoon 4.17 0.04 3.42 0.54 -0.50 0.25
rápido (descriptive) fast 3.64 -2.92 4.13 -3.37 0.45 0.20
brincar jump 3.90 -1.02 3.87 -1.40 0.39 0.15
lluvia rain 3.60 -1.35 4.29 -0.96 -0.38 0.15
galleta cookie 2.95 1.21 2.92 1.56 -0.36 0.13
luna moon 3.02 -0.19 2.43 0.09 -0.27 0.08
lejos away 2.88 -2.79 2.98 -3.05 0.26 0.07
león lion 3.09 -1.28 3.32 -1.12 -0.16 0.03
lámpara lamp 4.22 -3.39 2.94 -3.28 -0.11 0.01
ratón mouse 3.33 -1.10 3.28 -1.13 0.03 0.00
carreola stroller 2.41 -1.85 2.61 -1.84 -0.01 0.00
muñeca doll 2.59 -0.30 2.11 -0.30 0.00 0.00

DLL Level 2

word translation sp_a1 sp_d en_a1 en_d sp_en_d_diff d_diff_sq
sombrero hat 3.28 -2.14 3.15 1.37 -3.51 12.33
banco (outside) bench 2.94 -2.18 3.59 -5.43 3.25 10.57
mucho much 3.33 -1.80 3.10 -5.04 3.24 10.48
acabar finish 2.51 -0.93 3.13 -3.68 2.75 7.55
plato plate 4.57 0.17 4.19 -2.16 2.33 5.42
gorra hat 3.07 -0.85 3.15 1.37 -2.22 4.92
barco boat 3.41 -2.01 2.90 0.19 -2.20 4.84
pájaro bird 2.77 -0.23 2.51 1.78 -2.01 4.05
caber fit 3.76 -3.43 4.21 -5.30 1.88 3.52
tirar dump 3.82 -2.11 2.81 -3.89 1.78 3.15
persona person 3.19 -3.76 3.08 -5.51 1.75 3.05
abajo down 3.46 -1.17 2.22 0.46 -1.63 2.66
feliz happy 3.08 -3.12 2.86 -1.53 -1.59 2.52
columpio swing (object) 3.80 -1.96 3.21 -0.49 -1.47 2.16
dulce candy 3.13 0.63 2.66 -0.75 1.38 1.91
nosotros us 3.51 -4.73 3.31 -6.09 1.36 1.86
amiga friend 2.84 -1.99 3.50 -3.35 1.36 1.85
después after 2.93 -4.03 3.88 -5.37 1.34 1.78
día day 2.70 -2.47 3.09 -3.76 1.29 1.66
basura trash 4.54 -0.03 2.20 -1.28 1.25 1.56
escuchar listen 2.97 -2.99 3.70 -4.23 1.24 1.54
una a 2.32 -1.10 2.25 -2.33 1.23 1.52
peine comb 3.94 -0.68 2.92 -1.90 1.23 1.50
perro dog 1.83 1.28 1.93 2.49 -1.21 1.47
bee/mee baa baa 0.83 -0.30 1.17 0.90 -1.20 1.45
jugo juice 2.88 0.35 2.42 1.51 -1.16 1.34
gustar like 4.32 -2.70 4.56 -3.76 1.06 1.12
estrella star 3.85 -1.78 2.82 -0.76 -1.01 1.03
oso bear 2.96 -0.09 2.48 0.81 -0.91 0.83
ellos them 3.42 -4.51 3.26 -5.40 0.89 0.80
hola hello 1.94 1.23 1.41 0.35 0.87 0.76
horno oven 3.65 -4.32 3.69 -3.51 -0.81 0.66
noche night 2.95 -2.26 2.59 -1.54 -0.73 0.53
miau meow 1.54 0.91 2.01 1.61 -0.70 0.49
collar necklace 3.37 -2.99 3.02 -2.29 -0.70 0.49
terminar finish 3.35 -3.05 3.13 -3.68 0.63 0.40
hola hi 1.94 1.23 1.45 1.85 -0.62 0.38
frío cold 3.63 -0.72 3.45 -0.11 -0.60 0.37
un a 2.42 -1.81 2.25 -2.33 0.52 0.27
toalla towel 4.89 -2.02 4.12 -1.51 -0.52 0.27
escuela school 3.56 -0.88 2.87 -1.39 0.51 0.27
rápido (descriptive) fast 3.64 -2.92 4.13 -3.37 0.45 0.20
lluvia rain 3.60 -1.35 4.29 -0.96 -0.38 0.15
caballo horse 2.99 0.65 3.34 0.32 0.34 0.11
último last 4.55 -6.41 4.06 -6.69 0.28 0.08
escaleras stairs 4.20 -1.84 3.28 -2.12 0.28 0.08
llevar(se) carry 4.34 -3.36 4.09 -3.10 -0.26 0.07
avión airplane 2.81 0.24 2.97 0.48 -0.24 0.06
pensar think 3.96 -5.63 3.87 -5.86 0.22 0.05
cielo sky 3.87 -1.37 2.88 -1.59 0.22 0.05
bandera flag 3.48 -2.62 2.71 -2.42 -0.21 0.04
gracias thank you 2.27 1.35 2.04 1.17 0.18 0.03
recámara bedroom 3.72 -3.19 4.14 -3.03 -0.17 0.03
calcetín sock 3.26 0.16 2.01 0.23 -0.07 0.01
piernas leg 4.49 -1.64 4.86 -1.69 0.05 0.00

English Level 1 DLL

new_dll1ENshort <- improve_DLL_list(dll1ENshort, dict, Nswaps=Nswaps, language="english")$new_list
## [1] "Original list item easiness SSE: 150.24"
## [1] "Mean Spanish item easiness: -0.8"
## [1] "Mean English item easiness: -1.34"
## [1] "Selecting from 291 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'water (beverage)' with 'doll' (SSE improvement = 11.46)"
## [1] "Replacing 'none' with 'these' (SSE improvement = 12.63)"
## [1] "Replacing 'on' with 'all' (SSE improvement = 15.45)"
## [1] "Replacing 'her' with 'that' (SSE improvement = 11.25)"
## [1] "Replacing 'street' with 'stroller' (SSE improvement = 17.35)"
## [1] "Replacing 'meat' with 'mouse' (SSE improvement = 4.52)"
## [1] "Replacing 'glass' with 'toy (object)' (SSE improvement = 5.32)"
## [1] "Replacing 'truck' with 'sock' (SSE improvement = 5.75)"
## [1] "Replacing 'bathroom' with 'stove' (SSE improvement = 5.82)"
## [1] "Replacing 'there' with 'same' (SSE improvement = 2.75)"
## [1] "Replacing 'chicken (food)' with 'leg' (SSE improvement = 2.99)"
## [1] "Replacing 'his' with 'why' (SSE improvement = 5.98)"
## [1] "Replacing 'sun' with 'diaper' (SSE improvement = 1.69)"
## [1] "Replacing 'sleep' with 'blow' (SSE improvement = 1.71)"
## [1] "Replacing 'hot' with 'full' (SSE improvement = 1.64)"
## [1] "New list item easiness SSE: 43.93"
## [1] "Mean Spanish item easiness: -1.09"
## [1] "Mean English item easiness: -1.33"
write.csv(new_dll1ENshort, file="DLL/new_DLL-ES1-short-English.csv")

Even swapping only 10 items reduced the total SSE by more than 50%. Although it was not optimized for, the average ease of the items also decreased for both languages, coming closer to the mean. We now do the same for the other DLL forms before determining whether the DLL overestimation has decreased.

English Level 2 DLL

new_dll2ENshort <- improve_DLL_list(dll2ENshort, dict, Nswaps=Nswaps, language="english")$new_list
## [1] "Original list item easiness SSE: 106.84"
## [1] "Mean Spanish item easiness: -1.02"
## [1] "Mean English item easiness: -1.14"
## [1] "Selecting from 319 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'street' with 'doll' (SSE improvement = 17.35)"
## [1] "Replacing 'glass' with 'stroller' (SSE improvement = 5.33)"
## [1] "Replacing 'truck' with 'sink' (SSE improvement = 5.76)"
## [1] "Replacing 'bathroom' with 'mouse' (SSE improvement = 5.82)"
## [1] "Replacing 'his' with 'where (question)' (SSE improvement = 6.1)"
## [1] "Replacing 'good night' with 'wanna' (SSE improvement = 6.64)"
## [1] "Replacing 'today' with 'shh' (SSE improvement = 7.38)"
## [1] "Replacing 'boy' with 'yum yum' (SSE improvement = 2.05)"
## [1] "Replacing 'fall' with 'blow' (SSE improvement = 2.08)"
## [1] "Replacing 'broken' with 'full' (SSE improvement = 2.01)"
## [1] "Replacing 'out' with 'these' (SSE improvement = 2.1)"
## [1] "Replacing 'belly button' with 'toy (object)' (SSE improvement = 2.11)"
## [1] "Replacing 'big' with 'sick' (SSE improvement = 2.3)"
## [1] "Replacing 'soap' with 'stove' (SSE improvement = 2.35)"
## [1] "Replacing 'moo' with 'baby' (SSE improvement = 2.82)"
## [1] "New list item easiness SSE: 34.65"
## [1] "Mean Spanish item easiness: -1.14"
## [1] "Mean English item easiness: -1.15"
write.csv(new_dll2ENshort, file="DLL/new_DLL-ES2-short-English.csv")

Spanish Level 1 DLL

new_dll1SPshort <- improve_DLL_list(dll1SPshort %>% select(-translation), dict, Nswaps=Nswaps, language="spanish")$new_list
## [1] "Original list item easiness SSE: 114.15"
## [1] "Mean Spanish item easiness: -0.78"
## [1] "Mean English item easiness: -0.74"
## [1] "Selecting from 328 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'finish' with 'blow' (SSE improvement = 7.55)"
## [1] "Replacing 'babysitter' with 'cockadoodledoo' (SSE improvement = 8.66)"
## [1] "Replacing 'water (beverage)' with 'sink' (SSE improvement = 11.46)"
## [1] "Replacing 'good night' with 'wanna' (SSE improvement = 6.64)"
## [1] "Replacing 'hat' with 'toy (object)' (SSE improvement = 12.33)"
## [1] "Replacing 'bird' with 'stove' (SSE improvement = 4.05)"
## [1] "Replacing 'plate' with 'train' (SSE improvement = 5.42)"
## [1] "Replacing 'bathroom' with 'diaper' (SSE improvement = 5.81)"
## [1] "Replacing 'girl' with 'shh' (SSE improvement = 3.39)"
## [1] "Replacing 'table' with 'hair' (SSE improvement = 6.54)"
## [1] "Replacing 'out' with 'where (question)' (SSE improvement = 2.1)"
## [1] "Replacing 'couch' with 'watch (object)' (SSE improvement = 2.12)"
## [1] "Replacing 'duck' with 'grapes' (SSE improvement = 2.24)"
## [1] "Replacing 'blanket' with 'sweater' (SSE improvement = 2.47)"
## [1] "Replacing 'candy' with 'bedroom' (SSE improvement = 1.88)"
## [1] "New list item easiness SSE: 31.49"
## [1] "Mean Spanish item easiness: -1"
## [1] "Mean English item easiness: -0.83"
write.csv(new_dll1SPshort, file="DLL/new_DLL-ES1-short-Spanish.csv")

Note especially that the average ease of English items on this list was originally quite high (-0.29 vs. Spanish’s -0.78), but it is somewhat closer to the mean ease of the Spanish items after the substitutions.

Spanish Level 2 DLL

new_dll2SPshort <- improve_DLL_list(dll2SPshort %>% select(-translation), dict, Nswaps=Nswaps, language="spanish")$new_list
## [1] "Original list item easiness SSE: 73.77"
## [1] "Mean Spanish item easiness: -1.27"
## [1] "Mean English item easiness: -1.28"
## [1] "Selecting from 330 words on both Eng/Sp CDIs that are not on the DLL."
## [1] "Replacing 'finish' with 'blow' (SSE improvement = 7.55)"
## [1] "Replacing 'hat' with 'doll' (SSE improvement = 12.33)"
## [1] "Replacing 'bird' with 'stroller' (SSE improvement = 4.05)"
## [1] "Replacing 'boat' with 'sink' (SSE improvement = 4.84)"
## [1] "Replacing 'plate' with 'mouse' (SSE improvement = 5.42)"
## [1] "Replacing 'a lot' with 'where (question)' (SSE improvement = 6.85)"
## [1] "Replacing 'swing (object)' with 'toy (object)' (SSE improvement = 2.15)"
## [1] "Replacing 'candy' with 'stove' (SSE improvement = 1.9)"
## [1] "Replacing 'down' with 'these' (SSE improvement = 2.66)"
## [1] "Replacing 'person' with 'cockadoodledoo' (SSE improvement = 3.04)"
## [1] "Replacing 'after' with 'shh' (SSE improvement = 1.76)"
## [1] "Replacing 'they' with 'all' (SSE improvement = 3.67)"
## [1] "Replacing 'juice' with 'train' (SSE improvement = 1.33)"
## [1] "Replacing 'baa baa' with 'yum yum' (SSE improvement = 1.44)"
## [1] "Replacing 'dog' with 'diaper' (SSE improvement = 1.46)"
## [1] "New list item easiness SSE: 13.32"
## [1] "Mean Spanish item easiness: -1.2"
## [1] "Mean English item easiness: -1.22"
write.csv(new_dll2SPshort, file="DLL/new_DLL-ES2-short-Spanish.csv")

Much like the previous list, after substitutions the mean ease of the English items is closer to those of the Spanish items.

Overestimation in new DLL lists

The new DLL1 English short form (left) certainly looks like it’s overestimating less (compare to just the CDI:WG short form). We quantify the reliability (ICC1) and the overall root mean squared error (RMSE) of the new DLL form + CDI short form, just the CDI short form, and of the old DLL form.

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00608386 (tol = 0.002, component 1)
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

The table below shows that the reliability of the original (old) DLLs vs. the (new) DLL with swaps is the same for extrapolating to children’s full CDI scores (column: DLL.vs..full.ICC1), and higher than the reliability of extrapolating full CDI scores from just the appropriate short CDI form (column: CDI.short.vs..full.ICC1). The RMSE of the new DLL forms vs. the old DLL forms improved in all cases except for the the DLL1 SP short form, which was marginally worse (new RMSE=12.60 vs. old RMSE=12.22). For all DLL forms (new and old), the RMSE is better than that achieved by extrapolating from the CDI short form (column: CDI.short.vs..full.RMSE).

DLL DLL.vs..full.ICC1 CDI.short.vs..full.ICC1 DLL.vs..full.RMSE CDI.short.vs..full.RMSE
Old DLL1 EN short 0.97 0.92 10.93 19.97
New DLL1 EN short 0.97 0.92 11.42 19.97
Old DLL2 EN short 0.97 0.96 43.45 46.30
New DLL2 EN short 0.97 0.96 42.22 46.30
Old DLL1 SP short 0.98 0.97 12.22 18.67
New DLL1 SP short 0.98 0.97 11.81 18.67
Old DLL2 SP short 0.96 0.94 48.27 62.35
New DLL2 SP short 0.96 0.94 49.65 62.35

Summary

For each of the DLL lists, swapping the 15 items with the largest discrepancy between English and Spanish easiness for items of minimal discrepancy within the same lexical class resulted in substantially reducing the total easiness SSE, and also resulted in mean item easiness (in both languages) that are more equal and closer to the means of each language, and thus generally better for extrapolating to children’s full CDI score. Recognizing that these swaps are chosen algorithmically, we recommend the DLL team to consider each of the above swaps and determine whether any key words have been removed, or whether any undesirable words have been added.