Autistic vs Non-autistic VSOA

Chris Cox

Setup

Required Packages

library(dplyr)
library(purrr)
library(tidyr)
library(knitr)
library(kableExtra)
library(ggplot2)

Load metadata

new_guids <- readRDS("data/autistic-guis-added-20250507.rds")
d_meta <- readRDS("data/cdi-metadata.rds")
d_cdi <- readRDS("data/asd_na-osg-2025-05-20.rds") |>
    mutate(
        dataset = if_else(subjectkey %in% new_guids, 2, 1),
        dataset = factor(dataset, 1:2, c("old", "new"))
    ) |>
    left_join(
        d_meta |> select(num_item_id, word)
    )

Load VSOA

d_vsoa <- readRDS("data/vsoa-autistic-nonautistic-ndar-id-fix-remodel-v2.rds")
d_vsoa_diff <- readRDS("data/vsoa-autistic-nonautistic-diff-ndar-id-fix-remodel-v2.rds")
d_vsoa_diff_old <- readRDS("data/vsoa-autistic-nonautistic-diff-old.rds")
    #rename(vsoa_NA = na, vsoa_ASD = asd, vsoa_diff = diff,
    #       ci_l_diff = ci_l, ci_u_diff = ci_u)

Trim VSOA

trim_vsoa <- function(.data) {
    .data |>
        mutate(
            vsoa_NA = if_else(vsoa_NA <   1,  1, vsoa_NA),
            vsoa_NA = if_else(vsoa_NA > 680, NA, vsoa_NA),
            vsoa_ASD = if_else(vsoa_ASD <   1,  1, vsoa_ASD),
            vsoa_ASD = if_else(vsoa_ASD > 680, NA, vsoa_ASD),
            vsoa_diff = vsoa_ASD - vsoa_NA
        ) |>
        drop_na()
}
d_vsoa_diff_trimmed <- trim_vsoa(d_vsoa_diff)
d_vsoa_diff_old_trimmed <- trim_vsoa(d_vsoa_diff_old)

Prepare participant characteristics

d_ppt <- d_cdi |>
    select(subjectkey, nproduced, sex, group, interview_age) |>
    distinct()

Load and merge AoA

d_aoa <- readRDS("data/CDI_English (American)_WS_14May2025_AoA.rds")
d_vsoa_aoa <- left_join(
    d_vsoa |> filter(group != "ASD-NA"),
    d_aoa |> select(num_item_id, aoa = aoa_glm), by = "num_item_id"
) |>
    mutate(
        group = factor(group, c("ASD", "NA"), c("autistic", "not aut."))
    ) |>
    relocate(aoa, .before = vsoa)

Trim AoA (and paired VSOA)

d_vsoa_aoa_trimmed <- d_vsoa_aoa |>
    mutate(
        vsoa = if_else(vsoa <   1,  1, vsoa),
        vsoa = if_else(vsoa > 680, NA, vsoa),
        aoa  = if_else( aoa <  12, 12,  aoa),
        aoa  = if_else( aoa >  30, NA,  aoa)
    ) |>
    drop_na()

Participant Characteristics

Vocabulary size and interview age

Productive Vocabulary
Interview Age
group mean median SD mean median SD
autistic 197.480 141 195.710 47.727 39 24.719
not aut. 200.919 139 178.371 20.266 18 4.971

Gender and Group

group male female total ratio (M:F)
autistic 573 148 721 3.872
not aut. 1695 441 2136 3.844

Correlation Analysis

Helper functions

vsoa_corr_tbl <- function(.data, .x, .y, ..., digits = 3) {
    cor_methods <- c("pearson", "spearman", "kendall")
    map(cor_methods,
        function(df, method) {
            df |>
                ungroup() |>
                group_by(...) |>
                summarize(r = cor({{ .x }}, {{ .y }}, method = method,
                                  use = "complete.obs"))
        },
        df = .data
    ) |>
        set_names(cor_methods) |>
        list_rbind(names_to = "method") |>
        pivot_wider(names_from = method, values_from = r) |>
        kable(digits = digits)
}

vsoa_corr_fig <- function(.data, .x, .y, ...) {
    p <- ggplot(.data, aes(x = {{ .x }}, y = {{ .y }})) +
        geom_point() +
        geom_smooth(method = lm) +
        theme_bw(base_size = 18) +
        facet_wrap(vars(...))
    
    return(p)
}

VSOA and AoA Tbl. (raw)

Number of words contributing to correlations

group n
autistic 598
not aut. 598

Correlations

group pearson spearman kendall
autistic 0.891 0.910 0.774
not aut. 0.967 0.972 0.890

VSOA and AoA Fig. (raw)

VSOA and AoA Tbl. (trimmed)

Number of words contributing to correlations

group n
autistic 598
not aut. 598

Correlations

group pearson spearman kendall
autistic 0.891 0.910 0.774
not aut. 0.966 0.972 0.890

VSOA and AoA Fig. (trimmed)

Autistic and non-autistic Tbl. (raw)

Number of words contributing to correlations

group n
autistic 598
not aut. 598

Correlations

pearson spearman kendall
0.9 0.917 0.759

Autistic and non-autistic Fig. (raw)

Autistic and non-autistic Tbl. (trimmed)

n
1196
pearson spearman kendall
0.916 0.912 0.749

Autistic and non-autistic Fig. (trimmed)

Inspecting VSOA estimates

How many words have extreme VSOA?

  • Low: \(\textrm{VSOA} \le 1\)
  • High: \(\textrm{VSOA} \ge \max \textrm{vocab size}\)
d_vsoa |>
    filter(group != "ASD-NA") |>
    mutate(
        extreme = if_else(vsoa <= 1, 1,
                          if_else(vsoa >= max(d_ppt$nproduced), 2, NA)),
        extreme = factor(extreme, 1:2, c("low", "high"))
    ) |>
    drop_na(extreme) |>
    count(group, extreme) |>
    pivot_wider(id_cols = group, names_from = extreme,
                values_from = n, values_fill = 0) |>
    kable()

How many words have extreme VSOA?

group low high
NA 12 14
ASD 0 1

Least-produced words (for autistic only; counts)

Autistic
Not Autistic
word New Old New Old
about 43 31 64 64
downtown 40 30 92 92
myself 49 27 139 139
their 37 25 64 64
tights 38 31 85 85
vagina* 41 26 170 170
walker 29 19 98 98

Least-produced words (for autistic only; proportions)

Autistic
Not Autistic
word New Old New Old
about 0.09 0.13 0.04 0.04
downtown 0.09 0.13 0.05 0.05
myself 0.11 0.11 0.08 0.08
their 0.08 0.11 0.04 0.04
tights 0.08 0.13 0.05 0.05
vagina* 0.09 0.11 0.10 0.10
walker 0.06 0.08 0.06 0.06

Least-produced words (for non-autistic only; counts)

Autistic
Not Autistic
word New Old New Old
beside 53 33 61 61
country 39 32 30 30
hate 40 34 61 61
if 59 36 27 27
were 40 32 48 48
which 48 36 44 44

Least-produced words (for non-autistic only; proportions)

Autistic
Not Autistic
word New Old New Old
beside 0.11 0.14 0.04 0.04
country 0.08 0.14 0.02 0.02
hate 0.09 0.14 0.04 0.04
if 0.13 0.15 0.02 0.02
were 0.09 0.14 0.03 0.03
which 0.10 0.15 0.03 0.03

Words with largest VSOA differences

Autistic children learn earlier

num_item_id word vsoa_diff vsoa_NA vsoa_ASD
643 each -255.1 870.4 615.3
83 play dough -221.7 441.8 220.1
43 penguin -203.9 443.0 239.1
563 white -201.9 426.9 225.0
678 if -198.3 777.2 578.9
293 washing machine -194.5 469.6 275.1
513 brown -194.4 431.9 237.5

Words with largest VSOA differences

Autistic children learn later

num_item_id word vsoa_diff vsoa_NA vsoa_ASD
366 mommy* 757.1 -740.3 16.8
356 daddy* 679.6 -670.1 9.5
9 uh oh 346.2 -258.7 87.5
10 vroom 238.2 -31.7 206.5
12 yum yum 235.6 -58.6 177.0
4 grrr 207.5 -45.0 162.5
158 bib 207.2 258.7 465.9

Words with largest VSOA differences (trimmed)

Autistic children learn earlier

num_item_id word vsoa_diff vsoa_NA vsoa_ASD
83 play dough -222 442 220
43 penguin -204 443 239
563 white -202 427 225
293 washing machine -195 470 275
513 brown -194 432 237
510 black -192 391 198
374 teacher -185 492 307

Words with largest VSOA differences (trimmed)

Autistic children learn later

num_item_id word vsoa_diff vsoa_NA vsoa_ASD
158 bib 207 259 466
10 vroom 205 1 206
12 yum yum 176 1 177
4 grrr 161 1 162
203 owie/boo boo 159 108 267
213 bottle 156 57 213
36 kitty 151 53 204

Significantly different VSOAs by group

By lexical class (counts)

New VSOAs

Produced earlier by
lexical_class autistic not aut. ratio
total 164 91 1.80
nouns 76 43 1.77
verbs 32 5 6.40
other 22 28 0.79
adjectives 19 7 2.71
function_words 15 8 1.88

Old VSOAs

Produced earlier by
lexical_class autistic not aut. ratio
total 80 66 1.21
nouns 34 26 1.31
verbs 26 14 1.86
other 16 19 0.84
adjectives 4 2 2.00
function_words 0 5 0.00

By lexical class (proportions)

New VSOAs

Produced earlier by
lexical_class autistic not aut. ratio
verbs 0.31 0.05 6.40
adjectives 0.30 0.11 2.71
nouns 0.24 0.14 1.77
total 0.24 0.13 1.80
other 0.22 0.28 0.79
function_words 0.15 0.08 1.88

Old VSOAs

Produced earlier by
lexical_class autistic not aut. ratio
verbs 0.25 0.14 1.86
other 0.16 0.19 0.84
total 0.12 0.10 1.21
nouns 0.11 0.08 1.31
adjectives 0.06 0.03 2.00
function_words 0.00 0.05 0.00

By noun category (counts)

New VSOAs

Produced earlier by
category autistic not aut. ratio
total 76 43 1.77
animals 21 4 5.25
food_drink 18 12 1.50
furniture_rooms 9 3 3.00
vehicles 7 1 7.00
outside 6 1 6.00
toys 5 3 1.67
clothing 5 6 0.83
body_parts 4 4 1.00
household 1 9 0.11

Old VSOAs

Produced earlier by
category autistic not aut. ratio
total 34 26 1.31
animals 10 2 5.00
food_drink 5 7 0.71
body_parts 4 3 1.33
furniture_rooms 4 2 2.00
vehicles 3 1 3.00
outside 3 3 1.00
clothing 2 3 0.67
household 2 5 0.40
toys 1 0 Inf

By noun category (proportions)

New VSOAs

Produced earlier by
category autistic not aut. ratio
vehicles 0.50 0.07 7.00
animals 0.49 0.09 5.25
toys 0.28 0.17 1.67
furniture_rooms 0.27 0.09 3.00
food_drink 0.26 0.18 1.50
total 0.24 0.14 1.77
outside 0.19 0.03 6.00
clothing 0.18 0.21 0.83
body_parts 0.15 0.15 1.00
household 0.02 0.18 0.11

Old VSOAs

Produced earlier by
category autistic not aut. ratio
animals 0.23 0.05 5.00
vehicles 0.21 0.07 3.00
body_parts 0.15 0.11 1.33
furniture_rooms 0.12 0.06 2.00
total 0.11 0.08 1.31
outside 0.10 0.10 1.00
food_drink 0.07 0.10 0.71
clothing 0.07 0.11 0.67
toys 0.06 0.00 Inf
household 0.04 0.10 0.40