Most of the code here is borrowed from https://github.com/mikabr/aoa-prediction
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1.9000 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
here() starts at /Users/visuallearninglab/Documents/babyview-object
library(knitr)library(cowplot)
Attaching package: 'cowplot'
The following object is masked from 'package:lubridate':
stamp
library(grid)library(ggthemes)
Attaching package: 'ggthemes'
The following object is masked from 'package:cowplot':
theme_map
library(lmerTest)
Loading required package: lme4
Loading required package: Matrix
Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':
expand, pack, unpack
Attaching package: 'lmerTest'
The following object is masked from 'package:lme4':
lmer
The following object is masked from 'package:stats':
step
library(mirt)
Loading required package: stats4
Loading required package: lattice
Attaching package: 'mirt'
The following object is masked from 'package:lme4':
fixef
Rows: 304 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): class_name
dbl (8): n_frame_video_subject_age_detections, n_unique_videos, total_frame_...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1077 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): item_definition, measure
dbl (3): intercept, slope, aoa
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# based on this model just going to join on any row that contains the object even if that means matching with multiple words, since that's what it looks like is being done with MLU etc.longer_cdi_objects <- object_counts_cleaned |>select(object, visual_frequency, log_proportion, AoA) |>rowwise() |>mutate(matching =list( eng_joined |>filter(str_starts(uni_lemma, fixed(paste0(object, " ")))) |>filter(lexical_classes =="nouns") )) |>unnest(matching) |>rename(original_object=object) |>filter(uni_lemma !="ice cream") |>rename(object=uni_lemma) |>filter(!is.na(language))shorter_cdi_objects <- object_counts_cleaned |>select(object, visual_frequency, log_proportion, AoA) |>mutate(original_object=object) |>left_join(eng_joined, by=c("object"="uni_lemma")) |># getting rid of words that do not have a matchfilter(!is.na(language))eng_vars <-bind_rows(longer_cdi_objects, shorter_cdi_objects) |># re-calculating visual frequency after getting rid of all of the CDI words that don't exist in the modelmutate(visual_frequency =scale(log_proportion))
And now looking at whether we missed any objects that were detected:
anti_join(object_counts_cleaned |>distinct(object), eng_vars |>distinct(original_object), by =c("object"="original_object"))
Joining with `by = join_by(object)`
Joining with `by = join_by(object)`
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `across(pred, ~if_else(is.na(.), .fitted, .))`.
Caused by warning:
! Using an external vector in selections was deprecated in tidyselect 1.1.0.
ℹ Please use `all_of()` or `any_of()` instead.
# Was:
data %>% select(pred)
# Now:
data %>% select(all_of(pred))
See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in
ggplot2 3.3.4.
ℹ Please use "none" instead.
Estimates of coefficients in predicting words’ developmental trajectories for English comprehension and production data. Larger coefficient values indicate a greater effect of the predictor on acquisition: positive main effects indicate that words with higher values of the predictor tend to be understood/produced by more children, while negative main effects indicate that words with lower values of the predictor tend to be understood/produced by more children; positive age interactions indicate that the predictor’s effect increases with age, while negative age interactions indicate the predictor’s effect decreases with age. Line ranges indicates 95% confidence intervals; filled in points indicate coefficients for which \(p < 0.05\).
This isn’t exactly what I expected so I’m going to try out different models quickly by collapsing our code into helper funcs ## 4d. Testing out different models by collapsing into functions
Fitting model for produces...
Fitting model for understands...
Warning: `cols` is now required when using `unnest()`.
ℹ Please use `cols = c(results)`.
plot_effects_data(original_model_measure_coefs, plot_title="Original model data", curr_predictors=original_predictors)
Adding missing grouping variables: `effect`
Warning: `cols` is now required when using `unnest()`.
ℹ Please use `cols = c(data)`.
5. Correlation with other predictors
# Compute correlation matrix# Compute correlation matrixcor_matrix <-cor(model_data |>select(-object, -num_phons, -valence, -arousal), use ="pairwise.complete.obs")# Convert correlation matrix to long formatcor_melted <- reshape::melt(cor_matrix)
Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by
the caller; using TRUE
Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by
the caller; using TRUE
# Reverse order of labelscor_melted$X1 <-factor(cor_melted$X1, levels =rev(unique(cor_melted$X1)))cor_melted$X2 <-factor(cor_melted$X2, levels =rev(unique(cor_melted$X2)))# Define color mapping for axis labelsaxis_label_colors <-ifelse(grepl("visual", levels(cor_melted$X1)), "red", "black")# Plot heatmap with updated label ordering and coloringggplot(cor_melted, aes(X1, X2, fill = value)) +geom_tile() +scale_fill_viridis(option ="mako", direction =1) +theme_minimal() +geom_text(aes(label =round(value, 2)), color ="white", size =3) +# Label each squarelabs(title ="Heatmap of Predictor Correlations",fill ="Correlation") +theme(axis.text.x =element_text(angle =45, hjust =1, color = axis_label_colors),axis.text.y =element_text(color = axis_label_colors))
Warning: Vectorized input to `element_text()` is not officially supported.
ℹ Results may be unexpected or may change in future versions of ggplot2.
Warning: Vectorized input to `element_text()` is not officially supported.
ℹ Results may be unexpected or may change in future versions of ggplot2.