ch7

Exercise: fit a linear-mixed model for Stiller, Goodman, and Frank 2015 data

Download data

library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ readr     2.1.5
✔ ggplot2   3.5.1     ✔ stringr   1.5.1
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
sgf <- read_csv("scales_orig_data.csv") |>
  mutate(age_group = cut(age, 2:5, include.lowest = TRUE), 
         condition = condition |>
           fct_recode("Control" = "No Label", "Experimental" = "Label"))
Rows: 780 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (7): id, age.grp, trial, encoding, dob, date, condition
dbl (2): correct, age

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
tail(sgf)
# A tibble: 6 × 10
  id     age.grp trial  encoding correct dob     date    age condition age_group
  <chr>  <chr>   <chr>  <chr>      <dbl> <chr>   <chr> <dbl> <fct>     <fct>    
1 MSCH84 2       pasta  none           1 8/12/09 6/11…  2.83 Control   [2,3]    
2 MSCH84 2       beds   none           0 8/12/09 6/11…  2.83 Control   [2,3]    
3 MSCH85 2       faces  none           0 10/3/09 6/11…  2.69 Control   [2,3]    
4 MSCH85 2       houses none           0 10/3/09 6/11…  2.69 Control   [2,3]    
5 MSCH85 2       pasta  none           0 10/3/09 6/11…  2.69 Control   [2,3]    
6 MSCH85 2       beds   none           0 10/3/09 6/11…  2.69 Control   [2,3]    

Preprocessing

# center the age variable
sgf$age_centered <- scale(sgf$age, center = TRUE, scale = FALSE)
# reorder the levels of the factor condition so that "Control" becomes the first.
sgf$condition <- fct_relevel(sgf$condition, "Control")

Modeling

formulation:

\[ logit(E|Y_{it}) = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i1}X_{i2} + r_i + \epsilon_t \]

With $Y_{it}$ for correctness of participant i in trial t. $X_{i1}$ for whether children is in the experimental/controlled condition, $X_{i2}$ for age.

library(lme4)
Loading required package: Matrix

Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':

    expand, pack, unpack
mod <- glmer(correct ~ age_centered * condition + (1|id) + (1|trial), family = "binomial", data = sgf)

check result

summary(mod)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: correct ~ age_centered * condition + (1 | id) + (1 | trial)
   Data: sgf

     AIC      BIC   logLik deviance df.resid 
   657.3    683.6   -322.7    645.3      582 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.4205 -0.5690 -0.3759  0.6519  2.5432 

Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 0.05185  0.2277  
 trial  (Intercept) 0.07482  0.2735  
Number of obs: 588, groups:  id, 147; trial, 4

Fixed effects:
                                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)                         -1.4581     0.2156  -6.762 1.36e-11 ***
age_centered                        -0.3764     0.1892  -1.989 0.046658 *  
conditionExperimental                2.2612     0.2246  10.069  < 2e-16 ***
age_centered:conditionExperimental   0.9240     0.2567   3.600 0.000318 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) ag_cnt cndtnE
age_centerd  0.180              
cndtnExprmn -0.617 -0.185       
ag_cntrd:cE -0.155 -0.743  0.208