library(wordbankr) 
## Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
## when loading 'dplyr'
library(tidyverse) 
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.3
## ✓ tibble  3.1.4     ✓ dplyr   1.0.1
## ✓ tidyr   1.0.0     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.6.2
## Warning: package 'tibble' was built under R version 3.6.2
## Warning: package 'dplyr' was built under R version 3.6.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(lme4)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
library(lmerTest)
## Warning: package 'lmerTest' was built under R version 3.6.2
## 
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
## 
##     lmer
## The following object is masked from 'package:stats':
## 
##     step

0.1 Research Question

Whether exists a relationship between the children socio-demographic characteristics and the likehood that produce gestures between 8 and 18 months.

0.1.1 Step 1.Logistic Model regression without effect.

Completely null mode, no effect of random effects (This model assumes independence of variables, that is not true! )

## 
## Call:
## glm(formula = outcome ~ 1, family = binomial(link = "logit"), 
##     data = data_hlm, na.action = na.omit)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.114  -1.114  -1.114   1.242   1.242  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.151109   0.007842  -19.27   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 90312  on 65414  degrees of freedom
## Residual deviance: 90312  on 65414  degrees of freedom
## AIC: 90314
## 
## Number of Fisher Scoring iterations: 3

Negative coefficient means that is more likely not produce gestures in this particular sample.

## 
##     0     1 
## 35174 30241

0.2 Step 2. Hierarchical Liner Model, add random intercepts.

Considering that data have certain kind of hierarchical structure, thus gestures production is nested in groups (children) or gestures are repeated by children

In sum, random intercept model considers that we expected that the probability to produce gestures change by children

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: outcome ~ 1 + (1 | data_id)
##    Data: data_hlm
## 
##      AIC      BIC   logLik deviance df.resid 
##  79769.9  79788.0 -39882.9  79765.9    65413 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0860 -0.7656 -0.3683  0.8246  3.7946 
## 
## Random effects:
##  Groups  Name        Variance Std.Dev.
##  data_id (Intercept) 1.179    1.086   
## Number of obs: 65415, groups:  data_id, 1044
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.20237    0.03488  -5.802 6.54e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

0.2.1 Confidence intervals

## Computing profile confidence intervals ...
##                  2.5 %     97.5 %
## .sig01       1.0344021  1.1405077
## (Intercept) -0.2708193 -0.1340593

0.3 Step 3. Random intercept by children and type of gesture

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: outcome ~ 1 + (1 | data_id) + (1 | type)
##    Data: data_hlm
## 
##      AIC      BIC   logLik deviance df.resid 
##  72304.2  72331.4 -36149.1  72298.2    65412 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -8.9373 -0.6668 -0.2588  0.7154  5.3823 
## 
## Random effects:
##  Groups  Name        Variance Std.Dev.
##  data_id (Intercept) 1.5393   1.2407  
##  type    (Intercept) 0.7822   0.8844  
## Number of obs: 65415, groups:  data_id, 1044; type, 5
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -0.1027     0.3943  -0.261    0.794
## Computing profile confidence intervals ...
##                  2.5 %    97.5 %
## .sig01       1.1826765 1.3027949
## .sig02       0.5284464 1.9276817
## (Intercept) -1.0560033 0.8507078

0.4 Step 4. Random intercept by children and type of gesture and inclusion of Level 2 Effects

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: outcome ~ 1 + age + mom_ed_num + minority + (1 | data_id) + (1 |  
##     type)
##    Data: data_hlm
## 
##      AIC      BIC   logLik deviance df.resid 
##  71420.5  71475.0 -35704.2  71408.5    65409 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -8.1358 -0.6643 -0.2561  0.7089  5.4561 
## 
## Random effects:
##  Groups  Name        Variance Std.Dev.
##  data_id (Intercept) 0.5591   0.7477  
##  type    (Intercept) 0.7808   0.8836  
## Number of obs: 65415, groups:  data_id, 1044; type, 5
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -4.362895   0.423075 -10.312   <2e-16 ***
## age          0.339032   0.009181  36.926   <2e-16 ***
## mom_ed_num  -0.038202   0.017890  -2.135   0.0327 *  
## minority     0.043974   0.057189   0.769   0.4419    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr) age    mm_d_n
## age        -0.266              
## mom_ed_num -0.206 -0.036       
## minority   -0.040 -0.109 -0.116

0.4.0.0.0.0.0.0.0.0.1