FIML in lme4 and lavaan

To the best of my knowledge, FIML is ML.

Fit a growth curve model in lme4

library(lme4)
library(lavaan)
library(tidyr)
data("sleepstudy")

This is the lmer() help page example model. set REML = FALSE and control = lmerControl(optimizer = "nlminbwrap") to match lavaan output.

fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy, 
            REML = FALSE, 
            control = lmerControl(optimizer = "nlminbwrap"))

Extract coefficients and variances to compare to lavaan output. The as.data.frame() method when called on a VarCorr object converts the correlation between random effects to covariance.

fixef(fm1)

## (Intercept)        Days 
##   251.40510    10.46729

as.data.frame(VarCorr(fm1))[,1:4]

##        grp        var1 var2      vcov
## 1  Subject (Intercept) <NA> 565.51552
## 2  Subject        Days <NA>  32.68220
## 3  Subject (Intercept) Days  11.05542
## 4 Residual        <NA> <NA> 654.94102

Fit a growth curve model in lavaan

Need to reshape data to wide format.

ss_wide <- pivot_wider(sleepstudy, 
                       names_from = Days, 
                       values_from = Reaction)
names(ss_wide)[2:11] <- paste0("Reaction_",names(ss_wide)[2:11])

Next specify model. What a pain! Source: Michael Clark

lgc_model = '
intercept =~  1*Reaction_0 + 1*Reaction_1 + 1*Reaction_2 + 
              1*Reaction_3 + 1*Reaction_4 + 1*Reaction_5 + 
              1*Reaction_6 + 1*Reaction_7 + 1*Reaction_8 + 
              1*Reaction_9
              
Reactions =~  0*Reaction_0 + 1*Reaction_1 + 2*Reaction_2 + 
              3*Reaction_3 + 4*Reaction_4 + 5*Reaction_5 + 
              6*Reaction_6 + 7*Reaction_7 + 8*Reaction_8 + 
              9*Reaction_9

Reaction_0 ~~ resid*Reaction_0    # same residual variance for each time point
Reaction_1 ~~ resid*Reaction_1
Reaction_2 ~~ resid*Reaction_2
Reaction_3 ~~ resid*Reaction_3
Reaction_4 ~~ resid*Reaction_4
Reaction_5 ~~ resid*Reaction_5
Reaction_6 ~~ resid*Reaction_6
Reaction_7 ~~ resid*Reaction_7
Reaction_8 ~~ resid*Reaction_8
Reaction_9 ~~ resid*Reaction_9
'

Now fit lavaan model using the growth() function.

lgc_fit <- growth(lgc_model, data = ss_wide)
summary(lgc_fit)

## lavaan 0.6-18 ended normally after 98 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        15
##   Number of equality constraints                     9
## 
##   Number of observations                            18
## 
## Model Test User Model:
##                                                       
##   Test statistic                               141.141
##   Degrees of freedom                                59
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept =~                                        
##     Reaction_0        1.000                           
##     Reaction_1        1.000                           
##     Reaction_2        1.000                           
##     Reaction_3        1.000                           
##     Reaction_4        1.000                           
##     Reaction_5        1.000                           
##     Reaction_6        1.000                           
##     Reaction_7        1.000                           
##     Reaction_8        1.000                           
##     Reaction_9        1.000                           
##   Reactions =~                                        
##     Reaction_0        0.000                           
##     Reaction_1        1.000                           
##     Reaction_2        2.000                           
##     Reaction_3        3.000                           
##     Reaction_4        4.000                           
##     Reaction_5        5.000                           
##     Reaction_6        6.000                           
##     Reaction_7        7.000                           
##     Reaction_8        8.000                           
##     Reaction_9        9.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept ~~                                        
##     Reactions        11.055   42.876    0.258    0.797
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intercept       251.405    6.632   37.906    0.000
##     Reactions        10.467    1.502    6.968    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Rectn_0 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_1 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_2 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_3 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_4 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_5 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_6 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_7 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_8 (resd)  654.941   77.186    8.485    0.000
##    .Rectn_9 (resd)  654.941   77.186    8.485    0.000
##     intrcpt         565.516  265.266    2.132    0.033
##     Reactns          32.682   13.573    2.408    0.016

lme4 output again:

fixef(fm1)

## (Intercept)        Days 
##   251.40510    10.46729

as.data.frame(VarCorr(fm1))[,1:4]

##        grp        var1 var2      vcov
## 1  Subject (Intercept) <NA> 565.51552
## 2  Subject        Days <NA>  32.68220
## 3  Subject (Intercept) Days  11.05542
## 4 Residual        <NA> <NA> 654.94102

The Intercepts in lavaan output are the fixed effects in the lme4 output.
The Variances in lavaan output are the random effects in the lme4 output.
The Covariance in lavaan output is the covariance between random effects in lme4 output.

Add missingness to dependent variable

Randomly drop 5 responses. Now have 5 NAs for Reaction, the dependent variable.

set.seed(1)
drop <- sample(nrow(sleepstudy), size = 5)
sleepstudy$Reaction[drop] <- NA
summary(sleepstudy)

##     Reaction          Days        Subject   
##  Min.   :194.3   Min.   :0.0   308    : 10  
##  1st Qu.:254.8   1st Qu.:2.0   309    : 10  
##  Median :287.7   Median :4.5   310    : 10  
##  Mean   :298.4   Mean   :4.5   330    : 10  
##  3rd Qu.:337.2   3rd Qu.:7.0   331    : 10  
##  Max.   :466.4   Max.   :9.0   332    : 10  
##  NA's   :5                     (Other):120

Now update lme4 model with new data and extract coefficients.

fm2 <- update(fm1, data = sleepstudy)
fixef(fm2)

## (Intercept)        Days 
##    251.1472     10.5400

as.data.frame(VarCorr(fm2))[,1:4]

##        grp        var1 var2      vcov
## 1  Subject (Intercept) <NA> 559.71188
## 2  Subject        Days <NA>  32.76343
## 3  Subject (Intercept) Days  11.34585
## 4 Residual        <NA> <NA> 655.39412

Now reshape updated data into wide format and re-fit model in lavaan using FIML.

# reshape data with missingness
ss_wide <- pivot_wider(sleepstudy, 
                       names_from = Days, 
                       values_from = Reaction)
names(ss_wide)[2:11] <- paste0("Reaction_",names(ss_wide)[2:11])

# fit using fiml
lgc_fit2 <- growth(lgc_model, data = ss_wide, missing = "fiml")
summary(lgc_fit2)

## lavaan 0.6-18 ended normally after 98 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        15
##   Number of equality constraints                     9
## 
##   Number of observations                            18
##   Number of missing patterns                         5
## 
## Model Test User Model:
##                                                       
##   Test statistic                               141.133
##   Degrees of freedom                                59
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept =~                                        
##     Reaction_0        1.000                           
##     Reaction_1        1.000                           
##     Reaction_2        1.000                           
##     Reaction_3        1.000                           
##     Reaction_4        1.000                           
##     Reaction_5        1.000                           
##     Reaction_6        1.000                           
##     Reaction_7        1.000                           
##     Reaction_8        1.000                           
##     Reaction_9        1.000                           
##   Reactions =~                                        
##     Reaction_0        0.000                           
##     Reaction_1        1.000                           
##     Reaction_2        2.000                           
##     Reaction_3        3.000                           
##     Reaction_4        4.000                           
##     Reaction_5        5.000                           
##     Reaction_6        6.000                           
##     Reaction_7        7.000                           
##     Reaction_8        8.000                           
##     Reaction_9        9.000                           
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept ~~                                        
##     Reactions        11.347   42.941    0.264    0.792
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intercept       251.147    6.636   37.845    0.000
##     Reactions        10.540    1.509    6.987    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Rectn_0 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_1 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_2 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_3 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_4 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_5 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_6 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_7 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_8 (resd)  655.394   78.559    8.343    0.000
##    .Rectn_9 (resd)  655.394   78.559    8.343    0.000
##     intrcpt         559.715  264.515    2.116    0.034
##     Reactns          32.763   13.626    2.405    0.016

Notice the lavaan output again matches the lme4 output. FIML in this case is simply ML.

add covariate

Now add a binary time-invariant covariate and refit lme 4 model.

set.seed(8)
id <- sample(levels(sleepstudy$Subject), size = 9)
sleepstudy$X <- ifelse(sleepstudy$Subject %in% id, 1, 0)
fm3 <- update(fm1, . ~ . + X, data = sleepstudy)
fixef(fm3)

## (Intercept)        Days           X 
##  255.131080   10.541746   -7.992342

as.data.frame(VarCorr(fm3))[,1:4]

##        grp        var1 var2      vcov
## 1  Subject (Intercept) <NA> 538.03719
## 2  Subject        Days <NA>  32.77377
## 3  Subject (Intercept) Days  18.38357
## 4 Residual        <NA> <NA> 655.30978

Fit the same model using lavaan with missing = "fiml". Need to reshape data and update the model syntax. Notice we need to explicitly add the covariance specification intercept ~~ Reactions to get the covariance between random effects.

# reshape data
ss_wide <- pivot_wider(sleepstudy, 
                       names_from = Days, 
                       values_from = Reaction)
names(ss_wide)[3:12] <- paste0("Reaction_",names(ss_wide)[3:12])

# specify model
lgc_model2 = '
intercept =~  1*Reaction_0 + 1*Reaction_1 + 1*Reaction_2 + 
              1*Reaction_3 + 1*Reaction_4 + 1*Reaction_5 + 
              1*Reaction_6 + 1*Reaction_7 + 1*Reaction_8 + 
              1*Reaction_9
              
Reactions =~  0*Reaction_0 + 1*Reaction_1 + 2*Reaction_2 + 
              3*Reaction_3 + 4*Reaction_4 + 5*Reaction_5 + 
              6*Reaction_6 + 7*Reaction_7 + 8*Reaction_8 + 
              9*Reaction_9

# covariate
intercept ~ X

Reaction_0 ~~ resid*Reaction_0    # same residual variance for each time point
Reaction_1 ~~ resid*Reaction_1
Reaction_2 ~~ resid*Reaction_2
Reaction_3 ~~ resid*Reaction_3
Reaction_4 ~~ resid*Reaction_4
Reaction_5 ~~ resid*Reaction_5
Reaction_6 ~~ resid*Reaction_6
Reaction_7 ~~ resid*Reaction_7
Reaction_8 ~~ resid*Reaction_8
Reaction_9 ~~ resid*Reaction_9
intercept ~~ Reactions
'

# fit lavaan model
lgc_fit <- growth(lgc_model2, data = ss_wide, missing = "fiml")
summary(lgc_fit)

## lavaan 0.6-18 ended normally after 103 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        16
##   Number of equality constraints                     9
## 
##   Number of observations                            18
##   Number of missing patterns                         5
## 
## Model Test User Model:
##                                                       
##   Test statistic                               159.681
##   Degrees of freedom                                68
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept =~                                        
##     Reaction_0        1.000                           
##     Reaction_1        1.000                           
##     Reaction_2        1.000                           
##     Reaction_3        1.000                           
##     Reaction_4        1.000                           
##     Reaction_5        1.000                           
##     Reaction_6        1.000                           
##     Reaction_7        1.000                           
##     Reaction_8        1.000                           
##     Reaction_9        1.000                           
##   Reactions =~                                        
##     Reaction_0        0.000                           
##     Reaction_1        1.000                           
##     Reaction_2        2.000                           
##     Reaction_3        3.000                           
##     Reaction_4        4.000                           
##     Reaction_5        5.000                           
##     Reaction_6        6.000                           
##     Reaction_7        7.000                           
##     Reaction_8        8.000                           
##     Reaction_9        9.000                           
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept ~                                         
##     X                -7.992   13.547   -0.590    0.555
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .intercept ~~                                        
##     Reactions        18.384   43.816    0.420    0.675
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .intercept       255.131    9.405   27.126    0.000
##     Reactions        10.542    1.509    6.987    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Rectn_0 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_1 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_2 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_3 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_4 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_5 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_6 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_7 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_8 (resd)  655.310   78.539    8.344    0.000
##    .Rectn_9 (resd)  655.310   78.539    8.344    0.000
##    .intrcpt         538.036  257.313    2.091    0.037
##     Reactns          32.773   13.629    2.405    0.016

The covariate coefficient is listed under Regressions in lavaan. Once again the lme4 and lavaan outputs match.

add missingness on covariate

Now randomly set three of the covariates to missing and update the lme4 model with the new data.

set.seed(3)
dropx <- sample(levels(sleepstudy$Subject), size = 3)
sleepstudy$X <- ifelse(sleepstudy$Subject %in% dropx, NA, sleepstudy$Subject)
summary(sleepstudy)

##     Reaction          Days        Subject          X       
##  Min.   :194.3   Min.   :0.0   308    : 10   Min.   : 1.0  
##  1st Qu.:254.8   1st Qu.:2.0   309    : 10   1st Qu.: 4.0  
##  Median :287.7   Median :4.5   310    : 10   Median :10.0  
##  Mean   :298.4   Mean   :4.5   330    : 10   Mean   : 9.8  
##  3rd Qu.:337.2   3rd Qu.:7.0   331    : 10   3rd Qu.:15.0  
##  Max.   :466.4   Max.   :9.0   332    : 10   Max.   :18.0  
##  NA's   :5                     (Other):120   NA's   :30

fm4 <- update(fm3, data = sleepstudy)
fixef(fm4)

## (Intercept)        Days           X 
##  237.894633   10.371152    1.153681

as.data.frame(VarCorr(fm4))[,1:4]

##        grp        var1 var2      vcov
## 1  Subject (Intercept) <NA> 517.11505
## 2  Subject        Days <NA>  33.27205
## 3  Subject (Intercept) Days  27.61064
## 4 Residual        <NA> <NA> 690.95847

Now fit same model in lavaan model using FIML. Need to reshape data again. Notice the three subjects are dropped who had a missing covariate.

# reshape data
ss_wide <- pivot_wider(sleepstudy, 
                       names_from = Days, 
                       values_from = Reaction)
names(ss_wide)[3:12] <- paste0("Reaction_",names(ss_wide)[3:12])

# fit lavaan model
lgc_fit <- growth(lgc_model2, data = ss_wide, missing = "fiml")

## Warning: lavaan->lav_data_full():  
##    3 cases were deleted due to missing values in exogenous variable(s), while 
##    fixed.x = TRUE.

summary(lgc_fit)

## lavaan 0.6-18 ended normally after 101 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        16
##   Number of equality constraints                     9
## 
##                                                   Used       Total
##   Number of observations                            15          18
##   Number of missing patterns                         3            
## 
## Model Test User Model:
##                                                       
##   Test statistic                               163.707
##   Degrees of freedom                                68
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept =~                                        
##     Reaction_0        1.000                           
##     Reaction_1        1.000                           
##     Reaction_2        1.000                           
##     Reaction_3        1.000                           
##     Reaction_4        1.000                           
##     Reaction_5        1.000                           
##     Reaction_6        1.000                           
##     Reaction_7        1.000                           
##     Reaction_8        1.000                           
##     Reaction_9        1.000                           
##   Reactions =~                                        
##     Reaction_0        0.000                           
##     Reaction_1        1.000                           
##     Reaction_2        2.000                           
##     Reaction_3        3.000                           
##     Reaction_4        4.000                           
##     Reaction_5        5.000                           
##     Reaction_6        6.000                           
##     Reaction_7        7.000                           
##     Reaction_8        8.000                           
##     Reaction_9        9.000                           
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   intercept ~                                         
##     X                 1.154    1.330    0.867    0.386
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .intercept ~~                                        
##     Reactions        27.611   47.229    0.585    0.559
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .intercept       237.895   14.801   16.073    0.000
##     Reactions        10.371    1.671    6.208    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Rectn_0 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_1 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_2 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_3 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_4 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_5 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_6 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_7 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_8 (resd)  690.959   90.234    7.657    0.000
##    .Rectn_9 (resd)  690.959   90.234    7.657    0.000
##    .intrcpt         517.118  277.876    1.861    0.063
##     Reactns          33.272   15.264    2.180    0.029

Again, lme4 and lavaan outputs match. FIML indeed appears to be ML.