Several authors have found that fitting complex models to large observational data sets can take a very long time using the lme4 package and often produces warnings about failure to converge to the optimum.
I have often responded that it is comparatively straightforward to fit such models using the MixedModels package for Julia. The cost of doing things in this way is installing and learning to use Julia.
Recently the R package JuliaCall has been added to CRAN. This package allows an R user to call Julia functions from within R.
Before installing the JuliaCall package one must install Julia itself. The easiest way to do that is from the download site https://julialang.org/downloads/. An alternative is to install JuliaPro from https://juliacomputing.com
You may wish to start Julia once just to install some packages. This can be done from R through JuliaCall but I’m not sure how easy it is to try to install the RCall package through JuliaCall. I think it is easiest to start Julia and run
Pkg.add("RCall")
Pkg.add("MixedModels")
Then install JuliaCall in R and load it and lme4
library(JuliaCall)
library(lme4)
## Loading required package: Matrix
The MixedModels package for Julia contains several example data sets in .RData format. These can be attached to the R session
attach("~/.julia/v0.6/MixedModels/test/dat.rda")
## The following objects are masked from package:lme4:
##
## Arabidopsis, cake, cbpp, Dyestuff, Dyestuff2, grouseticks,
## InstEval, Pastes, Penicillin, sleepstudy, VerbAgg
One of these example data sets is kb07 from a 2007 paper by Kronmueller and Barr (thanks to Dale Barr for providing these data).
str(kb07)
## 'data.frame': 1790 obs. of 10 variables:
## $ G: Factor w/ 56 levels "30","31","34",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ H: Factor w/ 32 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ Y: num 2267 3856 1567 1732 2660 ...
## $ S: num 1 -1 -1 1 1 -1 -1 1 1 -1 ...
## $ T: num -1 1 -1 1 -1 1 -1 1 -1 1 ...
## $ U: num 1 -1 -1 -1 -1 1 1 1 1 -1 ...
## $ V: num -1 -1 1 1 -1 -1 1 1 -1 -1 ...
## $ W: num 1 1 1 -1 -1 -1 -1 1 1 1 ...
## $ X: num -1 -1 1 -1 1 1 -1 1 -1 -1 ...
## $ Z: num -1 1 -1 -1 1 -1 1 1 -1 1 ...
The names of the columns have been shrunk to single letters to cut down on typing. The two grouping factors, G and H, are “Subject” and “Item”, respectively. The covariates S, T, and U are two-level factors in a \(2^3\) design coded in the -1, +1 encoding favored by statisticians. The other covariates, 'V, W, X and Z are twp- and three-factor interactions.
lmerFitting the “maximal” model, as recommended by Barr et al. (2013), with lmer does take a relatively long time.
system.time(fm1 <- lmer(Y ~ 1+S+T+U+V+W+X+Z + (1+S+T+U+V+W+X+Z|G) + (1+S+T+U+V+W+X+Z|H),
kb07, REML=FALSE))
## Warning in commonArgs(par, fn, control, environment()): maxfun < 10 *
## length(par)^2 is not recommended.
## Warning in optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), :
## convergence code 1 from bobyqa: bobyqa -- maximum number of function
## evaluations exceeded
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.00277275 (tol =
## 0.002, component 1)
## user system elapsed
## 696.208 0.061 696.283
print(summary(fm1), corr=FALSE)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: Y ~ 1 + S + T + U + V + W + X + Z + (1 + S + T + U + V + W +
## X + Z | G) + (1 + S + T + U + V + W + X + Z | H)
## Data: kb07
##
## AIC BIC logLik deviance df.resid
## 28748.3 29193.0 -14293.2 28586.3 1709
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0332 -0.5773 -0.1437 0.3964 4.7091
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## G (Intercept) 90764 301.27
## S 5182 71.99 -0.43
## T 5543 74.45 -0.47 0.07
## U 7587 87.10 0.21 -0.20 0.41
## V 8829 93.96 0.20 -0.76 -0.54 -0.20
## W 1822 42.68 0.47 -0.53 -0.11 -0.44 0.28
## X 7422 86.15 -0.10 0.13 -0.05 -0.86 -0.06
## Z 3802 61.66 -0.48 0.41 -0.39 -0.09 0.18
## H (Intercept) 129829 360.32
## S 1855 43.07 -0.34
## T 62395 249.79 -0.68 -0.45
## U 2948 54.30 0.20 -0.03 -0.18
## V 1043 32.30 0.57 -0.76 0.02 0.01
## W 1622 40.27 0.28 -0.02 -0.27 0.44 -0.21
## X 4700 68.56 0.08 -0.23 0.21 -0.13 -0.26
## Z 4820 69.43 0.04 -0.48 0.32 -0.69 0.65
## Residual 399613 632.15
##
##
##
##
##
##
##
## 0.70
## -0.78 -0.39
##
##
##
##
##
##
## 0.02
## -0.68 -0.10
##
## Number of obs: 1790, groups: G, 56; H, 32
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2180.627 76.820 28.386
## S -66.990 19.333 -3.465
## T -333.881 47.667 -7.005
## U 78.987 21.234 3.720
## V 22.152 20.336 1.089
## W -18.924 17.507 -1.081
## X 5.262 22.421 0.235
## Z -23.951 21.019 -1.139
## convergence code: 1
## Model failed to converge with max|grad| = 0.00277275 (tol = 0.002, component 1)
## maxfun < 10 * length(par)^2 is not recommended.
The reason this takes a long time is because there are 72 covariance parameters to optimize, and each evaluation of the objective function requires solving a large penalized least squares problem for the fixed-effects parameters and the random effects. This is a huge number of parameters for a model like this. Almost inevitably convergence will be to singular covariance matrices for the random effects, as it is in this case, complicating matters further.
Let me be clear that I do not advocate fitting such models - I am simply showing how it can be done if you or the referees on your paper think you should.
Begin with julia_setup. This can take a while the first time you do this because it must initialize the Julia system and load some packages.
system.time(j <- julia_setup())
## Julia version 0.6.2 at location /usr/local/src/julia-d386e40c17/bin will be used.
## Julia initiation...
## Finish Julia initiation.
## Loading setup script for JuliaCall...
## Finish loading setup script for JuliaCall.
## user system elapsed
## 27.525 2.033 27.373
Load the MixedModels package.
j$library("MixedModels")
and copy the formula and data set from R to Julia
j$assign("form", formula(fm1))
j$assign("kb07", kb07)
Then fit the model
j$eval("@elapsed fm1 = fit(LinearMixedModel, form, kb07)")
## [1] 22.90206
j$eval("fm1")
## Julia Object of type MixedModels.LinearMixedModel{Float64}.
## Linear mixed model fit by maximum likelihood
## Formula: Y ~ 1 + S + T + U + V + W + X + Z + ((1 + S + T + U + V + W + X + Z) | G) + ((1 + S + T + U + V + W + X + Z) | H)
## logLik -2 logLik AIC BIC
## -1.4293159×10⁴ 2.8586318×10⁴ 2.8748318×10⁴ 2.91930056×10⁴
##
## Variance components:
## Column Variance Std.Dev. Corr.
## G (Intercept) 90715.0184 301.189340
## S 5180.3901 71.974927 -0.43
## T 5543.1348 74.452232 -0.47 0.08
## U 7584.8816 87.091226 0.21 -0.20 0.41
## V 8832.9843 93.983958 0.20 -0.76 -0.54 -0.20
## W 1821.9809 42.684668 0.47 -0.53 -0.11 -0.44 0.28
## X 7417.1453 86.122850 -0.10 0.13 -0.05 -0.86 -0.06 0.70
## Z 3801.0318 61.652509 -0.47 0.41 -0.39 -0.09 0.18 -0.78 -0.39
## H (Intercept) 129690.6871 360.125932
## S 1856.9765 43.092650 -0.34
## T 62370.7020 249.741270 -0.68 -0.45
## U 2950.1553 54.315332 0.20 -0.03 -0.18
## V 1042.0598 32.280950 0.57 -0.76 0.02 0.02
## W 1620.5108 40.255569 0.28 -0.03 -0.27 0.44 -0.21
## X 4703.6870 68.583431 0.08 -0.24 0.21 -0.13 -0.26 0.02
## Z 4821.2335 69.435103 0.04 -0.47 0.32 -0.68 0.65 -0.69 -0.10
## Residual 399627.0136 632.160592
## Number of obs: 1790; levels of grouping factors: 56, 32
##
## Fixed-effects parameters:
## Estimate Std.Error z value P(>|z|)
## (Intercept) 2180.63 76.7856 28.3989 <1e-99
## S -66.99 19.3346 -3.46478 0.0005
## T -333.881 47.6587 -7.00566 <1e-11
## U 78.987 21.235 3.71967 0.0002
## V 22.1518 20.3368 1.08925 0.2760
## W -18.9243 17.5061 -1.08101 0.2797
## X 5.26182 22.4216 0.234677 0.8145
## Z -23.951 21.0197 -1.13946 0.2545
I think of fitting a model in Julia using JuliaCall as “pushing” the data to Julia. An alternative, if you are willing to work in the Julia REPL (Read-Eval-Print-Loop) is to “pull” the data and formula using RCall. Here’s the same example in the Julia REPL.
julia> using MixedModels, RCall
R> attach("~/.julia/v0.6/MixedModels/test/dat.rda")
R> form <- Y ~ 1+S+T+U+V+W+X+Z + (1+S+T+U+V+W+X+Z|G) + (1+S+T+U+V+W+X+Z|H)
julia> fm1 = fit(LinearMixedModel, rcopy(R"form"), rcopy(R"kb07"))
Linear mixed model fit by maximum likelihood
Formula: Y ~ 1 + S + T + U + V + W + X + Z + ((1 + S + T + U + V + W + X + Z) | G) + ((1 + S + T + U + V + W + X + Z) | H)
logLik -2 logLik AIC BIC
-1.42931611×10⁴ 2.85863221×10⁴ 2.87483221×10⁴ 2.91930097×10⁴
Variance components:
Column Variance Std.Dev. Corr.
G (Intercept) 90797.8433 301.326805
S 5186.2898 72.015900 -0.43
T 5545.5545 74.468480 -0.47 0.08
U 7590.2433 87.122003 0.21 -0.20 0.41
V 8839.5027 94.018630 0.20 -0.76 -0.54 -0.20
W 1822.8053 42.694324 0.47 -0.53 -0.11 -0.44 0.29
X 7417.7267 86.126226 -0.10 0.13 -0.05 -0.86 -0.06 0.70
Z 3800.9670 61.651983 -0.48 0.41 -0.39 -0.09 0.18 -0.78 -0.39
H (Intercept) 129801.4428 360.279673
S 1855.0633 43.070446 -0.34
T 62410.5894 249.821115 -0.68 -0.45
U 2957.5947 54.383773 0.20 -0.03 -0.18
V 1038.0660 32.219031 0.57 -0.75 0.02 0.01
W 1608.1394 40.101613 0.28 -0.05 -0.27 0.44 -0.20
X 4698.6232 68.546504 0.08 -0.25 0.21 -0.13 -0.26 0.01
Z 4836.0681 69.541844 0.04 -0.46 0.32 -0.68 0.65 -0.69 -0.10
Residual 399601.5053 632.140416
Number of obs: 1790; levels of grouping factors: 56, 32
Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 2180.63 76.8177 28.387 <1e-99
S -66.9899 19.3354 -3.46463 0.0005
T -333.881 47.6721 -7.0037 <1e-11
U 78.9869 21.2424 3.71837 0.0002
V 22.1517 20.3362 1.08927 0.2760
W -18.9244 17.4951 -1.0817 0.2794
X 5.26191 22.418 0.234719 0.8144
Z -23.9509 21.0303 -1.13888 0.2548
julia> @time fit(LinearMixedModel, rcopy(R"form"), rcopy(R"kb07"));
## Error: Error happens in Julia.
## REvalError:
3.571324 seconds (1.22 M allocations: 74.571 MiB, 3.83% gc time)
julia> fm1.optsum
Initial parameter vector: [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0 … 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 0.0, 1.0]
Initial objective value: 30014.369768606295
Optimizer (from NLopt): LN_BOBYQA
Lower bounds: [0.0, -Inf, -Inf, -Inf, -Inf, -Inf, -Inf, -Inf, 0.0, -Inf … 0.0, -Inf, -Inf,
-Inf, 0.0, -Inf, -Inf, 0.0, -Inf, 0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10 … 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10, 1.0e-10]
initial_step: [0.75, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.75, 1.0 … 0.75, 1.0, 1.0, 1.0, 0.75, 1.0, 1.0, 0.75, 1.0, 0.75]
maxfeval: -1
Function evaluations: 2018
Final parameter vector: [0.476677, -0.0494935, -0.0558944, 0.0295759, 0.0292159, 0.0319406, -0.0139343,
-0.0463468, 0.102611, -0.0170349 … 0.0265041, -0.0488983, -0.0548502, 0.0604581, 0.00382623, -0.0032048, -0.00157491, 0.0, 8.21222e-5, 0.0]
Final objective value: 28586.322101083984
Return code: FTOL_REACHED
julia> versioninfo()
Julia Version 0.6.2
Commit d386e40 (2017-12-13 18:08 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
WORD_SIZE: 64
BLAS: libmkl_rt
LAPACK: libmkl_rt
LIBM: libopenlibm
LLVM: libLLVM-3.9.1 (ORCJIT, ivybridge)
One thing to notice in this transcript is that the prompt changes from julia> to R> and back. Once the RCall package is loaded you can switch to an R REPL (i.e. the R> prompt) by typing $ at the beginning of a line. Backspace at the beginning of a line takes you back to the julia> prompt.
This fit is faster than the previous Julia fit because this version of Julia uses BLAS from Intel’s Math Kernel Library (MKL). Also, the first time that you fit a model of this form several Julia functions (well, methods actually) need to be compiled. The second and subsequent fits are faster.