I find simulations with set parameters often the best way to gain understanding of statistical power in a specific testing design. My main aim with this syntax is to generate data following a random slope, random intercept model. The response variable (‘resp’ in short) varies between individuals (‘ID’) both in intercept as its linear relation to some environmental variable (‘env’). This typically occurs if individuals differ in their mean expression of a trait but also differ in how strongly they respond to the environment (e.g. temperature).
Below syntax uses the simulate command in package lme4 (Bates et al. 2015). In contrast to info on how to simulate from a fitted mixed model (‘simulate.merMod’), the lme4 vignette only offers basic info on how to parametrize simulations for a non-fitted model (‘simulate.formula’). Below thread on the R-sig-Mixed forum ended up crucial for my understanding of how theta should be fitted.
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q3/022378.html
Parameters that can be changed in below code are the number of ID’s scored, number of replicates per individuals, random variance in intercepts, random variance in slopes, random covariance between intercept and slope, residual variance and fixed effects.
The syntax ends with a couple of comparisons between set and achieved parameters and a plot showing variation in individual reaction norms.
require(lme4)
## Loading required package: lme4
## Loading required package: Matrix
require(ggplot2)
## Loading required package: ggplot2
require(ggthemes)
## Loading required package: ggthemes
form<-as.formula(c("~env+(env|ID)"))
Env is also fitted as a fixed effect to account for mean plasticity at the group level.
In this model there are four (co)variances fitted
V.ind0 = variance in reaction norm intercepts between individuals
V.inde = variance in reaction norm slopes between individuals
COVind0-ind1e = covariance between reaction norm slopes and intercepts
V.err0 = residual variance
The number of individuals sampled:
N.ind<-40
The number of repeat obs per individual:
N.obs<-15
set.seed(25)
simdat <-data.frame(ID=factor(rep(1:N.ind,each=N.obs)),env=runif(N.ind*N.obs,0,1))
beta<-c(7,2)
names(beta)<-c("(Intercept)","env")
In this example the fixed effects are: Beta intercept=7 Population level slope for environment=2
V.ind0 <- 3
V.inde <- 5
V.err0 <- 3
COVind0.ind1e <- 0.7*(sqrt(V.ind0*V.inde))
COVind0.ind1e with slope-intercept correlation set to 0.7
The variance covariance matrix
vcov<-matrix(c(V.ind0,COVind0.ind1e,COVind0.ind1e,V.inde), 2, 2)
Calculate theta
theta<-c((chol(vcov)/sqrt(V.err0))[1,1],
(chol(vcov)/sqrt(V.err0))[1,2],
(chol(vcov)/sqrt(V.err0))[2,2])
names(theta)<-c("ID.(Intercept)","ID.env.(Intercept)","ID.env")
set.seed(25)
response<-simulate(form,newdata=simdat,family=gaussian,
newparams=list(theta = theta,beta = beta, sigma = sqrt(V.err0)))
simdat$resp<-as.vector(response[,1])
model1<-lmer(resp~env+(env|ID),data=simdat)
Do simulated betas correspond to set beta parameters?
summary(model1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp ~ env + (env | ID)
## Data: simdat
##
## REML criterion at convergence: 2507.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.86745 -0.64037 -0.01198 0.67059 2.70785
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 3.432 1.853
## env 4.623 2.150 0.76
## Residual 2.933 1.713
## Number of obs: 600, groups: ID, 40
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 6.7063 0.3261 20.564
## env 1.8034 0.4216 4.277
##
## Correlation of Fixed Effects:
## (Intr)
## env 0.328
beta
## (Intercept) env
## 7 2
Do simulated variances correspond to set variance parameters?
as.data.frame(VarCorr(model1))
## grp var1 var2 vcov sdcor
## 1 ID (Intercept) <NA> 3.432159 1.8526086
## 2 ID env <NA> 4.623206 2.1501643
## 3 ID (Intercept) env 3.046502 0.7647969
## 4 Residual <NA> <NA> 2.932810 1.7125448
V.ind0
## [1] 3
V.inde
## [1] 5
COVind0.ind1e
## [1] 2.711088
V.err0
## [1] 3
getME(model1,"theta")
## ID.(Intercept) ID.env.(Intercept) ID.env
## 1.0817869 0.9602312 0.8089070
theta
## ID.(Intercept) ID.env.(Intercept) ID.env
## 1.0000000 0.9036961 0.9219544
For clarity 10 random individuals are shown in black, others in grey
First extract random effects for plotting
fittednorms<-ranef(model1)$ID[,1:2]
fittednorms[,1]<-fittednorms[,1]+getME(model1,"beta")[1]
fittednorms[,2]<-fittednorms[,2]+getME(model1,"beta")[2]
colnames(fittednorms)<-c("intercept","slope")
Randomly select 10 individuals to be plotted in black
set.seed(25)
rand10<-sample(1:N.ind, 10, replace = FALSE)
plot1<-ggplot(simdat,aes(env, resp))+
geom_rangeframe(data=data.frame(x=c(0, 1), y=c(0, 20)), aes(x, y))+
theme_tufte(base_size = 25)+
geom_point(colour=NA)+
geom_segment(aes(x = 0, y = intercept,xend=1, yend=intercept+slope ), data=fittednorms,colour = "gray85", size = 1)+
geom_segment(aes(x = 0, y = intercept,xend=1, yend=intercept+slope ), data=fittednorms[rand10,],colour = "black", size = 1)+
labs(x = "Environment", y="Response",title="slope-intercept correlation=.7")+
theme(axis.title.x=element_text(vjust=-1),axis.title.y=element_text(vjust=1))+
scale_x_continuous(breaks=seq(0,1,0.2))+
scale_y_continuous(breaks=seq(0,20,4))
plot1
Ref
Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.