Simulating Power for a Multi-Level Model

This document presents my attempt to conduct a formal power analysis for a multi-level model using code provided in the supplementary material to the paper “Power Struggles” by Sean P. Lane & Erin P. Hennes in 2017 in the Journal of Social and Personal Relationships (JSPR).

A general point from the manuscript: power in multi-level modeling depends on more parameters than in standard regression models, and researchers may not have intuitions about typical sizes for some of these effects. Therefore Lane & Hennes suggest collecting pilot data to have an idea of what values to simulate.

Although there are formulae that can be used to directly estimate power in a multi-level model (and L&H point to some resources for this), they opt for the flexible approach of simulating data and testing the power implied by the data generating model.

The Research Scenario

In the example used in L&H, participants’ anxiety was being predicted by the degree of social support they received from others. Measures of anxiety and social support (and other variables) were taken repeatedly at different points in time.

My example is a proposed study of interpersonal perception. Personality characteristics of participants will be measured, and participants will be watching videos of several social targets. My colleague and I have hypothesized that specific personality factors will influence judgments of the social targets. We will use a multi-level analysis to control for nesting of social targets in participants.

Generating the Data

simdat.m1 <- function(J,K) { # this function will generate 1 data set
  target <- rep(seq(1,K,length=K)) # K represents measures per person
  person <- rep(1:J, each=K) # J represents people recruited
  
  # generating values for personality data (level 2)
  # assuming a known correlation structure for the factors
  r12 = 0.25 # from prior research
  r13 = 0.32
  r14 = 0.01
  r15 = 0.25
  r16 = 0.46
  r17 = 0.2 # made up
  r18 = 0.2 # made up
  r23 = 0.18
  r24 = -0.18
  r25 = 0.30
  r26 = 0.17
  r27 = 0.2 # made up
  r28 = 0.2 # made up
  r34 = 0.13
  r35 = 0.25
  r36 = 0.1
  r37 = 0.2 # made up
  r38 = 0.2 # made up
  r45 = 0.002
  r46 = 0.01
  r47 = 0.2 # made up
  r48 = 0.2 # made up
  r56 = 0.12
  r57 = 0.2 # made up
  r58 = 0.2 # made up
  r67 = 0.2 # made up
  r68 = 0.2 # made up
  r78 = 0.2 # made up
  
  pDat = mvrnorm(n=J, # simulating data for J people
                 mu=rep(3.5, times=8), # mean score is 3.5 for all factors
                 Sigma=matrix(c(1, r12, r13, r14, r15, r16, r17, r18, # cor matrix
                                r12, 1, r23, r24, r25, r26, r27, r28,
                                r13, r23, 1, r34, r35, r36, r37, r38,
                                r14, r24, r34, 1, r45, r46, r47, r48,
                                r15, r25, r35, r45, 1, r56, r57, r58,
                                r16, r25, r36, r46, r56, 1, r67, r68,
                                r17, r27, r37, r47, r57, r67, 1, r78,
                                r18, r28, r38, r48, r58, r68, r78, 1),
                              nrow=8),
                 empirical=TRUE)
  
  # creat a long version of the personality data
  pDat.long <- as.data.frame(pDat[rep(seq_len(nrow(pDat)), each=K),]) # repeating personality K times
  names(pDat.long) <- paste("p",1:8,sep="")
  
  # fixed effects
  b0 <- 0.7 # true intercept
  b1 <- 0.1 # true effect of pers1
  b2 <- 0.1 # true effect of pers2
  b3 <- 0.1 # true effect of pers3
  b4 <- 0.1 # true effect of pers4
  b5 <- 0.1 # true effect of pers5
  b6 <- 0.1 # true effect of pers6
  b7 <- 0.1 # true effect of pers7
  b8 <- 0.1 # true effect of pers8
  
  # random effects
  vsub.b0 <- 0.01 # true between person variance in the intercept
  
  # combining fixed and random effects
  b0.int <- rnorm(J, b0, sqrt(vsub.b0)) # generating a random intercept for each participant
  vresid <- 0.13 # true within person variance in outcome
  
  # generating the outcome, based on the specified model
  y <- rnorm(J*K, b0.int[person] +
               b1*pDat.long[,1] +
               b2*pDat.long[,2] +
               b3*pDat.long[,3] +
               b4*pDat.long[,4] +
               b5*pDat.long[,5] +
               b6*pDat.long[,6] +
               b7*pDat.long[,7] +
               b8*pDat.long[,8],
             sqrt(vresid))
  
  # create the simulated data frame
  sdat <- as.data.frame(cbind(person,target,y,pDat.long))
  
  # return the simulated data frame
  return(sdat)
}

Example of One Data Set

To give a sense of the parameter estimates coming from this model, I generated the model summary for one simulation.

simDat <- simdat.m1(100,6) # input people, then measures
lme.power <- lmer(y ~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + (1|person), data=simDat)
summary(lme.power)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
##   to degrees of freedom [lmerMod]
## Formula: y ~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + (1 | person)
##    Data: simDat
## 
## REML criterion at convergence: 562.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -3.00931 -0.66624 -0.00236  0.64622  2.81156 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  person   (Intercept) 0.01725  0.1313  
##  Residual             0.12555  0.3543  
## Number of obs: 600, groups:  person, 100
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  0.67641    0.13627 91.00000   4.964 3.21e-06 ***
## p1           0.09650    0.02377 91.00000   4.060 0.000104 ***
## p2           0.08231    0.02202 91.00000   3.738 0.000324 ***
## p3           0.10262    0.02160 91.00000   4.751 7.51e-06 ***
## p4           0.08999    0.02122 91.00000   4.242 5.33e-05 ***
## p5           0.10218    0.02147 91.00000   4.759 7.29e-06 ***
## p6           0.12773    0.02253 91.00000   5.668 1.68e-07 ***
## p7           0.12543    0.02131 91.00000   5.885 6.57e-08 ***
## p8           0.08342    0.02131 91.00000   3.914 0.000175 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##    (Intr) p1     p2     p3     p4     p5     p6     p7    
## p1 -0.081                                                 
## p2 -0.315 -0.101                                          
## p3 -0.184 -0.250 -0.070                                   
## p4 -0.444  0.022  0.256 -0.109                            
## p5 -0.210 -0.116 -0.201 -0.138  0.018                     
## p6 -0.251 -0.417 -0.035  0.086  0.022  0.024              
## p7 -0.134 -0.037 -0.144 -0.084 -0.203 -0.096 -0.115       
## p8 -0.134 -0.037 -0.144 -0.084 -0.203 -0.096 -0.115 -0.061

Simulate!

The simulation function generates power estimates for all of the fixed effects in my model.

# create a new function for conducting the power analysis
power.m1 <- function(J,K,sims=100) { # defaults to 100 simulations (for speed)
  signifs <- matrix(NA, sims, 9) # records if an effect is significant
  
  for (s in 1:sims) { # loop through all simulations
    simd <- simdat.m1(J,K) # generate the data
    lme.power <- lmer(y ~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + (1|person), data=simd) # estimate the model
    est <- fixef(lme.power) # extract b's
    se <- se.fixef(lme.power) # extract SE's for b's
    signifs[s,] <- abs(est)/se>2 # if Z > 2, return significance as true
  }
  power <- colMeans(signifs)
  names <- names(fixef(lme.power))
  return(as.data.frame(cbind(names, power)))
}

pow.out <- power.m1(100,6,1000)
print(pow.out)
##         names power
## 1 (Intercept)     1
## 2          p1 0.994
## 3          p2 0.998
## 4          p3 0.997
## 5          p4     1
## 6          p5 0.999
## 7          p6 0.999
## 8          p7 0.999
## 9          p8 0.997