Bliese, P.D., Maltarich, M.A. & Hendricks, J.L. Back to Basics with Mixed-Effects Models: Nine Take-Away Points. J Bus Psychol 33, 1–23 (2018).
https://doi.org/10.1007/s10869-017-9491-z
##Simulation 1: Independent Data
##Set up Independent data
set.seed(12532) #使用set.seed生成隨機序列數據,()內設定起始值
#建立X和Y向量
Y<-rnorm(1000) #用rnormY隨機生成一組1000筆數據Y,Y是常態分配,平均為0,標準差為1
X<-.30*Y+sqrt(1-.30^2)*rnorm(1000) #sqrt是square root
#X和Y相關係數
cor(X,Y)
## [1] 0.2937485
#建立group ID向量,ID從1到100,重複10次,平均一組有10個成員
G.ID<-rep(1:100,10)
#隨機分派Group ID給observations(X,Y)
NO.GRP<-data.frame(G.ID=G.ID,Y=Y,X=X)
head(NO.GRP)
## G.ID Y X
## 1 1 -0.3562218 -0.2533692
## 2 2 0.3926757 1.9169494
## 3 3 -0.4866951 -0.3444651
## 4 4 0.2367052 -1.6619983
## 5 5 0.5555600 1.3272490
## 6 6 0.1011640 2.1926501
#Calculate group means for X.
TDAT<-aggregate(NO.GRP[,c("G.ID","X")],by=list(NO.GRP$G.ID),mean) #list()取得G.ID變量
#Assign means back to members as X.G
NO.GRP<-merge(NO.GRP,TDAT[,c("G.ID","X")],by="G.ID", #merge()將兩個data.frame 透過by="G.ID"進行合併
suffixes=c("",".G")) #The default is c(".x",".y")
#These data have no group-level properties.
#NO.GRP[1:15,]
knitr::kable(head(NO.GRP,15))
G.ID | Y | X | X.G |
---|---|---|---|
1 | -0.3562218 | -0.2533692 | -0.1805595 |
1 | -0.4222437 | -1.1081410 | -0.1805595 |
1 | -0.4751256 | 0.1228866 | -0.1805595 |
1 | -0.0754015 | -0.2089118 | -0.1805595 |
1 | 0.0401238 | -0.5647489 | -0.1805595 |
1 | -0.0425110 | 0.3182596 | -0.1805595 |
1 | 1.8467950 | 0.9231918 | -0.1805595 |
1 | 0.5551599 | -0.5044792 | -0.1805595 |
1 | 1.6542047 | 0.7838111 | -0.1805595 |
1 | -2.3206311 | -1.3140941 | -0.1805595 |
2 | -0.8275048 | -0.6610696 | -0.2058514 |
2 | 0.3926757 | 1.9169494 | -0.2058514 |
2 | 0.3734971 | -0.8080668 | -0.2058514 |
2 | -2.4918987 | -2.2593381 | -0.2058514 |
2 | -1.4049858 | 1.5874561 | -0.2058514 |
#install packages
pacman::p_load(nlme,multilevel)
##ICC(1) estimate via mixed-effects
tmod<-lme(fixed=Y~1,random=~1|G.ID,data=NO.GRP)
VarCorr(tmod)
## G.ID = pdLogChol(1)
## Variance StdDev
## (Intercept) 4.299649e-09 6.557171e-05
## Residual 9.986574e-01 9.993285e-01
#組內相關係數ICC(1)
4.299649e-09/(4.299649e-09+9.986574e-01)
## [1] 4.305429e-09
# ICC(1) estimate via ANOVA
tmod<-aov(Y~as.factor(G.ID),data=NO.GRP)
ICC1(tmod)
## [1] -0.01560787
ICC(1)用來描述組內資料的相似性或資料的不獨立性。以上用兩種方式計算組內相關係數ICC(1)皆趨近於0,表示隨機抽取任一群組內的兩名員工,其response variable完全無共通性。
將完全獨立的資料應用在OLS和Mixed effect model,結果會如何?
#Table 2 (upper panel) presents the results from regressing Y on X and X.G in OLS regression in simulated data with near-zero ICC(1)
tmod<-lm(Y~X+X.G, data=NO.GRP)
knitr::kable(summary(tmod)$coef)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.0615821 | 0.0302628 | 2.034908 | 0.0421243 |
X | 0.3019515 | 0.0316102 | 9.552340 | 0.0000000 |
X.G | -0.1098059 | 0.0981465 | -1.118796 | 0.2634968 |
#Table 2 (lower panel) presents the results from regressing Y on X and X.G in mixed-effects random intercept model in simulated data with near-zero ICC(1)
tmod<-lme(fixed=Y~X+X.G,random=~1|G.ID,data=NO.GRP)
#summary(tmod)$tTable
knitr::kable(summary(tmod)$tTable)
Value | Std.Error | DF | t-value | p-value | |
---|---|---|---|---|---|
(Intercept) | 0.0615821 | 0.0302629 | 899 | 2.034908 | 0.0421531 |
X | 0.3019515 | 0.0316102 | 899 | 9.552340 | 0.0000000 |
X.G | -0.1098059 | 0.0981465 | 98 | -1.118796 | 0.2659622 |
兩種模型所估計的參數與Std.Error都相同。
Take-away point 1:即使沒有預期或明顯的總體層次效果,混合效果模型仍可被用於巢套資料。
#Re-run simulation with data that returns a slightly positive ICC(1)
set.seed(125321) #變更起始值
Y<-rnorm(1000)
X<-.30*Y+sqrt(1-.30^2)*rnorm(1000)
cor(X,Y)
## [1] 0.2827705
G.ID<-rep(1:100,10)
NO.GRP<-data.frame(G.ID=G.ID,Y=Y,X=X)
# Calculate group means for X. Assign means back to members as X.G
TDAT<-aggregate(NO.GRP[,c("G.ID","X")],list(NO.GRP$G.ID),mean)
NO.GRP<-merge(NO.GRP,TDAT[,c("G.ID","X")],by="G.ID",
suffixes=c("",".G"))
knitr::kable(head(NO.GRP,15))
G.ID | Y | X | X.G |
---|---|---|---|
1 | -1.9711215 | 0.5262381 | 0.1933068 |
1 | -0.2086705 | -1.2067246 | 0.1933068 |
1 | 0.1279318 | 0.7974741 | 0.1933068 |
1 | 0.9715064 | -0.2736225 | 0.1933068 |
1 | 0.9996284 | 0.1510868 | 0.1933068 |
1 | -0.1924392 | 0.7063790 | 0.1933068 |
1 | 1.0780661 | -0.9318504 | 0.1933068 |
1 | -0.7626443 | 0.9025238 | 0.1933068 |
1 | 0.1578146 | 0.3128734 | 0.1933068 |
1 | 2.8152661 | 0.9486898 | 0.1933068 |
2 | 0.6172546 | 1.0773918 | -0.1148806 |
2 | 0.1150379 | 0.9117392 | -0.1148806 |
2 | -0.6256317 | -0.8983085 | -0.1148806 |
2 | 0.8985839 | 2.0294684 | -0.1148806 |
2 | -1.1983031 | -0.8236887 | -0.1148806 |
# ICC(1) estimate via mixed-effects
tmod<-lme(fixed=Y~1,random=~1|G.ID,data=NO.GRP)
VarCorr(tmod)
## G.ID = pdLogChol(1)
## Variance StdDev
## (Intercept) 0.01292908 0.1137061
## Residual 1.00240352 1.0012010
#ICC(1)
(0.01292908)/(0.01292908+1.00240352)
## [1] 0.01273384
# ICC(1) estimate via ANOVA
tmod<-aov(Y~as.factor(G.ID),data=NO.GRP)
summary(tmod)
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(G.ID) 99 112.0 1.132 1.129 0.194
## Residuals 900 902.2 1.002
ICC1(tmod)
## [1] 0.01273379
ICC(1)=0.013,表示任兩個成員之間的相似性有1.3%可歸因於群組的影響。 但F-test 檢定結果指出ICC(1)=0.013不具有統計上的顯著性(p = .194),因此一般情況下仍被視為independent data。
忽略資料的非獨立性,將它視作完全獨立的資料應用在OLS和Mixed effect model,結果會如何?
#Table 3 (upper panel) presents the results of regressing Y on X and X.G in OLS in simulated data with small positive ICC(1)
tmod<-lm(Y~X+X.G, data=NO.GRP)
knitr::kable(summary(tmod)$coef)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.0148230 | 0.0307618 | 0.4818625 | 0.6300094 |
X | 0.2826789 | 0.0328511 | 8.6048573 | 0.0000000 |
X.G | 0.0947241 | 0.1080628 | 0.8765653 | 0.3809340 |
#Table 3 (lower panel) presents the results of regressing Y on X and X.G in mixed-effects random intercept model in simulated data with small positive ICC(1)
tmod<-lme(fixed=Y~X+X.G, random=~1|G.ID,data=NO.GRP)
knitr::kable(summary(tmod)$tTable)
Value | Std.Error | DF | t-value | p-value | |
---|---|---|---|---|---|
(Intercept) | 0.0148230 | 0.0320474 | 899 | 0.4625321 | 0.6438117 |
X | 0.2826789 | 0.0326979 | 899 | 8.6451599 | 0.0000000 |
X.G | 0.0947241 | 0.1121245 | 98 | 0.8448117 | 0.4002738 |
兩模型所估計的參數相同,但是在OLS中,X的標準誤略大於Mixed-effects model,會增加型二誤差的風險;而X.G的標準誤略小於Mixed-effects model,會增加型一誤差的風險。這就是本文要告訴我們的第二個重點。
Take-away point 2:Nested data沒有選擇使用mixed-effect model,group-level effects會增加型一錯誤,lower-level effects會增加型二錯誤的風險。
接下來要討論平減(centering)
# Create a group-mean centered variable (W.X)
NO.GRP$W.X<-NO.GRP$X-NO.GRP$X.G
#Table 4 (upper panel) Results of regressing Y on W.X and X.G in OLS (upper panel) in simulated data with small positive ICC(1)
tmod<-lm(Y~W.X+X.G, data=NO.GRP)
knitr::kable(summary(tmod)$coef)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.0148230 | 0.0307618 | 0.4818625 | 0.6300094 |
W.X | 0.2826789 | 0.0328511 | 8.6048573 | 0.0000000 |
X.G | 0.3774030 | 0.1029484 | 3.6659436 | 0.0002594 |
#Table 4 (lower panel) Results of regressing Y on W.X and X.G in mixed-effects random intercept model in simulated data with small positive ICC(1)
tmod<-lme(fixed=Y~W.X+X.G,random=~1|G.ID,data=NO.GRP)
knitr::kable(summary(tmod)$tTable)
Value | Std.Error | DF | t-value | p-value | |
---|---|---|---|---|---|
(Intercept) | 0.0148230 | 0.0320474 | 899 | 0.4625321 | 0.6438117 |
W.X | 0.2826789 | 0.0326979 | 899 | 8.6451599 | 0.0000000 |
X.G | 0.3774030 | 0.1072509 | 98 | 3.5188807 | 0.0006589 |
表4呈現兩模型level-1變數經過centering後的結果,參數估計結果相同。 我們再和表3沒有centering相比,X和W.X參數估計結果也相同,但是X.G參數卻變大了,此現象稱為“emergent effect”,又稱作contextual effect 或incremental effect (詳見Hofmann and Gavin 1998)
Interpreting Level-2 Coefficients
# create a dataset that contains 100 (X.G, Y.G) pairs instead of 1000 (X, Y) pairs
TDAT<-aggregate(NO.GRP[,c("Y","X")],list(NO.GRP$G.ID),mean)
names(TDAT)<-c("G.ID","Y.G","X.G")
knitr::kable(head(TDAT,15))
G.ID | Y.G | X.G |
---|---|---|
1 | 0.3015338 | 0.1933068 |
2 | 0.1172007 | -0.1148806 |
3 | 0.6550346 | 0.6917088 |
4 | 0.0919974 | -0.4618666 |
5 | 0.2392360 | 0.6067226 |
6 | 0.6261253 | 0.0598669 |
7 | -0.2403001 | -0.0295834 |
8 | -0.5648905 | -0.2815494 |
9 | 0.1233002 | -0.1471086 |
10 | -0.4606494 | 0.0345138 |
11 | 0.2488465 | -0.0363647 |
12 | 0.2269388 | 0.0661628 |
13 | 0.0349877 | -0.1330073 |
14 | 0.1542718 | -0.0606252 |
15 | -0.9466745 | -0.1054040 |
#Table 5 (results from 100 group means) Results of OLS group-mean regression of Y.G on X.G in simulated data with small positive ICC(1)
tmod<-lm(Y.G~X.G, data=TDAT)
knitr::kable(summary(tmod)$coef)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.014823 | 0.0320474 | 0.4625321 | 0.6447251 |
X.G | 0.377403 | 0.1072509 | 3.5188807 | 0.0006589 |
#Add another group-level predictor, Z.G, with no level-1 analogue
set.seed(125321)
Y<-rnorm(1000)
X<-.30*Y+sqrt(1-.30^2)*rnorm(1000)
G.ID<-rep(1:100,10)
NO.GRP<-data.frame(G.ID=G.ID,Y=Y,X=X)
NO.GRP<-NO.GRP[order(NO.GRP$G.ID),]
NO.GRP$Z.G<-rep(rnorm(100),each=10) #新增group-level變數Z.G
knitr::kable(head(NO.GRP,10))
G.ID | Y | X | Z.G | |
---|---|---|---|---|
1 | 1 | -1.9711215 | 0.5262381 | -0.6305105 |
101 | 1 | -0.7626443 | 0.9025238 | -0.6305105 |
201 | 1 | 0.1578146 | 0.3128734 | -0.6305105 |
301 | 1 | -0.2086705 | -1.2067246 | -0.6305105 |
401 | 1 | 2.8152661 | 0.9486898 | -0.6305105 |
501 | 1 | 0.9715064 | -0.2736225 | -0.6305105 |
601 | 1 | 1.0780661 | -0.9318504 | -0.6305105 |
701 | 1 | 0.1279318 | 0.7974741 | -0.6305105 |
801 | 1 | 0.9996284 | 0.1510868 | -0.6305105 |
901 | 1 | -0.1924392 | 0.7063790 | -0.6305105 |
#mixed effect model
tmod<-lme(fixed=Y~Z.G, random=~1|G.ID, data=NO.GRP) #Y~Z.G
knitr::kable(summary(tmod)$tTable)
Value | Std.Error | DF | t-value | p-value | |
---|---|---|---|---|---|
(Intercept) | 0.0272597 | 0.0337761 | 900 | 0.8070708 | 0.4198389 |
Z.G | 0.0174623 | 0.0366474 | 98 | 0.4764947 | 0.6347827 |
#OLS regression
TDAT<-aggregate(NO.GRP[,c("Y","Z.G")],list(NO.GRP$G.ID),mean)
names(TDAT)<-c("G.ID","Y.G","Z.G")
tmod<-lm(Y.G~Z.G, data=TDAT) #Y.G~Z.G
knitr::kable(summary(tmod)$coef)
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 0.0272597 | 0.0337761 | 0.8070711 | 0.4215800 |
Z.G | 0.0174623 | 0.0366474 | 0.4764948 | 0.6347826 |
Take-away point 5: 在mixed-effects models,高層次的變數Z.G可預測低層次結果變數Y的組平均Y.G。