Bliese, P.D., Maltarich, M.A. & Hendricks, J.L. Back to Basics with Mixed-Effects Models: Nine Take-Away Points. J Bus Psychol 33, 1–23 (2018).

https://doi.org/10.1007/s10869-017-9491-z

1 Data Management

##Simulation 2: Pure Group Data

Simulate pure group-level data: This simulation relies on the sim.icc function in the multilevel library.

library(multilevel)

## Loading required package: nlme

## Loading required package: MASS

set.seed(632342) #設定起始值
ALL.GRP<-sim.icc(gsize=10,ngrp=100,icc1=.15,nitems=2) #這個資料集有100組，每一組10人，未設定 item.cor，表示items with no level-1 correlation
names(ALL.GRP)<-c("G.ID","Y","X")

sim.icc()可模擬具有或不具有Level-1相關性的2-Level ICC(1)值。nitems=2是vector的數量。若設定 item.cor=.3，表示有.30 level-1 item correlations。

knitr::kable(head(ALL.GRP,15))

G.ID	Y	X
1	-0.3501393	-0.8712853
1	-2.3714188	-0.7405538
1	-1.3145565	-0.7956528
1	-1.7994279	-0.6557583
1	0.5989924	-0.2615682
1	-1.2710712	-1.6651974
1	0.2234514	-2.1611040
1	-1.5340462	-0.1659889
1	0.1501798	-1.3895897
1	-1.4292006	-2.8472639
2	-1.9448274	-1.3905018
2	-2.1964239	-1.7641183
2	-0.6863093	-0.4302189
2	-0.6104307	-2.7132627
2	-4.2151097	-1.6249252

想像我們有100個地點，溫度從冷到熱都不同。每個地點都隨機分派10個小組成員，每個成員都估計氣溫(Y)。隔天，再隨機分派另一組1000名參與者，將這些參與者以10個為一組，隨機分配到100個地點(溫度未變)，每個成員都估計氣溫(X)。

因此(X,Y)是不同人不同日期對相同group-level室溫所估計的溫度。

# covariance theorem decomposition results
with(ALL.GRP, waba(x=X, y=Y, grpid=G.ID))

## $Cov.Theorem
##     RawCorr     EtaBx     EtaBy     CorrB     EtaWx     EtaWy       CorrW
## 1 0.1474048 0.4892028 0.4856851 0.6101503 0.8721701 0.8741338 0.003192502
## 
## $n.obs
## [1] 1000
## 
## $n.grps
## [1] 100

原始相關性為0.1474048，組間(組均值)相關性為0.6101503，組內(組均值平減)相關性為0.003192502。

# Randomly sort X on a group-by-group basis into X2 (footnote5)
set.seed(467432)
ALL.GRP$X2<-unlist(tapply(ALL.GRP$X,ALL.GRP$G.ID,sample)) #建立新變量X2
ALL.GRP[1:10,] #observe that X2 values are X values in new order

##    G.ID          Y          X         X2
## 1     1 -0.3501393 -0.8712853 -0.2615682
## 2     1 -2.3714188 -0.7405538 -0.7405538
## 3     1 -1.3145565 -0.7956528 -0.6557583
## 4     1 -1.7994279 -0.6557583 -0.1659889
## 5     1  0.5989924 -0.2615682 -2.8472639
## 6     1 -1.2710712 -1.6651974 -0.7956528
## 7     1  0.2234514 -2.1611040 -1.6651974
## 8     1 -1.5340462 -0.1659889 -0.8712853
## 9     1  0.1501798 -1.3895897 -1.3895897
## 10    1 -1.4292006 -2.8472639 -2.1611040

with(ALL.GRP,waba(Y,X2,G.ID))

## $Cov.Theorem
##     RawCorr     EtaBx     EtaBy     CorrB     EtaWx     EtaWy      CorrW
## 1 0.1815365 0.4856851 0.4892028 0.6101503 0.8741338 0.8721701 0.04796176
## 
## $n.obs
## [1] 1000
## 
## $n.grps
## [1] 100

X2變數按組對X進行隨機排序，然後重新估計變異數矩陣。結果如預期般，Y與組中的不同X response 一致，不會對結果產生有顯著的影響。

#Estimate the ICC(1) and ICC(2) values for the Y variable using
#the ANOVA method
tmod<-aov(Y~as.factor(G.ID),data=ALL.GRP)
ICC1(tmod)

## [1] 0.1530071

ICC2(tmod)

## [1] 0.643681

我們人工刻意設定Y的ICC(1)為0.1530071。這裡不同的是加大了完全忽略非獨立性的風險，因為Y對X回歸是基於原始相關係數0.147來估計係數，該係數可能被誤解為個體層次的effect。

# Calculate group means for X. Assign means back to members as X.G
TDAT<-aggregate(ALL.GRP[,c("G.ID","X")],list(ALL.GRP$G.ID),mean)
ALL.GRP<-merge(ALL.GRP,TDAT[,c("G.ID","X")],
by="G.ID",suffixes=c("",".G"))

2 Table 6

#Table 6 (upper panel) Results of regressing Y on X and X.G in OLS (upper panel) in simulated data with substantial positive ICC(1)
tmod<-lm(Y~X+X.G,data=ALL.GRP)
knitr::kable(summary(tmod)$coef)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	0.0504433	0.0331980	1.5194691	0.1289615
X	0.0032969	0.0357350	0.0922609	0.9265093
X.G	0.6208774	0.0730473	8.4996580	0.0000000

#Table 6 (lower panel) Results of regressing Y on X and X.G in mixed-effects random intercept model in simulated data with substantial positive ICC(1)
tmod<-lme(fixed=Y~X+X.G, random=~1|G.ID, data=ALL.GRP)
knitr::kable(summary(tmod)$tTable)

	Value	Std.Error	DF	t-value	p-value
(Intercept)	0.0504433	0.0426620	899	1.1823945	0.2373618
X	0.0032969	0.0344428	899	0.0957223	0.9237625
X.G	0.6208774	0.0888218	98	6.9901437	0.0000000

兩模型估計的參數相同，但standard errors不同。在OLS模型中，低估了group-level variable X.G 的standard errors(=0.073)而產生了膨脹的t值(=8.50)。在mixed effects model中，X.G的standard errors較大(=0.089)，但正確，其t值為6.99；level-1 variable X的standard errors較小，降低導致型二誤差的風險(Take-away point 2)。

在Table 6中，與組平均相關的總斜率為0.621+0.003=0.624。接下來我們將X(level-1變數)進行平減來確認總關係0.624。

3 Take-away point 6

#Create a group-mean centered variable (W.X)
ALL.GRP$W.X<-ALL.GRP$X-ALL.GRP$X.G

#Estimate total effect of X.G with a group-mean centered X variable

tmod<-lme(fixed=Y~W.X+X.G, random=~1|G.ID, data=ALL.GRP)
knitr::kable(summary(tmod)$tTable)

	Value	Std.Error	DF	t-value	p-value
(Intercept)	0.0504433	0.0426620	899	1.1823941	0.2373620
W.X	0.0032969	0.0344428	899	0.0957223	0.9237625
X.G	0.6241743	0.0818720	98	7.6237862	0.0000000

以上證明emergent effect。我們介紹Take-away point 6：在mixed-effect model中，emergent effect 確認了level-1變數與其組平均值的不同，平減後的組平均值代表跨層次意義上的重要改變。

4 Group-Mean Reliability

在研究實務上，在探討預測變量的結構具有多層次屬性結構時，研究人員經常會報告group mean reliability，也就是ICC(2)。

# Estimate ICC(1) and ICC(2) values for both X and Y using ANOVA method
tmod<-aov(Y~as.factor(G.ID),data=ALL.GRP)
ICC1(tmod)

## [1] 0.1530071

ICC2(tmod)

## [1] 0.643681

tmod<-aov(X~as.factor(G.ID),data=ALL.GRP)
ICC1(tmod)

## [1] 0.1568375

ICC2(tmod)

## [1] 0.6503632

接下來要展示不同group sizes對ICC(2)值的影響，及檢測emergent effect效果的能力。

set.seed(632342)
ALL.GRP<-sim.icc(gsize=2,ngrp=100,icc1=.15,nitems=2) # 在sim.icc函數中將gsize替換為2或100。
names(ALL.GRP)<-c("G.ID","Y","X")
ALL.GRP[1:15,]

##    G.ID          Y           X
## 1     1 -0.1093418  0.12424801
## 2     1 -2.1306214 -3.36765654
## 3     2 -1.0169082 -1.36755166
## 4     2 -1.5017796 -1.51738016
## 5     3  0.9077974 -0.23947720
## 6     3 -0.9622663  0.58838004
## 7     4  0.5395617  0.27097823
## 8     4 -1.2179360  0.09219963
## 9     5  0.5633326 -0.32990359
## 10    5 -1.0160478 -1.20213434
## 11    6 -1.5917934 -0.67586873
## 12    6 -1.8433899 -0.91484613
## 13    7 -0.2631602 -1.18739589
## 14    7 -0.1872816 -1.13403820
## 15    8 -3.7387863  0.52169297

with(ALL.GRP,waba(x=X, y=Y, grpid=G.ID))

## $Cov.Theorem
##     RawCorr     EtaBx     EtaBy     CorrB     EtaWx    EtaWy       CorrW
## 1 0.1224128 0.7756321 0.7742287 0.2391332 0.6311853 0.632906 -0.05304541
## 
## $n.obs
## [1] 200
## 
## $n.grps
## [1] 100

tmod<-aov(Y~as.factor(G.ID),data=ALL.GRP)
ICC1(tmod)

## [1] 0.2036817

ICC2(tmod)

## [1] 0.3384311

tmod<-aov(X~as.factor(G.ID),data=ALL.GRP)
ICC1(tmod)

## [1] 0.208023

ICC2(tmod)

## [1] 0.3444024

TDAT<-aggregate(ALL.GRP[,c("G.ID","X")],list(ALL.GRP$G.ID),mean)
ALL.GRP<-merge(ALL.GRP,TDAT[,c("G.ID","X")], by="G.ID",suffixes=c("",".G"))

tmod<-lme(fixed=Y~X+X.G, random=~1|G.ID, data=ALL.GRP)
knitr::kable(summary(tmod)$tTable)

	Value	Std.Error	DF	t-value	p-value
(Intercept)	0.1134809	0.0864520	99	1.3126456	0.1923373
X	-0.0526323	0.0995807	99	-0.5285392	0.5983080
X.G	0.2888299	0.1389321	98	2.0789288	0.0402359

Tutorial：Mixed-effect model (Simulation 2)

Ching-Fang Wu

Mon Dec 14 05:08:45 2020

1 Data Management

2 Table 6

3 Take-away point 6

4 Group-Mean Reliability