These data are subset of the NELS-88 data (National Education Longitudinal Study of 1988). The data set contains information on students’ performance on a math test and 14 other variables.
In this section, we load the data set, select variables of interest, and examine the first 6 lines of the data frame object.
# load the data from the package
data(school23, package="influence.ME")
# save a copy with only selected variables
<- school23[, c("school.ID", "SES", "mean.SES", "math")] dta
# show first 6 lines
head(dta)
school.ID SES mean.SES math
1 6053 0.85 0.699773 50
2 6053 0.43 0.699773 43
3 6053 -0.59 0.699773 50
4 6053 1.02 0.699773 49
5 6053 0.84 0.699773 62
6 6053 1.32 0.699773 43
We draw a scatter diagram of the math scores against mean school SES and add the regression line.
ggplot(dta, aes(mean.SES, math))+
geom_point(alpha=.5)+
stat_smooth(method='lm', formula=y~x, se=TRUE)+
labs(x="Mean school SES",
y="Math score")+
theme_minimal()
We draw a scatter diagram of the math scores against individual SES and add the regression line by school.
ggplot(dta, aes(SES, math, group=school.ID))+
geom_point(alpha=.5)+
stat_smooth(method='lm', formula=y~x, se=FALSE,
col=1, size=rel(.5))+
labs(x="SES",
y="Math score")+
theme_minimal()
-學生社經地位不同,數學成績也會不同,學生社經地位越高,數學成績越高。
<- lm(math ~ mean.SES, data=dta)
m0 summary(m0)
Call:
lm(formula = math ~ mean.SES, data = dta)
Residuals:
Min 1Q Median 3Q Max
-23.174 -7.384 0.165 7.577 23.005
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 51.733 0.416 124 <2e-16
mean.SES 8.076 0.671 12 <2e-16
Residual standard error: 9.47 on 517 degrees of freedom
Multiple R-squared: 0.219, Adjusted R-squared: 0.218
F-statistic: 145 on 1 and 517 DF, p-value: <2e-16
-在同一間學校中,學生的社經地位不同對數學成績沒有顯著差異,在不同學校之間也沒有差異。
<- nlme::lmList(math ~ I(SES - mean.SES) | school.ID, data=dta)
m1 summary(m1)
Call:
Model: math ~ I(SES - mean.SES) | school.ID
Data: dta
Coefficients:
(Intercept)
Estimate Std. Error t value Pr(>|t|)
6053 56.3182 1.29799 43.3889 5.19617e-167
6327 57.3750 3.04405 18.8482 1.60537e-59
6467 56.6000 3.85045 14.6996 1.47370e-40
7194 48.4583 1.75748 27.5726 1.73816e-100
7472 45.7391 1.79528 25.4774 9.01320e-91
7474 53.9412 2.08820 25.8314 2.00944e-92
7801 50.0909 1.83563 27.2881 3.54682e-99
7829 42.1500 1.92523 21.8935 6.82718e-74
7930 53.2500 1.75748 30.2990 7.23805e-113
24371 48.3500 1.92523 25.1139 4.51450e-89
24725 43.5455 1.83563 23.7223 1.55935e-82
25456 49.8636 1.83563 27.1643 1.32091e-98
25642 46.4000 1.92523 24.1011 2.56608e-84
26537 56.3750 2.15247 26.1909 4.26485e-94
46417 55.6957 1.79528 31.0233 4.24448e-116
47583 51.0500 1.92523 26.5164 1.31353e-95
54344 40.9474 1.97524 20.7303 2.17688e-68
62821 62.8209 1.05186 59.7234 1.94763e-222
[ 達到了 getOption("max.print") -- 省略最後 5 列 ]]
I(SES - mean.SES)
Estimate Std. Error t value Pr(>|t|)
6053 0.1515565 2.22314 0.0681722 0.94567734
6327 17.8574119 6.10499 2.9250526 0.00360947
6467 7.7373472 5.23465 1.4781028 0.14004584
7194 2.1019533 4.89361 0.4295305 0.66773279
7472 3.7020715 3.63722 1.0178307 0.30927876
7474 5.6797646 3.51626 1.6152861 0.10691555
7801 -1.2092983 3.49518 -0.3459905 0.72950370
7829 -1.0588653 3.11374 -0.3400626 0.73396038
7930 5.4743511 2.19085 2.4987335 0.01280196
24371 6.2254948 2.12279 2.9327005 0.00352332
24725 6.8469661 2.51820 2.7189945 0.00678882
25456 4.7741287 3.47844 1.3724921 0.17056074
25642 0.0793474 2.69810 0.0294087 0.97655107
26537 2.4715001 4.91554 0.5027936 0.61534339
46417 5.0081506 3.10761 1.6115760 0.10772127
47583 7.8025194 2.92725 2.6654820 0.00795080
54344 3.5927497 3.06511 1.1721445 0.24172909
62821 -1.2538648 2.31096 -0.5425741 0.58767859
[ 達到了 getOption("max.print") -- 省略最後 5 列 ]]
Residual standard error: 8.60987 on 473 degrees of freedom
-當學校的社經地位跟全部學校社經地位相比,每增加一個單位,數學成績增加7.16的差異,在不同學校也是如此。
$gm <- mean(dta$mean.SES)
dta<- lme4::lmer(math ~ I(mean.SES-gm) + (1 | school.ID), data=dta)
m2 print(summary(m2), corr=FALSE)
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ I(mean.SES - gm) + (1 | school.ID)
Data: dta
REML criterion at convergence: 3779.1
Scaled residuals:
Min 1Q Median 3Q Max
-2.6463 -0.7322 -0.0112 0.7252 2.6772
Random effects:
Groups Name Variance Std.Dev.
school.ID (Intercept) 9.89 3.15
Residual 81.36 9.02
Number of obs: 519, groups: school.ID, 23
Fixed effects:
Estimate Std. Error t value
(Intercept) 51.480 0.795 64.75
I(mean.SES - gm) 7.163 1.397 5.13
-個人社經地位與學校平均社經地位的差異,每增加一單位,數學成績增加3.88的差異,在不同學校也是如此。
<- lme4::lmer(math ~ I(SES - mean.SES) + (1 | school.ID), data=dta)
m3 print(summary(m3), corr=FALSE)
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ I(SES - mean.SES) + (1 | school.ID)
Data: dta
REML criterion at convergence: 3758.7
Scaled residuals:
Min 1Q Median 3Q Max
-2.6666 -0.7611 -0.0381 0.7621 2.6289
Random effects:
Groups Name Variance Std.Dev.
school.ID (Intercept) 26.3 5.13
Residual 75.2 8.67
Number of obs: 519, groups: school.ID, 23
Fixed effects:
Estimate Std. Error t value
(Intercept) 50.76 1.15 44.16
I(SES - mean.SES) 3.88 0.61 6.37
Interpret the following plot.
社經地位不同,數學成績也有所不同。在同一學校中,個人社經地位不同對數學成績的表現沒有差異。不同社經地位的學校,學生的數學成績有差異。而同一社經地位學生,在不同學校的數學成績表現會有所不同。
fortify.merMod(m3) %>%
ggplot() +
aes(SES, .fitted, group=school.ID)+
geom_point(aes(SES, math), alpha=.5)+
stat_smooth(method='lm', formula=y~x, se=FALSE,
col=1, size=rel(.5))+
labs(x="SES",
y="Math score")+
theme_minimal()
```
Kreft, I., & De Leeuw, J. (1998). Introducing Multilevel Modeling. Sage Publications.