1 Longitudinal Data Set

1.1 Induction

A person-level data: each person has one record and multiple variables
A person-period data: each person has multiple records.

Our objective is to seek a model that embodies two types of research questions:

level-1 questions about within-person change
level-2 questions about between-person differences in change.

The level-1 component of the multilevel model, also known as the individual growth model, represents the change we expect each member of the population to experience during the time period under study. (each person has a model across the time period).

\(y_{ij} = \pi_{0i} + \pi_{1i}x_{ij} + \epsilon_{ij}\), where i and j represent individuals and occasions/time.

The above model assumes that a straight line adequately represents each person’s true change over time and any deviations from linearity observed in sample data result from random measurement error (\(\epsilon_{ij}\)).

From real data analysis, we could say the stochastic part of level-1 model (\(\epsilon_{ij}\)) is not just measurement errors, which violates the independence of residuals.

Generally, we assume \(\epsilon_{ij}\) ~ \(N(0, \sigma^2)\). This assumption may be less credible in longitudinal data. For level-1 model, residuals may be autocorrealted and the variance may not equal across time. Because the same person is measured on several occasions, any unexplained person-specific time-invariant effect in the residuals will create a correlation across occasions.

Level-2 model codifies the relationship between inter-individual differences and time-invariant characteristics of the individual; which simultaneously account for both the general patterns (between group differences in intercepts and slopes) and inter-individual patterns within groups.

Here, we assume z is binary variable and is a predictor for the individual change.

Level-2 sub-model is

\(\pi_{0i} = \gamma_{00} + \gamma_{01}z_i+\xi_{0i}\), \(\pi_{1i} = \gamma_{10} + \gamma_{11}z_i+\xi_{1i}\).

We call \(\gamma_{00}\), \(\gamma_{01}\), \(\gamma_{10}\) and \(\gamma_{11}\) as fixed effects.

For z=0, we have \(\pi_{0i} = \gamma_{00} +\xi_{0i}\) and \(\pi_{1i} = \gamma_{10}+\xi_{1i}\). \(\gamma_{00}\) represents the average true initial status and \(\gamma_{10}\) is the average true rate of change.

The multilevel model (z=0) is for addressing: what is the average true change in the population with z=0?

For z=1, we have \(\pi_{0i}=\gamma_{00} + \gamma_{01}+\xi_{0i}\), \(\pi_{1i} = \gamma_{10} + \gamma_{11}+\xi_{1i}\). \(\gamma_{01}\) represents the hypothesized difference in average true initial status between groups; \(\gamma_{11}\) is the difference in average true annual rate of change.

The multilevel model (z=1) is for addressing: what is the difference in the average of trajectory of the true change associate with z=1?

We permit the level-2 residuals to be correlated.

\(\xi_{0i}\) and \(\xi_{1i}\) are the deviations of the individual growth parameters from their population average, their population covariance summarizes the association between the true individual intercepts and slopes. The population covariance of the level-2 residuals \(\sigma_{01}\), summarizes the magnitude and direction of the association between true initial status and true annual rate of change, controlling for the z.

\(\sigma_{01}\) addresses: controlling for z=1,are true initial status and true rate of change related?

We assume \(\xi_{0i}\) and \(\xi_{1i}\) are bivariate normal with mean 0 and unknown variance.

1.2 Intraclass Correlation

The reason we cannot use ANOVA on multilevel data is the residuals are NOT independent.

ICC (Intraclass Correlation \(\rho\)): measure of the proportion of variation in the outcome variable that occurs between groups versus the total variation present.

\(\rho=\frac{\tau^2}{\tau^2 + \sigma^2}\), where \(\tau^2\) is the population variance between clusters and \(\sigma^2\) is the population variance within the cluster. Higher ICC indicates the total variation in the outcome is highly associated with the cluster memberships (a relatively strong relationship among the scores for 2 individuals from the same cluster). Another way to say this is individuals from the same cluster are more alike on the measured variable.

\(\sigma^2\) is the weighted average of within-cluster variances. \(\tau^2\) could also be viewed as the impact of the cluster on the dependent variable; and testing it for significance is equivalent to testing the cluster has no impact on the dependent variable.

Larger ICC values are indicative of a greater impact of clustering. Thus, as the ICC increases, we must be more cognizant of employing multilevel modeling strategies in data analysis.

In practice, if we find that \(\tau^2\) is not 0, we must be careful in describing the relationship between the independent and dependent variables, as it is not the same for all clusters.

1.3 Centering

Why? - reduce collinearity caused by including an interaction term in a model;

1.4 MLE and REML

A variant MLE known as restricted MLE has proven more accurate than MLE for estimating variance parameters.

As the number of level-2 cluster increases, the difference in value for MLE and REML estimates become very small.

1.5 Assumptions of Multilevel Model

level-2 residuals are independent between clusters;
level-2 intercepts and slopes are assumed to be independent of the level-1 residuals;
level-1 residuals are normally distributed and have constant variance;
level-2 intercepts and slopes have a multivariate normal distribution with constant covariance matrix.

2 2-level Models

Some notations:

basic lm: \(y=\beta_0 + \beta_1x+\epsilon\)
group-specific intercepts and slopes: \(y_{ij}=\beta_{0j} + \beta_{1j}x + \epsilon_{ij}\)
predicting the outcome only from the intercept: \(y_{ij}=\beta_{0j}+ \epsilon_{ij}\)
allowing the intercept differ across clusters, leads to the random intercept: \(\beta_{0j}=\gamma_{00} + U_{0j}\), \(\gamma_{00}\) is the average or general intercept value that holds across cluster and \(U_{0j}\) is group-specific effect on the intercept. \(\gamma_{00}\) is fixed since it remains constant across all cluster and \(U_{0j}\) is random because it varies from cluster to cluster.

Level-1 model: \(y_{ij}=\beta_{0j} + \beta_{1j}x + \epsilon_{ij}\); Level-2 model: \(\beta_{0j}=\gamma_{00} + U_{0j}\), \(\beta_{1j}=\gamma_{10}\)

2.1 Achieve Data

Random effect: school
Fixed effect: vocabulary scores

2.1.1 Model 1

No independent variable.

m1 <- lme(fixed = geread ~ 1, random = ~1|school, data = Achieve)
summary(m1)

## Linear mixed-effects model fit by REML
##   Data: Achieve 
##        AIC      BIC    logLik
##   46274.31 46296.03 -23134.15
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept) Residual
## StdDev:   0.6257119  2.24611
## 
## Fixed effects:  geread ~ 1 
##                Value  Std.Error    DF t-value p-value
## (Intercept) 4.306753 0.05497501 10160 78.3402       0
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -2.3229469 -0.6377948 -0.2137753  0.2849664  3.8811630 
## 
## Number of Observations: 10320
## Number of Groups: 160

The above result shows the ICC is 0.6257/(0.6257+2.2461)=0.218, which means the correlation of reading test scores among students within the same school is 0.22.

2.1.2 Model 2

m2 <- lme(fixed = geread ~ gevocab, random = ~1|school, data = Achieve)
summary(m2)

## Linear mixed-effects model fit by REML
##   Data: Achieve 
##       AIC      BIC   logLik
##   43145.2 43174.17 -21568.6
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept) Residual
## StdDev:   0.3158785  1.94074
## 
## Fixed effects:  geread ~ gevocab 
##                 Value  Std.Error    DF  t-value p-value
## (Intercept) 2.0233559 0.04930868 10159 41.03447       0
## gevocab     0.5128977 0.00837268 10159 61.25850       0
##  Correlation: 
##         (Intr)
## gevocab -0.758
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.0822506 -0.5734728 -0.2103488  0.3206692  4.4334337 
## 
## Number of Observations: 10320
## Number of Groups: 160

The above code random=~1|school indicates “1” stands for the intercept, and only random intercept model will used and that the random intercept varies within school. Each school will have a model geread ~ gevocab.

model 2 is better than model 1 based on AIC, BIC and LogLik. To estimate the proportion of variance in the outcome variable accounted for at each level of model, for level 1 model, we have the following:

\(R^2=1-\frac{0.3158785 +1.94074}{0.6257119+2.24611}=0.214\), which tells us that model 1 explains around 21% of the variance in the reading score.

For level-2 model, we have \(R_2^2=1-\frac{0.3158785 +1.94074/B}{0.6257119+2.24611/B}\), where B=10320(#. of rows)/160(unique schools)=64.5, which leads to \(R_2^2\) is 47.6%.

2.1.3 Model 3

The impact of independent variable on the dependent is allowed to vary across the Level 2 effects. In the current example, this would mean that we allow the impact of gevocab on geread to vary from one school to another.

m3 <- lme(fixed = geread ~ gevocab, random=~gevocab|school, data = Achieve)
summary(m3)

## Linear mixed-effects model fit by REML
##   Data: Achieve 
##        AIC     BIC    logLik
##   43004.85 43048.3 -21496.43
## 
## Random effects:
##  Formula: ~gevocab | school
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 0.5316531 (Intr)
## gevocab     0.1389443 -0.859
## Residual    1.9146626       
## 
## Fixed effects:  geread ~ gevocab 
##                 Value  Std.Error    DF  t-value p-value
## (Intercept) 2.0057064 0.06108786 10159 32.83314       0
## gevocab     0.5203554 0.01441548 10159 36.09699       0
##  Correlation: 
##         (Intr)
## gevocab -0.866
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.7102091 -0.5674433 -0.2074358  0.3176380  4.6774569 
## 
## Number of Observations: 10320
## Number of Groups: 160

intervals(m3)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                 lower      est.     upper
## (Intercept) 1.8859621 2.0057064 2.1254506
## gevocab     0.4920982 0.5203554 0.5486126
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: school 
##                               lower       est.      upper
## sd((Intercept))           0.4250700  0.5316531  0.6649611
## sd(gevocab)               0.1153701  0.1389443  0.1673356
## cor((Intercept),gevocab) -0.9178709 -0.8585096 -0.7615768
## 
##  Within-group standard error:
##    lower     est.    upper 
## 1.888327 1.914663 1.941365

The result shows that gevocab is significantly related to geread across schools. The estimated coefficient of gecovab is 0.52, which can be interpreted as the average impact of the predictor on the outcome across schools. 0.1389443 represents \(\tau^2_1\) and reflects the variation in coefficients across schools. This tells how the relationship between geread and gevocab varies from school to school. As before, we have \(\sigma^2=1.91466\) and \(\tau^2=0.53165\). Taken together, the largest source of random variation in geread is variation among students within schools, with lesser variation from differences in the conditional mean (intercept) and coefficient for gevocab across shcools.

2.1.4 Model 4

Adding age: if we like to see whether the age of a student impacts reading performance and want to know this effect to vary from one school to another.

m4 <- lme(fixed = geread ~ gevocab+age, random=~gevocab+age|school, data = Achieve)
summary(m4)

## Linear mixed-effects model fit by REML
##   Data: Achieve 
##        AIC      BIC    logLik
##   43015.77 43088.18 -21497.88
## 
## Random effects:
##  Formula: ~gevocab + age | school
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev      Corr         
## (Intercept) 0.492487773 (Intr) gevocb
## gevocab     0.137976606 -0.073       
## age         0.006387997 -0.649 -0.601
## Residual    1.914030473              
## 
## Fixed effects:  geread ~ gevocab + age 
##                  Value Std.Error    DF  t-value p-value
## (Intercept)  2.9614055 0.4151887 10158  7.13267  0.0000
## gevocab      0.5191496 0.0143563 10158 36.16175  0.0000
## age         -0.0088390 0.0038396 10158 -2.30208  0.0214
##  Correlation: 
##         (Intr) gevocb
## gevocab -0.095       
## age     -0.989 -0.032
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.6805531 -0.5687008 -0.2091071  0.3180583  4.6850735 
## 
## Number of Observations: 10320
## Number of Groups: 160

intervals(m4, which = "fixed")

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                   lower         est.        upper
## (Intercept)  2.14755358  2.961405543  3.775257505
## gevocab      0.49100837  0.519149589  0.547290809
## age         -0.01636526 -0.008838966 -0.001312672
## attr(,"label")
## [1] "Fixed effects:"

Age is also significantly associated with geread. Age has negative effect on gevocab. 0.006387997 indicates that the relationship of vocabulary on reading varies more across schools than dose the impact of age.

3 Three or more levels

We add a new layer classroom: we may assume that at least a portion of a students’ performance on a reading test is due to the classroom in which he or she learns (the quality of the teacher, the presence of distuptive students, etc…).

3.1 Simple Three-level Model

3.1.1 Model 1

The nested structure in nlme is defined as A/B where A is the higher data unit (school) and B is the lower unit (classroom).

m1 <- lme(fixed = geread ~ 1,random = ~1|school/class, data = dat)
summary(m1)

## Linear mixed-effects model fit by REML
##   Data: dat 
##     AIC      BIC logLik
##   46154 46182.97 -23073
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept)
## StdDev:   0.5583923
## 
##  Formula: ~1 | class %in% school
##         (Intercept) Residual
## StdDev:   0.5221676 2.201589
## 
## Fixed effects:  geread ~ 1 
##               Value  Std.Error   DF  t-value p-value
## (Intercept) 4.30806 0.05499164 9752 78.34027       0
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -2.3052003 -0.6289603 -0.2093700  0.3049085  3.8673234 
## 
## Number of Observations: 10320
## Number of Groups: 
##            school class %in% school 
##               160               568

intervals(m1)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                lower    est.    upper
## (Intercept) 4.200265 4.30806 4.415855
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: school 
##                     lower      est.     upper
## sd((Intercept)) 0.4702517 0.5583923 0.6630533
##   Level: class 
##                     lower      est.     upper
## sd((Intercept)) 0.4545912 0.5221676 0.5997895
## 
##  Within-group standard error:
##    lower     est.    upper 
## 2.170908 2.201589 2.232704

~1|school: model the intercept to vary across schools. Reading scores vary across schools.
~1|class %in% school: model the intercept to vary across classrooms within schools. Reading scores vary across classrooms.

3.1.2 Model 2

Adding more independent variables.

m2 <- lme(fixed = geread ~ gevocab + clenroll+cenroll, random = ~1|school/class, data = dat)
summary(m2)

## Linear mixed-effects model fit by REML
##   Data: dat 
##        AIC      BIC    logLik
##   43144.87 43195.56 -21565.43
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept)
## StdDev:   0.2766194
## 
##  Formula: ~1 | class %in% school
##         (Intercept) Residual
## StdDev:   0.3007871 1.922991
## 
## Fixed effects:  geread ~ gevocab + clenroll + cenroll 
##                  Value  Std.Error   DF  t-value p-value
## (Intercept)  1.6751266 0.20809605 9751  8.04978  0.0000
## gevocab      0.5075566 0.00842654 9751 60.23313  0.0000
## clenroll     0.0189860 0.00955860  407  1.98628  0.0477
## cenroll     -0.0000037 0.00000364  158 -1.02193  0.3084
##  Correlation: 
##          (Intr) gevocb clnrll
## gevocab  -0.124              
## clenroll -0.961 -0.062       
## cenroll  -0.134  0.025 -0.007
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.2211629 -0.5672782 -0.2079045  0.3183508  4.4736276 
## 
## Number of Observations: 10320
## Number of Groups: 
##            school class %in% school 
##               160               568

intervals(m2)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                     lower          est.        upper
## (Intercept)  1.267215e+00  1.675127e+00 2.083038e+00
## gevocab      4.910389e-01  5.075566e-01 5.240744e-01
## clenroll     1.956547e-04  1.898604e-02 3.777642e-02
## cenroll     -1.091387e-05 -3.721429e-06 3.471016e-06
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: school 
##                     lower      est.     upper
## sd((Intercept)) 0.2173971 0.2766194 0.3519749
##   Level: class 
##                     lower      est.     upper
## sd((Intercept)) 0.2409209 0.3007871 0.3755294
## 
##  Within-group standard error:
##    lower     est.    upper 
## 1.896210 1.922991 1.950151

From the AIC/BIC, this model is better than the previous one;
Though the estimates for random intercepts of classroom nested in school and school decreased in value from the null model, suggesting that when we account for the three fixed effects, some of the mean differences between schools and between classrooms are accounted for.
\(R^2=1 - \frac{1.922991+0.3007871}{2.201589+0.5221697}=0.184\): adding the three variables explains approximately 18% of the variance in the reading score above and beyond the null model.

3.1.3 Model 3

Adding an interaction term.

m3 <- lme(fixed = geread ~ gevocab + clenroll+cenroll+gevocab*cenroll, random = ~1|school/class, data = dat)
summary(m3)

## Linear mixed-effects model fit by REML
##   Data: dat 
##        AIC      BIC    logLik
##   43167.75 43225.69 -21575.88
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept)
## StdDev:   0.2740961
## 
##  Formula: ~1 | class %in% school
##         (Intercept) Residual
## StdDev:   0.2975919 1.923059
## 
## Fixed effects:  geread ~ gevocab + clenroll + cenroll + gevocab * cenroll 
##                      Value  Std.Error   DF  t-value p-value
## (Intercept)      1.7515430 0.20999286 9750  8.34096  0.0000
## gevocab          0.4899998 0.01168332 9750 41.94013  0.0000
## clenroll         0.0188007 0.00951172  407  1.97659  0.0488
## cenroll         -0.0000132 0.00000563  158 -2.33721  0.0207
## gevocab:cenroll  0.0000023 0.00000107 9750  2.18957  0.0286
##  Correlation: 
##                 (Intr) gevocb clnrll cenrll
## gevocab         -0.203                     
## clenroll        -0.949 -0.041              
## cenroll         -0.212  0.542  0.000       
## gevocab:cenroll  0.166 -0.693 -0.007 -0.766
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.1901563 -0.5682666 -0.2060729  0.3183307  4.4723839 
## 
## Number of Observations: 10320
## Number of Groups: 
##            school class %in% school 
##               160               568

intervals(m3)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                         lower          est.         upper
## (Intercept)      1.339913e+00  1.751543e+00  2.163172e+00
## gevocab          4.670981e-01  4.899998e-01  5.129015e-01
## clenroll         1.025107e-04  1.880075e-02  3.749899e-02
## cenroll         -2.427125e-05 -1.315470e-05 -2.038145e-06
## gevocab:cenroll  2.450820e-07  2.339608e-06  4.434134e-06
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: school 
##                     lower      est.    upper
## sd((Intercept)) 0.2149691 0.2740961 0.349486
##   Level: class 
##                     lower      est.     upper
## sd((Intercept)) 0.2375104 0.2975919 0.3728718
## 
##  Within-group standard error:
##    lower     est.    upper 
## 1.896272 1.923059 1.950225

The statistically significant interaction term indicates that the impact of student vocabulary score on reading achievement is dependent to some degree on the size of the school.
From the AIC/BIC, we can see that including the interaction term does not provide us a better fitting model. Hence, we should decide whether the primary goal is to develop an optimally fitting model or explore relationships in the data.
\(R^2=1 - \frac{1.923+0.0.2976}{2.201589+0.5221697}=0.185\): which is very similar to the model without the interaction term.

3.1.4 Model with More than 3 levels

corp is like district.

m4 <- lme(fixed = geread ~ 1, random = ~1|corp/school/class, data = dat)
summary(m4)

## Linear mixed-effects model fit by REML
##   Data: dat 
##        AIC      BIC    logLik
##   46113.22 46149.43 -23051.61
## 
## Random effects:
##  Formula: ~1 | corp
##         (Intercept)
## StdDev:   0.4209979
## 
##  Formula: ~1 | school %in% corp
##         (Intercept)
## StdDev:    0.295833
## 
##  Formula: ~1 | class %in% school %in% corp
##         (Intercept) Residual
## StdDev:   0.5247746 2.201587
## 
## Fixed effects:  geread ~ 1 
##               Value  Std.Error   DF  t-value p-value
## (Intercept) 4.32583 0.07197848 9752 60.09894       0
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -2.2995234 -0.6304742 -0.2130696  0.3028624  3.9448255 
## 
## Number of Observations: 10320
## Number of Groups: 
##                        corp            school %in% corp 
##                          59                         160 
## class %in% school %in% corp 
##                         568

intervals(m4)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                lower    est.    upper
## (Intercept) 4.184738 4.32583 4.466923
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: corp 
##                    lower      est.     upper
## sd((Intercept)) 0.321723 0.4209979 0.5509065
##   Level: school 
##                     lower     est.     upper
## sd((Intercept)) 0.2003532 0.295833 0.4368144
##   Level: class 
##                     lower      est.     upper
## sd((Intercept)) 0.4578135 0.5247746 0.6015295
## 
##  Within-group standard error:
##    lower     est.    upper 
## 2.170912 2.201587 2.232695

3.1.5 Random coefficient model with more than 3 levels

m5 <- lme(fixed = geread ~ gevocab + gender, random = ~gender|school/class, data = dat)
summary(m5)

## Linear mixed-effects model fit by REML
##   Data: dat 
##        AIC      BIC    logLik
##   43127.93 43200.35 -21553.97
## 
## Random effects:
##  Formula: ~gender | school
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 0.2207298 (Intr)
## gender      0.1101379 -0.019
## 
##  Formula: ~gender | class %in% school
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev      Corr  
## (Intercept) 0.302941367 (Intr)
## gender      0.004914493 -0.022
## Residual    1.922517135       
## 
## Fixed effects:  geread ~ gevocab + gender 
##                 Value  Std.Error   DF  t-value p-value
## (Intercept) 2.0150114 0.07570370 9750 26.61708  0.0000
## gevocab     0.5091256 0.00840838 9750 60.54976  0.0000
## gender      0.0175517 0.03929581 9750  0.44666  0.6551
##  Correlation: 
##         (Intr) gevocb
## gevocab -0.527       
## gender  -0.758  0.039
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.2117003 -0.5676457 -0.2072293  0.3161412  4.4474380 
## 
## Number of Observations: 10320
## Number of Groups: 
##            school class %in% school 
##               160               568

If we like to hypothesize that the relationship of gender and reading varies across classrooms but not across schools. We have the following model.

m6 <- lme(fixed = geread ~ gevocab + gender, random = list(school=~1, class=~gender), data = dat)
summary(m6)

## Linear mixed-effects model fit by REML
##   Data: dat 
##        AIC      BIC    logLik
##   43125.18 43183.11 -21554.59
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept)
## StdDev:   0.2737238
## 
##  Formula: ~gender | class %in% school
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 0.3623693 (Intr)
## gender      0.1651168 -0.563
## Residual    1.9215122       
## 
## Fixed effects:  geread ~ gevocab + gender 
##                 Value  Std.Error   DF  t-value p-value
## (Intercept) 2.0128841 0.07726503 9750 26.05168  0.0000
## gevocab     0.5090473 0.00841459 9750 60.49582  0.0000
## gender      0.0190565 0.03880626 9750  0.49107  0.6234
##  Correlation: 
##         (Intr) gevocb
## gevocab -0.516       
## gender  -0.768  0.039
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.2117225 -0.5676842 -0.2072084  0.3182780  4.4324392 
## 
## Number of Observations: 10320
## Number of Groups: 
##            school class %in% school 
##               160               568

in the random list, the first grouping variable should be the higher level unit (school);

4 Chapter 5 Longitudinal Data Using Multilevel Models

Level 1 model: \(Y_{it}=\pi_{0i} + \pi_{1i}T_{it}+ \pi_{2i}X_{it}+\epsilon_{it}\) \(\pi_{0i} = \beta_{00} + \beta_{01}Z_i+ r_{0i}\)

Level 2 model: \(\pi_{1i} = \beta_{10} + r{1i}\) \(\pi_{2i} = \beta_{20} + r_{2i}\)

\(Y_{it}\) is the outcome variable for individual i at time t;
\(T_{it}\) is the time predictor variable: the only one to make multilevel model longitudinal; it is often worthwhile to rescale it so that the first measurement occasion is the zero point;
\(X_{it}\) is the time-varying predictor;
\(Z_{it}\) is the time-invariant predictor: measured at only one point and the value does not change across measurement occaions (ex. gender).

Now, we have time predictor, time-varying predictor and time-invariant predictors.

Time-varying predictors should appear at level 1 since they are associated with specific measurements, and time-invariant predictors should appear at level 2 or higher because they are associated with individuals across all measurement conditions.

4.1 Example: LANG

4.1.1 Background

total language achievement scores measured over 6 occasions (outcome variable);
4 language subtest scores

Using stack function in R to transform data to person-period format, shows as the following code:

Lang <- read.csv("Language.csv", header = TRUE)
LangPP <- data.frame(ID=Lang$ID, school = Lang$school, Process = Lang$Process, Application = Lang$Application, Grammar = Lang$Grammar, Mechanics = Lang$Goal4RitScoref08, stack(Lang, select = LangScore1:LangScore6))
names(LangPP)[c(7)] <- "Language"

The above code creates a new time variable named “ind”.

Change time variable as continuous numeric variable as follows:

LangPP$Time <- car::Recode(LangPP$ind, '"LangScore1"=0; "LangScore2"=1; "LangScore3"=2; "LangScore4"=3; "LangScore5"=4; "LangScore6"=5;', as.factor=FALSE)

head(LangPP)

##   ID school Process Application Grammar Mechanics Language        ind Time
## 1  1      4     231         217     226       222      224 LangScore1    0
## 2  2      4     213         209     204       220      211 LangScore1    0
## 3  3      4     226         227     220       222      223 LangScore1    0
## 4  4      4     211         223     214       203      213 LangScore1    0
## 5  5      4     218         209     219       217      216 LangScore1    0
## 6  6      4     183         180     188       180      183 LangScore1    0

4.1.2 Fitting Model with nmle

4.1.2.1 M1

m1 <- lme(fixed = Language ~ Time, random = ~1|ID, data = LangPP)
summary(m1)

## Linear mixed-effects model fit by REML
##   Data: LangPP 
##        AIC      BIC    logLik
##   135173.6 135204.9 -67582.82
## 
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    15.23617 7.526427
## 
## Fixed effects:  Language ~ Time 
##                 Value  Std.Error    DF  t-value p-value
## (Intercept) 197.21573 0.29356329 15189 671.7997       0
## Time          3.24619 0.03264194 15189  99.4483       0
##  Correlation: 
##      (Intr)
## Time -0.278
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -6.40395610 -0.49863770  0.03877858  0.56691711  4.86362966 
## 
## Number of Observations: 18228
## Number of Groups: 3038

4.1.2.2 M2

Adding predictors to the model is the same way as previous stated, whether they are time varying or time invariant.

nd <- LangPP[complete.cases(LangPP), ]
m2 <- lme(fixed = Language ~ Time + Grammar, random = ~1|ID, data = nd)
summary(m2)

## Linear mixed-effects model fit by REML
##   Data: nd 
##        AIC      BIC    logLik
##   130031.1 130070.2 -65010.56
## 
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:     6.01123 7.526131
## 
## Fixed effects:  Language ~ Time + Grammar 
##                Value Std.Error    DF   t-value p-value
## (Intercept) 73.70355 1.0913675 15179  67.53321       0
## Time         3.24548 0.0326514 15179  99.39795       0
## Grammar      0.63089 0.0055231  3034 114.22629       0
##  Correlation: 
##         (Intr) Time  
## Time    -0.075       
## Grammar -0.991  0.000
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -6.62860687 -0.52600469  0.03741845  0.57880493  4.67614317 
## 
## Number of Observations: 18216
## Number of Groups: 3036

Grammar is also statistically significantly related to language test scores, meaning that measurement occasions revealing higher Grammar scores also demonstrated higher Language scores.

4.1.2.3 M3

To allow the growth rate to vary randomly across individuals:

m3 <- lme(fixed = Language ~ Time + Grammar, random = ~Time|ID, data = nd)
summary(m3)

## Linear mixed-effects model fit by REML
##   Data: nd 
##        AIC      BIC    logLik
##   128617.1 128671.7 -64301.54
## 
## Random effects:
##  Formula: ~Time | ID
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 4.645126 (Intr)
## Time        1.792940 0.026 
## Residual    6.737546       
## 
## Fixed effects:  Language ~ Time + Grammar 
##                Value Std.Error    DF   t-value p-value
## (Intercept) 54.47082 1.0025930 15179  54.32994       0
## Time         3.24548 0.0437406 15179  74.19834       0
## Grammar      0.72912 0.0050825  3034 143.45691       0
##  Correlation: 
##         (Intr) Time  
## Time    -0.047       
## Grammar -0.993  0.000
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -5.316263678 -0.523957279 -0.001213543  0.536332069  5.032113994 
## 
## Number of Observations: 18216
## Number of Groups: 3036

intervals(m3)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                  lower       est.      upper
## (Intercept) 52.5056194 54.4708223 56.4360252
## Time         3.1597459  3.2454828  3.3312197
## Grammar      0.7191527  0.7291182  0.7390837
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: ID 
##                            lower       est.      upper
## sd((Intercept))       4.43512829 4.64512564 4.86506607
## sd(Time)              1.72057864 1.79294012 1.86834487
## cor((Intercept),Time) 0.01112055 0.02613533 0.04113833
## 
##  Within-group standard error:
##    lower     est.    upper 
## 6.655244 6.737546 6.820867

Time is significant, indicates that growth rate in language over the 6 time points do differ across individuals in the sample.

4.1.2.4 M4

Adding a third level of data.

m4 <- lme(fixed = Language ~ Time + Grammar, random = ~1|school/ID, data = nd)
summary(m4)

## Linear mixed-effects model fit by REML
##   Data: nd 
##      AIC      BIC logLik
##   129906 129952.9 -64947
## 
## Random effects:
##  Formula: ~1 | school
##         (Intercept)
## StdDev:    1.864035
## 
##  Formula: ~1 | ID %in% school
##         (Intercept) Residual
## StdDev:    5.781975 7.526131
## 
## Fixed effects:  Language ~ Time + Grammar 
##                Value Std.Error    DF   t-value p-value
## (Intercept) 74.44782 1.2133745 15179  61.35601       0
## Time         3.24548 0.0326514 15179  99.39795       0
## Grammar      0.62608 0.0059030  3000 106.06152       0
##  Correlation: 
##         (Intr) Time  
## Time    -0.067       
## Grammar -0.951  0.000
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -6.66711402 -0.52788280  0.03525493  0.58085294  4.67186329 
## 
## Number of Observations: 18216
## Number of Groups: 
##         school ID %in% school 
##             35           3036

Comparing with and without school with ANOVA

anova(m2,m4)

##    Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## m2     1  5 130031.1 130070.2 -65010.56                        
## m4     2  6 129906.0 129952.9 -64947.00 1 vs 2 127.1166  <.0001

Including school leads to better model fit.

4.2 Changing Covariance Structure

\(\bullet\) The corAR1 error structure is a first-order autoregressive error structure for situations when time is measured in fixed intervals \(\bullet\) The corCAR1 error structure is a first-order autoregressive error structure for situations when time is measured in varying intervals \(\bullet\) The corARMA error structure is an error structure that incorporates both an autoregressive component, and a moving average process

If we believe that the relationship among the longitudinal measures follow an autoregressive process we should use the following code:

m5 <- lme(fixed = Language ~ Time, random = ~1|ID, correlation = corAR1(), data = nd)
summary(m5)

## Linear mixed-effects model fit by REML
##   Data: nd 
##        AIC    BIC    logLik
##   134812.9 134852 -67401.47
## 
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    15.12726 7.826729
## 
## Correlation Structure: AR(1)
##  Formula: ~1 | ID 
##  Parameter estimate(s):
##      Phi 
## 0.187954 
## Fixed effects:  Language ~ Time 
##                 Value  Std.Error    DF  t-value p-value
## (Intercept) 196.93995 0.29731496 15179 662.3950       0
## Time          3.32366 0.03678136 15179  90.3625       0
##  Correlation: 
##      (Intr)
## Time -0.309
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -6.04160440 -0.47534079  0.05309946  0.56304378  4.68692543 
## 
## Number of Observations: 18216
## Number of Groups: 3036

intervals(m5)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                 lower       est.      upper
## (Intercept) 196.35718 196.939953 197.522726
## Time          3.25156   3.323656   3.395752
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: ID 
##                    lower     est.    upper
## sd((Intercept)) 14.72836 15.12726 15.53697
## 
##  Correlation structure:
##         lower     est.     upper
## Phi 0.1646125 0.187954 0.2110851
## attr(,"label")
## [1] "Correlation structure:"
## 
##  Within-group standard error:
##    lower     est.    upper 
## 7.718620 7.826729 7.936352

The point estimate for the autocorrelation is 0.188 with 95% CI (0.165, 0.211), meaning this autocorrelation is significantly different from 0. This result indicates a positive relationship between adjacent pairs of scores so that a relatively higher score at one point in time is associated with a relatively higher socre at the next point in time.

When adding an error structure to a random coefficients model, the random coefficients structure must be specified in the syntax for specifying the correlation as follows:

m6 <- lme(fixed = Language ~ Time, random = ~Time|ID, correlation = corAR1(form = ~Time|ID), data = nd)
summary(m6)

## Linear mixed-effects model fit by REML
##   Data: nd 
##        AIC      BIC   logLik
##   133477.4 133532.1 -66731.7
## 
## Random effects:
##  Formula: ~Time | ID
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 18.405189 (Intr)
## Time         1.876833 -0.729
## Residual     6.567955       
## 
## Correlation Structure: AR(1)
##  Formula: ~Time | ID 
##  Parameter estimate(s):
##         Phi 
## -0.08333152 
## Fixed effects:  Language ~ Time 
##                 Value Std.Error    DF  t-value p-value
## (Intercept) 197.34089 0.3439154 15179 573.8064       0
## Time          3.21335 0.0436252 15179  73.6582       0
##  Correlation: 
##      (Intr)
## Time -0.677
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -5.79781447 -0.49041475  0.01559593  0.52065744  5.17467096 
## 
## Number of Observations: 18216
## Number of Groups: 3036

intervals(m6)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                  lower       est.      upper
## (Intercept) 196.666771 197.340887 198.015003
## Time          3.127841   3.213352   3.298863
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: ID 
##                            lower       est.      upper
## sd((Intercept))       17.9215268 18.4051892 18.9019047
## sd(Time)               1.7972087  1.8768326  1.9599842
## cor((Intercept),Time) -0.7527637 -0.7290973 -0.7035486
## 
##  Correlation structure:
##          lower        est.      upper
## Phi -0.1136616 -0.08333152 -0.0528463
## attr(,"label")
## [1] "Correlation structure:"
## 
##  Within-group standard error:
##    lower     est.    upper 
## 6.473317 6.567955 6.663977

4.3 Benefits of Multilevel Modeling

allows the simultaneous modeling of both intra-individual change (how an individual change over time) and inter-individual change (differences in temporal change across individuals).
Utilize available data from incomplete observations, not reducing sample size dramatically as other approaches and not requiring complex techniques for handling missing data.

5 Plots

5.1 dotplot

Using package lattice.

dat <- Achieve[Achieve$corp == 940 & Achieve$school==767, ]
dotplot(class ~ geread, data = dat, jitter.y=TRUE)

We could re-order the classroom with ascending order of geread.

dotplot(reorder(class, geread)~geread, data = dat)

Classroom 3 is more homogeneous (smaller variance) and lower performing (smaller mean) than classroom 2 and 4.

One method for visualizing such large and complex data would be to focus on a higher level of data aggregation.

Achieve <- cbind(Achieve, Classroom_unique = paste(Achieve$corp, Achieve$school, Achieve$class, sep=""))
Achieve.class <- aggregate(Achieve, by = list(Achieve$Classroom_unique), FUN = mean)

dotplot(reorder(corp, geread)~geread, data = Achieve.class, jitter.y=TRUE)

The above aggregating is just for plot, NOT for modeling the data.

Mixed Effect Model

Yi Huang

22 March, 2022