Hypotheses

The reading score gap is conceptualised as the difference between first-language English-speakers (E1; English is L1) and second-/additional-language English-speakers (EL; English is L2). It is operationalised as the regression coefficient for EL, which considers balanced bilinguals as E1.

Level 1 Hypotheses

  • H\(_1\): First-language English-speakers have a significantly higher reading score than ELLs (fixed effect of variable EL).
  • H\(_2\): The reading score gap is bigger for boys than for girls (interaction effect of Sex and EL).

Level 2 Hypotheses

  • H\(_3\): The reading score gap is bigger in schools with more ELs (fixed effect of pEL).
  • H\(_4\): The reading score gap is lower in schools who offer parent education programmes in English (fixed effect of ParentProgram).

Cross-level Interaction Hypothesis

  • H\(_5\): The offering of ESL programmes reduces the reading score gap between ELs and native English speakers; in other words, schools with a reading specialist experience a smaller difference between ELs and E1 students (interaction effect between EL and ReadingSpecialist).

Design

The study design is a multilevel design with 2 levels. Students (level 1) are nested within schools (level 2).

Sample Description

The reduced eclsk dataset contains data from 18174 children (9288 boys and 8847 girls) in 5th grade, attending 2975 different schools. After excluding X children due to missing data (specify), the final sample comprises 8067 children, 4082 boys and 3985 girls from 1920 different schools. At the time of reading testing, they were aged between 9.5558333 and 13.1258333 months (M = 11.0964148, SD = 0.367585),

Results

Descriptive Results

## # A tibble: 2 x 3
##      EL     n `Mean Reading Score`
##   <dbl> <int>                <dbl>
## 1     0  6758                 1.51
## 2     1  1309                 1.34

Model Building

First, I built an intercept-only model with random intercepts: m0. This model serves as the benchmark model, against which more complex models can be evaluated. \[ ReadingScore_{ij} = \gamma_{00} + u_{0j} + e_{ij} \]

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: ReadingScore ~ (1 | School)
##    Data: data
## 
##      AIC      BIC   logLik deviance df.resid 
##     5608     5629    -2801     5602     8064 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.5707 -0.6222  0.0112  0.6859  2.5805 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  School   (Intercept) 0.02456  0.1567  
##  Residual             0.10208  0.3195  
## Number of obs: 8067, groups:  School, 1920
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept) 1.453e+00  5.737e-03 1.163e+03   253.2   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The intra-class coefficient of m0, \(\rho\) = 0.1939095, tell us that 19.39 percent of the total variance observed is explained by level two groupings, i.e. by differences between schools.

Next, I built a model containing all individual-level (level 1) predictors, m1. This model includes EL (binary variable indicating whether or not the student is an EL), Sex, and ESL (a binary variable indicating whether or not the student takes part in an ESL program).

\[ ReadingScore_{ij} = \gamma_{00} + \gamma_{10}EL_{ij} + \gamma_{20}Sex_{ij} + \gamma_{30}ESL_{ij} + u_{0j} + e_{ij} \]

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: ReadingScore ~ EL + Sex + ESL + (1 | School)
##    Data: data
## 
##      AIC      BIC   logLik deviance df.resid 
##   5360.3   5402.2  -2674.2   5348.3     7949 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6176 -0.6192  0.0101  0.6826  2.4526 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  School   (Intercept) 0.01933  0.1390  
##  Residual             0.10191  0.3192  
## Number of obs: 7955, groups:  School, 1905
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)  1.468e+00  6.921e-03  2.184e+03 212.161  < 2e-16 ***
## EL          -1.493e-01  1.224e-02  7.360e+03 -12.202  < 2e-16 ***
## Sex          2.420e-02  7.505e-03  7.633e+03   3.225  0.00127 ** 
## ESL          3.948e-02  1.371e-02  7.893e+03   2.879  0.00400 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr) EL     Sex   
## EL  -0.229              
## Sex -0.525  0.007       
## ESL -0.119 -0.394 -0.026

Given some missing data for ESL, models m0 and m1 cannot be compared using the anova function, but other indicators suggest that m1 is the preferable model here.

Model m1a is a refined version of model m1, with the interaction effect of _EL*Sex_ (testing H\(_2\)). \[ ReadingScore_{ij} = \gamma_{00} + \gamma_{10}EL_{ij} + \gamma_{20}Sex_{ij} + \gamma_{30}ESL_{ij} + \gamma_{30}EL_{ij}*Sex_{ij} + u_{0j} + e_{ij} \]

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: ReadingScore ~ EL * Sex + ESL + (1 | School)
##    Data: data
## 
##      AIC      BIC   logLik deviance df.resid 
##   5362.3   5411.2  -2674.2   5348.3     7948 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6169 -0.6189  0.0100  0.6828  2.4518 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  School   (Intercept) 0.01933  0.1390  
##  Residual             0.10191  0.3192  
## Number of obs: 7955, groups:  School, 1905
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)  1.469e+00  7.102e-03  2.394e+03 206.810  < 2e-16 ***
## EL          -1.509e-01  1.582e-02  7.866e+03  -9.537  < 2e-16 ***
## Sex          2.370e-02  8.175e-03  7.558e+03   2.899  0.00375 ** 
## ESL          3.944e-02  1.372e-02  7.893e+03   2.875  0.00405 ** 
## EL:Sex       3.149e-03  2.057e-02  7.851e+03   0.153  0.87830    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##        (Intr) EL     Sex    ESL   
## EL     -0.315                     
## Sex    -0.559  0.256              
## ESL    -0.120 -0.291 -0.015       
## EL:Sex  0.224 -0.634 -0.397 -0.022

Model m1a does not fit better compared to m1, with \(\chi^2\)(1) = 0.0234464, p = 0.8783018 and the interaction effect _EL*Sex_ is non-signifiant. Thus, I continued with model m1.

## # Comparison of Model Performance Indices
## 
## Model |            Type |     AIC |     BIC | R2_conditional | R2_marginal |  ICC | RMSE
## ----------------------------------------------------------------------------------------
## m0    | lmerModLmerTest | 5608.01 | 5628.99 |           0.19 |        0.00 | 0.19 | 0.30
## m1    | lmerModLmerTest | 5360.34 | 5402.23 |           0.18 |        0.02 | 0.16 | 0.31

In a next step, I added all school-level (level 2) estimators, pEL and ParentProgram, resulting in m2 (with both level 1 and 2 predictors).

\[ ReadingScore_{ij} = \gamma_{00} + \gamma_{10}EL_{ij} + \gamma_{20}Sex_{ij} + \gamma_{30}ESL_{ij} + \gamma_{01}pEL_j + \gamma_{02}ParentProgram_j + \gamma_{03}ReadingSpecialist_j + u_{0j} + e_{ij} \]

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: 
## ReadingScore ~ EL + Sex + ESL + pEL + ParentProgram + ReadingSpecialist +  
##     (1 | School)
##    Data: data
## 
## REML criterion at convergence: 5046.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6380 -0.6253  0.0138  0.6938  2.5357 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  School   (Intercept) 0.01552  0.1246  
##  Residual             0.10255  0.3202  
## Number of obs: 7557, groups:  School, 1801
## 
## Fixed effects:
##                     Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)        1.453e+00  3.124e-02  1.204e+03  46.496  < 2e-16 ***
## EL                -1.173e-01  1.348e-02  7.538e+03  -8.705  < 2e-16 ***
## Sex                2.745e-02  7.686e-03  7.318e+03   3.572 0.000356 ***
## ESL                4.744e-02  1.409e-02  7.445e+03   3.367 0.000763 ***
## pEL               -2.601e-03  3.374e-04  1.574e+03  -7.710 2.22e-14 ***
## ParentProgram      2.401e-02  1.559e-02  1.191e+03   1.540 0.123821    
## ReadingSpecialist  7.352e-03  1.227e-02  1.083e+03   0.599 0.549034    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) EL     Sex    ESL    pEL    PrntPr
## EL          -0.045                                   
## Sex         -0.088  0.007                            
## ESL         -0.017 -0.361 -0.026                     
## pEL         -0.450 -0.324 -0.019 -0.035              
## ParentPrgrm -0.966  0.039 -0.032 -0.002  0.368       
## RedngSpclst -0.058  0.006 -0.006 -0.032 -0.070 -0.031

Significance testing showed a significant effect of pEL, but no main effects for ParentProgram or ReadingSpecialist, hence I removed the latter two predictors from the model, resulting in model m2a.

\[ ReadingScore_{ij} = \gamma_{00} + \gamma_{10}EL_{ij} + \gamma_{20}Sex_{ij} + \gamma_{30}ESL_{ij} + \gamma_{01}pEL\_School_j + u_{0j} + e_{ij} \]

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: ReadingScore ~ EL + Sex + ESL + pEL + (1 | School)
##    Data: data
## 
## REML criterion at convergence: 5189.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6419 -0.6244  0.0135  0.6934  2.5470 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  School   (Intercept) 0.01648  0.1284  
##  Residual             0.10211  0.3195  
## Number of obs: 7786, groups:  School, 1853
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)  1.500e+00  7.635e-03  1.530e+03 196.498  < 2e-16 ***
## EL          -1.128e-01  1.318e-02  7.776e+03  -8.559  < 2e-16 ***
## Sex          2.742e-02  7.563e-03  7.524e+03   3.626 0.000290 ***
## ESL          4.730e-02  1.379e-02  7.694e+03   3.429 0.000609 ***
## pEL         -2.829e-03  3.102e-04  1.745e+03  -9.123  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr) EL     Sex    ESL   
## EL  -0.028                     
## Sex -0.478  0.010              
## ESL -0.088 -0.357 -0.026       
## pEL -0.447 -0.362 -0.008 -0.039
## # Comparison of Model Performance Indices
## 
## Model |            Type |     AIC |     BIC | R2_conditional | R2_marginal |  ICC | RMSE
## ----------------------------------------------------------------------------------------
## m1    | lmerModLmerTest | 5360.34 | 5402.23 |           0.18 |        0.02 | 0.16 | 0.31
## m2a   | lmerModLmerTest | 5203.06 | 5251.79 |           0.18 |        0.05 | 0.14 | 0.31

Next, testing random slopes models, variable by variable. Models m3a, m3b, and m3c test random slopes for EL, Sex, and ESL, respectively.

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: ReadingScore ~ EL + Sex + ESL + pEL + (1 + EL | School)
##    Data: data
## 
## REML criterion at convergence: 5187.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6476 -0.6237  0.0121  0.6915  2.5712 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  School   (Intercept) 0.017459 0.13213       
##           EL          0.001924 0.04386  -0.50
##  Residual             0.101901 0.31922       
## Number of obs: 7786, groups:  School, 1853
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)  1.500e+00  7.688e-03  1.392e+03 195.068  < 2e-16 ***
## EL          -1.122e-01  1.322e-02  6.325e+02  -8.484  < 2e-16 ***
## Sex          2.725e-02  7.560e-03  7.511e+03   3.604 0.000315 ***
## ESL          4.845e-02  1.374e-02  4.710e+03   3.526 0.000427 ***
## pEL         -2.823e-03  3.089e-04  1.493e+03  -9.139  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr) EL     Sex    ESL   
## EL  -0.040                     
## Sex -0.474  0.011              
## ESL -0.091 -0.359 -0.026       
## pEL -0.442 -0.379 -0.009 -0.031
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: ReadingScore ~ EL + Sex + ESL + pEL + (1 + Sex | School)
##    Data: data
## 
## REML criterion at convergence: 5171.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6909 -0.6265  0.0227  0.6850  2.5934 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  School   (Intercept) 0.022798 0.15099       
##           Sex         0.001979 0.04448  -1.00
##  Residual             0.101561 0.31869       
## Number of obs: 7786, groups:  School, 1853
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)  1.497e+00  8.034e-03  1.056e+03 186.396  < 2e-16 ***
## EL          -1.125e-01  1.315e-02  7.755e+03  -8.557  < 2e-16 ***
## Sex          3.276e-02  7.653e-03  4.501e+03   4.280  1.9e-05 ***
## ESL          4.709e-02  1.374e-02  7.598e+03   3.427 0.000613 ***
## pEL         -2.820e-03  3.079e-04  1.750e+03  -9.159  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr) EL     Sex    ESL   
## EL  -0.028                     
## Sex -0.562  0.012              
## ESL -0.084 -0.357 -0.021       
## pEL -0.427 -0.364  0.007 -0.042
## convergence code: 0
## boundary (singular) fit: see ?isSingular
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: ReadingScore ~ EL + Sex + ESL + pEL + (1 + ESL | School)
##    Data: data
## 
## REML criterion at convergence: 5176.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6822 -0.6220  0.0109  0.6839  2.5084 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  School   (Intercept) 0.01716  0.1310        
##           ESL         0.02203  0.1484   -0.30
##  Residual             0.10042  0.3169        
## Number of obs: 7786, groups:  School, 1853
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)  1.500e+00  7.663e-03  1.385e+03 195.797  < 2e-16 ***
## EL          -1.133e-01  1.330e-02  6.929e+03  -8.520  < 2e-16 ***
## Sex          2.725e-02  7.546e-03  7.472e+03   3.611 0.000307 ***
## ESL          4.497e-02  1.518e-02  5.785e+02   2.962 0.003178 ** 
## pEL         -2.846e-03  3.115e-04  1.716e+03  -9.137  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr) EL     Sex    ESL   
## EL  -0.027                     
## Sex -0.475  0.009              
## ESL -0.107 -0.333 -0.022       
## pEL -0.445 -0.363 -0.008 -0.047
## # Comparison of Model Performance Indices
## 
## Model |            Type |     AIC |     BIC | R2_conditional | R2_marginal |  ICC | RMSE |   BF
## -----------------------------------------------------------------------------------------------
## m2a   | lmerModLmerTest | 5203.06 | 5251.79 |           0.18 |        0.05 | 0.14 | 0.31 |     
## m3a   | lmerModLmerTest | 5205.51 | 5268.15 |           0.18 |        0.05 | 0.14 | 0.31 | 0.00
## m3b   | lmerModLmerTest | 5189.93 | 5252.57 |           0.18 |        0.05 | 0.14 | 0.31 | 0.68
## m3c   | lmerModLmerTest | 5194.21 | 5256.85 |           0.19 |        0.05 | 0.15 | 0.30 | 0.08

Models with random slopes for variables, EL and ESL, showed to perform worse than odel m2a without random slopes. Only model m3b with a random slope for Sex performs better than model m2a, with \(\chi^2\)(2) = 17.15915, p = 1.8790482^{-4}.

Hence, I proceeded with model m3a, containing a random slope for Sex. The final model looks as follows:

\[ ReadingScore_{ij} = \gamma_{00} + \gamma_{10}EL_{ij} + \gamma_{20}Sex_{ij} + \gamma_{30}ESL_{ij} + \gamma_{01}pEL\_School_j + u_{2j}Sex_{2ij} + u_{0j} + e_{ij} \]

Evaluation of Hypotheses

Level 1 Hypotheses

  • H\(_1\): First-language English-speakers have a significantly higher reading score than ELs (fixed effect of variable EL). => EL remains a significant predictor throughout.

  • H\(_2\): The reading score gap is bigger for boys than for girls (interaction effect of Sex and EL). => The interaction effect _Sex*EL_ is not significant (see model m1a), hence I cannot conclude that the reading score gap is beigger for boys than for girls. Girls do, however, generally outperform boys in reading, as inidcated by the significant main effect of Sex.

Level 2 Hypotheses

  • H\(_3\): The reading score gap is bigger in schools with more ELs (fixed effect of pEL). => Model m2a provides evidence for this hypotheses.
  • H\(_4\): The reading score gap is lower in schools who offer parent education programmes in English (fixed effect of ParentProgram). => ParentProgram had no significant effect when entered into model m2, hence there is no evidence for a positive effect on students’ reading score through schools offering parent programs.

Cross-level Interaction Hypothesis

  • H\(_5\): The offering of ESL programmes reduces the reading score gap between ELs and native English speakers; in other words, schools with a reading specialist experience a smaller difference between ELs and E1 students (interaction effect between EL and ReadingSpecialist). => Could not be tested, given that ReadingSpecialist was not a significant predictor on level 2.