The dataset consists in a reduced version of Trends of International Mathematics and Science Study (TIMSS). TIMSS and TIMSS Advanced are sponsored by the International Association for the Evaluation of Educational Achievement (IEA) and conducted in the United States by the National Center for Education Statistics (NCES). TIMSS provides reliable and timely trend data on the mathematics and science achievement of U.S. students compared to that of students in other countries. TIMSS data have been collected from students at grades 4 and 8 every 4 years since 1995, with the United States participating in every administration of TIMSS.

1. List all the variables at level 1 and level 2

Table 1 presents the descriptive information of variables include in the analysis. Dataset includes a total of 7097 students or micro-units (level 1), nested in 146 schools or macro-units (level 2). Variables include in level 1 are:

Variables for level 2 are:

Table 1. Descriptive statistics TIMSS

Variable Stats / Values Freqs (% of Valid) Graph Missing
grade [factor]
1. 3
2. 4
2335(32.9%)
4762(67.1%)
0 (0.0%)
gender [factor]
1. boy
2. girl
3497(49.3%)
3600(50.7%)
0 (0.0%)
science [numeric]
Mean (sd) : 150.4 (9.9)
min ≤ med ≤ max:
103.4 ≤ 150 ≤ 185
Q1 - Q3 : 144.3 - 156.8
186 distinct values 0 (0.0%)
math [numeric]
Mean (sd) : 151.1 (10.1)
min ≤ med ≤ max:
104.3 ≤ 150.9 ≤ 189
Q1 - Q3 : 144.3 - 157.3
195 distinct values 0 (0.0%)
shortages [integer]
Mean (sd) : 0.6 (0.9)
min ≤ med ≤ max:
0 ≤ 0 ≤ 3
Q1 - Q3 : 0 - 1
0:4087(57.6%)
1:1879(26.5%)
2:790(11.1%)
3:341(4.8%)
0 (0.0%)

Generated by summarytools 1.0.0 (R version 4.1.2)
2021-11-26

2. Hierachical Linear Models

2.1 Null model (model5.0)

First, we estimate the empty or null model that did not contain any explanatory variable and where the students’ performance in Mathematics (\(Y_{ij}\)) is the sum of a general mean (\(\gamma_{00}\)), a random effect at the group level (school) (\(U_{0j}\)), and a random effect at the individual level, (\(R_{ij}\)):

Level 1 \[ \begin{aligned} Y_{ij} = \beta_{0} + R_{ij} \end{aligned} \] Level 2 \[ \begin{aligned} \beta_{0} = \gamma_{00} + U_{0j} \end{aligned} \]

Full Model \[ \begin{aligned} Y_{ij} = \gamma_{00} + U_{0j} + R_{ij} \end{aligned} \] Random effect \(U_{0j}\) and \(R_{ij}\) are assumed to be independent, and to have mean 0 and variances var(\(R_{ij}\)) = \(\sigma^2\) and var(\(U_{ij}\)) = \(\tau_{0}^2\), respectively. Then, the total variance of \(Y\) can be decomposed as the sum of group and individual variances:

\[ \begin{aligned} var(Y_{ij}) = var(U_{0j}) + var(R_{ij}) = \sigma^2 + \tau_{0}^2 \end{aligned} \] According to results (See Table 2), the general mean of mathematics performance in TIMSS is 150.869, while \(\sigma^2\) = 82.02  and \(\tau_{0}^2\) = 19.866. Furthermore, 19.498% of population variance in the performance in mathematics is explained by the school characteristics (level 2 or macro-level). Also, the standard deviation of random effect falls into the 95% confidence interval of (3.939 - 5.078), which does not include the value \(0\) so that the random intercept between schools is statistically significant (Table 3).

Table 2. Null Model. Student Popularity
  Null Model
Predictors Estimates std.Error p-value
Intercept 150.87 *** 0.39 <0.001
Random Effects
σ2 82.020
τ00 idschool 19.866
ICC 0.195
N idschool 146
Observations 7097
Marginal R2 / Conditional R2 0.000 / 0.195
  • p<0.05   ** p<0.01   *** p<0.001
Table 3. Confidence Interval for Null Model (Model 5.0)
2.5 % 97.5 %
.sig01 3.939060 5.077678
.sigma 8.907981 9.209159
(Intercept) 150.105334 151.631466

2.2 Random intercept model plus level 1 fixed effect (model 5.1)

A second random intercept model for estimating mathematics performance in TIMSS was ran, where students are nested in schools, whereas grade, gender, and science performance were included as explanatory variables. This model can be also expressed in two separates level as:

Level 1 \[ \begin{aligned} Y_{ij} = \beta_{0} + \beta_{1}gender+\beta_{2}grade + +\beta_{3}science + R_{ij} \end{aligned} \]

\[ \begin{aligned} \beta_{1} = \gamma_{10} \end{aligned} \]

\[ \begin{aligned} \beta_{2} = \gamma_{20} \end{aligned} \]

\[ \begin{aligned} \beta_{3} = \gamma_{30} \end{aligned} \]

Level 2 \[ \begin{aligned} \beta_{0} = \gamma_{00} + U_{0j} \end{aligned} \]

Full Model \[ \begin{aligned} Y_{ij} = \gamma_{00} + \gamma_{10}gender_{i} + \gamma_{20}grade_{i} + \gamma_{30}science_{i} + U_{0j} + R_{ij} \end{aligned} \]

Table 3 presents the estimation of the random intercept model (model 5.1) with fixed effect for student levels (grade, gender, and science performance). Additionally, Table 4 shows the estimation of confidence intervals. Results indicate that both random and fixed effects are statistically significant.

Table 3. Comparison Null Model (5.0) vs. Random Effect Model (5.1)
  Null Mode (5.0) Random Effect (5.1)
Predictors Estimates std.Error p-value Estimates std.Error p-value
Intercept 150.87 *** 0.39 <0.001 63.38 *** 1.47 <0.001
Gender (Girl) 0.58 *** 0.17 0.001
Grade(4th grade) 3.61 *** 0.19 <0.001
Science performance 0.57 *** 0.01 <0.001
Random Effects
σ2 82.020 51.600
τ00 19.866 idschool 5.953 idschool
ICC 0.195 0.103
N 146 idschool 146 idschool
Observations 7097 7097
Marginal R2 / Conditional R2 0.000 / 0.195 0.397 / 0.459
  • p<0.05   ** p<0.01   *** p<0.001
Table 4. Confidence Interval for Random effects and fixed effect (Model 5.1)
2.5 % 97.5 %
.sig01 2.1187112 2.8200105
.sigma 7.0654774 7.3045052
(Intercept) 60.4621952 66.3002954
gendergirl 0.2427455 0.9178991
grade4 3.2330679 3.9832321
science 0.5459489 0.5847957

a. What is \(\tau_{00}^{2}\), the variance of the random intercept? And what is \(\sigma^{2}\), the variance of the residual? And compute the residual ICC for this model

The variance of the random intercept \(\tau_{00}^{2}\) = 5.953 represent the variability in the mathematics performance between schools. This random effect is statistically significant, suggesting that mean of performance in mathematics is different by schools.

The variance of the residual \(\sigma^{2}\) = 51.6, or variability within schools, showed a decreased respect null model (82.02 to 51.6), which means that age, grade, and science performance explain the variability of mathematics performance within schools. Also, adding these variables the ICC drops by 47.18\(\%\) in comparison to Null Model (Model 5.0).

b. What is the effect of being a female student on the math score in this dataset?

According to results, gender is a statistically significant predictor of mathematics performance \((p-value < 0.001)\). Thus, being a female increases score in mathematics in 0.58 points.

2.2 Random intercept model plus level 1 and 2 fixed effect (model 5.2)

An additional random intercept model (model 5.2) for estimating mathematics performance in TIMSS was ran. This model includes grade, gender, normal standardized science performance (cscience) (See Table 5), and an interaction term between gender and normal standardized science performance (\(\gamma_{11}cscience*gender_{i}\)) as explanatory variables for level 1, while school shortages of instructional materials was included as an explanatory variable for school level.

Table 5. Descriptive statistics Standardized Normal Science Performance - TIMSS

Variable Stats / Values Freqs (% of Valid) Graph Missing
cscience [matrix, array]
Mean (sd) : 0 (1)
min ≤ med ≤ max:
-4.7 ≤ 0 ≤ 3.5
Q1 - Q3 : -0.6 - 0.6
186 distinct values 0 (0.0%)

Generated by summarytools 1.0.0 (R version 4.1.2)
2021-11-26

Model 5.2 can be also expressed in two separates level as:

Level 1 \[ \begin{aligned} Y_{ij} = \beta_{0} + \beta_{1}gender+\beta_{2}grade +\beta_{3}cscience +\beta_{4}shortages + R_{ij} \end{aligned} \]

\[ \begin{aligned} \beta_{1} = \gamma_{10} + \gamma_{11}cscience \end{aligned} \]

\[ \begin{aligned} \beta_{2} = \gamma_{20} \end{aligned} \]

\[ \begin{aligned} \beta_{3} = \gamma_{30} \end{aligned} \]

Level 2 \[ \begin{aligned} \beta_{0} = \gamma_{00} + \gamma_{01}shortages + U_{0j} \end{aligned} \]

Full Model \[ \begin{aligned} Y_{ij} = \gamma_{00} + (\gamma_{10} + \gamma_{11}cscience)*gender_{i} + \gamma_{20}grade_{i} + \gamma_{30}cscience_{i} + \gamma_{01}shortages + U_{0j} + R_{ij} \end{aligned} \]

\[ \begin{aligned} Y_{ij} = \gamma_{00} + \gamma_{10}gender_{i} + \gamma_{11}cscience*gender_{i} + \gamma_{20}grade_{i} + \gamma_{30}cscience_{i} + \gamma_{01}shortages + U_{0j} + R_{ij} \end{aligned} \]

Table 6 presents the estimation of random intercept model (model 5.2) described below. Furthermore, Table 7 shows the estimation of confidence intervals for all random, fixed, and interaction effects.

Table 6. Comparison Null Model (5.0) vs. Random Effect Model (5.1 and 5.2)
  Null Mode (5.0) Model 5.1 Model 5.2
Predictors Estimates std.Error p-value Estimates std.Error p-value Estimates std.Error p-value
Intercept 150.87 *** 0.39 <0.001 63.38 *** 1.47 <0.001 148.78 *** 0.31 <0.001
Gender (Girl) 0.58 *** 0.17 0.001 0.58 *** 0.17 0.001
Grade (4th grade) 3.61 *** 0.19 <0.001 3.61 *** 0.19 <0.001
Science performance 0.57 *** 0.01 <0.001
Science performance (Z-score) 5.49 *** 0.12 <0.001
Shortages -0.59 * 0.26 0.023
Gender (Girl) * Science (Z-score) 0.27 0.17 0.124
Random Effects
σ2 82.020 51.600 51.587
τ00 19.866 idschool 5.953 idschool 5.681 idschool
ICC 0.195 0.103 0.099
N 146 idschool 146 idschool 146 idschool
Observations 7097 7097 7097
Marginal R2 / Conditional R2 0.000 / 0.195 0.397 / 0.459 0.403 / 0.462
  • p<0.05   ** p<0.01   *** p<0.001
Table 7. Confidence Interval for Random effects and fixed and interaction effect (Model 5.2)
2.5 % 97.5 %
.sig01 2.0667467 2.7580642
.sigma 7.0645427 7.3035461
(Intercept) 148.1707274 149.3870606
gendergirl 0.2425514 0.9175888
grade4 3.2324254 3.9822586
cscience 5.2503297 5.7346572
shortages -1.1063639 -0.0782220
gendergirl:cscience -0.0735812 0.6071602

a. What is the effect of shortage on math scores?

The effect of shortages in instructional materials is negative and statistically significant \((p-value =\) 0.025 \()\) on mathematics performance. Attending a school with small level shortages of instructional materials predicts a reduction in the student performance in mathematics in 0.592 while attending a school with a higher level of shortages of instructional materials reduce the student performance in mathematics in 1.777 points.

b. What is the interaction effect? Is it significant? Report the p-value

The effect of interaction between gender and standardized performance in science positive (0.267) and non-significant (p-value = 0.124).

c. Try to visualize the interaction effect.

Figure 1 presents the interaction effect. This information confirmed that there is no significant difference in the effect of the science Score (z-score) as a predictor of mathematics performance by gender.

2.3 Random intercept model plus cross-level interaction (model 5.3)

Model 5.3 includes grade, gender, standardized scores in science (z-scores), and shortages as fixed predictors of mathematics performance. Also, the model includes a random slope effect for science as a predictor on mathematics performance (\(U_{3j}*cscience_{i}\)). This model can be defined using the following form:

Level 1 \[ \begin{aligned} Y_{ij} = \beta_{0} + \beta_{1}gender+\beta_{2}grade +\beta_{3}cscience +\beta_{4}shortages + R_{ij} \end{aligned} \]

\[ \begin{aligned} \beta_{1} = \gamma_{10} \end{aligned} \]

\[ \begin{aligned} \beta_{2} = \gamma_{20} \end{aligned} \]

\[ \begin{aligned} \beta_{3} = \gamma_{30}+ U_{3j} \end{aligned} \]

Level 2 \[ \begin{aligned} \beta_{0} = \gamma_{00} + \gamma_{01}shortages + U_{0j} \end{aligned} \]

Full Model \[ \begin{aligned} Y_{ij} = \gamma_{00} + \gamma_{10}gender_{i} + \gamma_{20}grade_{i} + \gamma_{30}cscience_{i} + U_{3j}*cscience_{i} + \gamma_{01}shortages + U_{0j} + R_{ij} \end{aligned} \]

Table 8 presents the results for random intercept and random slope model that includes explanatory variables for both level 1 (students) and level 2 (schools).

Table 8. Comparison Hierarchical Models
  Model 5.1 Model 5.2 Model 5.3
Predictors Estimates std.Error p-value Estimates std.Error p-value Estimates std.Error p-value
Intercept 63.38 *** 1.47 <0.001 148.78 *** 0.31 <0.001 148.88 *** 0.31 <0.001
Gender (Girl) 0.58 *** 0.17 0.001 0.58 *** 0.17 0.001 0.55 ** 0.17 0.001
Grade (4th grade) 3.61 *** 0.19 <0.001 3.61 *** 0.19 <0.001 3.61 *** 0.19 <0.001
Science performance 0.57 *** 0.01 <0.001
Science performance (Z-score) 5.49 *** 0.12 <0.001 5.60 *** 0.12 <0.001
Shortages -0.59 * 0.26 0.023 -0.59 * 0.26 0.021
Gender (Girl) * Science (Z-score) 0.27 0.17 0.124
Random Effects
σ2 51.600 51.587 51.046
τ00 5.953 idschool 5.681 idschool 5.496 idschool
τ11     0.697 idschool.cscience
ρ01     -0.263 idschool
N 146 idschool 146 idschool 146 idschool
Observations 7097 7097 7097
Marginal R2 / Conditional R2 0.397 / 0.459 0.403 / 0.462 0.402 / 0.467
  • p<0.05   ** p<0.01   *** p<0.001

a. What is the variance for the random slope effect?

The variance for the random intercept (\(\tau_{00}\)) and random slope (\(\tau_{11}\)) are equal to 5.496 and 0.697, respectively

b. Test if the random slope of cscience is significant?

Tabled 9 shows the estimation of confidence intervals for all effects. Random slope effect (\(.sig03\)) falls into the 95% confidence interval of (0.567 - 1.106), which does not include the \(0\) value so that we conclude that random slope of science performance (z-score) on maths scores is a statistically significant effect.

Table 9. Confidence Interval for Model 5.3
2.5 % 97.5 %
.sig01 2.0268927 2.7191530
.sig02 -0.5394733 0.0470499
.sig03 0.5672824 1.1056351
.sigma 7.0264498 7.2661412
(Intercept) 148.2754619 149.4848407
gendergirl 0.2176294 0.8920900
grade4 3.2390221 3.9889513
cscience 5.3594600 5.8406667
shortages -1.1009506 -0.0868340