library(tidyverse) # for lots of things
setwd("D:\\r\\hlm") # Setup working directory
df <- read.csv("D:\\r\\hlm\\hw3.csv") # Load the data
# Check nuts and bolts
dim(df)
## [1] 80 4
head(df,3)
## classroomid time ontaskbehavior group
## 1 1 0 5 0
## 2 1 1 12 0
## 3 1 2 13 0
In the SPSS file, you will find information for the following:
Time. In this practice dataset, time is the most granular level of analysis. Each observation in the time column represents the on-task behavior score for a specific time within a classroom.
Classroom. The on-task behavior score for each and every time point is nested within each classroom.
First, let’s generate a graph for all different time points, as follows:
ggplot(df, aes(x = time, y = ontaskbehavior, group = interaction(classroomid, group), color = as.factor(group))) +
geom_line() +
geom_point() +
scale_color_manual(values = c("blue", "green"), labels = c("Reading and Writing", "Mathematics and Problem Solving")) +
scale_x_continuous(breaks = 0:3, labels = c("Day One", "Day Two", "Day Three", "Day Four")) +
labs(title = "Comparison of On-Task Behavior Over Time by Group",
x = "Time (Days)", y = "On-Task Behavior Score",
color = "Group") +
theme_minimal()
Now, let’s take a look at the comparison between the two classrooms, taking the average of different classrooms in each classroom type:
# Aggregating data
aggregated_data <- df %>%
group_by(time, group) %>%
dplyr::summarize(average_ontaskbehavior = mean(ontaskbehavior))
## `summarise()` has grouped output by 'time'. You can override using the
## `.groups` argument.
# Create the plot
ggplot(aggregated_data, aes(x = time, y = average_ontaskbehavior, group = as.factor(group), color = as.factor(group))) +
geom_line() +
geom_point() +
scale_color_manual(values = c("blue", "green"), labels = c("Reading and Writing", "Mathematics and Problem Solving")) +
scale_x_continuous(breaks = 0:3, labels = c("Day One", "Day Two", "Day Three", "Day Four")) +
labs(title = "Average On-Task Behavior Over Time by Group",
x = "Time (Days)", y = "Average On-Task Behavior Score",
color = "Group") +
theme_minimal()
From both plots, we can see that for both types of classrooms, the average on-task behavior score goes up as time goes by. However, we noticed two difference:
The mathematics and problem solving group has a higher average on-task behavior score at each time point than the reading and writing group.
The rate of increase of behavior score seems to be faster for the reading and writing group from day 1 to day 2, and from day 3 to day 4. The difference in rate of increase for the two groups is indistinguishable from day 2 to day 3.
No. We can see from plot 1 than within each group, different classrooms have quite different initial status points.
No. Initial starting point is higher for the mathematical and problem solving group, as you can see from both plots.
No. We can see from plot 1 that within each group, classrooms varies in growth trajectories.
Yes, as I have stated in Q3. The rate of increase is faster for the reading and writing group from day 1 to day 2, and from day 3 to day 4. The difference in rate of increase for the two groups is indistinguishable from day 2 to day 3.
Yes. To answer this question, we need an unconditional model with random intercept:
library(lmerTest) # for HLM
lapply(df, class) # Check data format for each column
## $classroomid
## [1] "integer"
##
## $time
## [1] "integer"
##
## $ontaskbehavior
## [1] "integer"
##
## $group
## [1] "integer"
# Change columns into proper data format to feed into HLM models
df$time <- as.numeric(df$time) # time as numeric
df$group <- as.factor(df$group) # group as factor
# Unconditional model
uncon_model <- lmerTest::lmer(ontaskbehavior ~ 1 + (1 | classroomid),
data = df, REML = FALSE)
summary(uncon_model)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: ontaskbehavior ~ 1 + (1 | classroomid)
## Data: df
##
## AIC BIC logLik deviance df.resid
## 433.9 441.0 -213.9 427.9 77
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.01401 -0.61634 -0.06546 0.58085 3.06471
##
## Random effects:
## Groups Name Variance Std.Dev.
## classroomid (Intercept) 3.399 1.844
## Residual 9.925 3.150
## Number of obs: 80, groups: classroomid, 20
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 9.4750 0.5422 20.0000 17.47 1.4e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the model output, we can see that the random effect intercept has a variance of 3.399, which is much larger than 0 (the scale is 0 to 20 since we are talking about the behavior score). This means that the initial status of the behavior score is quite different among classrooms (Level 2).
Yes. To answer this question, we fit a mixed effect model with both a fixed effect estimate for time, and a random slope estimate for time:
uncon_model_growth_rate <- lmerTest::lmer(ontaskbehavior ~ time + (time | classroomid),
data = df, REML = FALSE)
summary(uncon_model_growth_rate)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: ontaskbehavior ~ time + (time | classroomid)
## Data: df
##
## AIC BIC logLik deviance df.resid
## 344.6 358.8 -166.3 332.6 74
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.89209 -0.46781 -0.04962 0.43420 3.07415
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## classroomid (Intercept) 7.7805 2.7894
## time 0.5721 0.7564 -0.56
## Residual 1.4100 1.1874
## Number of obs: 80, groups: classroomid, 20
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 6.2800 0.6621 20.0002 9.485 7.64e-09 ***
## time 2.1300 0.2067 19.9998 10.307 1.90e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## time -0.586
From the model output, we can see that the variance is 0.572 for the growth rate. The variance component is not extremely large, but is still substantial, considering that it is a measure of variability in the rate of change of the on-task behavior score across classrooms.
Yes. To answer this question, we need to fit a fixed effect model with a fixed effect estimate for classroom type (group).
model_with_group <- lmerTest::lmer(ontaskbehavior ~ group + (1 | classroomid),
data = df, REML = FALSE)
summary(model_with_group)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: ontaskbehavior ~ group + (1 | classroomid)
## Data: df
##
## AIC BIC logLik deviance df.resid
## 425.4 434.9 -208.7 417.4 76
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.6838 -0.7824 0.0397 0.7027 3.3949
##
## Random effects:
## Groups Name Variance Std.Dev.
## classroomid (Intercept) 0.9969 0.9984
## Residual 9.9250 3.1504
## Number of obs: 80, groups: classroomid, 20
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 7.9250 0.5898 20.0000 13.438 1.8e-11 ***
## group1 3.1000 0.8340 20.0000 3.717 0.00136 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## group1 -0.707
Recall that our unconditional model has an intercept variance of 3.399. In this model, our variance component is reduced to 0.997. The means that the type of classrooms accounts for a significant proportion (about 70.671%) of the variance in the initial status of the behavior score.
No. To answer this question, the model should include group as a predictor for both the intercept and the slope of time-behavior score:
model_growth_rate <- lmer(ontaskbehavior ~ time * group + (time | classroomid),
data = df, REML = FALSE)
summary(model_growth_rate)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: ontaskbehavior ~ time * group + (time | classroomid)
## Data: df
##
## AIC BIC logLik deviance df.resid
## 335.7 354.7 -159.8 319.7 72
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.88370 -0.51014 -0.06529 0.43689 3.06869
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## classroomid (Intercept) 3.6596 1.9130
## time 0.4697 0.6853 -0.40
## Residual 1.4100 1.1874
## Number of obs: 80, groups: classroomid, 20
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 4.2500 0.6817 20.0001 6.235 4.34e-06 ***
## time 2.4500 0.2742 20.0006 8.936 2.02e-08 ***
## group1 4.0600 0.9640 20.0001 4.212 0.000429 ***
## time:group1 -0.6400 0.3877 20.0006 -1.651 0.114431
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) time group1
## time -0.510
## group1 -0.707 0.361
## time:group1 0.361 -0.707 -0.510
From the model output, we can see that the interaction fixed effects between time and the type of classroom (group) is insignificant (p < .05) with a small effect size (-.6400). This suggests that the linear growth rate has no statistically significant difference between the two types of classrooms.
Recall that in the unconditional model, the variance of the random intercept (initial status) was 3.399, indicating substantial variability in initial behavior scores across classrooms. In contrast, inclusion of the “group” variable reduced the variance of the random intercept to 0.997. This significant reduction suggests that the type of classroom explains a considerable portion of the variance in initial scores across classrooms.
On the other hand, both the unconditional model and the conditional model with classroom type as a predictor has a residual variance of 9.925, indicating that classroom type does not explain the variability of scores within each classroom.
First, our unconditional model
ontaskbehavior ~ 1 + (1 | classroomid)
, as presented in
question a, has AIC/BIC = 433.9/441.0. This is a baseline
model.
Second, we included time
as a predictor
ontaskbehavior ~ time + (time | classroomid)
, and got
AIC/BIC = 344.6/358.8. Both of them are much lower than the baseline
model, indicating substantially improved goodness-of-fit. In addition,
the significant time
coefficient (beta = 2.130, p <
0.001) suggests that on-task behavior scores change significantly over
time.
Third, we included group
as a predictor
ontaskbehavior ~ group + (1 | classroomid)
. This model fit
with AIC/BIC = 425.4/434.9, indicating the model is a better fit to the
data than the baseline model, but not as well as the model with time as
a predictor. Additionally, The significant group
fixed
effects coefficient (beta = 3.100, p < 0.01) indicates that on-task
behavior scores differ significantly between different types of
classrooms.
Finally, we fit a model with both time
and
group
as predictors while also estimating their interaction
effects
ontaskbehavior ~ time * group + (time | classroomid)
. This
model achieve an AIC/BIC = 335.7/354.7, providing the best fit among the
four models. The significant coefficients for time and classroom types
remain, but the interaction between time and the classroom type is not
significant at p < .05 level, indicating that the rate of change over
time does not differ significantly between different types of
classrooms.
In summary, both time and classroom type are important predictors for the behavior score, but time is a particularly strong predictor.
First, our unconditional model
ontaskbehavior ~ 1 + (1 | classroomid)
, the intercept
variance is 3.709, indicating classrooms differ a lot in their average
on-task behavior scores. The residual variance is 9.925, indicating that
there is also substantial variability within each classroom.
Second, we included time
as a predictor
ontaskbehavior ~ time + (time | classroomid)
. the intercept
variance (7.781) increases almost by half. This is expected because
including time
as a predictor allows our model to freely
estimates the between-classroom variability of the growth rate over
time. Therefore, the substantial increase of the intercept variance
suggests that there’s substantial heterogeneity among classrooms both in
terms of their initial status and in their rates of change over
time.
Meanwhile, the residual variance is reduced from 9.925 to 1.410. This tells us that by modeling the change over time differently for each classroom, our model explains more of the within-classroom variability that was left as unexplained noise (residual) in the first model. In other words, the behavior score of students within classrooms also varies a lot over time.
Third, we included group
as a predictor
ontaskbehavior ~ group + (1 | classroomid)
. The intercept
variance is reduced from 3.709 to 0.997, which means that classroom type
explains much of the between-classroom variability in initial scores.The
residual variance is still 9.925, meaning classroom type does not
explain within-classroom variability in behavior scores.
Finally, we fit a model with both time
and
group
as predictors while also estimating their interaction
effects
ontaskbehavior ~ time * group + (time | classroomid)
. the
intercept variance is slightly increased to 3.660. This denotes that the
variability in initial status is not heavily influenced by the two
predictors time and classroom type.
The random slope for time has a variance of 0.470, indicating that there is significant variability in how behavior score changes over time across different classrooms. Also, the residual variance is substantially reduced to 1.410. This means that these two predictors explain a good portion of within-classroom variability in scores.
In summary, the conditional models, particularly by including time as a predictor, provide a better explanation of the variability in on-task behavior scores both within and between classrooms compared to the unconditional models.
Note: I also tried to fit a model that estimates the random effects for classroom types:
# group as a random effect
# model_group_random <- lmer(ontaskbehavior ~ time * group + (time + group | classroomid), data = df, REML = FALSE)
#summary(model_group_random)
But this model is nearly unidentified. Statistically, I think it’s related to we have a simple dataset with n = 80, which does not work with a relatively complicated model. Empirically, I believe the classroom type should be viewed as a treatment variable instead of a variable that identifies a random sample.