MLM - Ex1 - Model Comparison

מגיש: אלדד אביב

ת.ז: 206836165

Pre Process

Load Libraries

library(tidyverse)
library(ggplot2)
library(lmerTest)
library(performance)
library(parameters)

Load Data

dataset2 <- read.csv("Example_Couples_relational_Satisfaction.csv")

Display Data

head(dataset2)
##       Duration      M_sat     F_sat ID
## 1   0.06043444 17.2340096 18.525058  1
## 2  -7.54308310 10.4868291  4.091697  2
## 3 -11.25961816 11.4645249 12.398660  3
## 4   6.94572548 -0.5525516 14.531549  4
## 5   4.90349783 12.6976443 -1.882210  5
## 6  14.12341031 -8.6538947 -7.157227  6

It seems like the data is in short format, let’s adjust it to long pivot:

Convert Data Into Long Format

long_data <- dataset2 |>
  pivot_longer(
    cols = contains("sat"), # Define Columns to pivot
    names_to = "Gender", # Define new column name
    values_to = "sat" # Define converted values name
  )

head(long_data)
## # A tibble: 6 × 4
##   Duration    ID Gender   sat
##      <dbl> <int> <chr>  <dbl>
## 1   0.0604     1 M_sat  17.2 
## 2   0.0604     1 F_sat  18.5 
## 3  -7.54       2 M_sat  10.5 
## 4  -7.54       2 F_sat   4.09
## 5 -11.3        3 M_sat  11.5 
## 6 -11.3        3 F_sat  12.4

Part 1

Model A: [ sat ~ 1 ]

# FIXED EFFECT: Intercept

model.lm.empty <- lm(sat ~ 1,
                     data = long_data)
summary(model.lm.empty)
## 
## Call:
## lm(formula = sat ~ 1, data = long_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -28.186  -6.364  -0.506   7.684  22.821 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   1.1401     0.6755   1.688    0.093 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.553 on 199 degrees of freedom

Model A: Report

  • The empty model predict the mean value of the satisfaction score for all observations: 1.14 (p = 0.093)

Model B: [ sat ~ 1 + (1|ID) ]

model.r_intercepts <- lmer(sat ~ 1 # Fixed effect: overall grand mean of satisfaction
                           + (1 | ID), # Random effect: allows different baseline by ID
                           data = long_data)

model_parameters(model.r_intercepts, ci_method = "S")
## # Fixed Effects
## 
## Parameter   | Coefficient |   SE |        95% CI | t(99.00) |     p
## -------------------------------------------------------------------
## (Intercept) |        1.14 | 0.83 | [-0.51, 2.79] |     1.37 | 0.173
## 
## # Random Effects
## 
## Parameter          | Coefficient |   SE |       95% CI
## ------------------------------------------------------
## SD (Intercept: ID) |        6.84 | 0.75 | [5.51, 8.49]
## SD (Residual)      |        6.68 | 0.47 | [5.82, 7.68]
Model B: Report
  • Fixed effect of sat: 1.14(same as null model)
  • SD of sat between couples: 6.84
rAnova
# ranova checks the significance of including random intercepts in the model
ranova(model.r_intercepts)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## sat ~ (1 | ID)
##          npar  logLik    AIC    LRT Df Pr(>Chisq)    
## <none>      3 -718.99 1444.0                         
## (1 | ID)    2 -734.13 1472.2 30.268  1  3.763e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • p = 3.763e-08: indicates that the random effect significantly improves the model fit

  • AIC = 1444(full model) < AIC = 1472(reduced model): It appears that the model with the random intercept provides a better fit to the data, as indicated by the lower AIC value.

sjPlot::tab_model(model.lm.empty, model.r_intercepts,
                  df.method= "satterthwaite")
## `ci_method` must be one of "residual", "wald", "normal", "profile",
##   "boot", "uniroot", "betwithin" or "ml1". Using "wald" now.
  sat sat
Predictors Estimates CI p Estimates CI p
(Intercept) 1.14 -0.19 – 2.47 0.093 1.14 -0.51 – 2.79 0.173
Random Effects
σ2   44.67
τ00   46.82 ID
ICC   0.51
N   100 ID
Observations 200 200
R2 / R2 adjusted 0.000 / 0.000 0.000 / 0.512
  • ICC = 0.512 indicates that 51.2% of the variance in satisfaction is explained by differences in the baseline satisfaction levels among couples.
  • R2 conditional = 0.512 > R2 Marginal = 0 suggesting variance explained better by a fixed effect + random effect compare to a fixed effect alone.

Part 2

Model C: [ sat ~ Gender ]

# FIXED EFFECT: Gender

model.lm.fixed.G <- lm(sat ~ Gender,
                     data = long_data)
summary(model.lm.fixed.G)
## 
## Call:
## lm(formula = sat ~ Gender, data = long_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -29.1050  -6.9338  -0.6524   7.7012  21.9017 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   0.2211     0.9532   0.232    0.817
## GenderM_sat   1.8380     1.3481   1.363    0.174
## 
## Residual standard error: 9.532 on 198 degrees of freedom
## Multiple R-squared:  0.009302,   Adjusted R-squared:  0.004298 
## F-statistic: 1.859 on 1 and 198 DF,  p-value: 0.1743

Predicting sat with gender exclusively yielded a non-significant coefficient of 1.838 (p = 0.174)

Model D: [ sat ~ (Gender + (1|ID) ]

# FIXED EFFECT: Gender, intercept
# RANDOM EFFECT: Intercept
model.f.Gender.r.intercept <- lmer(sat ~ Gender + (1 |ID),
                   data = long_data)
model_parameters(model.f.Gender.r.intercept, ci_method = "S")
## # Fixed Effects
## 
## Parameter      | Coefficient |   SE |        95% CI |    t |     df |     p
## ---------------------------------------------------------------------------
## (Intercept)    |        0.22 | 0.95 | [-1.66, 2.10] | 0.23 | 155.58 | 0.817
## Gender [M_sat] |        1.84 | 0.93 | [-0.01, 3.69] | 1.97 |  99.00 | 0.051
## 
## # Random Effects
## 
## Parameter          | Coefficient |   SE |       95% CI
## ------------------------------------------------------
## SD (Intercept: ID) |        6.89 | 0.75 | [5.57, 8.52]
## SD (Residual)      |        6.59 | 0.47 | [5.73, 7.57]
  • Gender[M_sat] Coefficient = 1.84 represents the difference in mean satisfaction levels between males (M_sat) and female (F_sat)
  • p = 0.51

Model E: [ sat ~ G * Duration + (1|ID) ]

# FIXED EFFECT: Gender*Duration, intercept
# RANDOM EFFECT: intercept(ID)
model.maximal <- lmer(sat ~ Gender * Duration + (1 | ID),
                      data = long_data)
model_parameters(model.maximal, ci_method = "S")
## # Fixed Effects
## 
## Parameter                 | Coefficient |   SE |         95% CI |     t |     df |      p
## -----------------------------------------------------------------------------------------
## (Intercept)               |        0.18 | 0.90 | [-1.59,  1.95] |  0.20 | 159.43 | 0.840 
## Gender [M_sat]            |        1.86 | 0.91 | [ 0.04,  3.67] |  2.03 |  98.00 | 0.045 
## Duration                  |        0.45 | 0.10 | [ 0.26,  0.64] |  4.70 | 159.43 | < .001
## Gender [M_sat] × Duration |       -0.21 | 0.10 | [-0.41, -0.02] | -2.18 |  98.00 | 0.031 
## 
## # Random Effects
## 
## Parameter          | Coefficient |   SE |       95% CI
## ------------------------------------------------------
## SD (Intercept: ID) |        6.20 | 0.72 | [4.93, 7.80]
## SD (Residual)      |        6.47 | 0.46 | [5.62, 7.44]
ranova(model.f.Gender.r.intercept)
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## sat ~ Gender + (1 | ID)
##          npar  logLik    AIC    LRT Df Pr(>Chisq)    
## <none>      4 -716.22 1440.4                         
## (1 | ID)    3 -731.98 1470.0 31.522  1  1.972e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Likelihood Ratio Test (LRT)

  • p = 1.972e-08 Indicates that the random effect significantly improves the model fit.
  • AIC = 1440.4 (full model) < AIC = 1470.0 (reduced model) Suggests that the model with the random intercept provides a better fit to the data, as indicated by the lower AIC value.
r2(model.f.Gender.r.intercept)
## # R2 for Mixed Models
## 
##   Conditional R2: 0.527
##      Marginal R2: 0.009
  • R² conditional = 0.527 > R² marginal = 0.009 Suggests that the variance is better explained by a combination of fixed effects (gender) and random effects (between-couple variability) compared to fixed effects alone.
sjPlot::tab_model(model.lm.fixed.G, model.f.Gender.r.intercept, model.maximal)
  sat sat sat
Predictors Estimates CI p Estimates CI p Estimates CI p
(Intercept) 0.22 -1.66 – 2.10 0.817 0.22 -1.66 – 2.10 0.817 0.18 -1.59 – 1.95 0.839
Gender [M_sat] 1.84 -0.82 – 4.50 0.174 1.84 0.00 – 3.68 0.050 1.86 0.05 – 3.66 0.044
Duration 0.45 0.26 – 0.64 <0.001
Gender [M_sat] × Duration -0.21 -0.41 – -0.02 0.030
Random Effects
σ2   43.41 41.82
τ00   47.45 ID 38.44 ID
ICC   0.52 0.48
N   100 ID 100 ID
Observations 200 200 200
R2 / R2 adjusted 0.009 / 0.004 0.009 / 0.527 0.132 / 0.548

Fixed Effects

  • Gender: Males have significantly higher satisfaction levels compared to females (Estimate = 1.86, 95% CI = [0.05, 3.66], p = 0.044).

  • Duration: Longer relationship duration is significantly associated with higher satisfaction levels (Estimate = 0.45, 95% CI = [0.26, 0.64], p < 0.001).

  • Gender × Duration: The positive effect of duration on satisfaction is significantly smaller for males compared to females (Estimate = -0.21, 95% CI = [-0.41, -0.02], p = 0.030).

Random Effects

  • Between-couple variability (ID): There is significant variability in baseline satisfaction levels between couples (Variance = 38.44).

  • Residual variance: Indicates the variance in satisfaction that is not explained by the fixed effects (Variance = 41.82).

  • ICC: Approximately 48% of the variance in satisfaction is attributable to differences between couples (ICC = 0.48).

Compare Models

LRT_CD <- ranova(model.f.Gender.r.intercept)
LRT_DE <- ranova(model.maximal)
# C vs D
LRT_CD
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## sat ~ Gender + (1 | ID)
##          npar  logLik    AIC    LRT Df Pr(>Chisq)    
## <none>      4 -716.22 1440.4                         
## (1 | ID)    3 -731.98 1470.0 31.522  1  1.972e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • LRT_CD = 31.522, p = 1.972e-08 Indicates that including the random intercept for couple/ID significantly improves the model fit, suggesting substantial between-couple variability in satisfaction.
# D vs E
LRT_DE
## ANOVA-like table for random-effects: Single term deletions
## 
## Model:
## sat ~ Gender + Duration + (1 | ID) + Gender:Duration
##          npar  logLik    AIC    LRT Df Pr(>Chisq)    
## <none>      6 -708.79 1429.6                         
## (1 | ID)    5 -721.55 1453.1 25.531  1  4.354e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • LRT_DE = 25.531, p = 4.354e-07 Indicates that the interaction between gender and duration significantly improves the model fit, suggesting that the way gender and duration interact to predict satisfaction varies across couples.