In this exercise, we are going to replicate the SPSS MIXED analysis in R.

Load libraries

library(haven)
library(dplyr)
library(nlme)

Load SPSS data file

dyad_df <- read_sav("sample data_two_level.sav")

Let’s first run this analysis using linear regression (ignoring nonindependence).

lm_mod <- lm(awish ~ sloppy_roommateC + first_similarNC + genderN,
                  data = dyad_df)

summary(lm_mod)
## 
## Call:
## lm(formula = awish ~ sloppy_roommateC + first_similarNC + genderN, 
##     data = dyad_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7736 -0.5838 -0.2863  0.0111  4.5374 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1.53715    0.15437   9.958  < 2e-16 ***
## sloppy_roommateC -0.01357    0.12951  -0.105 0.916718    
## first_similarNC  -0.29747    0.07912  -3.759 0.000269 ***
## genderN          -0.06684    0.15503  -0.431 0.667181    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.303 on 115 degrees of freedom
##   (17 observations deleted due to missingness)
## Multiple R-squared:  0.1171, Adjusted R-squared:  0.09405 
## F-statistic: 5.083 on 3 and 115 DF,  p-value: 0.00243

MINI-CHALLENGE: How would you interpret the linear model results?


Here is the original SPSS code

MIXED
awish  WITH sloppy_roommateC first_similarNC genderN
 /FIXED = sloppy_roommateC first_similarNC genderN
/PRINT = SOLUTION TESTCOV
/Repeated day | SUBJECT(participant_ID) COVTYPE(CSH).

Let’s replicate it in R

How are we accounting for within-person effects?

When you use the /REPEATED part in the MIXED command, you’re telling SPSS, “Hey, my data has measurements that are related to each other (like several measurements from the same person).”The COVTYPE(CSH) option is SPSS’s way of saying, “Not only are these measurements related, but they also might have different levels of variability.”

In the gls model, Participant_ID can be included as part of a correlation structure, but not as a random effect. The correlation argument in gls (e.g., corCompSymm(form = ~ 1 | Participant_ID)) specifies a correlation structure for the residuals. This means that observations (e.g., awish) from the same participant are assumed to be correlated. We also add the “weights” line to tell R that each day should have its own variability.

# optional
ctrl <- lmeControl(msMaxIter=10000,
                   MaxIter=100000,
                   msMaxEval=10000,
                   returnObject=TRUE,
                   niterEM=10000,
                   nlmStepMax=1000)

# Treating day as distinguishing factor
model_dist <- gls(awish ~ sloppy_roommateC + first_similarNC + genderN, # DV ~ fixed effects
                  data = dyad_df, # specify dataframe
                  method = "REML", 
                  na.action=na.exclude, # ignore NA values
                  control = ctrl, # reference ctrl we created
                  correlation = corCompSymm(form = ~ 1 | Participant_ID), # tells R this is within-person
                  weights = varIdent(form = ~1 | day)) # gives each 'day' its own variance

# print all results
summary(model_dist)
## Generalized least squares fit by REML
##   Model: awish ~ sloppy_roommateC + first_similarNC + genderN 
##   Data: dyad_df 
##        AIC      BIC    logLik
##   383.9418 403.1564 -184.9709
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | Participant_ID 
##  Parameter estimate(s):
##       Rho 
## 0.6972369 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | day 
##  Parameter estimates:
##       10       11 
## 1.000000 1.222602 
## 
## Coefficients:
##                       Value  Std.Error   t-value p-value
## (Intercept)       1.5005809 0.19227253  7.804448  0.0000
## sloppy_roommateC  0.0145184 0.12453186  0.116584  0.9074
## first_similarNC  -0.2679739 0.09691191 -2.765128  0.0066
## genderN          -0.0601296 0.19253516 -0.312305  0.7554
## 
##  Correlation: 
##                  (Intr) slpp_C frs_NC
## sloppy_roommateC -0.025              
## first_similarNC  -0.076  0.161       
## genderN           0.636  0.021 -0.086
## 
## Standardized residuals:
##          Min           Q1          Med           Q3          Max 
## -1.355403234 -0.448201059 -0.221400515  0.005400028  3.949516410 
## 
## Residual standard error: 1.18154 
## Degrees of freedom: 119 total; 115 residual

Now, let’s compare these results to SPSS

SPSS: Omnibus Test

R: Omnibus Test

anova(model_dist, type = "marginal")
## Denom. DF: 115 
##                  numDF  F-value p-value
## (Intercept)          1 60.90940  <.0001
## sloppy_roommateC     1  0.01359  0.9074
## first_similarNC      1  7.64594  0.0066
## genderN              1  0.09753  0.7554

SPSS: Fixed Effects

R: Fixed Effects

# View fixed effects
options(scipen=999) # optional: don't put p-values in scientific notation
as.data.frame(summary(model_dist)$tTable)
##                        Value  Std.Error    t-value              p-value
## (Intercept)       1.50058088 0.19227253  7.8044475 0.000000000003021373
## sloppy_roommateC  0.01451838 0.12453186  0.1165837 0.907393316095992164
## first_similarNC  -0.26797388 0.09691191 -2.7651285 0.006630062223051679
## genderN          -0.06012962 0.19253516 -0.3123046 0.755374589197473578

Confidence intervals

intervals(model_dist)
## Approximate 95% confidence intervals
## 
##  Coefficients:
##                       lower        est.       upper
## (Intercept)       1.1197260  1.50058088  1.88143576
## sloppy_roommateC -0.2321553  0.01451838  0.26119202
## first_similarNC  -0.4599377 -0.26797388 -0.07601004
## genderN          -0.4415047 -0.06012962  0.32124547
## 
##  Correlation structure:
##         lower      est.     upper
## Rho 0.5301396 0.6972369 0.8122025
## 
##  Variance function:
##        lower     est.    upper
## 11 0.9991561 1.222602 1.496018
## 
##  Residual standard error:
##     lower      est.     upper 
## 0.9730747 1.1815398 1.4346652

SPSS: Covariance Parameters

R: Covariance Parameters

getVarCov(model_dist)
## Marginal variance covariance matrix
##       [,1]   [,2]
## [1,] 1.396 1.1900
## [2,] 1.190 2.0867
##   Standard Deviations: 1.1815 1.4446

What if we want to add interactions?

# Treating day as distinguishing factor
model_dist_int <- gls(awish ~ sloppy_roommateC + first_similarNC + genderN + dayR +
                      dayR:sloppy_roommateC + dayR:first_similarNC + dayR:genderN, # specify interactions with :
                  data = dyad_df, # specify dataframe
                  method = "REML", 
                  na.action=na.exclude, # ignore NA values
                  control = ctrl, # reference ctrl we created
                  correlation = corCompSymm(form = ~ 1 | Participant_ID), # tells R this is within-person
                  weights = varIdent(form = ~1 | day)) # gives each 'day' its own variance

# print all results
summary(model_dist_int)
## Generalized least squares fit by REML
##   Model: awish ~ sloppy_roommateC + first_similarNC + genderN + dayR +      dayR:sloppy_roommateC + dayR:first_similarNC + dayR:genderN 
##   Data: dyad_df 
##        AIC      BIC    logLik
##   400.7978 430.6027 -189.3989
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | Participant_ID 
##  Parameter estimate(s):
##       Rho 
## 0.6923991 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | day 
##  Parameter estimates:
##      10      11 
## 1.00000 1.23134 
## 
## Coefficients:
##                            Value  Std.Error   t-value p-value
## (Intercept)            1.5507918 0.19485538  7.958681  0.0000
## sloppy_roommateC       0.0001695 0.12598015  0.001345  0.9989
## first_similarNC       -0.3021855 0.09904984 -3.050843  0.0029
## genderN               -0.0537230 0.19557733 -0.274689  0.7841
## dayR                   0.1347902 0.08982004  1.500670  0.1363
## sloppy_roommateC:dayR -0.0303575 0.08001806 -0.379383  0.7051
## first_similarNC:dayR  -0.0787840 0.04647043 -1.695359  0.0928
## genderN:dayR           0.0438726 0.09106250  0.481786  0.6309
## 
##  Correlation: 
##                       (Intr) slpp_C frs_NC gendrN dayR   sl_C:R f_NC:R
## sloppy_roommateC      -0.025                                          
## first_similarNC       -0.059  0.154                                   
## genderN                0.625  0.055 -0.079                            
## dayR                   0.211  0.010  0.020  0.115                     
## sloppy_roommateC:dayR  0.009  0.083 -0.007  0.044 -0.030              
## first_similarNC:dayR   0.022  0.013  0.241 -0.007 -0.090  0.202       
## genderN:dayR           0.111  0.138  0.002  0.223  0.623  0.068 -0.105
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -1.49489622 -0.45615068 -0.21865985  0.01630524  4.15457370 
## 
## Residual standard error: 1.167647 
## Degrees of freedom: 119 total; 111 residual