Basics

library(tidyverse)

## Warning: package 'ggplot2' was built under R version 4.3.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Practice_long <- read_csv("Practice_long.csv")

## Rows: 170 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (9): Dyad, Gender, No, comm_freq, criticism1, criticism2, criticism3, re...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

** This is a course material for PSY6111-01-00. Any questions or comments regarding the material should be addressed to the course instructor, Jeong Eun Cheon (email: cheonje@yonsei.ac.kr) **

Introduction

Let’s first go over the basics of linear regression before we delve into more sophisticated models, such as dyadic data analysis and Actor-Partner Interdependence Models (APIM). We will start by examining how individual-level data can be analyzed using simple and multiple regression. Then, we will extend these concepts to dyadic data using random effects and distinguishable dyad modeling.

Basic Linear Regression

In this section, we will fit some basic linear regression models using individual-level data. Even though the data is nested (i.e., dyads are present), we will temporarily ignore this and treat it as if it were individual-level data.

# View the names of the variables in the dataset
names(Practice_long)

## [1] "Dyad"       "Gender"     "No"         "comm_freq"  "criticism1"
## [6] "criticism2" "criticism3" "rel_sat"    "criticism"

Fitting a Simple Intercept-Only Model

This model estimates the mean of rel_sat (relationship satisfaction) without any predictors.

# Fit a simple linear regression model with no predictors
a <- lm(formula = rel_sat ~ 1 , data = Practice_long)
summary(a)

## 
## Call:
## lm(formula = rel_sat ~ 1, data = Practice_long)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3332 -1.4105  0.0297  1.2168  4.1679 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.7145     0.1302    43.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.697 on 169 degrees of freedom

# Describe the distribution of relationship satisfaction
psych::describe(Practice_long$rel_sat)

##    vars   n mean  sd median trimmed  mad  min  max range skew kurtosis   se
## X1    1 170 5.71 1.7   5.74    5.67 1.99 2.38 9.88   7.5 0.16     -0.7 0.13

Adding a Predictor: Criticism

Next, we add criticism as a predictor to see how it affects relationship satisfaction.

# Fit a linear regression model with criticism as the predictor
b <- lm(formula = rel_sat ~ 1 + criticism , data = Practice_long)
summary(b)

## 
## Call:
## lm(formula = rel_sat ~ 1 + criticism, data = Practice_long)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3311 -1.4044  0.0612  1.2268  4.1769 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.61220    0.52073  10.777   <2e-16 ***
## criticism    0.02518    0.12413   0.203    0.839    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.702 on 168 degrees of freedom
## Multiple R-squared:  0.0002449,  Adjusted R-squared:  -0.005706 
## F-statistic: 0.04115 on 1 and 168 DF,  p-value: 0.8395

# For each 1 unit change in criticism, relationship satisfaction goes up by 0.025

Analyzing Gender Differences

In this model, we examine whether there are differences in relationship satisfaction between men and women.

# Fit a linear regression model with Gender as the predictor
c <- lm(formula = rel_sat ~ Gender , data = Practice_long)
summary(c)

## 
## Call:
## lm(formula = rel_sat ~ Gender, data = Practice_long)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8476 -1.2116  0.0842  1.1355  4.4840 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.2000     0.1759  29.559  < 2e-16 ***
## Gender        1.0289     0.2488   4.136 5.58e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.622 on 168 degrees of freedom
## Multiple R-squared:  0.0924, Adjusted R-squared:  0.08699 
## F-statistic:  17.1 on 1 and 168 DF,  p-value: 5.583e-05

# Calculate the difference in relationship satisfaction between men and women
t.test(rel_sat ~ Gender, Practice_long)

## 
##  Welch Two Sample t-test
## 
## data:  rel_sat by Gender
## t = -4.1356, df = 161.95, p-value = 5.673e-05
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -1.5201451 -0.5375828
## sample estimates:
## mean in group 0 mean in group 1 
##        5.200030        6.228894

# Difference in men and women's relationship satisfaction is approximately 1.029
6.229 - 5.200

## [1] 1.029

Interaction Between Criticism and Gender

We now investigate whether the relationship between criticism and relationship satisfaction is different for men and women by adding an interaction term.

# Fit a linear regression model with an interaction between criticism and Gender
d <- lm(formula = rel_sat ~ criticism*Gender , data = Practice_long)
summary(d)

## 
## Call:
## lm(formula = rel_sat ~ criticism * Gender, data = Practice_long)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.866 -1.198  0.036  1.137  4.617 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        5.8668     0.7373   7.957 2.62e-13 ***
## criticism         -0.1587     0.1706  -0.931   0.3533    
## Gender            -0.9080     0.9948  -0.913   0.3627    
## criticism:Gender   0.4826     0.2369   2.037   0.0432 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.609 on 166 degrees of freedom
## Multiple R-squared:  0.1176, Adjusted R-squared:  0.1017 
## F-statistic: 7.377 on 3 and 166 DF,  p-value: 0.0001138

# For men, the relationship between criticism and relationship satisfaction is -0.159.
# For women, it is -0.159 + 0.483 = 0.324.
# So the relationship is negative for men and positive for women.

# Test for the significance of the interaction
library(interactions)
sim_slopes(model = d, pred = criticism, modx = Gender)

## JOHNSON-NEYMAN INTERVAL 
## 
## When Gender is OUTSIDE the interval [-6.19, 1.00], the slope of criticism
## is p < .05.
## 
## Note: The range of observed values of Gender is [0.00, 1.00]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of criticism when Gender = 0.00 (0): 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -0.16   0.17    -0.93   0.35
## 
## Slope of criticism when Gender = 1.00 (1): 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.32   0.16     1.97   0.05

Switching Reference Groups

In regression models, the reference group is the category used as the baseline when estimating effects in a categorical variable. By default, in R, the category coded as 0 is used as the reference group. In this case, Gender is being used as a predictor (0 = men and 1 = women), so the reference group in the model would be men.

However, sometimes we want to change the reference group to focus on the effects for women. To do this, we create a new variable called women where the value is 1 for women and 0 for men (i.e., reversing the reference group).

# Change the reference group to focus on women
Practice_long <- Practice_long %>% mutate(women = ifelse(Gender == 0, 1, 0))

# Fit the interaction model with women as the focus
e <- lm(formula = rel_sat ~ criticism*women , data = Practice_long)
summary(e)

## 
## Call:
## lm(formula = rel_sat ~ criticism * women, data = Practice_long)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.866 -1.198  0.036  1.137  4.617 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       4.9588     0.6679   7.425 5.61e-12 ***
## criticism         0.3239     0.1644   1.970   0.0505 .  
## women             0.9080     0.9948   0.913   0.3627    
## criticism:women  -0.4826     0.2369  -2.037   0.0432 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.609 on 166 degrees of freedom
## Multiple R-squared:  0.1176, Adjusted R-squared:  0.1017 
## F-statistic: 7.377 on 3 and 166 DF,  p-value: 0.0001138

What this does: By changing the reference group, the model now estimates the effects of criticism on relationship satisfaction (rel_sat) for women (when women = 1) and compares it to men.

Why do this: This approach allows us to interpret the effect of criticism on rel_sat more clearly for women (since the coefficient will now reflect the effect for women directly). The interaction term will show how the relationship between criticism and satisfaction differs between men and women.

Moving to Dyadic Data: APIM and Random Intercepts

In dyadic data, fitting a random intercept model allows us to account for between-dyad variability. This means that each dyad (or couple) can have its own level of relationship satisfaction (intercept), reflecting the differences between dyads in their overall levels of satisfaction.

library(nlme)

## 
## Attaching package: 'nlme'

## The following object is masked from 'package:dplyr':
## 
##     collapse

# Fit a model with random intercepts for each dyad
APIM_mlm <- lme(rel_sat ~ 1,
                data = Practice_long,
                random = ~ 1 | Dyad,
                na.action = na.omit)

summary(APIM_mlm)

## Linear mixed-effects model fit by REML
##   Data: Practice_long 
##        AIC      BIC    logLik
##   669.5704 678.9601 -331.7852
## 
## Random effects:
##  Formula: ~1 | Dyad
##          (Intercept) Residual
## StdDev: 0.0001701197 1.697392
## 
## Fixed effects:  rel_sat ~ 1 
##                Value Std.Error DF  t-value p-value
## (Intercept) 5.714462  0.130184 85 43.89527       0
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.96368957 -0.83099092  0.01752416  0.71687973  2.45549684 
## 
## Number of Observations: 170
## Number of Groups: 85

# Calculate the intraclass correlation (ICC)
0.0001701/(0.0001701 + 1.697)

## [1] 0.0001002257

Intraclass Correlation Coefficient (ICC): The ICC indicates how much of the variability in relationship satisfaction is due to between-dyad differences. A higher ICC means that the dyads (couples) themselves account for a large portion of the variability in satisfaction.

In our current example, we have a near-zero ICC. This occurs because the data was simulated or generated, and in this case, the variability between dyads is minimal. A near-zero ICC suggests that the dyads do not account for much of the variation in relationship satisfaction, meaning that there isn’t much difference between dyads.

When you encounter a near-zero ICC, it implies that there is little to no between-dyad variability, and the structure of the data may not require the additional complexity of a multilevel model. In such cases, running a normal regression without accounting for the nested structure (random intercepts) may be more appropriate.

However, for practice purposes, let’s proceed with adding a predictor to the model and continue with dyadic analysis.

Adding A Predictor to the Model

After fitting a simple random intercept model, we can now add a predictor, such as criticism, to examine how it affects relationship satisfaction. This allows us to model the relationship between an individual’s criticism level and their satisfaction, while still accounting for the fact that dyads may have different baseline satisfaction levels.

# Fit a linear mixed-effects model with random intercepts for dyads
mlm <- lme(rel_sat ~ criticism,
           data = Practice_long,
           random = ~ 1 | Dyad,
           na.action = na.omit)
summary(mlm)

## Linear mixed-effects model fit by REML
##   Data: Practice_long 
##       AIC      BIC    logLik
##   673.867 686.3628 -332.9335
## 
## Random effects:
##  Formula: ~1 | Dyad
##          (Intercept) Residual
## StdDev: 0.0002042523 1.702228
## 
## Fixed effects:  rel_sat ~ criticism 
##                Value Std.Error DF   t-value p-value
## (Intercept) 5.612197 0.5207336 84 10.777482  0.0000
## criticism   0.025182 0.1241306 84  0.202866  0.8397
##  Correlation: 
##           (Intr)
## criticism -0.968
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.95691168 -0.82502170  0.03595135  0.72071427  2.45381308 
## 
## Number of Observations: 170
## Number of Groups: 85

Reshaping Data for APIM (Person-Period Format)

To properly model both actor and partner effects in dyadic data, we need to reshape the data into person-period format. In this format, each row represents one person in the dyad (e.g., partner A or partner B), and each variable is separated into its actor and partner version (e.g., criticism_A for the actor and criticism_P for the partner).

Reshaping the data this way is crucial for implementing the Actor-Partner Interdependence Model (APIM), which allows us to model how an individual’s behavior affects both their own satisfaction (actor effect) and their partner’s satisfaction (partner effect).

# Create the female (Gender == 1) and male (Gender == 0) datasets
female_data <- Practice_long %>%
  mutate(partnum = 1) %>%
  gather(variable, value, -Dyad, -Gender) %>%
  mutate(Gender = ifelse(Gender == 1, "A", "P")) %>%
  unite(var_gender, variable, Gender) %>%
  spread(var_gender, value)

male_data <- Practice_long %>%
  mutate(partnum = 2) %>%
  gather(variable, value, -Dyad, -Gender) %>%
  mutate(Gender = ifelse(Gender == 1, "P", "A")) %>%
  unite(var_gender, variable, Gender) %>%
  spread(var_gender, value)

# Combine female and male data
dyad_pairwise <- bind_rows(female_data, male_data) %>%
  arrange(Dyad) %>%
  select(Dyad, everything())

APIM: Actor and Partner Effects

Once the data is reshaped, we can fit the APIM to model both actor and partner effects. In this model, we include the criticism levels of both the actor (criticism_A) and the partner (criticism_P) to see how both variables influence the actor’s relationship satisfaction.

# Fit an APIM model with random intercepts for each dyad
APIM_mlm <- lme(rel_sat_A ~ criticism_A + criticism_P,
                data = dyad_pairwise,
                random = ~ 1 | Dyad,
                na.action = na.omit)

summary(APIM_mlm)

## Linear mixed-effects model fit by REML
##   Data: dyad_pairwise 
##        AIC      BIC    logLik
##   678.1732 693.7632 -334.0866
## 
## Random effects:
##  Formula: ~1 | Dyad
##          (Intercept) Residual
## StdDev: 0.0001829362 1.707211
## 
## Fixed effects:  rel_sat_A ~ criticism_A + criticism_P 
##                Value Std.Error DF  t-value p-value
## (Intercept) 5.544689 0.7030019 84 7.887161  0.0000
## criticism_A 0.023900 0.1248142 83 0.191486  0.8486
## criticism_P 0.017905 0.1248142 83 0.143454  0.8863
##  Correlation: 
##             (Intr) crtc_A
## criticism_A -0.669       
## criticism_P -0.669 -0.072
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.95296202 -0.83591391  0.04307722  0.71946769  2.46046546 
## 
## Number of Observations: 170
## Number of Groups: 85

Actor Effect: How the individual’s own criticism level (criticism_A) affects their own relationship satisfaction (rel_sat_A).

Partner Effect: How the partner’s criticism level (criticism_P) affects the individual’s relationship satisfaction.

Distinguishable Dyads

In some dyadic analyses, such as male-female couples, we treat dyads as distinguishable because the two members differ in important ways (e.g., by gender). We can include interaction terms to model these differences explicitly, allowing us to estimate separate effects for men and women.

# Fit an APIM model with distinguishable dyads by gender
APIM_mlm <- lme(rel_sat_A ~ (criticism_A + criticism_P)*women_A,
                data = dyad_pairwise,
                random = ~ 1 | Dyad,
                na.action = na.omit)

summary(APIM_mlm)

## Linear mixed-effects model fit by REML
##   Data: dyad_pairwise 
##        AIC      BIC    logLik
##   665.9444 690.7433 -324.9722
## 
## Random effects:
##  Formula: ~1 | Dyad
##         (Intercept) Residual
## StdDev:   0.3327061 1.582815
## 
## Fixed effects:  rel_sat_A ~ (criticism_A + criticism_P) * women_A 
##                         Value Std.Error DF   t-value p-value
## (Intercept)          5.250601 0.9444245 84  5.559577  0.0000
## criticism_A          0.330466 0.1659384 80  1.991502  0.0498
## criticism_P         -0.075661 0.1721829 80 -0.439424  0.6615
## women_A              0.734271 1.3070548 80  0.561775  0.5758
## criticism_A:women_A -0.486059 0.2395868 80 -2.028737  0.0458
## criticism_P:women_A  0.042175 0.2395868 80  0.176033  0.8607
##  Correlation: 
##                     (Intr) crtc_A crtc_P womn_A c_A:_A
## criticism_A         -0.620                            
## criticism_P         -0.703 -0.091                     
## women_A             -0.692  0.429  0.487              
## criticism_A:women_A  0.408 -0.695  0.093 -0.647       
## criticism_P:women_A  0.487  0.095 -0.721 -0.647 -0.132
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -2.362098558 -0.684791816  0.008982524  0.676471353  2.749830068 
## 
## Number of Observations: 170
## Number of Groups: 85

APIM_mlm <- lme(rel_sat_A ~ (criticism_A + criticism_P)*women_A,
                data = dyad_pairwise,
                random = ~ 1 | Dyad,
                correlation = corCompSymm(form=~1|Dyad), # Further model within-dyad correlation
               weights = varIdent(form=~1|women_A), # Allow different residual variance for men and women
                na.action = na.omit)

summary(APIM_mlm)

## Linear mixed-effects model fit by REML
##   Data: dyad_pairwise 
##        AIC      BIC    logLik
##   667.2405 698.2392 -323.6203
## 
## Random effects:
##  Formula: ~1 | Dyad
##         (Intercept) Residual
## StdDev:   0.5481133 1.669218
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | Dyad 
##  Parameter estimate(s):
##         Rho 
## -0.08368695 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | women_A 
##  Parameter estimates:
##        0        1 
## 1.000000 0.813705 
## Fixed effects:  rel_sat_A ~ (criticism_A + criticism_P) * women_A 
##                         Value Std.Error DF   t-value p-value
## (Intercept)          5.250601 1.0258813 84  5.118137  0.0000
## criticism_A          0.330466 0.1802506 80  1.833373  0.0705
## criticism_P         -0.075661 0.1870338 80 -0.404533  0.6869
## women_A              0.734271 1.3070552 80  0.561775  0.5758
## criticism_A:women_A -0.486059 0.2387926 80 -2.035485  0.0451
## criticism_P:women_A  0.042175 0.2403784 80  0.175454  0.8612
##  Correlation: 
##                     (Intr) crtc_A crtc_P womn_A c_A:_A
## criticism_A         -0.620                            
## criticism_P         -0.703 -0.091                     
## women_A             -0.757  0.469  0.532              
## criticism_A:women_A  0.448 -0.757  0.097 -0.639       
## criticism_P:women_A  0.531  0.097 -0.781 -0.654 -0.132
## 
## Standardized Within-Group Residuals:
##           Min            Q1           Med            Q3           Max 
## -2.1625481369 -0.6763683411 -0.0008082596  0.6527694227  2.8030965875 
## 
## Number of Observations: 170
## Number of Groups: 85

Intro to Multilevel Modeling

Jeong Eun Cheon, Department of Psychology, Yonsei University