Interpreting main effects with and without an interaction term

Without an interaction term

\[conflict = b_0 + b_1sleep_1 + b_2stress_2 + \epsilon\] \(b_0\): is the intercept (predicted outcome) when the predictors are 0.

\(b_1\): represents the slope (main effect) of sleep.

\(b_2\): represents the slope (main effect) of stress.

Here, only the intercept is interpreted when the predictors are at zero


With an interaction

\[conflict = b_0 + b_1sleep_1 + b_2stress_2 + b_3(sleep_1 \times stress_2) + \epsilon\]

\[conflict = b_0 + b_2stress_2 + (b_1 + b_3stress_2)sleep_1 + \epsilon\]

  • With an interaction term, \(sleep_1\) is now \(b_1 + b_3stress_2\). The “main effects” are now interpreted as simple effects.
    • If \(stress_2 = 0\), then the slope of \(sleep_1\) is \(b_1\).
    • If \(stress_2 = 1\), then the slope of \(sleep_1\) is \(b_1 + b_3\).
    • So, \(b_3\) is the additional increase in the effect of \(sleep_1\) when \(stress_2\) increases by 1 unit.

The issue with analyzing subsets

Here is a dataset looking at the effect of sleep condition (sleep deprivation = 0, normal sleep = 1) and gender (female = 0, male = 1) on relationship conflict.

Let’s say we are just interested in the sleep deprived group and want to look at whether sleep deprived men and women reported different levels of relationship conflict.

We can take three approaches:

  1. Planned contrasts
  2. Pairwise comparisons
  3. Creating a subset with only the sleep deprived participants
df <- read.csv("/Users/kareenadelrosario/Downloads/simulated_sleep_stress_conflict_binary_sleep_gender.csv")

head(df)
##   sleep   stress gender  conflict
## 1     0 2.876944      1 46.564295
## 2     0 4.369032      1 24.113348
## 3     1 4.485928      0 17.399893
## 4     1 3.796584      0 23.426948
## 5     1 4.758071      1  9.074021
## 6     0 5.606076      0 19.340843

Planned contrasts (e.g., dummy coding)

model_interaction <- lm(conflict ~ factor(sleep) * factor(gender), data = df)

summary(model_interaction)
## 
## Call:
## lm(formula = conflict ~ factor(sleep) * factor(gender), data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -26.4239  -6.6321   0.2746   6.5583  26.0689 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     24.8043     2.0575  12.056  < 2e-16 ***
## factor(sleep)1                  -7.9739     2.9097  -2.740  0.00732 ** 
## factor(gender)1                  1.0422     3.2891   0.317  0.75204    
## factor(sleep)1:factor(gender)1  -0.7322     4.4284  -0.165  0.86901    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.89 on 96 degrees of freedom
## Multiple R-squared:  0.1298, Adjusted R-squared:  0.1026 
## F-statistic: 4.771 on 3 and 96 DF,  p-value: 0.003827

Pairwise comparisons

emm <- emmeans::emmeans(model_interaction, specs = c("gender", "sleep"))
pairs(emm)
##  contrast                        estimate   SE df t.ratio p.value
##  gender0 sleep0 - gender1 sleep0    -1.04 3.29 96  -0.317  0.9889
##  gender0 sleep0 - gender0 sleep1     7.97 2.91 96   2.740  0.0362
##  gender0 sleep0 - gender1 sleep1     7.66 2.97 96   2.585  0.0539
##  gender1 sleep0 - gender0 sleep1     9.02 3.29 96   2.741  0.0361
##  gender1 sleep0 - gender1 sleep1     8.71 3.34 96   2.608  0.0508
##  gender0 sleep1 - gender1 sleep1    -0.31 2.97 96  -0.105  0.9996
## 
## P value adjustment: tukey method for comparing a family of 4 estimates

Creating a subset with only the sleep deprived participants

Degrees of freedom are cut in half.

Note that the F-stat for the model is no longer significant, meaning that the amount of explained variance does not outweight the unexplained variance. In other words, the model fit is poor.

df_filter <- df %>% 
  dplyr::filter(sleep == 0)

model_filter <- lm(conflict ~ factor(gender), data = df_filter)

summary(model_filter)
## 
## Call:
## lm(formula = conflict ~ factor(gender), data = df_filter)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -26.4239  -5.8695   0.2168   4.3641  22.5556 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       24.804      1.890  13.122   <2e-16 ***
## factor(gender)1    1.042      3.022   0.345    0.732    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10 on 44 degrees of freedom
## Multiple R-squared:  0.002696,   Adjusted R-squared:  -0.01997 
## F-statistic: 0.1189 on 1 and 44 DF,  p-value: 0.7318
LS0tCnRpdGxlOiAiTGFiIDk6IEhvbWV3b3JrIENsYXJpZmljYXRpb25zIgphdXRob3I6ICJLYXJlZW5hIGRlbCBSb3NhcmlvIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDogCiAgaHRtbF9kb2N1bWVudDoKICAgIHRoZW1lOiBmbGF0bHkKICAgIGhpZ2hsaWdodDogYXJyb3cKICAgIHRvYzogdHJ1ZQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUpCgpsaWJyYXJ5KGRwbHlyKQpgYGAKCiMgSW50ZXJwcmV0aW5nIG1haW4gZWZmZWN0cyB3aXRoIGFuZCB3aXRob3V0IGFuIGludGVyYWN0aW9uIHRlcm0KCiMjIyBXaXRob3V0IGFuIGludGVyYWN0aW9uIHRlcm0KCiQkY29uZmxpY3QgPSBiXzAgKyBiXzFzbGVlcF8xICsgYl8yc3RyZXNzXzIgKyBcZXBzaWxvbiQkClwoYl8wXCk6IGlzIHRoZSBpbnRlcmNlcHQgKHByZWRpY3RlZCBvdXRjb21lKSB3aGVuIHRoZSBwcmVkaWN0b3JzIGFyZSAwLgoKXChiXzFcKTogcmVwcmVzZW50cyB0aGUgc2xvcGUgKG1haW4gZWZmZWN0KSBvZiBzbGVlcC4KClwoYl8yXCk6IHJlcHJlc2VudHMgdGhlIHNsb3BlIChtYWluIGVmZmVjdCkgb2Ygc3RyZXNzLgoKSGVyZSwgb25seSB0aGUgaW50ZXJjZXB0IGlzIGludGVycHJldGVkIHdoZW4gdGhlIHByZWRpY3RvcnMgYXJlIGF0IHplcm8KCi0tLS0tCgojIyMgV2l0aCBhbiBpbnRlcmFjdGlvbgoKJCRjb25mbGljdCA9IGJfMCArIGJfMXNsZWVwXzEgKyBiXzJzdHJlc3NfMiArIGJfMyhzbGVlcF8xIFx0aW1lcyBzdHJlc3NfMikgKyBcZXBzaWxvbiQkCgokJGNvbmZsaWN0ID0gYl8wICArIGJfMnN0cmVzc18yICsgKGJfMSArIGJfM3N0cmVzc18yKXNsZWVwXzEgKyBcZXBzaWxvbiQkCgogICogV2l0aCBhbiBpbnRlcmFjdGlvbiB0ZXJtLCBcKHNsZWVwXzFcKSBpcyBub3cgXChiXzEgKyBiXzNzdHJlc3NfMlwpLiBUaGUgIm1haW4gZWZmZWN0cyIgYXJlIG5vdyBpbnRlcnByZXRlZCBhcyAqc2ltcGxlIGVmZmVjdHMuKgogICAgKyBJZiBcKHN0cmVzc18yID0gMFwpLCB0aGVuIHRoZSBzbG9wZSBvZiBcKHNsZWVwXzFcKSBpcyBcKGJfMVwpLgogICAgKyBJZiBcKHN0cmVzc18yID0gMVwpLCB0aGVuIHRoZSBzbG9wZSBvZiBcKHNsZWVwXzFcKSBpcyBcKGJfMSArIGJfM1wpLgogICAgKyBTbywgXChiXzNcKSBpcyB0aGUgYWRkaXRpb25hbCBpbmNyZWFzZSBpbiB0aGUgZWZmZWN0IG9mIFwoc2xlZXBfMVwpIHdoZW4gXChzdHJlc3NfMlwpIGluY3JlYXNlcyBieSAxIHVuaXQuCgoKIyBUaGUgaXNzdWUgd2l0aCBhbmFseXppbmcgc3Vic2V0cwoKSGVyZSBpcyBhIGRhdGFzZXQgbG9va2luZyBhdCB0aGUgZWZmZWN0IG9mIHNsZWVwIGNvbmRpdGlvbiAoc2xlZXAgZGVwcml2YXRpb24gPSAwLCBub3JtYWwgc2xlZXAgPSAxKSBhbmQgZ2VuZGVyIChmZW1hbGUgPSAwLCBtYWxlID0gMSkgb24gcmVsYXRpb25zaGlwIGNvbmZsaWN0LiAKCkxldCdzIHNheSB3ZSBhcmUganVzdCBpbnRlcmVzdGVkIGluIHRoZSBzbGVlcCBkZXByaXZlZCBncm91cCBhbmQgd2FudCB0byBsb29rIGF0IHdoZXRoZXIgc2xlZXAgZGVwcml2ZWQgbWVuIGFuZCB3b21lbiByZXBvcnRlZCBkaWZmZXJlbnQgbGV2ZWxzIG9mIHJlbGF0aW9uc2hpcCBjb25mbGljdC4gCgpXZSBjYW4gdGFrZSB0aHJlZSBhcHByb2FjaGVzOgoKICAxLiBQbGFubmVkIGNvbnRyYXN0cwogIDIuIFBhaXJ3aXNlIGNvbXBhcmlzb25zCiAgMy4gQ3JlYXRpbmcgYSBzdWJzZXQgd2l0aCBvbmx5IHRoZSBzbGVlcCBkZXByaXZlZCBwYXJ0aWNpcGFudHMKICAKYGBge3J9CmRmIDwtIHJlYWQuY3N2KCIvVXNlcnMva2FyZWVuYWRlbHJvc2FyaW8vRG93bmxvYWRzL3NpbXVsYXRlZF9zbGVlcF9zdHJlc3NfY29uZmxpY3RfYmluYXJ5X3NsZWVwX2dlbmRlci5jc3YiKQoKaGVhZChkZikKYGBgCgoKIyMjIFBsYW5uZWQgY29udHJhc3RzIChlLmcuLCBkdW1teSBjb2RpbmcpCgpgYGB7cn0KbW9kZWxfaW50ZXJhY3Rpb24gPC0gbG0oY29uZmxpY3QgfiBmYWN0b3Ioc2xlZXApICogZmFjdG9yKGdlbmRlciksIGRhdGEgPSBkZikKCnN1bW1hcnkobW9kZWxfaW50ZXJhY3Rpb24pCmBgYAoKIyMjIFBhaXJ3aXNlIGNvbXBhcmlzb25zCgpgYGB7cn0KZW1tIDwtIGVtbWVhbnM6OmVtbWVhbnMobW9kZWxfaW50ZXJhY3Rpb24sIHNwZWNzID0gYygiZ2VuZGVyIiwgInNsZWVwIikpCnBhaXJzKGVtbSkKYGBgCgojIyMgQ3JlYXRpbmcgYSBzdWJzZXQgd2l0aCBvbmx5IHRoZSBzbGVlcCBkZXByaXZlZCBwYXJ0aWNpcGFudHMKCkRlZ3JlZXMgb2YgZnJlZWRvbSBhcmUgY3V0IGluIGhhbGYuIAoKTm90ZSB0aGF0IHRoZSBGLXN0YXQgZm9yIHRoZSBtb2RlbCBpcyBubyBsb25nZXIgc2lnbmlmaWNhbnQsIG1lYW5pbmcgdGhhdCB0aGUgYW1vdW50IG9mIGV4cGxhaW5lZCB2YXJpYW5jZSBkb2VzIG5vdCBvdXR3ZWlnaHQgdGhlIHVuZXhwbGFpbmVkIHZhcmlhbmNlLiBJbiBvdGhlciB3b3JkcywgdGhlIG1vZGVsIGZpdCBpcyBwb29yLgpgYGB7cn0KZGZfZmlsdGVyIDwtIGRmICU+JSAKICBkcGx5cjo6ZmlsdGVyKHNsZWVwID09IDApCgptb2RlbF9maWx0ZXIgPC0gbG0oY29uZmxpY3QgfiBmYWN0b3IoZ2VuZGVyKSwgZGF0YSA9IGRmX2ZpbHRlcikKCnN1bW1hcnkobW9kZWxfZmlsdGVyKQpgYGAKCgoKCg==