Introduction

In this report we compare two survey‑weighted wage models for Illinois 2000:

A final section then illustrates the very same logic on a real‑world birthweight ∼ smoking × race example.

2.1 Model 1: Interaction‑Only

mod2000    <- svyglm(
  incwage_inflation ~ education_attainment:race_ethnicity
                     + uhrswork + wkswork2
                     + ind + occ + age + chicago_dummy,
  design = design2000
)




# Filter tidy model output (remove ind and occ terms)
model_tidy_filtered <- tidy(mod2000, conf.int = TRUE) %>%
  filter(!grepl("^ind", term), !grepl("^occ", term))

# Create a model equation string
model_equation <- "incwage_inflation ~ education_attainment:race_ethnicity + uhrswork + wkswork2 + ind + occ + age + chicago_dummy"

# Create table with equation as header
kbl(model_tidy_filtered,
    caption = paste("Regression Model Results:\n", model_equation),
    digits = 3,
    booktabs = TRUE,
    col.names = c("Term", "Estimate", "Std. Error", "Statistic", "P-value", "Conf. Low", "Conf. High")) %>%
  kable_styling(full_width = FALSE, position = "center")
Regression Model Results: incwage_inflation ~ education_attainment:race_ethnicity + uhrswork + wkswork2 + ind + occ + age + chicago_dummy
Term Estimate Std. Error Statistic P-value Conf. Low Conf. High
(Intercept) 24894223555867.996 157418434806062.156 0.158 0.874 -283642297184131.250 333430744295867.250
uhrswork 1230.764 31.891 38.592 0.000 1168.258 1293.270
age 900.623 15.238 59.103 0.000 870.756 930.490
chicago_dummy1 -2459.179 500.431 -4.914 0.000 -3440.013 -1478.345
education_attainmentLess than High School:race_ethnicityWhite (non-Hispanic or Latino) -24894223477285.125 157415906522578.531 -0.158 0.874 -333425788839660.188 283637341885089.938
education_attainmentHigh School Diploma:race_ethnicityWhite (non-Hispanic or Latino) -24894223472238.305 157426320013545.750 -0.158 0.874 -333446199038003.688 283657752093527.062
education_attainmentSome College:race_ethnicityWhite (non-Hispanic or Latino) -24894223466668.320 157427085714059.062 -0.158 0.874 -333447699787873.188 283659252854536.562
education_attainmentBachelor’s Degree:race_ethnicityWhite (non-Hispanic or Latino) -24894223441582.031 157419309464055.656 -0.158 0.874 -333432458491182.250 283644011608018.125
education_attainmentMaster’s Degree or Higher:race_ethnicityWhite (non-Hispanic or Latino) -24894223413543.625 157418461729768.938 -0.158 0.874 -333430796923390.500 283642350096303.250
education_attainmentLess than High School:race_ethnicityHispanic or Latino -24894223478645.219 157416433497210.000 -0.158 0.874 -333426821699208.125 283638374741917.625
education_attainmentHigh School Diploma:race_ethnicityHispanic or Latino -24894223472768.824 157447277589078.938 -0.158 0.874 -333487275405772.375 283698828460234.750
education_attainmentSome College:race_ethnicityHispanic or Latino -24894223467725.809 157427349782227.375 -0.158 0.874 -333448217356482.312 283659770421030.688
education_attainmentBachelor’s Degree:race_ethnicityHispanic or Latino -24894223458217.449 157418030016239.281 -0.158 0.874 -333429950819450.562 283641503903015.688
education_attainmentMaster’s Degree or Higher:race_ethnicityHispanic or Latino -24894223443730.332 157417914640856.812 -0.158 0.874 -333429724671860.688 283641277784400.062
education_attainmentLess than High School:race_ethnicityBlack (non-Hispanic or Latino) -24894223470886.422 157424464650228.656 -0.158 0.874 -333442562567115.875 283654115625343.000
education_attainmentHigh School Diploma:race_ethnicityBlack (non-Hispanic or Latino) -24894223471490.832 157419713415945.250 -0.158 0.874 -333433250257527.188 283644803314545.562
education_attainmentSome College:race_ethnicityBlack (non-Hispanic or Latino) -24894223466551.203 157424209972735.875 -0.158 0.874 -333442063400737.562 283653616467635.188
education_attainmentBachelor’s Degree:race_ethnicityBlack (non-Hispanic or Latino) -24894223455880.551 157437316621185.094 -0.158 0.874 -333467752120335.875 283679305208574.750
education_attainmentMaster’s Degree or Higher:race_ethnicityBlack (non-Hispanic or Latino) -24894223441036.793 157419309464055.656 -0.158 0.874 -333432458490637.000 283644011608563.375
education_attainmentLess than High School:race_ethnicityOther (non-Hispanic or Latino) -24894223482969.691 157416206904170.812 -0.158 0.874 -333426377586374.250 283637930620434.875
education_attainmentHigh School Diploma:race_ethnicityOther (non-Hispanic or Latino) -24894223475445.223 157418016978118.844 -0.158 0.874 -333429925282261.438 283641478331370.938
education_attainmentSome College:race_ethnicityOther (non-Hispanic or Latino) -24894223471128.570 157417517908790.250 -0.158 0.874 -333428947113510.312 283640500171253.188
education_attainmentBachelor’s Degree:race_ethnicityOther (non-Hispanic or Latino) -24894223455165.410 157437391450233.188 -0.158 0.874 -333467898782838.250 283679451872507.375
education_attainmentMaster’s Degree or Higher:race_ethnicityOther (non-Hispanic or Latino) -24894223440831.969 157416577445372.844 -0.158 0.874 -333427103796491.625 283638656914827.625

When we exclude the main effect, the model crashes. How can we prevent this? Or should we just include the main effects (as seen below?)

Model 2: Main Effects + Interaction

# Model 2: Education * Race interaction
mod2000b <- svyglm(
  incwage_inflation ~ education_attainment * race_ethnicity +
                     uhrswork + wkswork2 +
                     ind + occ + age + chicago_dummy,
  design = design2000
)

# Filter tidy model output (remove ind and occ terms)
model2_tidy_filtered <- tidy(mod2000b, conf.int = TRUE) %>%
  filter(!grepl("^ind", term), !grepl("^occ", term))

# Create model equation string
model2_equation <- "incwage_inflation ~ education_attainment * race_ethnicity + uhrswork + wkswork2 + ind + occ + age + chicago_dummy"

# Display regression results table
kbl(model2_tidy_filtered,
    caption = paste("Model 2 Results:\n", model2_equation),
    digits = 3,
    booktabs = TRUE,
    col.names = c("Term", "Estimate", "Std. Error", "Statistic", "P-value", "Conf. Low", "Conf. High")) %>%
  kable_styling(full_width = FALSE, position = "center")
Model 2 Results: incwage_inflation ~ education_attainment * race_ethnicity + uhrswork + wkswork2 + ind + occ + age + chicago_dummy
Term Estimate Std. Error Statistic P-value Conf. Low Conf. High
(Intercept) 78553.604 5970.008 13.158 0.000 66852.526 90254.682
education_attainmentHigh School Diploma 5047.160 592.361 8.520 0.000 3886.147 6208.173
education_attainmentSome College 10617.119 615.753 17.243 0.000 9410.258 11823.981
education_attainmentBachelor’s Degree 35703.208 838.982 42.555 0.000 34058.823 37347.593
education_attainmentMaster’s Degree or Higher 63741.123 1334.627 47.759 0.000 61125.285 66356.962
race_ethnicityHispanic or Latino -1360.121 718.242 -1.894 0.058 -2767.859 47.616
race_ethnicityBlack (non-Hispanic or Latino) 6398.267 1271.590 5.032 0.000 3905.979 8890.554
race_ethnicityOther (non-Hispanic or Latino) -5684.876 1297.491 -4.381 0.000 -8227.929 -3141.824
uhrswork 1230.812 31.961 38.510 0.000 1168.170 1293.455
age 900.622 15.235 59.115 0.000 870.761 930.482
chicago_dummy1 -2458.981 500.471 -4.913 0.000 -3439.893 -1478.069
education_attainmentHigh School Diploma:race_ethnicityHispanic or Latino 829.021 924.551 0.897 0.370 -983.078 2641.119
education_attainmentSome College:race_ethnicityHispanic or Latino 301.862 1028.685 0.293 0.769 -1714.338 2318.062
education_attainmentBachelor’s Degree:race_ethnicityHispanic or Latino -15275.258 1888.524 -8.088 0.000 -18976.722 -11573.795
education_attainmentMaster’s Degree or Higher:race_ethnicityHispanic or Latino -28826.559 4609.893 -6.253 0.000 -37861.845 -19791.274
education_attainmentHigh School Diploma:race_ethnicityBlack (non-Hispanic or Latino) -5651.528 1366.415 -4.136 0.000 -8329.671 -2973.386
education_attainmentSome College:race_ethnicityBlack (non-Hispanic or Latino) -6281.893 1383.763 -4.540 0.000 -8994.036 -3569.750
education_attainmentBachelor’s Degree:race_ethnicityBlack (non-Hispanic or Latino) -20696.615 1747.421 -11.844 0.000 -24121.520 -17271.709
education_attainmentMaster’s Degree or Higher:race_ethnicityBlack (non-Hispanic or Latino) -33890.941 3072.765 -11.029 0.000 -39913.489 -27868.393
education_attainmentHigh School Diploma:race_ethnicityOther (non-Hispanic or Latino) 2477.261 1829.004 1.354 0.176 -1107.545 6062.068
education_attainmentSome College:race_ethnicityOther (non-Hispanic or Latino) 1224.228 1604.974 0.763 0.446 -1921.484 4369.939
education_attainmentBachelor’s Degree:race_ethnicityOther (non-Hispanic or Latino) -7898.584 1977.978 -3.993 0.000 -11775.376 -4021.792
education_attainmentMaster’s Degree or Higher:race_ethnicityOther (non-Hispanic or Latino) -21603.137 3187.711 -6.777 0.000 -27850.978 -15355.297

Interpretation of Main Effects (Conditional on Reference Levels)

Note: When an interaction is in the model, each main‐effect coefficient represents the effect only at the reference level of the other variable.

  • Intercept
    • Value: 78,553.60
    • Meaning: Predicted income for White individuals with Less than HS, holding controls at their baselines (0 hours, age 0, non‑Chicago).
  • High School Diploma (education_attainmentHigh School Diploma)
    • Estimate: +5,047.16 (p < .001)

    • Meaning: Among Whites, earning a HS diploma adds $5,047.16 versus Less than HS.

    • Check:

      62,121.87 - 57,074.71 = 5,047.16
  • Some College (education_attainmentSome College)
    • Estimate: +10,617.12 (p < .001)

    • Meaning: Among Whites, Some College adds $10,617.12 versus Less than HS.

    • Check:

      67,691.83 - 57,074.71 = 10,617.12
  • Bachelor’s Degree (education_attainmentBachelor's Degree)
    • Estimate: +35,703.21 (p < .001)

    • Meaning: Among Whites, a Bachelor’s adds $35,703.21 versus Less than HS.

    • Check:

      92,777.92 - 57,074.71 = 35,703.21
  • Master’s Degree or Higher (education_attainmentMaster's Degree or Higher)
    • Estimate: +63,741.12 (p < .001)

    • Meaning: Among Whites, a Master’s+ adds $63,741.12 versus Less than HS.

    • Check:

      120,815.83 - 57,074.71 = 63,741.12
  • Hispanic or Latino Main Effect (race_ethnicityHispanic or Latino)
    • Estimate: –1,360.12 (p = .058)

    • Meaning: Among Less than HS, Hispanics earn $1,360.12 less than Whites.

    • Check:

      55,714.59 - 57,074.71 = -1,360.12

Interpretation of Interaction with Hispanic or Latino

Each interaction term shows how the Hispanic effect changes at each education level relative to Whites.

  • HS Diploma × Hispanic (…High School Diploma:race_ethnicityHispanic or Latino)
    • Estimate: +829.02 (n.s.)

    • Meaning: The Hispanic HS premium is $829 above the White HS premium.

    • Check:

      (61,590.77 - 55,714.59) - (62,121.87 - 57,074.71)
      = 5,876.18 - 5,047.16 = 829.02
  • Some College × Hispanic
    • Estimate: +301.86 (n.s.)
    • Meaning: Hispanic Some College premium is $301.86 above the White Some College premium.
    • Check: analogous to above.
  • Bachelor’s × Hispanic
    • Estimate: –15,275.26 (p < .001)
    • Meaning: Hispanic Bachelor’s return is $15,275.26 below the White Bachelor’s return.
  • Master’s+ × Hispanic
    • Estimate: –28,826.56 (p < .001)

    • Meaning: Hispanic Master’s+ return is $28,826.56 below the White Master’s+ return.

    • Check:

      (90,629.15 - 55,714.59) - (120,815.83 - 57,074.71)
      = 34,914.56 - 63,741.12 = -28,826.56

Take‑away:
- The education main effects show the return to each credential for Whites (reference race).
- The race main effect shows the race gap at Less than HS (reference education).
- The interaction terms quantify how the Hispanic return to each credential differs from the White return—i.e. the difference in slopes.
- All checks use the predicted means you see in the Emmeans post‑hoc table.

Estimated Marginal Means & Plot (Income by Education and Race Category to confirm the results shown above)

# emmeans post-hoc table for Model 2
emm2000b <- emmeans(
  mod2000b,
  ~ education_attainment * race_ethnicity,
  nuisance = c("ind", "occ", "age", "uhrswork", "wkswork2", "chicago_dummy")
)

tidy(emm2000b, conf.int = TRUE) %>%
  kable(
    caption = "Model 2: Predicted Income by Education and Race (2000) Post-Hoc Results Emmmeans",
    digits = 2,
    format.args = list(big.mark = ",")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Model 2: Predicted Income by Education and Race (2000) Post-Hoc Results Emmmeans
education_attainment race_ethnicity estimate std.error df conf.low conf.high statistic p.value
Less than High School White (non-Hispanic or Latino) 57,074.71 748.33 181,458 55,608.00 58,541.42 76.27 0
High School Diploma White (non-Hispanic or Latino) 62,121.87 575.64 181,458 60,993.62 63,250.12 107.92 0
Some College White (non-Hispanic or Latino) 67,691.83 574.50 181,458 66,565.83 68,817.83 117.83 0
Bachelor’s Degree White (non-Hispanic or Latino) 92,777.92 738.84 181,458 91,329.80 94,226.04 125.57 0
Master’s Degree or Higher White (non-Hispanic or Latino) 120,815.83 1,226.07 181,458 118,412.76 123,218.91 98.54 0
Less than High School Hispanic or Latino 55,714.59 675.74 181,458 54,390.16 57,039.02 82.45 0
High School Diploma Hispanic or Latino 61,590.77 787.51 181,458 60,047.26 63,134.28 78.21 0
Some College Hispanic or Latino 66,633.57 886.77 181,458 64,895.52 68,371.63 75.14 0
Bachelor’s Degree Hispanic or Latino 76,142.54 1,749.97 181,458 72,712.63 79,572.44 43.51 0
Master’s Degree or Higher Hispanic or Latino 90,629.15 4,448.81 181,458 81,909.58 99,348.73 20.37 0
Less than High School Black (non-Hispanic or Latino) 63,472.98 1,248.94 181,458 61,025.09 65,920.86 50.82 0
High School Diploma Black (non-Hispanic or Latino) 62,868.61 696.08 181,458 61,504.31 64,232.91 90.32 0
Some College Black (non-Hispanic or Latino) 67,808.20 705.81 181,458 66,424.83 69,191.57 96.07 0
Bachelor’s Degree Black (non-Hispanic or Latino) 78,479.57 1,184.30 181,458 76,158.38 80,800.76 66.27 0
Master’s Degree or Higher Black (non-Hispanic or Latino) 93,323.16 2,650.64 181,458 88,127.97 98,518.35 35.21 0
Less than High School Other (non-Hispanic or Latino) 51,389.83 1,281.76 181,458 48,877.62 53,902.05 40.09 0
High School Diploma Other (non-Hispanic or Latino) 58,914.26 1,378.02 181,458 56,213.37 61,615.15 42.75 0
Some College Other (non-Hispanic or Latino) 63,231.18 1,051.22 181,458 61,170.82 65,291.54 60.15 0
Bachelor’s Degree Other (non-Hispanic or Latino) 79,194.46 1,511.72 181,458 76,231.53 82,157.39 52.39 0
Master’s Degree or Higher Other (non-Hispanic or Latino) 93,527.82 2,722.86 181,458 88,191.08 98,864.56 34.35 0

Plot

library(plotly)

# Use the same plot_df you already built:
# plot_df has columns: Education (factor), Race, Income, CI_Low, CI_High
# 2) Convert to a plain data.frame
plot_df <- as.data.frame(emm2000b)
# 1) Subset to just White & Hispanic or Latino
plot_df_sub <- plot_df %>%
  filter(race_ethnicity %in% c(
    "White (non-Hispanic or Latino)",
    "Hispanic or Latino"
  ))

# 2) Interactive Plotly chart
plot_ly(
  data = plot_df_sub,
  x = ~education_attainment,
  y = ~emmean,
  color = ~race_ethnicity,
  colors = RColorBrewer::brewer.pal(2, "Set1"),
  type = 'scatter',
  mode = 'lines+markers',
  error_y = list(
    type       = "data",
    array      = ~emmean - lower.CL,
    arrayminus = ~upper.CL - emmean,
    thickness  = 1.5,
    width      = 5
  ),
  text = ~race_ethnicity,
  hovertemplate = paste(
    "<b>%{text}</b><br>",
    "Education: %{x}<br>",
    "Predicted Income: $%{y:,.0f}<br>",
    "95% CI: [ %{y-error_y.array:,.0f}, %{y+error_y.arrayminus:,.0f} ]",
    "<extra></extra>"
  )
) %>%
  layout(
    title = list(
      text = "<b>Interactive Predicted Income by Education & Race</b>",
      font = list(size = 20)
    ),
    xaxis = list(
      title     = "Education Level",
      tickangle = -45,
      tickfont  = list(size = 12),
      titlefont = list(size = 14)
    ),
    yaxis = list(
      title     = "Predicted Inflation‑Adjusted Income",
      tickfont  = list(size = 12),
      titlefont = list(size = 14)
    ),
    legend = list(
      title       = list(text = "<b>Race/Ethnicity</b>"),
      orientation = "h",
      x           = 0.3,
      y           = -0.2
    )
  )

Real‑World Example: Birthweight ~ Smoking * Race

Link to University of Zurich Interpreting Interactions Link: https://www.ebpi.uzh.ch/dam/jcr%3A5764104b-a3b3-451d-828d-34bed6c804fb/InteractionsStataR20170622.pdf?utm_source=chatgpt.com

library(MASS)
data(birthwt)
birthwt$smoke <- factor(birthwt$smoke, 0:1, c("non-smoker", "smoker"))
birthwt$race <- factor(birthwt$race, 1:3, c("white", "black", "other"))
birthwt$nonwhite <- birthwt$race != "white"
birthwt$nonwhite <- factor(as.numeric(birthwt$nonwhite), 0:1, c("white", "nonwhite"))
head(birthwt[, c("bwt", "low", "smoke", "nonwhite", "age", "lwt")])
##     bwt low      smoke nonwhite age lwt
## 85 2523   0 non-smoker nonwhite  19 182
## 86 2551   0 non-smoker nonwhite  33 155
## 87 2557   0     smoker    white  20 105
## 88 2594   0     smoker    white  21 108
## 89 2600   0     smoker    white  18 107
## 91 2622   0 non-smoker nonwhite  21 124
# Fit the model
m3 <- lm(bwt ~ smoke * nonwhite, data = birthwt)

# 1) Tidy the m3 output (with confidence intervals)
model3_tidy <- tidy(m3, conf.int = TRUE)

# 2) (Optional) If you wanted to remove any terms, you'd filter here.
#    But in this simple model we'll keep all four terms.
#    e.g. model3_tidy <- model3_tidy %>% filter(term != "(Intercept)")

# 3) Display via kable
model3_tidy %>%
  kbl(
    caption   = "Model Results:\n bwt ~ smoke * nonwhite",
    digits    = 3,
    booktabs  = TRUE,
    col.names = c(
      "Term", "Estimate", "Std. Error", "t value",
      "P‑value", "Conf. Low", "Conf. High"
    )
  ) %>%
  kable_styling(full_width = FALSE, position = "center")
Model Results: bwt ~ smoke * nonwhite
Term Estimate Std. Error t value P‑value Conf. Low Conf. High
(Intercept) 3428.750 102.726 33.378 0.000 3226.086 3631.414
smokesmoker -601.904 139.577 -4.312 0.000 -877.270 -326.537
nonwhitenonwhite -604.243 130.737 -4.622 0.000 -862.170 -346.316
smokesmoker:nonwhitenonwhite 419.488 217.086 1.932 0.055 -8.795 847.770

Reference group: White non‑smokers

  • Intercept ((Intercept) = 3 428.7 g)
    Estimated mean birthweight for white mothers who do not smoke (all dummies = 0).

  • Main effect of smoking (smokesmoker = –601.9 g)
    The effect of smoking among white mothers: smokers’ babies weigh on average 601.9 g less than their non‑smoking counterparts.

  • Main effect of non‑white (nonwhitenonwhite = –604.2 g)
    The effect of non‑white race among non‑smokers: non‑white mothers’ babies weigh on average 604.2 g less than white non‑smokers.

  • Interaction (smokesmoker:nonwhitenonwhite = +419.5 g)
    The “extra” adjustment when both conditions hold. For non‑white smokers, the combined main‑effect penalties (–601.9 g for smoking, –604.2 g for non‑white) are partially offset by +419.5 g, yielding a net
    \[ 3\,428.7 - 601.9 - 604.2 + 419.5 = 2\,642.1\;\text{g}. \]
    This confirms that each coefficient of the interaction term is interpreted relative to its reference level: the smoking effect is “for whites,” the race effect is “for non‑smokers,” and the interaction is the additional departure for non‑white smokers.

We verify these cell means using the emmeans package:

# install.packages("emmeans")  # if you haven’t already
library(emmeans)

# Compute estimated marginal means for each smoke × nonwhite cell
emm1 <- emmeans(
  m3,
  ~ smoke * nonwhite
)

# View the table of EMMs with standard errors and 95% CIs

tidy(emm1, conf.int = TRUE) %>%
  kable(
    caption = "Model 2: Predicted BWT by Smoker and Race",
    digits = 2,
    format.args = list(big.mark = ",")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Model 2: Predicted BWT by Smoker and Race
smoke nonwhite estimate std.error df conf.low conf.high statistic p.value
non-smoker white 3,428.75 102.73 185 3,226.09 3,631.41 33.38 0
smoker white 2,826.85 94.49 185 2,640.42 3,013.27 29.92 0
non-smoker nonwhite 2,824.51 80.87 185 2,664.97 2,984.05 34.93 0
smoker nonwhite 2,642.09 145.28 185 2,355.48 2,928.70 18.19 0