Interaction and First Difference

First Difference

A difference in expected values of outcome for two different sets of covariate values.
Ex. \(E[Y_i|D = 1,X = x] − E[Y_i|D = 0,X = x]\)

Why useful? Comparing outcomes between two groups.
Ex. The difference in the expected income of men and women at fixed levels of education and age.

How: Change only one variable values holding the others constant.

Interaction term (cross term; multiplicative term)

A new term consisting of two terms multiplied. Ex. D*X

Why useful? Examining the effect of the interaction of two variables on outcomes, in addition to the marginal effects(meaning, standalone outcomes)

Interacting a dummy variable (0 and 1 binary variable) with a continuous variable allows having different slopes of that continuous variable across two different values that the dummy variable takes on.

Motivating example:
How does income differ between genders across the levels of education? (here education is called a moderator).

\(Wage_i = \alpha + \beta_1sex + \beta_2educ + \beta_3sex*educ\)
sex = 1 (female); 0 (male)—dummy variable educ : years of education…

\(\bar {Wage_f} = \alpha + \beta_1 + \beta_2edu + \beta_3 edu =( \alpha + \beta_1) + (\beta_2 + \beta_3)educ\)

\(\bar {Wage_m} = \alpha + \beta_2edu\)

This allows examining the impact of sex on wage at every level of education.

\(\beta_1\) represents a premium that females can have over male if there is zero years worth of education for both.

\(\beta_3\) represents the partial effect of education for females. If \(\beta_3 \gt 0\), the increase in wage for an year increase in education is greater for females than for males.

Combine First Difference technique with an interaction term

Model
\(Y_i=\beta_0 + \beta_1D_i + \beta_2X_i + \beta3(D_i*X_i) + \epsilon_i\), where \(E[\epsilon_i|X_i, D_i]=0\).

\(D_i\): dummy variable (0, 1)
\(X_i\): continuous variable

Procedures
1. Create an interaction/cross term of sex*educ.
2. Get first difference in wages:
FD: E[wage| female, educ] - E[wage|male, educ]–> a vector of difference in wage between female and male at every level of the education variable.

\(FD=\beta_1 + \beta_3 educ\)–> This tells wage differences between female and male at every level of education.

Marginal effect
The change in Y, disregarding D, becomes:
\(\frac {\partial Y}{\partial D} = \beta_1 + \beta_3X\)
The marginal effect occurs linearly if \(|\beta_3| \gt 0\)

Example

Questions: What is the effect of having daughters on progresssive voting tendency of judges? How does it depend on the partisanship of the judge?

Outcome:
progressive: 1 if judge votes feminist fashion > 50% of cases and 0 otherwise

Covariates:
republican: 1 if appointed by Republican president, 0 otherwise
child: Number of children judge has
girls: Number of girls judge has

Model: Logistic Regression Model
- \(Progressive_i \thicksim Bern(p_i)\)

\(X_i\beta = \beta_0 + \beta_1 republican_i+\beta_2 girls_i + \beta_3 rep_i*girls_i + \beta_4 child_i\)
Link function
\(g(\pi)=logit(p_i)= X_i\beta\)
inverse logit: \(p_i=logit^{-1}(X_i\beta) = \frac{e^{X_i\beta}}{1+e^{X_i\beta}}\)

QOI
Progressive voting tendency among judges appointed by Republican presidents who have a daughter [FD 1]
vs.
Progressive voting tendency among judges appointed by Democrat presidents who have a daughter [FD 2]

FD 1: E[progressive| Rep, 1 girls, 1 child] - E[progressive| Rep, 0 girls, 1 child]

FD 1: E[progressive| Dem, 1 girls, 1 child] - E[progressive| Dem, 0 girls, 1 child]

Step 1. Estimate the model

library(mvtnorm)
library(ggplot2)
library(tidyverse)

# Read data: select rows for chld>0
data <- read_csv('./data/dbj.csv')[, -1] %>% subset(., child > 0)
# Change the value of progressive to binary.
data$progressive <- ifelse(data$progressive.vote > 0.5, 1, 0)

fit <- glm(progressive ~ republican*girls + child, data=data, family=binomial(link=logit))
broom::tidy(fit)

# A tibble: 5 x 5
  term             estimate std.error statistic p.value
  <chr>               <dbl>     <dbl>     <dbl>   <dbl>
1 (Intercept)        -0.574     0.431    -1.33  0.183  
2 republican         -1.46      0.563    -2.59  0.00967
3 girls              -0.182     0.246    -0.739 0.460  
4 child               0.172     0.155     1.11  0.268  
5 republican:girls    0.416     0.312     1.33  0.183

Step 2. Simulate coefficients (draw coefficients from a normal distribution of coefficients)

sim <- 10000

sim_coef <- function (fit, nsim=sim){
  rmvnorm(nsim, mean=fit$coefficients, sigma=vcov(fit))
}

sim.coef <- sim_coef(fit, sim)
dim(sim.coef)

[1] 10000     5

Step 3. Write the function of first difference

# inverse logit function g^-1 (x*beta) = p.
# x*beta is expressed as x here.
# returns probability p_i
invlogit <- function(x) {exp(x) / (1 + exp(x))}

first_diff <- function (republican=1, girls, child, sim.coef){
  # create simulated data 
  sim.data <- c(1, republican, girls, child, republican*girls)
  # systematic component: linear predictor
  # sim.coef is a 10000 x 5 vector
  X_beta <- t(sim.data %*% t(sim.coef))
  # expected value, p_i, when expected value for girls=girls
  exp.val.1 <-invlogit(X_beta) 
  
  girls_0 <- girls-1
  sim.data.0 <- c(1, republican, girls_0, child, republican*girls_0)
  X_beta.0 <- t(sim.data.0 %*% t(sim.coef))
  exp.val.0 <-invlogit(X_beta.0)
  
  diff <- exp.val.1 - exp.val.0
  
  diff.summary <- c(republican=republican, girls=girls, child=child,
                    est.prob= mean(diff), quantile(diff, c(0.025, 0.975)) )
  
  return(diff.summary)
}

Step 4. Simulate to get estimates and confidence interval.

# For Republicans
rep.fd <- first_diff(republican = 1, girls = 1, child = 1, sim.coef=sim.coef)
# For Democrats
dem.fd <- first_diff(republican = 0, girls = 1, child = 1, sim.coef=sim.coef)
# store the outcomes
outcomes <- as.data.frame(rbind(rep.fd, dem.fd))
outcomes %>% knitr::kable()

	republican	girls	child	est.prob	2.5%	97.5%
rep.fd	1	1	1	0.0276450	-0.0441538	0.0960506
dem.fd	0	1	1	-0.0418176	-0.1567749	0.0672079

Step 5. Visualize the results

prob.hat.rep <- outcomes$est.prob[1]
prob.hat.dem <- outcomes$est.prob[2]

ggplot(outcomes, aes(y = factor(republican)))+
  geom_point(aes(x=est.prob), size=2)+
  geom_errorbarh(aes(xmin = `2.5%`, xmax = `97.5%`), height = .1) +
  geom_vline(xintercept=0, type="dashed", color="gray") +
 labs(x="First Difference in progressive voting probability", y="Political affiliation")+
  annotate(geom = 'text', x=0.03, y = 2.15, label = 'Republican', col = 'red') +
  annotate(geom = 'text', x=0.03, y = 1.15, label = 'Democrat', col = 'Navy') +
  theme_bw() +
  theme(panel.grid = element_blank())

Interpretation of the results

Among judges appointed by Republican presidents,judges who have one daughter and no other child is more likely to have progressive voting than judges who have only one son by a factor of 0.027645.

Among judges appointed by Democrat presidents,judges who have one daughter and no other child is more likely to have progressive voting than judges who have only one son by a factor of -0.0418176.

I cannot tell how significant the difference between Republican and Democrat judges, but the analysis suggests that there is a difference.

Interaction and First Difference

10/30/2020

First Difference

Interaction term (cross term; multiplicative term)

Combine First Difference technique with an interaction term

Example