Difference-in-Difference

2022-08-10

1. Introduction

2. DiD Theory

3. Replication: Study by David Card and Alan B. Krueger about the effect of a raise in minimum wages on employment.

1. Introduction

In economics, researchers are often using natural or quasi experimental setting. With them one can study a given change in the environment, which allows to split the population at hand into a treatment and control group.

When can we use DiD?

We want to evaluate a program or treatment.
We have treatment and control groups.
We observe them before and after.

BUT:

Treatment is not random.
Other things were happening while the program was in effect.
We can’t control for all the potencial confounders.

Key Assumption

Trend in control group approximates what would have happened in the treatment group in the absence of the treatment.

2. Difference-in-Difference theory

Objetives

Identify a quasi-experimental technique appropriate for estimating a treatment effect in a given situation.
Apply the DiD method to estimate a treatment effect, and evaluate its validity.

Quasi-experiment: a situation where you, as researcher, did not assign people to treatment/control. The context isolates pathway between treatment and outcome

External validity 👍

2. Difference-in-Difference theory

with-without + before-after

The DiD approach includes a before-after comparison for a treatment and control group. This is a combination of:

A cross-sectional comparison (= compare a sample that was treated to an non-treated control group)
A before-after comparison (= compare treatment group with itself, before and after the treatment)

2. Difference-in-Difference theory

DiD Model

Regression: $Y_{i} = \beta_{0} + \beta_{1}T_{i} + \beta_{2}P_{i} + \beta_{3}T_{i}P_{i} + u_{i}$

$Y_{i} =$ outcome
$T_{i} =$ 1 if treatment
$P_{i} =$ 1 if after event

2. Difference-in-Difference theory

DiD Estimator

$\hat{\beta_{3}} = (\overline{Y}^{TG,AT} - \overline{Y}^{TG,BT}) - (\overline{Y}^{CG,AT} - \overline{Y}^{CG,BT})$

$\hat{\beta_{3}} = ((\beta_{0}+\beta_{1}+\beta_{2}+\beta_{3}) - (\beta_{0}+\beta_{2})) - ((\beta_{0}+\beta_{1})-(\beta_{0}))$

2. Difference-in-Difference theory

Parallel Trend Assumption

To obtain an unbiased estimate of the treatment effect one needs to make a parallel trend assumption. That means without the change in the environment, the change in the outcome variable would have been the same for the two groups (counterfactual outcome).

2. Difference-in-Difference theory

Validity of the DiD estimator

The validity of the DiD approach is closely related to the similarity of the treatment and control group. Hence, some plausibility checks should be conducted:

Compute Placebo-DiD for periods without a change in the environment.
For (longer) time series: check and demonstrate the parallel time trends.
Use an alternative control group (if available): the estimate should be the same.
Replace Y by an alternative outcome which is definitely independent of the treatment (the DiD estimator should be 0).

2.1 Common problems

Ashenfelter Dip

The idea that earnings often fall just prior to entering a training program, which complicates measurement of treatment effect.

- Over-estimation of the treatment effect.

- Treatment under study is specific to a particular target group. (Certain treatment is measured in adults over 50 years of age, this will not have the same results as if it is applied to young people under 18 years of age)

2.1 Common problems

Estimate depending on functional form

A functional form refers to the algebraic form of a relationship between a dependent variable and regressors or explanatory variables.

- Consider the difference in logs.

2.1 Common problems

Long-term effects vs. reliability

Parallel trends are more plausible over shorter time period than over long time period.
But from policy point of view: interest in long-term effect

2.1 Common problems

Heterogenous effects

It is possible to apply the DiD estimator, if both groups are affected by a policy change, but with different doses.
But: DiD estimator may be misleading, if the intensity of the response is different between the two groups.

3. Replication

In this section I am replicate a study by David Card and Alan B. Krueger about the effect of a raise in minimum wages on employment.

Conventional economic theory suggests that in a labour market with perfect competition an increase in the minimum wage leads to an increase in unemployment.

In April 1992, the U.S. state of New Jersey (NJ) raised the minimum wage from $4.25 to $5.05. Card and Krueger (1994) use a DiD approach and show that this increase in minimum wages led to an increase in employment in the sector of fast food restaurants.

The control group in their setting is the neighbouring U.S. state of Pennsylvania (PA), which was not subject to this policy change The authors conducted a survey before and after the raise of the minimum wage with a representative sample of fast food restaurants in NJ and PA. This setting can be regarded as quasi experimental, as both states are not identical in many aspects and the legislative procedure, in order the raise the minimum wage, was not initiated at random.

3. Replication

DiD Regression Model

$emp_{it} = \beta_{0} + \beta_{1}NJ_{i} + \beta_{2}POST_{t} + \beta_{3}NJ_{i}POST_{t} + u_{it}$

$i$: denotes a fast food restaurant
$t$: denotes time
$emp_{it}$: number of employees in restaurant $i$ at time $t$

3. Replication

DiD Regression Model

$emp^{PA,feb} = \beta_{0} + u$
$emp^{PA,nov} = \beta_{0} + \beta_{2} + u$
$emp^{NJ,feb} = \beta_{0} + \beta_{1} + u$
$emp^{NJ,nov} = \beta_{0} + \beta_{1} + \beta_{2} + \beta_{3} + u$

Diffence in Difference Estimator:

$\hat{\beta_{3}} = (\overline{emp}^{NJ,nov} - \overline{emp}^{NJ,feb}) - (\overline{emp}^{PA,nov} - \overline{emp}^{PA,feb})$

$counterfactual = emp^{NJ,feb} - (emp^{PA,feb} - emp^{PA,nov})$

$counterfactual = (\beta_{0} + \beta_{1}) - ((\beta_{0}) - (\beta_{0} + \beta_{2}))$

$counterfactual = \beta_{0} + \beta_{1} + \beta_{2}$

The following R packages are required:

library(dplyr)
library(readr)
library(ggplot2)
library(tidyr)
library(sjlabelled)
library(ggrepel)
library(scales)
library(ggpubr)
library(plm)
library(lmtest)
library(stringi)
library(tidyverse)

3.1 The Dataset

Raw Data
Cleaned Data
Transposed Data (February 1992 vs November 1992)
Final Data

Generate a variable that measures employment. According to the paper, the full-time equivalents (FTE/emptot) consist of full-time employees (empft), managers (nmgrs) and part-time employees (emppt). The latter are multiplied by factor 0.5 before entering the calculation. Also, I am generating the share of full-time employees of all FTE (pct_ftw).

card_krueger_1994_mod <- card_krueger_1994 %>%
  mutate(emptot = empft + nmgrs + 0.5 * emppt,
         pct_fte = empft / emptot * 100)

3.2 Descriptive statistics

Distribution of restaurants

Table 2 in the paper shows extensive descriptive statistics of the dataset. Some of them are replicated in this section, in order to show that reading and processing the data was not prone to errors.

##         state
## chain    New Jersey Pennsylvania
##   bk     41.1%      44.3%       
##   kfc    20.5%      15.2%       
##   roys   24.8%      21.5%       
##   wendys 13.6%      19.0%

3.2 Descriptive statistics

Pre-treatment means

Next, I am adding the mean values of certain variables of the first wave of the survey grouped by each state …

## # A tibble: 4 × 3
##   variable `New Jersey` Pennsylvania
##   <chr>           <dbl>        <dbl>
## 1 emptot          20.4         23.3 
## 2 pct_fte         32.8         35.0 
## 3 wage_st          4.61         4.63
## 4 hrsopen         14.4         14.5

3.2 Descriptive statistics

Post-treatment means

… as well as the mean values of the second wave of the survey. My calculations are in line with the numbers published in the paper.

## # A tibble: 4 × 3
##   variable `New Jersey` Pennsylvania
##   <chr>           <dbl>        <dbl>
## 1 emptot          21.0         21.2 
## 2 pct_fte         35.9         30.4 
## 3 wage_st          5.08         4.62
## 4 hrsopen         14.4         14.7

3.3 Figure 1

In this section I am reproducing figure 1 of the study. It shows the distribution of the fast food restaurants’ wages grouped by the federal states NJ and PA before and after the treatment.

3.4 Calculating the treatment effect

First differences

Before calculating the DiD estimator with OLS, I want to deduce it by means of differencing the mean values of employment (emptot) between each group. This is easily done with functions group_by() and summarise(). We obtain four groups with distinct mean values!

differences <- card_krueger_1994_mod %>%
  group_by(observation, state) %>%
  summarise(emptot = mean(emptot, na.rm = TRUE))

# Treatment group (NJ) before treatment
njfeb <- differences[1,3]

# Control group (PA) before treatment
pafeb <- differences[2,3]

# Treatment group (NJ) after treatment
njnov <- differences[3,3]

# Control group (PA) after treatment
panov <- differences[4,3]

3.4 Calculating the treatment effect

ATE

The Average Treatment Effect (ATE) in this setting can be determined in two ways:

Calculate the difference between the difference of November and February within NJ and PA

(njnov-njfeb)-(panov-pafeb)

##     emptot
## 1 2.753606

Calculate the difference between the difference of NJ and PA within November and February

(njnov-panov)-(njfeb-pafeb)

##     emptot
## 1 2.753606

3.4 Calculating the treatment effect

Digression: counterfactual outcome

First, I use the differences of variable emptotcalculated in the previous step for NJ and PJ in February and November. Additionally, we require the outcome of NJ if the treatment (raise of the minimum wage) did not happen. This is called the counterfactual outcome (nj_counterfactual). The DiD assumption states that the trends of treatment and control group are identical until the treatment takes place. Hence, without the treatment the employment (emptot) of NJ would decline from February to November by the same amount as PA.

# Calculate counterfactual outcome
nj_counterfactual <- tibble(
  observation = c("February 1992","November 1992"), 
  state = c("New Jersey (Counterfactual)","New Jersey (Counterfactual)"),
  emptot = as.numeric(c(njfeb, njfeb-(pafeb-panov)))) 
# Data points for treatment event
intervention <- tibble(
    observation = c("Intervention", "Intervention", "Intervention"),
    state = c("New Jersey", "Pennsylvania", "New Jersey (Counterfactual)"),
    emptot = c(19.35, 22.3, 19.35)) 
# Combine data
did_plotdata <- bind_rows(differences, nj_counterfactual, intervention)

3.4 Calculating the treatment effect

Line Plot

In November 1992, the distance between the actual and counterfactual employment (emptot) of NJ identifies the causal effect of an increase in minimum wages on employment.

3.5 Calculating the DiD estimator

Dummy variable creation

With linear regression, this result can be achieved very easy. At first, we need to create two dummy variables. One indicates the start of the treatment (time) and is equal to zero before the treatment and equal to one after the treatment. The other variable separates the observations into a treatment and control group (treated). This dummy variable is equal to one for fast food restaurants located in NJ and equal to zero for fast food restaurants located in PA.

card_krueger_1994_mod <- mutate(card_krueger_1994_mod,
                                time = ifelse(observation == "November 1992", 1, 0),
                                treated = ifelse(state == "New Jersey", 1, 0)
                                )

3.5 Calculating the DiD estimator

Estimation via lm()

The DiD estimator is an interaction between both dummy variables. This interaction can be specified with the : operator in the formula of function lm() in addition to the individual dummy variables.

Hint: Another possibility is to only specify time*treated in the formula which adds the individual dummy variables automatically.

3.5 Calculating the DiD estimator

Estimation via lm()

The coefficient of time:treated is the difference-in-differences estimator. The treatment (raise of the minimum wage) leads on average to an increase of employment (emptot) in NJ by 2.75 FTE.

3.5 Calculating the DiD estimator

Fixed Effects

Balanced sample: In RStudio we can get this result by computing a fixed effects model which is sometimes also called a within estimator. I am using R package plm to run this regression with function plm() and argument model = "within". Beforehand, the data has to be declared as a panel with function p.dataframe(). With variable sheet each fast food restaurant can be uniquely identified. Additionally, we need the function coeftest() from R package lmtest in order to obtain the correct standard errors which must be clustered by sheet.

3.5 Calculating the DiD estimator

Fixed Effects

# Declare as panel data
panel <- pdata.frame(card_krueger_1994_mod, "sheet")

# Within model
did.reg <- plm(emptot ~ time + treated + time:treated, 
               data = panel, model = "within")

# obtain clustered standard errors
coeftest(did.reg, vcov = function(x) 
  vcovHC(x, cluster = "group", type = "HC1"))

## 
## t test of coefficients:
## 
##               Estimate Std. Error t value Pr(>|t|)  
## time2          -2.2833     1.2465 -1.8319  0.06775 .
## time2:treated   2.7500     1.3359  2.0585  0.04022 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Bibliography

Baker, A. (September 25, 2019). Difference-in-Differences Methodology. Retrieved on August 3 2022, from https://andrewcbaker.netlify.app/2019/09/25/difference-in-differences-methodology/
Leppert, P. (September 18, 2020). R Tutorial: Difference-in-Differences. Retrieved on August 3 2022, from https://rpubs.com/phle/r_tutorial_difference_in_differences
Ashenfelter, O. (2007). Orley Ashenfelter, Distinguished Fellow 2007. Retrieved on August 3 2022, from https://www.aeaweb.org/about-aea/honors-awards/distinguished-fellows/orley-ashenfelter

Contents

1. Introduction

When can we use DiD?

BUT:

Key Assumption

2. Difference-in-Difference theory

Objetives

2. Difference-in-Difference theory

with-without + before-after

2. Difference-in-Difference theory

DiD Model

2. Difference-in-Difference theory

DiD Estimator

2. Difference-in-Difference theory

Parallel Trend Assumption

2. Difference-in-Difference theory

Validity of the DiD estimator

2.1 Common problems

2.1 Common problems

Ashenfelter Dip

2.1 Common problems

Estimate depending on functional form

2.1 Common problems

Long-term effects vs. reliability

2.1 Common problems

Heterogenous effects

3. Replication

3. Replication

DiD Regression Model

3. Replication

DiD Regression Model

The following R packages are required:

3.1 The Dataset

3.2 Descriptive statistics

Distribution of restaurants

3.2 Descriptive statistics

Pre-treatment means

3.2 Descriptive statistics

Post-treatment means

3.3 Figure 1

3.4 Calculating the treatment effect

First differences

3.4 Calculating the treatment effect

ATE

3.4 Calculating the treatment effect

Digression: counterfactual outcome

3.4 Calculating the treatment effect

Line Plot

3.5 Calculating the DiD estimator

Dummy variable creation

3.5 Calculating the DiD estimator

Estimation via lm()

3.5 Calculating the DiD estimator

Estimation via lm()

3.5 Calculating the DiD estimator

Fixed Effects

3.5 Calculating the DiD estimator

Fixed Effects

Bibliography