Using Panel data: Difference-in-Differences Impact of Minimum Wages

Part 1: Reading and Questions

Based on seminal paper by David Card and Alan Krueger. Card and Krueger (1994) Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania AER 84(4): 772-793

a. What is the causal link the paper is trying to reveal?

Answer: In April 1, 1992, minimum wage in New Jersey (NJ) increased from $4.25 to $5.05 per hour. In this regard, this paper wants to reveal the causal link between minimum wages and employment outcomes on establishment level (fast-food industry). Specifically, \(H_0: \beta_{minimumWageChange} = 0\) and \(H_a: \beta_{minimumWageChange} \not= 0\)

b. What would be the ideal experiment to test this causal link?

Answer: Every time we talk about an ideal experiment, we need to think of Fundamental Problem of Causal Inference which states that we cannot observe counterfactual. To test the hypothesis aforementioned, we need to survey fast-food industry, call it X, in NJ in February 1992 and note the employment: \(RealEmploy_{Feb}=5\). Once the law is imposed in April, we will go back to same industry X in November to survey them and note their employment level again: \(RealEmploy_{Nov}=10\). Now, imagine we are in hypothetical world where there is no law change in NJ in April. As a part of second-wave of survey, we went to industry X in November and survey their employment level: \(HypotheticalEmploy_{Nov}=8\).

Here, the effect of minimum-wage policy for industry X can be calculated as: \([RealEmploy_{Nov} - RealEmploy_{Feb}]-[HypotheticalEmploy_{Nov} - RealEmploy_{Feb}]\) = \([10-5] - [8-5]\) = \(5-3\) = \(2\) is the impact of minimum wage policy change.

But can we calculate such Hypothetical Employment for same industry? If “yes”, then this would have been an ideal experiment. But, the answer is “no”.

Therefore, we need exactly similar location where this law was not implemented to introduce randomness in occurrence of intervention (that can mimic experiment like features; also called quasi-experiment). In such scenario, we can compare the employment level with nearby industry and calculate the effect. This requires geographically close location (like eastern Pennsylvania (PA) selected by authors) which helps us to ensure comparability to test above hypothesis.

c. What is the identification strategy?

Answer: Identification strategy in this article is the natural experiment. Since the minimum-wage law that was implemented in NJ but not in eastern PA was not in the researcher’s control, observing the employment pattern in geographically close location (for eg: McDonald just left to the state border and Burger King just right to the state border) can establish experiment like behavior. Also, given that the data collection was done in short time frame, there is a less chance of spillovers (labor market often takes longer time to adjust).

Note that, fast-food industry made a case because it employed high number of low-wage workers that the effect of change in minimum wage from $4.25 to $5.05 per hour can be seen among these groups only (higher wage workers like supervisors are not generally affected by this policy).

d. What are the assumptions / threats to this identification strategy?

Answer:

With reference to the data, there are couple threats to this identification strategy:

  • Authors assumed the seasonal employment pattern were similar. If not, we cannot compare NJ and eastern PA.
  • The law was approved in 1990. Nothing can rule out the possibility that some adjustments have already happened before the baseline survey.
  • Computing differences in differences requires us to assume that, in absence of minimum-wage law, the employment pattern in NJ would have been similar as in eastern PA. But, we can never compute if this is true (we can logically argue, however).

Part 2: Replication Analysis

a. Load data from Card and Krueger AER 1994

#install.packages("readr")

setwd("~/OneDrive - University of Georgia/4th Sem PhD/Adv Econometric Applications_Filipski/Assignments/HW5")
library(readr)
CKdata <- read_csv("CardKrueger1994_fastfood.csv")

b. Verify that the data is correct Reproduce the % of Burger King, KFC, Roys, and Wendys, as well as the Full Time Equivalent (FTE) means in the 2 waves.

Answer:

#computing means of some variables to reproduce Table 2 (partial):
library(magrittr)
library(dplyr)
Means_Table2 <- CKdata %>% group_by(state) %>% 
  summarize(bk = round(sum(bk)/n()*100, digits=1), 
            kfc = round(sum(kfc)/n()*100, digits=1), 
            roys = round(sum(roys)/n()*100, digits=1),
            wendys = round(sum(wendys)/n()*100, digits=1), 
            FTE1 = round( mean(emptot, na.rm=TRUE), digits=1),
            FTE2 = round( mean(emptot2, na.rm=TRUE), digits=1))

#transpose the mean table so that it looks like Table 2 in paper:
Means_Table2 <- subset(Means_Table2, select = -c(state))
Means_Table2_transp <- as.matrix(t(Means_Table2))
colnames(Means_Table2_transp) <- c("PA", "NJ")
rownames(Means_Table2_transp) <- c("a. Burger King", "b. KFC", "c. Roy Rogers", "d. Wendy's", "Wave 1: FTE Employment", "Wave 2: FTE Employment")
Means_Table2_transp
##                          PA   NJ
## a. Burger King         44.3 41.1
## b. KFC                 15.2 20.5
## c. Roy Rogers          21.5 24.8
## d. Wendy's             19.0 13.6
## Wave 1: FTE Employment 23.3 20.4
## Wave 2: FTE Employment 21.2 21.0

Given the exact same values,the data are correct.

c. Use a “first-differenced” OLS to obtain their Diff-in-diff estimator

Answer:

library(stargazer)
#first difference for the employment
CKdata$diff_employ <- CKdata$emptot2-CKdata$emptot

#first differenced OLS
fd_ols <- lm(diff_employ ~ state, data = CKdata)

stargazer(fd_ols,
           type="text",
           align=TRUE,
           no.space=TRUE,
           column.labels=c("First differenced OLS"),
           covariate.labels = "Policy Effect",
           title="First Differenced OLS")
## 
## First Differenced OLS
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                             diff_employ        
##                        First differenced OLS   
## -----------------------------------------------
## Policy Effect                 2.750**          
##                               (1.154)          
## Constant                     -2.283**          
##                               (1.036)          
## -----------------------------------------------
## Observations                    384            
## R2                             0.015           
## Adjusted R2                    0.012           
## Residual Std. Error      8.968 (df = 382)      
## F Statistic            5.675** (df = 1; 382)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

Compared to the result in Table 3, the estimate is almost same (2.760 in Table 3), however, the difference in SE is relatively pronounced.

Part 3: Alternative ways of running DiD

d. What would be the equation of a standard “difference in difference” regression?

Answer: Standard DID equation can be written as:

\(Employment Outcome_{it} = \beta + \alpha \times Treatment_i + \tau \times After_t + \gamma (Treatment \times After) + \epsilon_{it}\)

where, \(\gamma\) represents the treatment effect i.e., effect of minimum-wage law.

e. Compute the difference-in-differences estimator “by hand”.

Answer: Using the values from the table in QN 2 (b):

\(Diff-in-Diff Estimate= [NJFTE_{wave2} - PAFTE_{wave2}]-[NJFTE_{wave1} - PAFTE_{wave1}]\)

\(=[21-21.2]-[20.4-23.3]\)

\(= -0.2 +2.9\)

\(=2.7\)

Note that the estimate is same as the one in Table 3 (Column 3, Row 4).

f. Run the regression you wrote up in part d. Comment on the result

Answer: Reference for codes: Oscar Torres-Reyna (2011)

library(reshape2)
library(stargazer)
#rename
CKdata$emptot1 <- CKdata$emptot
CKdata$wage_st1 <- CKdata$wage_st

CKdata$diff_employ <- NULL
CKdata_long <- reshape(CKdata,
                          idvar= c("state","id"),
                          varying = c("emptot1", "emptot2", "wage_st1", "wage_st2"),
                          sep = "",
                          timevar = "wave",
                          times = c("0","1"),
                          direction = "long")        #our data is wide currently, need to reshape to long

#state=treatment (NJ is treated) for our case
#wave2=1 if it is second wave (represents "after"), 0 otherwise
CKdata_long$wave2 <- ifelse(CKdata_long$wave == '2', 1, 0)
CKdata_long$state_wave2 <- CKdata_long$state * CKdata_long$wave2         #interaction term

regTable <- lm(emptot ~ state+ wave2+ state_wave2, data = CKdata_long)
stargazer (regTable,
           type="text",         #type="html" or "latex" did not worked for me
           align=TRUE,
           no.space=TRUE,
           column.labels=c("Diff-in-Diff OLS"),
           covariate.labels = c("treatment (state)", "after (wave2)", "treatment*after (state*wave2)" ),
           title="TABLE: DIFF-IN-DIFF USING ORDINARY LEAST SQUARES")
## 
## TABLE: DIFF-IN-DIFF USING ORDINARY LEAST SQUARES
## =========================================================
##                                   Dependent variable:    
##                               ---------------------------
##                                         emptot           
##                                    Diff-in-Diff OLS      
## ---------------------------------------------------------
## treatment (state)                      -2.892**          
##                                         (1.194)          
## after (wave2)                           -2.166           
##                                         (1.516)          
## treatment*after (state*wave2)            2.754           
##                                         (1.688)          
## Constant                               23.331***         
##                                         (1.072)          
## ---------------------------------------------------------
## Observations                              794            
## R2                                       0.007           
## Adjusted R2                              0.004           
## Residual Std. Error                9.406 (df = 790)      
## F Statistic                       1.964 (df = 3; 790)    
## =========================================================
## Note:                         *p<0.1; **p<0.05; ***p<0.01

Here, \(\gamma\) coefficient from interaction term represents the treatment effect i.e., effect of minimum-wage law. Although the treatment effect is almost equal (2.754), SE is larger. This resulted insignificant effect of minimum-wage law.