Using Panel data: Difference-in-Differences Impact of Minimum Wages

Part 1: Reading and questions

a. What is the causal link the paper is trying to reveal?

The paper tries to reveal the causal link between minimum wage and employment.

b. What would be the ideal experiment to test this causal link?

The ideal experiment would be to randomly assign a treatment group with change in minimum wage and a control group with no change in minimum wage among employers in the U.S

c. What is the identification strategy?

The identification strategy used is the comparison of employment growth in fast-food chains in NJ where the minimum wage rose with eastern PA where the minimum wage did not rise, and both of these states have similar seasonal patterns of employment. Alternatively, the authors also compare stores that were initally paying higher wages (above the minimum wage) to the changes at lower-wage stores.

d. What are the assumptions / threats to this identification strategy?

One assumption made is that the control group of fast-food stores in eastern PA form a natural basis for comparison with restaurants in NJ citing the fact that seasonal patterns of employment are similar in NJ and eastern PA

Part2: Replication Analysis

a. Load the data

b. Verify that the data is correct:

data <- read.csv("CardKrueger1994_fastfood.csv")
library(dplyr)

# Create transpose and summarize the data
fastf <- t(data %>% group_by(state) %>%
             summarise(bk = mean(bk),
                       kfc = mean(kfc),
                       wendys = mean(wendys),
                       roys = mean(roys),
                       emptot = mean(emptot, na.rm = TRUE),
                       emptot2 = mean(emptot2, na.rm = TRUE)))
colnames(fastf) <- c("PA", "NJ")     
rownames(fastf) <- c("state","Burger King", "KFC","Wendy's","Roy Rogers","FTE Wave 1","FTE Wave 2")
fastf <- round(fastf[-1,], 2)
fastf
##                PA    NJ
## Burger King  0.44  0.41
## KFC          0.15  0.21
## Wendy's      0.19  0.14
## Roy Rogers   0.22  0.25
## FTE Wave 1  23.33 20.44
## FTE Wave 2  21.17 21.03

c. Use OLS to get DID estimator

y1 <- lm(demp ~ state, data = data)
# Create OLS table 
library(stargazer)
stargazer(y1, type = "text", title = "OLS",
          align = TRUE, keep.stat = c("n","adj.rsq"), omit = "Constant",
          dep.var.labels = c("Difference, NJ - PA"), covariate.labels = c("State"))
## 
## OLS
## ========================================
##                  Dependent variable:    
##              ---------------------------
##                  Difference, NJ - PA    
## ----------------------------------------
## State                  2.750**          
##                        (1.154)          
##                                         
## ----------------------------------------
## Observations             384            
## Adjusted R2             0.012           
## ========================================
## Note:        *p<0.1; **p<0.05; ***p<0.01

The coefficient for the state variable is 2.75 which is significant at the 5%. This is very close to the estimate arrived at in Table 3.

d. What would be the equation of a standard “difference in difference” regression?

The equation is defined as follows for the ith observation:

\[\begin{equation} Y_{i,s,t} = \beta_{0} + \beta_{1}(state_{s}) + \beta_{2}(time_{t}) + \beta_{3}(state_{s} * time_{t}) + \epsilon_{i,s,t} \end{equation}\]