Question 1

Calculating d-i-d model manually

did1 <- did %>%
  group_by(nj, d) %>%
  summarise(fte_avg = mean(fte, na.rm = TRUE))
(21.02743-20.43941)-(21.16558-23.33117)
## [1] 2.75361

Question 2

Calculating d-i-d model via regression

lm1 <- lm(fte ~ nj*d, data = did)
summary(lm1)
## 
## Call:
## lm(formula = fte ~ nj * d, data = did)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.166  -6.439  -1.027   4.473  64.561 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   23.331      1.072  21.767   <2e-16 ***
## nj            -2.892      1.194  -2.423   0.0156 *  
## d             -2.166      1.516  -1.429   0.1535    
## nj:d           2.754      1.688   1.631   0.1033    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.406 on 790 degrees of freedom
##   (26 observations deleted due to missingness)
## Multiple R-squared:  0.007401,   Adjusted R-squared:  0.003632 
## F-statistic: 1.964 on 3 and 790 DF,  p-value: 0.118

Interpretation: The coefficient of interest here represents the difference-in-difference before and after income law was passed based on fastfood establishments in New Jersey and Pennsylvania to measure whether the rise in minimum wage affects employment. So the difference we see before and after the law is passed, can make us discern that on average, income varied based on the income law, between New Jersey and Pennsylvania. The coefficient of interest tells us that the income law, as a causual effect for the difference-in-difference, is not statistically significant.


Question 3

Robustness Check on Missing Variables

did3[did3$d == 1& is.na(did3$fte), "fte"] <- 0

lm1 <- lm(fte ~ nj*d, data = did3)
summary(lm1)
## 
## Call:
## lm(formula = fte ~ nj * d, data = did3)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.630  -6.439  -0.939   4.769  64.561 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   23.331      1.107  21.075   <2e-16 ***
## nj            -2.892      1.233  -2.346   0.0192 *  
## d             -2.701      1.556  -1.736   0.0829 .  
## nj:d           2.527      1.732   1.459   0.1449    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.714 on 804 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.008062,   Adjusted R-squared:  0.004361 
## F-statistic: 2.178 on 3 and 804 DF,  p-value: 0.08917

Based on the regression of the differences-in-differences model to study employment changes in NJ and PA as two bordering states, focusing on fast-food restaurants close to the border we reach the conclusion that the results do not qualitatively change as the explanatory variables are qualitative in nature. In this example, the coefficient shows the difference in mean between the mean employment in New Jersey after income law was passed, and the mean employment in Pennsylvania where the income law was not passed. Although the coefficient insubstantially is reduced from, 2.75 to 2.52, the coefficient is still statistically insignificant meaning there is no causal relationship between, meaning that the rise in minimum wage does not reduce employement, and as such the results don’t change qualtatively. The dummy variable coefficient measures the extent to which it differs from the base category (not passed in Pennsylvania).


Question 4

Graphical Representation for Did Regression Model

did1 <- mutate(did1, State = ifelse(nj==1, "NJ", "PA"))

ggplot(did1, aes(factor(d), fte_avg, group = State))+
  geom_point(aes(color=State))+
  geom_line(aes(color=State))+
  scale_x_discrete(labels= c('Pre-Treatment', 'Post-Treatment'))+
  labs(x="", y="Full-Time Equivalent Employment", title="Average Employement Per Store")+ 
  theme_light()