Question 1

Using Manual DID we find that the coeffienct of interest is 2.74. This is the effect of the policy on wages.

#remove missing values
cards_new <- filter(cards, fte != "NA")

#manual DID
cards_did <- group_by(cards_new, nj, d )
summarise(cards_did, 
          cards_g = mean(fte))
## # A tibble: 4 x 3
## # Groups:   nj [2]
##      nj     d cards_g
##   <dbl> <dbl>   <dbl>
## 1     0     0    23.3
## 2     0     1    21.2
## 3     1     0    20.4
## 4     1     1    21.0
##Calculate the coefficient of interest
c = (21.02743-20.43941) - (21.16558-23.33117)
c
## [1] 2.75361

Question 2

The coefficient of interest is 2.74 and is not statistically significant at the 0.1 significance level. The coefficient of interest is the diff in diff estimator. Here the effect on wages is 2.74 and its is not significant at 10%.

lm1 <- lm(fte~ nj*d, data=cards_new)
summary(lm1)
## 
## Call:
## lm(formula = fte ~ nj * d, data = cards_new)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.166  -6.439  -1.027   4.473  64.561 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   23.331      1.072  21.767   <2e-16 ***
## nj            -2.892      1.194  -2.423   0.0156 *  
## d             -2.166      1.516  -1.429   0.1535    
## nj:d           2.754      1.688   1.631   0.1033    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.406 on 790 degrees of freedom
## Multiple R-squared:  0.007401,   Adjusted R-squared:  0.003632 
## F-statistic: 1.964 on 3 and 790 DF,  p-value: 0.118

Question 3

The answer does not change qualitatively. We see a small difference in the coefficient from 2.75 to 2.55. However, we still interpret the result as the causal effect is not statistically significant. The result only changes quantitatively.

Solution:
First we set the NA values to an absolute 0. Then we run the regression analysis. See below code and result.

cards[is.na(cards)] <- 0

lm2 <- lm(fte~ nj*d, data=cards)
summary(lm2)
## 
## Call:
## lm(formula = fte ~ nj * d, data = cards)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -22.741  -6.322  -0.697   4.885  65.178 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   22.741      1.121  20.289   <2e-16 ***
## nj            -2.919      1.247  -2.340   0.0195 *  
## d             -2.111      1.585  -1.332   0.1834    
## nj:d           2.554      1.764   1.448   0.1481    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.962 on 816 degrees of freedom
## Multiple R-squared:  0.006773,   Adjusted R-squared:  0.003121 
## F-statistic: 1.855 on 3 and 816 DF,  p-value: 0.1358

Graphical Representation of Diff in Diff of Minimum wages



Question 4

suminc %>%
  ggplot(aes(treatment,fte, color = State, group = State))+
  geom_line(size =1)+
  ggtitle("Average Employment Per Store")+
  xlab("")+
  ylab("Full time equivalent employment")