PROBLEM 1. Canonical Difference-in-Differences.

i Without covariates.

#install.packages("wooldridge")

library(wooldridge)

## Warning: 程序包'wooldridge'是用R版本4.4.3 来建造的

library(lmtest)

## Warning: 程序包'lmtest'是用R版本4.4.1 来建造的

## 载入需要的程序包：zoo

## Warning: 程序包'zoo'是用R版本4.4.1 来建造的

## 
## 载入程序包：'zoo'

## The following objects are masked from 'package:data.table':
## 
##     yearmon, yearqtr

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

library(multiwayvcov)

## Warning: 程序包'multiwayvcov'是用R版本4.4.1 来建造的

# Load and attach data
data(kielmc)
attach(kielmc)

# 2x2 Difference-in-Differences regression (without covariates)
did_model <- lm(rprice ~ nearinc * y81)

# Cluster-robust standard errors at the cbd level
cl_vcov <- cluster.vcov(did_model, cluster = ~cbd)

# Get robust coefficient test results
coeftest(did_model, vcov = cl_vcov)

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)  82517.2     3221.9 25.6112 < 2.2e-16 ***
## nearinc     -18824.4     7796.3 -2.4145 0.0163216 *  
## y81          18790.3     5154.9  3.6452 0.0003122 ***
## nearinc:y81 -11863.9     6621.8 -1.7916 0.0741446 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results suggest that the construction of the garbage incinerator had a negative effect on house prices in nearby areas (relative to areas farther away), with a marginally significant estimated drop of around $11,864. However, the result is not strongly significant.

ii With covariates.

Re-estimate DiD with covariates: rooms, baths, age, agesq, larea, lland, clustering SEs at the cbd level.

library(wooldridge)
library(lmtest)
library(multiwayvcov)


data(kielmc)

did_cov1 <- lm(rprice ~ nearinc * y81 + rooms + baths + age + agesq + larea + lland, data = kielmc)

cl_vcov1 <- cluster.vcov(did_cov1, kielmc$cbd)


coeftest(did_cov1, vcov = cl_vcov1)

## 
## t test of coefficients:
## 
##                Estimate  Std. Error t value  Pr(>|t|)    
## (Intercept) -2.7973e+05  9.3485e+04 -2.9922 0.0029916 ** 
## nearinc      1.5555e+04  1.0225e+04  1.5213 0.1292058    
## y81          1.6673e+04  3.7149e+03  4.4881 1.013e-05 ***
## rooms        2.9790e+03  1.4014e+03  2.1257 0.0343132 *  
## baths        7.4582e+03  2.5953e+03  2.8738 0.0043353 ** 
## age         -6.1625e+02  1.2845e+02 -4.7976 2.495e-06 ***
## agesq        3.0526e+00  7.7572e-01  3.9352 0.0001026 ***
## larea        3.2815e+04  8.4878e+03  3.8662 0.0001346 ***
## lland        7.1943e+03  4.6602e+03  1.5438 0.1236639    
## nearinc:y81 -1.7546e+04  5.8216e+03 -3.0139 0.0027912 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Houses near the incinerator experienced a significant price decrease of about $17,546 after its construction, relative to farther houses.

Compared to the model without covariates, this result is larger in magnitude and more precisely estimated, indicating that some confounding may have biased the original estimate toward zero.

Re-estimate using a different set of covariates: rooms, baths, age, area, and land.

did_cov2 <- lm(rprice ~ nearinc * y81 + rooms + baths + age + area + land, data = kielmc)


cl_vcov2 <- cluster.vcov(did_cov2, kielmc$cbd)

coeftest(did_cov2, vcov = cl_vcov2)

## 
## t test of coefficients:
## 
##                Estimate  Std. Error t value  Pr(>|t|)    
## (Intercept) -1.5131e+04  1.1348e+04 -1.3333 0.1834009    
## nearinc      4.8773e+03  7.4656e+03  0.6533 0.5140455    
## y81          1.3793e+04  3.8399e+03  3.5920 0.0003812 ***
## rooms        4.2119e+03  1.7315e+03  2.4325 0.0155564 *  
## baths        1.1103e+04  2.5041e+03  4.4340 1.283e-05 ***
## age         -1.9056e+02  5.3182e+01 -3.5832 0.0003937 ***
## area         1.7937e+01  5.6531e+00  3.1730 0.0016590 ** 
## land         1.1802e-01  1.1140e-01  1.0595 0.2902133    
## nearinc:y81 -1.1692e+04  4.8847e+03 -2.3937 0.0172703 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This result still indicates a statistically significant drop in house prices near the incinerator after its construction, but the estimated effect is smaller (about -$11,692) compared to the previous model with agesq, larea, and lland (which had -$17,546).

estimate the treatment effect using IPW with covariates: rooms, baths, age, agesq, larea, lland ; with clustered bootstrap.

library(causalweight)

## Warning: 程序包'causalweight'是用R版本4.4.3 来建造的

## 载入需要的程序包：ranger

## Warning: 程序包'ranger'是用R版本4.4.3 来建造的

treat <- kielmc$nearinc
post <- kielmc$y81
y <- kielmc$rprice
X <- kielmc[, c("rooms", "baths", "age", "agesq", "larea", "lland")]
X <- as.data.frame(lapply(X, as.numeric)) 
X_matrix <- as.matrix(X)
# Run IPW DiD estimator
ipw_model1 <- suppressWarnings( didweight(
    y = y,
    d = treat,          
    t = post,          
    x = X_matrix,              
    cluster = kielmc$cbd,
    boot = 1000        
))

# Show results
ipw_model1

## $effect
## [1] -18380.37
## 
## $se
## [1] 22175.04
## 
## $pvalue
## [1] 0.4071744
## 
## $ntrimmed
## [1] 11

treatment effect: -18380.37, standard errors: 23936.81, and p-value: 0.4425642.

Although the estimated treatment effect is negative, the standard error is large, and the p-value is high, suggesting no significant effect based on this specification.

estimate the treatment effect using IPW with covariates: rooms, baths, age, area, and land ; with clustered bootstrap.

library(causalweight)

treat <- kielmc$nearinc
post <- kielmc$y81
y <- kielmc$rprice


X_step4 <- kielmc[, c("rooms", "baths", "age", "area", "land")]
X_matrix4 <- as.matrix(X_step4)


ipw_model2 <- suppressWarnings(
  didweight(
  y = y,
  d = treat,
  t = post,
  x = X_matrix4,
  cluster = kielmc$cbd,
  boot = 1000
))

# Show results
ipw_model2

## $effect
## [1] -9243.685
## 
## $se
## [1] 11154.81
## 
## $pvalue
## [1] 0.4072896
## 
## $ntrimmed
## [1] 0

Treatment effect: −9,243.69 , Standard error: 11,440.05 , p-value: 0.4191

Compare

The strongest and most precise treatment effect comes from Step 2.1 (DiD with full covariates), with a statistically significant result at the 1% level. The IPW models (3 and 4) suggest similar negative effects in direction, but they have much larger standard errors, and both are not statistically significant. Including covariates (in both DiD and IPW) generally increases the magnitude of the estimated effect, which likely reflects adjustment for housing characteristics that confound the raw treatment effect. Between the two DiD models with covariates, the more flexible one with quadratic and log-transformed terms (Step 2.1) gives a larger and more precise estimate.

My preferred estimate is from Step 1, where the treatment effect was estimated using a 2x2 DiD model with covariates including rooms, baths, age, agesq, larea, and lland. This model provides a statistically significant and economically meaningful estimate (−$17,546) with the highest precision and best control for confounding. While IPW estimates suggest similar directions, their high variance makes them less reliable in this setting.

iii. Two-Way Fixed Effects

i <- c(1, 1, 2, 2, 3, 3, 4, 4)
t <- c(1, 2, 1, 2, 1, 2, 1, 2)
P_t <- c(0, 1, 0, 1, 0, 1, 0, 1)
D_it <- c(0, 0, 0, 1, 0, 1, 0, 1)
D_i <- c(0, 0, 0, 1, 0, 1, 0, 1)
Y_it <- c(6, 7, 10, 11, 8, 10, 4, 12)

df <- data.frame(i = i, t = t, P_t = P_t, D_it = D_it, D_i = D_i, Y_it = Y_it)

library(fixest)

## Warning: 程序包'fixest'是用R版本4.4.3 来建造的

## 
## 载入程序包：'fixest'

## The following object is masked from 'package:scales':
## 
##     pvalue

# Estimate Model (1)
m1 <- feols(Y_it ~ D_it | i + t, data = df, cluster = ~i)

# Estimate Model (2)
m2 <- feols(Y_it ~ D_i * P_t | i + t, data = df, cluster = ~i)

## The variables 'P_t' and 'D_i:P_t' have been removed because of collinearity (see $collin.var).

# Estimate Model (3)
m3 <- feols(Y_it ~ D_i:P_t | i + t, data = df, cluster = ~i)

# Show results
etable(m1, m2, m3, cluster = "i")

##                            m1            m2            m3
## Dependent Var.:          Y_it          Y_it          Y_it
##                                                          
## D_it            2.667 (2.438)                            
## D_i                           2.667 (2.438)              
## D_i x P_t                                   2.667 (2.438)
## Fixed-Effects:  ------------- ------------- -------------
## i                         Yes           Yes           Yes
## t                         Yes           Yes           Yes
## _______________ _____________ _____________ _____________
## S.E.: Clustered         by: i         by: i         by: i
## Observations                8             8             8
## R2                    0.72436       0.72436       0.72436
## Within R2             0.15686       0.15686       0.15686
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The estimates from all three models are identical at 2.667 with the same clustered standard error of 2.438. This equivalence arises because the treatment occurs only in the second period and only for treated units, under a perfectly balanced two-period panel. The inclusion of unit and time fixed effects ensures that each specification effectively estimates the same average treatment effect on the treated (ATT). While the estimates are substantively the same, none are statistically significant due to the small sample and relatively large standard errors.

PROBLEM 2. Difference-in-Differences with Multiple Periods of Treatment

i. Descriptive county-level maps.

library(usmap)
library(dplyr)
library(ggplot2)

library(did)

## Warning: 程序包'did'是用R版本4.4.3 来建造的

data(mpdta)

county_info <- mpdta %>%
  group_by(countyreal) %>%
  summarise(
    treat = max(treat),
    first.treat = unique(first.treat)
  ) %>%
  rename(fips = countyreal)
county_info$fips <- sprintf("%05d", county_info$fips)

county_info <- county_info %>%
  mutate(fips = as.numeric(fips)) 


plot_usmap(data = county_info %>% mutate(treat = as.factor(treat)), 
           values = "treat", regions = "counties") +
  scale_fill_manual(values = c("0" = "grey85", "1" = "blue"), na.value = "white", 
                    name = "Treated") +
  labs(title = "Treated Counties") +
  theme(legend.position = "right")

plot_usmap(data = county_info %>% mutate(first.treat = as.numeric(first.treat)),
           values = "first.treat", regions = "counties") +
  scale_fill_viridis_c(name = "First Treat Year", na.value = "white") +
  labs(title = "First Treatment Year by County") +
  theme(legend.position = "right")

The first map (“Treated Counties”) displays which counties implemented the minimum wage policy at any point during the observed period (2003–2007). Blue counties were treated (i.e., introduced the policy), while grey ones remained untreated. We observe that treated counties are dispersed across the country, with notable clusters in states like Colorado, Minnesota, Florida, and parts of the Midwest and Southeast.

The second map (“First Treatment Year by County”) adds more temporal detail, showing the year each treated county first adopted the policy. Lighter (yellow-green) shades indicate earlier adoption (e.g., 2004), while darker (purple) shades indicate later adoption. Untreated counties appear in white. This staggered pattern of adoption confirms the need for methods that account for treatment heterogeneity across time and space.

Mean lemp by first.treat and year

library(dplyr)
library(ggplot2)
library(usmap)


lemp_summary <- mpdta %>%
  group_by(countyreal, year, first.treat) %>%
  summarise(mean_lemp = mean(lemp, na.rm = TRUE), .groups = "drop") %>%
  mutate(fips = as.numeric(countyreal))



treat_groups <- unique(lemp_summary$first.treat)


for (group in treat_groups) {
  

  df_group <- lemp_summary %>%
    filter(first.treat == group & !is.na(year))
  

  title_text <- paste("Mean lemp for First Treatment Year =", group)
  

  p <- plot_usmap(data = df_group, 
                  values = "mean_lemp", 
                  regions = "counties") +
    scale_fill_viridis_c(name = "Mean lemp", na.value = "white") +
    facet_wrap(~year, nrow = 2, ncol = 3) +  # 2行3列布局
    labs(title = title_text) +
    theme(legend.position = "right")
  

  print(p)
  

}

In the year-by-year maps of mean teen employment (lemp) by treatment cohort (first.treat), we observe two main trends. First, in most treatment cohorts, lemp remains relatively stable before treatment years and slightly declines or levels off after treatment. For example, counties first treated in 2006 or 2007 show modest reductions in lemp in the post-treatment years. Second, counties never treated (first.treat = 0) exhibit relatively higher and stable teen employment rates across years. These visual trends may suggest a modest negative effect of treatment on teen employment, though formal statistical testing is necessary to confirm this. Overall, the maps support the idea that treatment timing varied across space and that lemp changed differently depending on treatment status and timing.

ii. Cohort-year-specific average treatment effects on the treated

Estimate the group-time average treatment effects on the treated

library(did)

att_reg <- att_gt(
  yname = "lemp",                 
  tname = "year",                 
  idname = "countyreal",          
  gname = "first.treat",          
  xformla = ~ lpop,               
  data = mpdta,
  est_method = "reg",             
  control_group = "nevertreated", 
  clustervars = "countyreal"      
)

summary(att_reg)

## 
## Call:
## att_gt(yname = "lemp", tname = "year", idname = "countyreal", 
##     gname = "first.treat", xformla = ~lpop, data = mpdta, control_group = "nevertreated", 
##     clustervars = "countyreal", est_method = "reg")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##   2004 2004  -0.0149     0.0212       -0.0715      0.0417  
##   2004 2005  -0.0770     0.0283       -0.1527     -0.0013 *
##   2004 2006  -0.1411     0.0365       -0.2387     -0.0435 *
##   2004 2007  -0.1075     0.0342       -0.1989     -0.0162 *
##   2006 2004  -0.0021     0.0208       -0.0576      0.0535  
##   2006 2005  -0.0070     0.0191       -0.0580      0.0441  
##   2006 2006   0.0008     0.0192       -0.0506      0.0521  
##   2006 2007  -0.0415     0.0205       -0.0964      0.0133  
##   2007 2004   0.0264     0.0142       -0.0116      0.0644  
##   2007 2005  -0.0048     0.0159       -0.0472      0.0377  
##   2007 2006  -0.0285     0.0167       -0.0731      0.0161  
##   2007 2007  -0.0288     0.0173       -0.0750      0.0174  
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.23116
## Control Group:  Never Treated,  Anticipation Periods:  0
## Estimation Method:  Outcome Regression

There’s strong evidence of negative treatment effects for the 2004 cohort, especially over 2005–2007. No clear evidence of effects for later cohorts. These findings suggest that early adopters of the minimum wage had a more noticeable impact on teen employment — possibly due to more aggressive or larger policy changes, or better statistical precision due to more post-treatment years.

IPW estimates

att_ipw <- att_gt(
  yname = "lemp",                 
  tname = "year",                 
  idname = "countyreal",          
  gname = "first.treat",          
  xformla = ~ lpop,               
  data = mpdta,
  est_method = "ipw",            
  control_group = "nevertreated",
  clustervars = "countyreal"      
)


summary(att_ipw)

## 
## Call:
## att_gt(yname = "lemp", tname = "year", idname = "countyreal", 
##     gname = "first.treat", xformla = ~lpop, data = mpdta, control_group = "nevertreated", 
##     clustervars = "countyreal", est_method = "ipw")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##   2004 2004  -0.0145     0.0243       -0.0804      0.0513  
##   2004 2005  -0.0764     0.0289       -0.1548      0.0019  
##   2004 2006  -0.1405     0.0369       -0.2405     -0.0404 *
##   2004 2007  -0.1069     0.0341       -0.1994     -0.0144 *
##   2006 2004  -0.0009     0.0228       -0.0627      0.0609  
##   2006 2005  -0.0064     0.0183       -0.0561      0.0433  
##   2006 2006   0.0012     0.0192       -0.0509      0.0533  
##   2006 2007  -0.0413     0.0204       -0.0965      0.0139  
##   2007 2004   0.0266     0.0146       -0.0130      0.0661  
##   2007 2005  -0.0047     0.0160       -0.0479      0.0386  
##   2007 2006  -0.0283     0.0179       -0.0767      0.0201  
##   2007 2007  -0.0289     0.0165       -0.0735      0.0157  
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.23604
## Control Group:  Never Treated,  Anticipation Periods:  0
## Estimation Method:  Inverse Probability Weighting

The 2004 cohort again shows the largest negative effects in 2005–2007:

2005: −0.0764 (not significant)

2006: −0.1405, significant

2007: −0.1069, significant

The pattern is very similar to the regression adjustment results:

The magnitudes are almost identical (e.g., −0.1405 vs −0.1411). Statistical significance appears only in post-treatment years for the 2004 cohort. Later cohorts (2006 and 2007) again show small, mostly insignificant effects.

Do results change compared to regression adjustment?

Not really — the overall pattern is qualitatively consistent: Both methods find that early adopters (2004 group) experienced the most pronounced treatment effects. The 2006 and 2007 cohorts show little evidence of a treatment effect. Slight differences may exist in standard errors or confidence bands, but the substantive interpretation remains the same.

3.Doubly Robust (DR) estimation

att_dr <- att_gt(
  yname = "lemp",                 
  tname = "year",                 
  idname = "countyreal",          
  gname = "first.treat",          
  xformla = ~ lpop,               
  data = mpdta,
  est_method = "dr",             
  control_group = "nevertreated",
  clustervars = "countyreal"      
)


summary(att_dr)

## 
## Call:
## att_gt(yname = "lemp", tname = "year", idname = "countyreal", 
##     gname = "first.treat", xformla = ~lpop, data = mpdta, control_group = "nevertreated", 
##     clustervars = "countyreal", est_method = "dr")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##   2004 2004  -0.0145     0.0231       -0.0770      0.0480  
##   2004 2005  -0.0764     0.0296       -0.1566      0.0038  
##   2004 2006  -0.1404     0.0370       -0.2408     -0.0401 *
##   2004 2007  -0.1069     0.0350       -0.2017     -0.0121 *
##   2006 2004  -0.0005     0.0225       -0.0614      0.0605  
##   2006 2005  -0.0062     0.0183       -0.0558      0.0434  
##   2006 2006   0.0010     0.0195       -0.0518      0.0537  
##   2006 2007  -0.0413     0.0197       -0.0946      0.0120  
##   2007 2004   0.0267     0.0131       -0.0088      0.0623  
##   2007 2005  -0.0046     0.0158       -0.0473      0.0382  
##   2007 2006  -0.0284     0.0173       -0.0753      0.0184  
##   2007 2007  -0.0288     0.0166       -0.0739      0.0163  
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.23267
## Control Group:  Never Treated,  Anticipation Periods:  0
## Estimation Method:  Doubly Robust

Again, the 2004 cohort shows the strongest negative effects:

2006: −0.1404, statistically significant

2007: −0.1069, statistically significant

2005: −0.0764, borderline (not significant)

All other estimates (for 2006 and 2007 cohorts) remain small and insignificant, with confidence bands including zero. The DR estimates closely mirror the regression and IPW results, confirming robustness. The 2004 cohort consistently experiences significant drops in log teen employment post-treatment. Later adopters (2006, 2007) show little to no detectable effect, suggesting either smaller treatment impact or less post-treatment time to detect it.

4.plotting the results using ggdid()

ggdid(att_dr)

This plot strongly reinforces the numeric results. The 2004 early adopters experienced the most pronounced drop in teen employment. Later cohorts show no detectable effect, possibly due to smaller effects or fewer post-treatment years.

Do your earlier maps hint to these results? Yes.

iii. Cohort-specific and aggregate average treatment effects on the treated

Aggregate ATT using aggte() from the did package

agg_dr <- aggte(att_dr, type = "group")

summary(agg_dr)

## 
## Call:
## aggte(MP = att_dr, type = "group")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on group/cohort aggregation:  
##      ATT    Std. Error     [ 95%  Conf. Int.]  
##  -0.0328        0.0135    -0.0594     -0.0063 *
## 
## 
## Group Effects:
##  Group Estimate Std. Error [95% Simult.  Conf. Band]  
##   2004  -0.0846     0.0271       -0.1446     -0.0246 *
##   2006  -0.0202     0.0180       -0.0599      0.0196  
##   2007  -0.0288     0.0163       -0.0649      0.0073  
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Never Treated,  Anticipation Periods:  0
## Estimation Method:  Doubly Robust

On average, the minimum wage policy reduced the log of teen employment by about 3.3% across all treated counties and time periods. This effect is statistically meaningful and consistent with earlier findings.

The 2004 cohort drives the overall effect. Later adopters (2006, 2007) show smaller and statistically insignificant impacts, again consistent with your previous DR, IPW, and regression-adjusted analyses.

Identifying and testing three other R packages

a.Estimate ATT using fixest + sunab()

use the sunab() function to construct cohort × event time interactions and estimate the average treatment effect using a two-way fixed effects model that is robust to heterogeneous treatment effects.

library(fixest)

fixest_att <- feols(
  lemp ~ sunab(first.treat, year) + lpop | countyreal + year,
  data = mpdta,
  cluster = ~countyreal
)

## The variable 'lpop' has been removed because of collinearity (see $collin.var).

summary(fixest_att)

## OLS estimation, Dep. Var.: lemp
## Observations: 2,500
## Fixed-effects: countyreal: 500,  year: 5
## Standard-errors: Clustered (countyreal) 
##           Estimate Std. Error   t value   Pr(>|t|)    
## year::-4  0.003306   0.024555  0.134651 0.89294248    
## year::-3  0.025022   0.018154  1.378283 0.16873349    
## year::-2  0.024459   0.014267  1.714383 0.08707936 .  
## year::0  -0.019932   0.011858 -1.680940 0.09340025 .  
## year::1  -0.050957   0.016871 -3.020469 0.00265324 ** 
## year::2  -0.137259   0.036589 -3.751317 0.00019655 ***
## year::3  -0.100811   0.034504 -2.921707 0.00363899 ** 
## ... 1 variable was removed because of collinearity (lpop)
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.123709     Adj. R2: 0.991529
##                  Within R2: 0.012404

The results confirm earlier findings: significant and persistent negative treatment effects appear starting the year after adoption. No significant effects before treatment, lending support to the parallel trends assumption.

Estimate ATT using did2s

library(did2s)

## Warning: 程序包'did2s'是用R版本4.4.3 来建造的

## did2s (v1.0.2). For more information on the methodology, visit <https://www.kylebutts.github.io/did2s>
## 
## To cite did2s in publications use:
## 
##   Butts, Kyle (2021).  did2s: Two-Stage Difference-in-Differences
##   Following Gardner (2021). R package version 1.0.2.
## 
## LaTeX的用户的BibTeX条目是
## 
##   @Manual{,
##     title = {did2s: Two-Stage Difference-in-Differences Following Gardner (2021)},
##     author = {Kyle Butts},
##     year = {2021},
##     url = {https://github.com/kylebutts/did2s/},
##   }

mpdta2 <- mpdta %>%
  mutate(treat_dynamic = ifelse(year >= first.treat & first.treat != 0, 1, 0))

att_did2s <- did2s(
  data = mpdta2,
  yname = "lemp",
  first_stage = ~ 0 | countyreal + year,
  second_stage = ~ i(treat_dynamic, ref = FALSE),
  treatment = "treat_dynamic",
  cluster_var = "countyreal"
)

## Running Two-stage Difference-in-Differences
##  - first stage formula `~ 0 | countyreal + year`
##  - second stage formula `~ i(treat_dynamic, ref = FALSE)`
##  - The indicator variable that denotes when treatment is on is `treat_dynamic`
##  - Standard errors will be clustered by `countyreal`

summary(att_did2s)

## OLS estimation, Dep. Var.: lemp
## Observations: 2,500
## Standard-errors: Custom 
##                  Estimate Std. Error  t value   Pr(>|t|)    
## treat_dynamic::1 -0.04771   0.013478 -3.53973 0.00040785 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.127547   Adj. R2: 0.013792

The did2s estimate of the average treatment effect on the treated (ATT) is −0.0477, with a standard error of 0.0135. This result is statistically significant at the 1% level (p < 0.001), suggesting a negative and precisely estimated effect of the minimum wage policy on log teen employment (lemp).

In other words, after accounting for both county and year fixed effects, and conditioning on log population (lpop), the introduction of the minimum wage policy reduced teen employment by approximately 4.8% on average in treated counties compared to never-treated counties.

TwoWayFEweights package

# install.packages("didimputation")
library(didimputation)

## Warning: 程序包'didimputation'是用R版本4.4.3 来建造的

## 
## 载入程序包：'didimputation'

## The following objects are masked from 'package:did2s':
## 
##     df_het, df_hom

att_imputation <- did_imputation(
  data = mpdta,
  yname = "lemp",              
  gname = "first.treat",       
  tname = "year",              
  idname = "countyreal",       
  cluster_var = "countyreal" 
)


summary(att_imputation)

##      lhs                term              estimate          std.error      
##  Length:1           Length:1           Min.   :-0.04771   Min.   :0.01322  
##  Class :character   Class :character   1st Qu.:-0.04771   1st Qu.:0.01322  
##  Mode  :character   Mode  :character   Median :-0.04771   Median :0.01322  
##                                        Mean   :-0.04771   Mean   :0.01322  
##                                        3rd Qu.:-0.04771   3rd Qu.:0.01322  
##                                        Max.   :-0.04771   Max.   :0.01322  
##     conf.low          conf.high       
##  Min.   :-0.07363   Min.   :-0.02179  
##  1st Qu.:-0.07363   1st Qu.:-0.02179  
##  Median :-0.07363   Median :-0.02179  
##  Mean   :-0.07363   Mean   :-0.02179  
##  3rd Qu.:-0.07363   3rd Qu.:-0.02179  
##  Max.   :-0.07363   Max.   :-0.02179

The average treatment effect on the treated (ATT) is estimated to be -0.0477, with a standard error of 0.0132. The 95% confidence interval ranges from -0.0736 to -0.0218, indicating that the effect is statistically significant at conventional levels. This suggests that the implementation of the minimum wage policy led to a 4.8% reduction in log teenage employment, on average, for the treated counties. The result aligns in magnitude and statistical significance with earlier findings using the doubly robust method (did2s), providing additional robustness to the conclusion that the policy had a negative impact on teenage employment in the treated counties.

PS5

Mohan

2025-04-06