class: center, middle, inverse, title-slide .title[ # Advanced quantitative data analysis ] .subtitle[ ## Difference in Difference II ] .author[ ### Mengni Chen ] .institute[ ### Department of Sociology, University of Copenhagen ] --- <style type="text/css"> .remark-slide-content { font-size: 20px; padding: 20px 80px 20px 80px; } .remark-code, .remark-inline-code { background: #f0f0f0; } .remark-code { font-size: 14px; } </style> #Let's get ready ```r #install.packages("did") library(tidyverse) # Add the tidyverse package to my current library. library(haven) # Handle labelled data. library(splitstackshape) #transform wide data (with stacked variables) to long data library(plm) #linear models for panel data library(did) #for difference in difference analysis ``` --- #Difference in Difference: Visualize the three ways of understanding - Assumption - We assume that trends of dependent variable over time were identical between treated and non-treated group before the treatment takes place - We assume that the trends would have remained parallel, if there would have been no treatment. - Three ways of understanding <img src="https://github.com/fancycmn/slide12/blob/main/S12_Pic11.PNG?raw=true" width="100%" style="display: block; margin-left:0px;"> --- #Difference in Difference: application - It can not only looks at the effect of life events on individual's life satisfaction, mental health, salary, working hours, etc. - It can also evaluate the effect of policies. - for example, KU introduces a new one-year MA program. How do you evaluate the impact of this new program? --- #Difference in differnce: time-varying treatment - Time-varying treatment or sometime called staggered treatment - More then two group: one control, one treated group is treated earlier, one treated group is treated later - More than two periods - The treatment time is not the same for all members in the treated group - This is the so-called **staggered treatment** - Example:KU introduces a new one-year MA program. How does this affect the salary of the graduates? - The program is rolled out to students at different times based on their departments: - One-year MA program in the Sociology Department starts in Sep 2025. - One-year MA program in the Psychology Department starts in Sep 2026. - One-year MA program in the Economics Department starts in Sep 2027. - Data: We have yearly data on graduates' employment and salary from January 2022 to December 2030. - Goal: To estimate the causal effect of the one-year MA program on graduates' salary. --- # TWFE is problematic when the treatment is staggered The TWFE model includes: - Unit Fixed Effects `\(μ_{i}\)`: Captures time-invariant characteristics of each unit (e.g., state, individual, etc.) - Time Fixed Effects `\(λ_{t}\)`: Captures time-varying factors common to all units (e.g., macroeconomic trends). - The regression equation is : `\(Y_{i,t}=μ_{i}+λ_{t}+ β^{DD}Treatment_{i,t}+ ϵ_{i,t}\)` - `\(Treatment_{i,t}\)` is a binary indicator equal to 1 if the individual `\(i\)` is treated at time t, and 0 otherwise. - `\(β^{DD}\)` is the average treatment effect on the treated (ATT). - Using a two-way fixed effect (TWFE) to estimate the average treatment effect (ATT) is problematic (Goodman-Bacon 2018) - Problem 1: `\(β^{DD}\)` becomes a very strange weighted average, due to the strange weight - Problem 2: the treatment effects may be heterogeneous across groups and over time --- # TWFE is problematic when the treatment is staggered - Goodman-Bacon(2018) identifies that the `\(β^{DD}\)` is a [strange weighted average](https://www.youtube.com/watch?v=aUHCAG98G-o) of 2x2 comparisons (see the following graphs for different 2x2 comparisons). - Later treated groups become a control group for early treated groups - Earlier treated groups become also a control group of late treated groups - Heterogeneous treatment effects may lead to severe bias <img src="https://d33wubrfki0l68.cloudfront.net/53a39a756721843acc5f97dc13f07b72991a38aa/716a0/post/2019-09-25-difference-in-differences-methodology_files/figure-html/unnamed-chunk-3-1.png" width="40%", style="position:absolute; left:50px; top:330px;"> <img src="https://d33wubrfki0l68.cloudfront.net/5ea38f7e27db439faeb5eafecb24d7a2d48fe581/7c116/post/2019-09-25-difference-in-differences-methodology_files/figure-html/unnamed-chunk-4-1.png" width="40%", style="position:absolute; right:150px; top:330px;"> --- #Difference in differnce: time-varying treatment - Using a two-way fixed effect (TWFE) to estimate the average treatment effect (ATT) is problematic (Goodman-Bacon 2018) - Difficult to interpret what the ATT means from the twoway fixed effect - `\(β^{DD}\)` becomes a very strange weighted average, due to the strange weight (Goodman-Bacon 2018) <img src="https://github.com/fancycmn/2024-Session13/blob/main/Figure1.JPG?raw=true" width="80%" style="display: block; margin-left:0px;"> --- # Two-way fixed effect is problematic - `\(β^{DD}\)` from two-way fixed effect estimate depends on many factors: variation in treatment time, group size, etc - treated group vs untreated group - earlier treated vs later treated - later treated vs earlier treated - Greater weight will be given to - Big groups (i.e. many observations) - Groups that are treated closer to th middle of the sample period - Therefore, due to the strange weight, it is very difficult to explain what `\(β^{DD}\)` it means --- #Solutions: Callaway and Sant’Anna (2021) - Give a group-period specific estimate - ATT(g,t): average treatment effect of group G and at time t - Use the "never treated" group as the comparison for every treated group and every period - Summarize the ATT(g,t)s, using weights flexibly and meaningful to you --- #Using DID package - [prepare the data](https://rpubs.com/fancycmn/1251748) - three codes from the `did` package - estimate a group-period specific effect: `att_gt()` - plot the result: `ggdid()` - summarize the aggregate treatment effect: `aggte()` --- #Estimate a group-period specific effect ```r sixwaves_long4 <- sixwaves_long3 %>% group_by(id) %>% mutate( treatgroup=case_when( anchorwave %in% c(2:6) ~ anchorwave, #use anchorwave to define the different treated groups anchorwave ==99 ~ 0) #if not treated, define the treatgroup as 0, that is the control group ) ``` <img src="https://github.com/fancycmn/2024-Session13/blob/main/Figure2.JPG?raw=true" width="100%" style="display: block; margin-left:0px;"> --- #Estimate a group-period specific effect ```r did <- att_gt(yname = "sat", #dependent variable tname = "wave", #time variable idname = "id", #id gname = "treatgroup", #identify five treatment groups xformla = ~ 1, #when you don't have any covariates to control, use "~ 1"; if yes, you can add covariates here by ~ x1+x2 data = sixwaves_long4 #specify your data, ) ``` ``` ## Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : ## Dropped 1844 observations while converting to balanced panel. ``` --- #Estimate a group-period specific effect ```r summary(did) ``` ``` ## ## Call: ## att_gt(yname = "sat", tname = "wave", idname = "id", gname = "treatgroup", ## xformla = ~1, data = sixwaves_long4) ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## Group-Time Average Treatment Effects: ## Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band] ## 2 2 0.4211 0.2102 -0.1838 1.0261 ## 2 3 0.5487 0.1890 0.0050 1.0925 * ## 2 4 0.3960 0.2137 -0.2191 1.0110 ## 2 5 0.3775 0.2126 -0.2343 0.9893 ## 2 6 0.3463 0.2313 -0.3191 1.0117 ## 3 2 0.1969 0.2009 -0.3813 0.7751 ## 3 3 0.3428 0.1971 -0.2242 0.9098 ## 3 4 -0.0380 0.2866 -0.8626 0.7867 ## 3 5 -0.1051 0.2239 -0.7492 0.5390 ## 3 6 0.2281 0.2099 -0.3757 0.8320 ## 4 2 -0.2392 0.2341 -0.9128 0.4344 ## 4 3 0.3234 0.2081 -0.2754 0.9223 ## 4 4 0.0800 0.2108 -0.5265 0.6865 ## 4 5 0.1634 0.1605 -0.2984 0.6253 ## 4 6 0.3281 0.1797 -0.1891 0.8452 ## 5 2 0.2097 0.2408 -0.4831 0.9024 ## 5 3 -0.2581 0.2160 -0.8795 0.3634 ## 5 4 0.2928 0.2047 -0.2961 0.8817 ## 5 5 0.2506 0.2130 -0.3623 0.8635 ## 5 6 0.1241 0.2107 -0.4821 0.7302 ## 6 2 -0.4166 0.2878 -1.2448 0.4116 ## 6 3 -0.0192 0.2405 -0.7114 0.6729 ## 6 4 -0.1032 0.3224 -1.0310 0.8245 ## 6 5 0.1628 0.3322 -0.7931 1.1186 ## 6 6 0.4390 0.2421 -0.2576 1.1356 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## P-value for pre-test of parallel trends assumption: 0.3068 ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Plot group-period specific effect ```r ggdid(did, ylim=c(-2,2)) ``` <img src="https://github.com/fancycmn/slide13/blob/main/S13_Pic6.png?raw=true" width="45%" style="display: block; margin-left:20px;"> --- #Summarize the ATT(g,t)s: get an overall one ATT ```r #"simple" (this just computes a weighted average of all group-time average treatment effects with weights proportional to group size) agg.ovearll <- aggte(did, type = "simple") #summarize the overall effect summary(agg.ovearll) ``` ``` ## ## Call: ## aggte(MP = did, type = "simple") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## ATT Std. Error [ 95% Conf. Int.] ## 0.2726 0.0976 0.0814 0.4638 * ## ## ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Summarize the ATT(g,t)s: get an group-specific ATT ```r #average treatment effects across different groups agg.group <- aggte(did, type = "group") #summarize the effect by the group summary(agg.group) ``` ``` ## ## Call: ## aggte(MP = did, type = "group") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## Overall summary of ATT's based on group/cohort aggregation: ## ATT Std. Error [ 95% Conf. Int.] ## 0.2678 0.0892 0.093 0.4426 * ## ## ## Group Effects: ## Group Estimate Std. Error [95% Simult. Conf. Band] ## 2 0.4179 0.1752 0.0012 0.8347 * ## 3 0.1070 0.1854 -0.3341 0.5480 ## 4 0.1905 0.1473 -0.1599 0.5409 ## 5 0.1873 0.1852 -0.2533 0.6280 ## 6 0.4390 0.2441 -0.1418 1.0198 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Summarize the ATT(g,t)s: plot a group-specific ATT ```r ggdid(agg.group) #plot the effect by the group ``` <img src="https://github.com/fancycmn/slide13/blob/main/S13_Pic7.png?raw=true" width="45%" style="display: block; margin-left:20px;"> --- #Summarize the ATT(g,t)s - Get a time-dynamic ATT ```r agg.dynamic <- aggte(did, type = "dynamic") #summarize the time dynamic effect summary(agg.dynamic) ``` ``` ## ## Call: ## aggte(MP = did, type = "dynamic") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## Overall summary of ATT's based on event-study/dynamic aggregation: ## ATT Std. Error [ 95% Conf. Int.] ## 0.2841 0.11 0.0684 0.4997 * ## ## ## Dynamic Effects: ## Event time Estimate Std. Error [95% Simult. Conf. Band] ## -4 -0.4166 0.2780 -1.1785 0.3453 ## -3 0.1110 0.1618 -0.3323 0.5543 ## -2 -0.2111 0.1331 -0.5759 0.1537 ## -1 0.2526 0.1012 -0.0249 0.5300 ## 0 0.3012 0.0890 0.0573 0.5452 * ## 1 0.2278 0.1023 -0.0524 0.5081 ## 2 0.2309 0.1152 -0.0848 0.5465 ## 3 0.3142 0.1554 -0.1118 0.7401 ## 4 0.3463 0.2476 -0.3322 1.0249 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Summarize the ATT(g,t)s: plot a time-dynamic ATT ```r ggdid(agg.dynamic) #plot the time dynamic effect ``` <img src="https://github.com/fancycmn/slide13/blob/main/S13_Pic10.png?raw=true" width="45%" style="display: block; margin-left:20px;"> --- #Estimate a group-period specific effect: using unbalanced sometimes, the sample size of a balanced data is very small. You can use the unbalanced option. However, you should consider any selection issues that come with balanced or unbalanced data. ```r did_unbalanced <- att_gt(yname = "sat", #dependent variable tname = "wave", #time variable idname = "id", #id gname = "treatgroup", #identify five treatment groups xformla = ~ 1, #when you don't have any covariates to control, use "~ 1"; if yes, you can add covariates here by ~ x1+x2 data = sixwaves_long4, #specify your data, allow_unbalanced_panel =TRUE #you can specify here to use the unbalanced panel ) ``` --- #Take home - What is staggered treatment - What is the problem of using two-way fixed effect to estimate the ATT of staggered treatment - Using `did` package to do staggered DID - `att_gt()`: estimate the group-time specific effect - `ggdid()`: plot the effect - `aggte()`: aggregate the group-time specific effect - [weight explanation in Callaway and Sant’Anna(1:08:01)](https://www.youtube.com/watch?v=VLviaylakAo&t=4642s) --- class: center, middle #[Exercise](https://rpubs.com/fancycmn/1251752)