class: center, middle, inverse, title-slide .title[ # Advanced quantitative data analysis ] .subtitle[ ## Difference in Difference II ] .author[ ### Mengni Chen ] .institute[ ### Department of Sociology, University of Copenhagen ] --- <style type="text/css"> .remark-slide-content { font-size: 20px; padding: 20px 80px 20px 80px; } .remark-code, .remark-inline-code { background: #f0f0f0; } .remark-code { font-size: 14px; } </style> #Let's get ready ``` r #install.packages("did") library(tidyverse) # recoding library(haven) # import data library(janitor) # tabulation library(splitstackshape) # transform wide data to long data library(plm) # panel data analysis library(did) # difference in difference analysis ``` --- #Difference in Difference: Visualize the three ways of understanding - Assumption - We assume that trends of dependent variable over time were identical between treated and non-treated group before the treatment takes place - We assume that the trends would have remained parallel, if there would have been no treatment. - Three ways of understanding <img src="https://github.com/fancycmn/slide12/blob/main/S12_Pic11.PNG?raw=true" width="100%" style="display: block; margin-left:0px;"> --- #Difference in Difference: application - It can not only looks at the effect of life events on individual's life satisfaction, mental health, salary, working hours, etc, but also evaluate the effect of policies. - Example:KU introduces a new one-year MA program. How does this affect the salary of the graduates? - The program is rolled out to students at different times based on their faculties: - One-year MA program in all departments within the Faculty of Science starts in Sep 2026. - One-year MA program in all departments within the Faculties of Law and Humanities starts starts in Sep 2027. - One-year MA program in all departments within the Faculties of Theology and Social Science starts in Sep 2028. - all other faculties will not have these reform. - Data: We have yearly data on all departments' employment and salary of graduates from January 2022 to December 2030. - Goal: To estimate the causal effect of the one-year MA program on graduates' salary. --- #Difference in differnce: time-varying treatment - Time-varying treatment or sometime called staggered treatment - More then two group: one control, one treated group is treated earlier, one treated group is treated later - More than two periods - The treatment time is not the same for all members in the treated group - This is the so-called **staggered treatment** --- # TWFE is problematic when the treatment is staggered The TWFE model includes: - Unit Fixed Effects `\(μ_{i}\)`: Captures time-invariant characteristics of each unit (e.g., state, individual, etc.) - Time Fixed Effects `\(λ_{t}\)`: Captures time-varying factors common to all units (e.g., macroeconomic trends). - The regression equation is : `\(Y_{i,t}=μ_{i}+λ_{t}+ β^{DD}Treatment_{i,t}+ ϵ_{i,t}\)` - `\(Treatment_{i,t}\)` is a binary indicator equal to 1 if the individual `\(i\)` is treated at time t, and 0 otherwise. - `\(β^{DD}\)` is the average treatment effect on the treated (ATT). - Using a two-way fixed effect (TWFE) to estimate the average treatment effect (ATT) is problematic (Goodman-Bacon 2018) - Problem 1: `\(β^{DD}\)` becomes a very strange weighted average, due to the strange weight - Problem 2: the treatment effects may be heterogeneous across groups and over time --- # TWFE is problematic when the treatment is staggered - Goodman-Bacon(2018) identifies that the `\(β^{DD}\)` is a [strange weighted average](https://www.youtube.com/watch?v=aUHCAG98G-o) of 2x2 comparisons (see the following graphs for different 2x2 comparisons). - Later treated groups become a control group for early treated groups - Earlier treated groups become also a control group of late treated groups - Heterogeneous treatment effects may lead to severe bias <img src="https://d33wubrfki0l68.cloudfront.net/53a39a756721843acc5f97dc13f07b72991a38aa/716a0/post/2019-09-25-difference-in-differences-methodology_files/figure-html/unnamed-chunk-3-1.png" width="40%", style="position:absolute; left:50px; top:330px;"> <img src="https://d33wubrfki0l68.cloudfront.net/5ea38f7e27db439faeb5eafecb24d7a2d48fe581/7c116/post/2019-09-25-difference-in-differences-methodology_files/figure-html/unnamed-chunk-4-1.png" width="40%", style="position:absolute; right:150px; top:330px;"> --- #Difference in differnce: time-varying treatment - Using a two-way fixed effect (TWFE) to estimate the average treatment effect (ATT) is problematic (Goodman-Bacon 2018) - Difficult to interpret what the ATT means from the twoway fixed effect - `\(β^{DD}\)` becomes a very strange weighted average, due to the strange weight (Goodman-Bacon 2018) <img src="https://github.com/fancycmn/2024-Session13/blob/main/Figure1.JPG?raw=true" width="80%" style="display: block; margin-left:0px;"> --- # Two-way fixed effect is problematic - `\(β^{DD}\)` from two-way fixed effect estimate depends on many factors: variation in treatment time, group size, etc - treated group vs untreated group - earlier treated vs later treated - later treated vs earlier treated - Greater weight will be given to - Big groups (i.e. many observations) - Groups that are treated closer to th middle of the sample period - Therefore, due to the strange weight, it is very difficult to explain what `\(β^{DD}\)` it means --- #Solutions: Callaway and Sant’Anna (2021) - Give a group-period specific estimate - ATT(g,t): average treatment effect of group G and at time t - Use the "never treated" group as the comparison for every treated group and every period - Summarize the ATT(g,t)s, using weights flexibly and meaningful to you --- #Using DID package - [prepare the data](https://rpubs.com/fancycmn/1376008) - three codes from the `did` package - estimate a group-period specific effect: `att_gt()` - plot the result: `ggdid()` - summarize the aggregate treatment effect: `aggte()` --- #Create a variable which can show the staggered treatment - create `anchorwave` ``` r #creating anchorwave sixwaves_long3 <- sixwaves_long2 %>% group_by(id) %>% mutate( wave=as.numeric(wave),#once we define sixwaves_long2a as a panel structure, wave becomes a factor; so transfer back to numeric partnerwave=case_when(getpartner==1 ~ wave, TRUE ~ 99 ), #identify at which wave the person get a partner; for the rest, make it 99. anchorwave=min(partnerwave) #anchor the time of the event ) ``` <img src="https://github.com/fancycmn/25-Session3/blob/main/S13-F1.png?raw=true" width="100%" style="display: block; margin-left:0px;"> --- #Create a variable which can show the staggered treatment - based on `anchorwave`, create `treatgroup` ``` r #creating treagroup sixwaves_long4 <- sixwaves_long3 %>% group_by(id) %>% mutate( treatgroup=case_when( anchorwave %in% c(2:6) ~ anchorwave, #use anchorwave to define the different treated groups anchorwave ==99 ~ 0) #if not treated, define the treatgroup as 0, that is the control group ) ``` <img src="https://github.com/fancycmn/25-Session3/blob/main/S13-F2.png?raw=true" width="100%" style="display: block; margin-left:0px;"> --- #Estimate a group-period specific effect ``` r did_result <- att_gt(yname = "sat", #dependent variable tname = "wave", #time variable idname = "id", #id gname = "treatgroup", #identify five treatment groups xformla = ~ 1, #when you don't have any covariates to control, use "~ 1"; if yes, you can add covariates here by ~ x1+x2 data = sixwaves_long4 #specify your data, ) ``` ``` ## Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : ## Dropped 1844 observations while converting to balanced panel. ``` --- #Estimate a group-period specific effect ``` r summary(did_result) ``` ``` ## ## Call: ## att_gt(yname = "sat", tname = "wave", idname = "id", gname = "treatgroup", ## xformla = ~1, data = sixwaves_long4) ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## Group-Time Average Treatment Effects: ## Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band] ## 2 2 0.4211 0.1971 -0.1468 0.9891 ## 2 3 0.5487 0.2030 -0.0361 1.1335 ## 2 4 0.3960 0.2055 -0.1961 0.9880 ## 2 5 0.3775 0.2147 -0.2409 0.9959 ## 2 6 0.3463 0.2468 -0.3647 1.0573 ## 3 2 0.1969 0.1960 -0.3677 0.7616 ## 3 3 0.3428 0.2068 -0.2530 0.9386 ## 3 4 -0.0380 0.2980 -0.8964 0.8204 ## 3 5 -0.1051 0.2114 -0.7141 0.5039 ## 3 6 0.2281 0.2000 -0.3481 0.8044 ## 4 2 -0.2392 0.2318 -0.9071 0.4286 ## 4 3 0.3234 0.2120 -0.2874 0.9343 ## 4 4 0.0800 0.2073 -0.5171 0.6771 ## 4 5 0.1634 0.1584 -0.2930 0.6199 ## 4 6 0.3281 0.1855 -0.2062 0.8623 ## 5 2 0.2097 0.2457 -0.4983 0.9176 ## 5 3 -0.2581 0.2233 -0.9013 0.3852 ## 5 4 0.2928 0.2083 -0.3074 0.8930 ## 5 5 0.2506 0.2378 -0.4345 0.9357 ## 5 6 0.1241 0.1854 -0.4100 0.6582 ## 6 2 -0.4166 0.2892 -1.2498 0.4166 ## 6 3 -0.0192 0.2305 -0.6831 0.6447 ## 6 4 -0.1032 0.2877 -0.9320 0.7255 ## 6 5 0.1628 0.3241 -0.7709 1.0964 ## 6 6 0.4390 0.2572 -0.3021 1.1801 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## P-value for pre-test of parallel trends assumption: 0.3068 ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Plot group-period specific effect ``` r ggdid(did_result, ylim=c(-2,2)) ``` *Given the confidence interval of all pre-treatment estimates covers 0, it indicates that the parallel treand assumptions fulfulls.* <img src="https://github.com/fancycmn/slide13/blob/main/S13_Pic6.png?raw=true" width="40%" style="display: block; margin-left:20px;"> --- #Estimate an overall effect ``` r #"simple" (this just computes a weighted average of all group-time average treatment effects with weights proportional to group size) overall_effect <- aggte(did_result, type = "simple") #summarize the overall effect summary(overall_effect) ``` ``` ## ## Call: ## aggte(MP = did_result, type = "simple") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## ATT Std. Error [ 95% Conf. Int.] ## 0.2726 0.0937 0.089 0.4562 * ## ## ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Estimate group-specific effects ``` r #average treatment effects across different groups group_effect <- aggte(did_result, type = "group") #summarize the effect by the group summary(group_effect) ``` ``` ## ## Call: ## aggte(MP = did_result, type = "group") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## Overall summary of ATT's based on group/cohort aggregation: ## ATT Std. Error [ 95% Conf. Int.] ## 0.2678 0.082 0.1071 0.4285 * ## ## ## Group Effects: ## Group Estimate Std. Error [95% Simult. Conf. Band] ## 2 0.4179 0.1897 -0.0608 0.8967 ## 3 0.1070 0.1996 -0.3968 0.6107 ## 4 0.1905 0.1449 -0.1752 0.5562 ## 5 0.1873 0.1839 -0.2769 0.6516 ## 6 0.4390 0.2456 -0.1810 1.0590 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Estimate group-specific effects: visualization ``` r ggdid(group_effect) #plot the effect by the group ``` <img src="https://github.com/fancycmn/slide13/blob/main/S13_Pic7.png?raw=true" width="45%" style="display: block; margin-left:20px;"> --- #Estimate time-dynamic effect ``` r dynamic_effect <- aggte(did_result, type = "dynamic") #summarize the time dynamic effect summary(dynamic_effect ) ``` ``` ## ## Call: ## aggte(MP = did_result, type = "dynamic") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## Overall summary of ATT's based on event-study/dynamic aggregation: ## ATT Std. Error [ 95% Conf. Int.] ## 0.2841 0.1168 0.0551 0.5131 * ## ## ## Dynamic Effects: ## Event time Estimate Std. Error [95% Simult. Conf. Band] ## -4 -0.4166 0.2639 -1.1515 0.3183 ## -3 0.1110 0.1633 -0.3439 0.5659 ## -2 -0.2111 0.1336 -0.5833 0.1610 ## -1 0.2526 0.1086 -0.0499 0.5551 ## 0 0.3012 0.0882 0.0556 0.5469 * ## 1 0.2278 0.1056 -0.0662 0.5219 ## 2 0.2309 0.1187 -0.0998 0.5615 ## 3 0.3142 0.1477 -0.0973 0.7257 ## 4 0.3463 0.2504 -0.3512 1.0439 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Estimate time-dynamic effect: visualization ``` r ggdid(dynamic_effect ) #plot the time dynamic effect ``` <img src="https://github.com/fancycmn/slide13/blob/main/S13_Pic10.png?raw=true" width="45%" style="display: block; margin-left:20px;"> --- #Estimate a group-period specific effect: using unbalanced sometimes, the sample size of a balanced data is very small. Due to this, you may get all results insignificant. You can use the unbalanced option. However, you should consider any selection issues that come with balanced or unbalanced data. ``` r did_unbalanced <- att_gt(yname = "sat", #dependent variable tname = "wave", #time variable idname = "id", #id gname = "treatgroup", #identify five treatment groups xformla = ~ 1, #when you don't have any covariates to control, use "~ 1"; if yes, you can add covariates here by ~ x1+x2 data = sixwaves_long4, #specify your data, allow_unbalanced_panel =TRUE #you can specify here to use the unbalanced panel ) ``` --- #Estimate using unbalanced:group-period specific effect ``` r summary(did_unbalanced) ``` ``` ## ## Call: ## att_gt(yname = "sat", tname = "wave", idname = "id", gname = "treatgroup", ## xformla = ~1, data = sixwaves_long4, allow_unbalanced_panel = TRUE) ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## Group-Time Average Treatment Effects: ## Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band] ## 2 2 0.3502 0.1000 0.0479 0.6526 * ## 2 3 0.4438 0.1260 0.0631 0.8246 * ## 2 4 0.3988 0.1448 -0.0390 0.8365 ## 2 5 0.4052 0.1696 -0.1074 0.9179 ## 2 6 0.3517 0.1994 -0.2509 0.9543 ## 3 2 0.0900 0.1219 -0.2785 0.4584 ## 3 3 0.2369 0.1140 -0.1076 0.5813 ## 3 4 0.1393 0.1726 -0.3825 0.6611 ## 3 5 0.2802 0.1662 -0.2220 0.7825 ## 3 6 0.4630 0.1799 -0.0808 1.0068 ## 4 2 -0.4068 0.1554 -0.8765 0.0629 ## 4 3 0.3167 0.1464 -0.1257 0.7591 ## 4 4 0.1567 0.1484 -0.2919 0.6053 ## 4 5 0.3352 0.1702 -0.1793 0.8496 ## 4 6 0.4294 0.1717 -0.0897 0.9484 ## 5 2 0.0980 0.1577 -0.3787 0.5747 ## 5 3 -0.0435 0.1687 -0.5533 0.4663 ## 5 4 0.0403 0.1604 -0.4445 0.5250 ## 5 5 0.6396 0.1626 0.1482 1.1310 * ## 5 6 0.4349 0.1827 -0.1174 0.9871 ## 6 2 -0.4237 0.2196 -1.0874 0.2400 ## 6 3 0.0736 0.2125 -0.5687 0.7160 ## 6 4 -0.3023 0.2549 -1.0727 0.4680 ## 6 5 0.5510 0.2812 -0.2988 1.4008 ## 6 6 0.2768 0.2232 -0.3979 0.9515 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## P-value for pre-test of parallel trends assumption: 0.06269 ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Estimate using unbalanced: overall effect ``` r overall_effect_unbalanced <- aggte(did_unbalanced, type = "simple") #summarize the overall effect summary(overall_effect_unbalanced) ``` ``` ## ## Call: ## aggte(MP = did_unbalanced, type = "simple") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## ATT Std. Error [ 95% Conf. Int.] ## 0.3587 0.0758 0.2102 0.5072 * ## ## ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Estimate using unbalanced: group-specific effect ``` r group_effect_unbalanced <- aggte(did_unbalanced, type = "group") #summarize the effect by the group summary(group_effect_unbalanced) ``` ``` ## ## Call: ## aggte(MP = did_unbalanced, type = "group") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## Overall summary of ATT's based on group/cohort aggregation: ## ATT Std. Error [ 95% Conf. Int.] ## 0.3612 0.0647 0.2344 0.4881 * ## ## ## Group Effects: ## Group Estimate Std. Error [95% Simult. Conf. Band] ## 2 0.3900 0.1244 0.0814 0.6986 * ## 3 0.2798 0.1178 -0.0123 0.5720 ## 4 0.3071 0.1241 -0.0007 0.6149 ## 5 0.5372 0.1483 0.1695 0.9049 * ## 6 0.2768 0.2321 -0.2989 0.8525 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Estimate using unbalanced: time-dynamic effect ``` r dynamic_effect_unbalanced <- aggte(did_unbalanced, type = "dynamic") #summarize the time dynamic effect summary(dynamic_effect_unbalanced ) ``` ``` ## ## Call: ## aggte(MP = did_unbalanced, type = "dynamic") ## ## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> ## ## ## Overall summary of ATT's based on event-study/dynamic aggregation: ## ATT Std. Error [ 95% Conf. Int.] ## 0.3628 0.084 0.1983 0.5274 * ## ## ## Dynamic Effects: ## Event time Estimate Std. Error [95% Simult. Conf. Band] ## -4 -0.4237 0.2307 -1.0641 0.2167 ## -3 0.0904 0.1279 -0.2646 0.4454 ## -2 -0.2574 0.0950 -0.5212 0.0064 ## -1 0.1860 0.0750 -0.0221 0.3942 ## 0 0.3228 0.0553 0.1692 0.4764 * ## 1 0.3436 0.0763 0.1317 0.5555 * ## 2 0.3689 0.1019 0.0859 0.6519 * ## 3 0.4272 0.1201 0.0939 0.7605 * ## 4 0.3517 0.1912 -0.1791 0.8825 ## --- ## Signif. codes: `*' confidence band does not cover 0 ## ## Control Group: Never Treated, Anticipation Periods: 0 ## Estimation Method: Doubly Robust ``` --- #Take home - What is staggered treatment - What is the problem of using two-way fixed effect to estimate the ATT of staggered treatment - Using `did` package to do staggered DID - `att_gt()`: estimate the group-time specific effect - `ggdid()`: plot the effect - `aggte()`: aggregate the group-time specific effect - [weight explanation in Callaway and Sant’Anna(1:08:01)](https://www.youtube.com/watch?v=VLviaylakAo&t=4642s) --- class: center, middle #[Exercise](https://rpubs.com/fancycmn/1375260)