class: center, middle, inverse, title-slide .title[ # Evaluating the Minneapolis 2040 Comprehensive Plan with Synthetic Control Method ] .author[ ### Kim-Eng Ky ] .institute[ ### Federal Reserve Bank of Minneapolis ] .date[ ### December 2, 2022 ] --- class: center, middle # Disclaimer The views expressed here are the presenter's and not necessarily those of the Federal Reserve Bank of Minneapolis or the Federal Reserve System. --- # Motivation - The Minneapolis 2040 comprehensive plan went into effect in late 2019. - "In 2040, all Minneapolis residents will be able to afford and access quality housing throughout the city." - Minneapolis became the first major city in the U.S. to eliminate zoning regulations that ban the construction of duplexes and triplexes. --- # Methodology - 10 indicators grouped into three categories: More Housing, More Affordable Housing, and More Equitable Housing - Synthetic Control Method (SCM) is applied to each indicator to estimate the treatment effect - Permutation test to assess statistical significance --- # Why synthetic control? - Systematically selects comparison groups and their weights - Can account for the effects of confounders changing over time by weighting the control groups to better match the treatment group before the intervention -- But... -- It's not protected from events that may only affect the treatment unit. --- # Assumptions - Potential control units cannot have a similar policy change - The policy of interest cannot affect the outcome in the control units, i.e., no spillover effects - Counterfactual outcome can be approximated by a fixed combination of control units --- # Synthetic Control Method (SCM) (1) Identify predictors of the outcome variables - Total population - Homeownership rate - Median housing cost - Median household income - Odd-year pre-treatment lagged values of the outcome -- (2) Identify potential control units - Total population between 150,000 and 2,000,000 - No similar policy in effect around the same time (e.g., Portland, OR) - No cities that may experience spillover effects of the plan (e.g., St. Paul, MN) - Principal cities only (e.g., excluding Irving, VA) - Cities that are a census place during the whole study period (2010-current) (e.g., excluding Urban Honolulu census designated place, HI) --- # Synthetic Control Method (SCM) (3) Estimate predictor weights - Equal weights - Regression-based weights -- (4) Estimate control weights - Minimize the root mean squared prediction error --- # Assessing statistical significance (5) Calculate p-value - construct a counterfactual for each control unit using all other control units (not including Minneapolis) - calculate post-to-pre-treatment root mean squared error (RMSE) ratio ( `\(r_j\)` ) for Minneapolis ( `\(r_1\)` ) and all control units - calculate p-value: `\(p = \frac{1}{J+1} \sum_{j=1}^{J+1}I_+(r_1 - r_j)\)` where `\(J\)` is the number of control units --- # Implementation: Housing cost burden among extremely low-income renting households <!-- --> --- class: inverse # R package: `Synth` "The R package `Synth` implements synthetic control methods for comparative case studies designed to estimate the causal effects of policy interventions and other events of interest (Abadie, Diamond, and Hainmueller 2010)." ``` install.packages("Synth") library(Synth) ``` --- # Preparing the data: input data
--- class: inverse # Preparing the data: required format ``` data_prep_out <- dataprep( foo = input_data, predictors = c("pct_owner_occupied", "total_population", "housing_cost", "household_income"), predictors.op = "mean", dependent = "value", unit.variable = "place_id", time.variable = "year", special.predictors = list( list("value", 2019, "mean"), list("value", 2017, "mean"), list("value", 2015, "mean"), list("value", 2013, "mean"), list("value", 2011, "mean"), list("value", 2009, "mean") ), treatment.identifier = 1, controls.identifier = 2:127, time.predictors.prior = 2008:2019, time.optimize.ssr = 2008:2019, unit.names.variable = "place_name", time.plot = 2008:2020 ) ``` --- class: inverse # Estimating and visualizing the counterfactual ``` synth_out <- synth(data_prep_out, verbose = TRUE) path.plot(synth.res = synth_out, dataprep.res = data_prep_out) ``` <!-- --> --- class: inverse # Better visualization ``` library(data.table); library(ggplot2) # extract weights of each donor city weights <- data.table(synth_out$solution.w, place_id = rownames(synth_out$solution.w)) # calculate the counterfactual for Minneapolis data_w_weights <- weights[input_data, on = .(place_id), nomatch = 0] scm <- data_w_weights[, .(scm_value = sum(value * w.weight)), .(year)] # add the actual value mpls_outcome <- input_data[place == "2743000", .(year, actual_value = value, place_name)] scm <- scm[mpls_outcome, on = .(year)] ggplot(scm, aes(x = year)) + geom_line(aes(y = scm_value, color = "Counterfactual"), size = 1) + geom_line(aes(y = actual_value, color = "Actual"), size = 1) + geom_vline(xintercept = 2019, linetype = "dashed") + scale_color_manual(values = c("#003b5c", "#cfb023")) + scale_x_continuous(breaks = 2008:2020) + scale_y_continuous(limits = c(.5, 1), labels = percent) + labs(x = NULL, y = "Housing cost burden", color = NULL, title = "Housing cost burden among extremely low-income renting households", subtitle = "Minneapolis and the comparison without the Minneapolis 2040 plan") + theme(legend.position = c(.2, .2)) ``` --- # Better visualization <!-- --> --- class: inverse # Calculating p-value ``` # 1. Construct the pre-treatment and post-treatment root mean square error # 2. Calculate the ratio of the post-treatment RMSE to pre-treatment RMSE # 3. Order the observations from lowest to highest based on the ratio and then # determine each city's percentile rank placebos[, spe := (scm_value - actual_value) ^ 2] rmse <- placebos[time_var <= treatment_period, .(pre = sqrt(mean(spe)), pre_nobs = .N), by=.(place_id, place_name)] xpost <- placebos[time_var > treatment_period, .(post = sqrt(mean(spe)), post_nobs = .N), by=.(place_id, place_name)] rmse <- rmse[xpost, on = .(place_id, place_name)] rmse[, ratio := post / pre] mpls_ratio <- rmse[str_detect(place_name, "Minneapolis"), ratio] pval <- rmse[, mean(ratio >= mpls_ratio)] ``` --- # Putting it all together https://minneapolisfed.shinyapps.io/Minneapolis-Indicators/ <img src="data:image/png;base64,#dashboard-landing-page.png" alt="landing-page" width="800"/> --- # References [1] [Alberto Abadie, Alexis Diamond & Jens Hainmueller (2010)](https://www.tandfonline.com/doi/abs/10.1198/jasa.2009.ap08746) Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program, Journal of the American Statistical Association, 105:490, 493-505, DOI: 10.1198/jasa.2009.ap08746s [2] [Robert McClelland and Sarah Gault (2017)](https://www.urban.org/sites/default/files/publication/89246/the_synthetic_control_method_as_a_tool_1.pdf) The Synthetic Control Method as a Tool to Understand State Policy, Urban Institute [3] [Minneapolis 2040 Housing Indicators: Technical Appendix](https://www.minneapolisfed.org/~/media/assets/articles/2021/new-fed-tool-will-measure-zoning-reforms-impacts-on-housing-affordability-in-minneapolis/minneapolis-housing-indicators-technical-appendix.pdf) --- class: inverse, middle, center # Thank you! Kim-Eng Ky Twitter: @kykimeng Email: Kim.Ky@mpls.frb.org Slides: https://rpubs.com/kykimeng/rforgov