#read in data
library(readr)
dlaw_event <- read_csv("C:/Users/rlutt/Downloads/dlaw_event.csv")
## Rows: 1617 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (11): stfips, year, _nfd, post, asmrs, pcinc, asmrh, cases, weight, copo...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
stfips: This is a time-invariant variable signifying the state FIPS code. There are multiple observations for each state, which vary by year. year: This variable signifies the year of a specific observation (each row). States are observed over multiple years.
*Note all other varies are nested within a state and a year.
_nfd: This variable is also a year. It represents the year a certain “treatment” was given to a state, which in this case was the implementation of no-fault divorce. timetoTreat: This variable is the difference between year observed and year of treatment (measured in years). A negative value signifies that the observation took place before the treatment and a positive value means it took place afterwards. asmrh: This variable measures homicide mortality in a given state, and it is varies by year. pcinc: This variable measures per captia income in a given state, and it varies by year. asmrs: This variable measures the female suicide rate in a given state, and it varies by year. cases: This variable measures the the Aid to Families with Dependent Children (AFDC) rate for a family of four. weight: This variable is a weight used for complex survey design. copop: This variable measures the state population in a given year.
First, I need to rearrange the data to make it fit for a conventional DiD approach. I need to recode certain observations into a treatment and control arm and time into a pre- treatment and post-treatment period.
#code NAs as 0
dlaw_event$`_nfd`[is.na(dlaw_event$`_nfd`)] <- 0
#code NAs in Timetotreat to point out control arm
dlaw_event$timeToTreat[is.na(dlaw_event$timeToTreat)] <- "M"
#create dummy variable signifying treatment status
dlaw_event$treated <-ifelse(dlaw_event$timeToTreat != "M", 1, 0)
#recode post (the time dummy variable) to include control group
#states began to implement no fault divorce laws in 1969 so that will be the year of the cut-off for pre vs. post in the control group
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
dlaw_event<-dlaw_event%>%
mutate(newpost = case_when(
treated== 0 & year>=1969 ~ 1, treated==1 & timeToTreat>=0 ~1, treated==1 & timeToTreat<0 ~0, treated==0 & year<=1968~0))
#create interaction term between the treated and time
dlaw_event$did<- dlaw_event$newpost * dlaw_event$treated
#generate model using this interaction term
DiDmodel <-lm(asmrs~ treated + newpost + did + asmrh + cases +pcinc, data = dlaw_event)
summary(DiDmodel)
##
## Call:
## lm(formula = asmrs ~ treated + newpost + did + asmrh + cases +
## pcinc, data = dlaw_event)
##
## Residuals:
## Min 1Q Median 3Q Max
## -47.117 -10.698 -3.129 7.568 124.239
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.575e+01 2.733e+00 20.396 < 2e-16 ***
## treated 1.483e+01 2.464e+00 6.018 2.18e-09 ***
## newpost 8.683e+00 2.576e+00 3.371 0.000766 ***
## did -8.140e+00 2.692e+00 -3.023 0.002539 **
## asmrh 2.660e+00 2.622e-01 10.142 < 2e-16 ***
## cases -4.056e+02 4.828e+01 -8.400 < 2e-16 ***
## pcinc -6.107e-04 9.104e-05 -6.707 2.73e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.04 on 1610 degrees of freedom
## Multiple R-squared: 0.1579, Adjusted R-squared: 0.1547
## F-statistic: 50.3 on 6 and 1610 DF, p-value: < 2.2e-16
#perform this with fixed effecs
library(plm)
## Warning: package 'plm' was built under R version 4.3.3
##
## Attaching package: 'plm'
## The following objects are masked from 'package:dplyr':
##
## between, lag, lead
modplm <- plm(asmrs ~ treated + newpost + did + asmrh + cases +pcinc, data=dlaw_event, index=c("stfips", "year"), model="within")
summary(modplm)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = asmrs ~ treated + newpost + did + asmrh + cases +
## pcinc, data = dlaw_event, model = "within", index = c("stfips",
## "year"))
##
## Balanced Panel: n = 49, T = 33, N = 1617
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -45.9715 -6.1871 -0.1671 6.0788 69.8164
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## newpost 1.1642e+01 1.7613e+00 6.6097 5.267e-11 ***
## did -9.0646e+00 1.7148e+00 -5.2861 1.426e-07 ***
## asmrh 1.9524e+00 3.1860e-01 6.1282 1.123e-09 ***
## cases 4.4868e+01 4.7348e+01 0.9476 0.3435
## pcinc -1.5980e-03 8.1493e-05 -19.6094 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 284760
## Residual Sum of Squares: 203740
## R-Squared: 0.28452
## Adj. R-Squared: 0.26026
## F-statistic: 124.31 on 5 and 1563 DF, p-value: < 2.22e-16
#event study approach
#use the provided code
library(fixest)
## Warning: package 'fixest' was built under R version 4.3.3
dlaw_event$timeToTreat[dlaw_event$timeToTreat == 'M'] <- 0
dlaw_event$timeToTreatn<-as.numeric(dlaw_event$timeToTreat)
mod_twfe = feols(asmrs ~ i(timeToTreatn, treated, ref = -1) + ## Our key interaction: time × treatment status
pcinc + asmrh + cases | ## Other controls
stfips + year, ## FEs
cluster = ~stfips, ## Clustered SEs
data = dlaw_event)
summary(mod_twfe)
## OLS estimation, Dep. Var.: asmrs
## Observations: 1,617
## Fixed-effects: stfips: 49, year: 33
## Standard-errors: Clustered (stfips)
## Estimate Std. Error t value Pr(>|t|)
## timeToTreatn::-21:treated -22.920727 4.011063 -5.714377 6.8263e-07 ***
## timeToTreatn::-20:treated -12.084179 10.996365 -1.098925 2.7728e-01
## timeToTreatn::-19:treated 8.842727 5.957829 1.484220 1.4429e-01
## timeToTreatn::-18:treated -0.515951 4.678946 -0.110271 9.1265e-01
## timeToTreatn::-17:treated -4.434874 6.210981 -0.714038 4.7866e-01
## timeToTreatn::-16:treated -1.022577 3.593627 -0.284553 7.7721e-01
## timeToTreatn::-15:treated 0.847757 4.195519 0.202062 8.4072e-01
## timeToTreatn::-14:treated 4.327995 5.218016 0.829433 4.1097e-01
## timeToTreatn::-13:treated -1.388569 4.634641 -0.299607 7.6577e-01
## timeToTreatn::-12:treated -0.043450 6.912751 -0.006286 9.9501e-01
## timeToTreatn::-11:treated -9.381948 3.980237 -2.357133 2.2543e-02 *
## timeToTreatn::-10:treated -1.150666 4.932033 -0.233305 8.1652e-01
## timeToTreatn::-9:treated -5.000702 3.587977 -1.393739 1.6982e-01
## timeToTreatn::-8:treated -2.737651 3.902927 -0.701435 4.8642e-01
## timeToTreatn::-7:treated -1.256434 4.340360 -0.289477 7.7346e-01
## timeToTreatn::-6:treated -0.750558 2.990802 -0.250956 8.0292e-01
## timeToTreatn::-5:treated -2.775423 2.620752 -1.059018 2.9489e-01
## timeToTreatn::-4:treated 0.228357 2.397350 0.095254 9.2451e-01
## timeToTreatn::-3:treated -2.312587 2.970068 -0.778631 4.4002e-01
## timeToTreatn::-2:treated -0.515740 2.514907 -0.205073 8.3838e-01
## timeToTreatn::0:treated 0.250747 2.722144 0.092114 9.2699e-01
## timeToTreatn::1:treated -1.619352 2.941537 -0.550512 5.8452e-01
## timeToTreatn::2:treated -1.687107 3.898178 -0.432794 6.6710e-01
## timeToTreatn::3:treated -0.744471 2.862572 -0.260071 7.9592e-01
## timeToTreatn::4:treated -2.956354 2.832628 -1.043679 3.0186e-01
## timeToTreatn::5:treated -2.377841 2.754740 -0.863182 3.9233e-01
## timeToTreatn::6:treated -3.311888 3.568157 -0.928179 3.5796e-01
## timeToTreatn::7:treated -5.136502 3.401946 -1.509872 1.3763e-01
## timeToTreatn::8:treated -6.991146 3.086374 -2.265165 2.8054e-02 *
## timeToTreatn::9:treated -4.823210 3.089481 -1.561172 1.2505e-01
## timeToTreatn::10:treated -8.814158 3.674600 -2.398672 2.0389e-02 *
## timeToTreatn::11:treated -7.273310 3.631759 -2.002696 5.0876e-02 .
## timeToTreatn::12:treated -6.151559 4.089512 -1.504228 1.3907e-01
## timeToTreatn::13:treated -8.276837 3.946249 -2.097393 4.1250e-02 *
## timeToTreatn::14:treated -6.593221 3.867273 -1.704876 9.4683e-02 .
## timeToTreatn::15:treated -7.850840 4.070836 -1.928557 5.9711e-02 .
## timeToTreatn::16:treated -7.234422 4.270836 -1.693912 9.6762e-02 .
## timeToTreatn::17:treated -8.516898 4.344278 -1.960486 5.5757e-02 .
## timeToTreatn::18:treated -9.991582 3.758781 -2.658198 1.0640e-02 *
## timeToTreatn::19:treated -11.536133 3.861769 -2.987267 4.4242e-03 **
## timeToTreatn::20:treated -9.219165 4.501869 -2.047853 4.6067e-02 *
## timeToTreatn::21:treated -10.790884 4.417864 -2.442557 1.8315e-02 *
## timeToTreatn::22:treated -10.654782 4.608349 -2.312061 2.5110e-02 *
## timeToTreatn::23:treated -12.086577 5.292140 -2.283873 2.6844e-02 *
## timeToTreatn::24:treated -10.677957 6.147523 -1.736953 8.8810e-02 .
## timeToTreatn::25:treated -10.267769 7.459044 -1.376553 1.7504e-01
## timeToTreatn::26:treated -16.692550 10.542337 -1.583382 1.1990e-01
## timeToTreatn::27:treated -0.434476 8.147106 -0.053329 9.5769e-01
## pcinc -0.001105 0.000407 -2.713385 9.2210e-03 **
## asmrh 1.080640 0.596888 1.810458 7.6487e-02 .
## cases -190.371617 134.499118 -1.415412 1.6340e-01
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 10.4 Adj. R2: 0.696608
## Within R2: 0.073118
#e.
iplot(mod_twfe,
xlab = 'Time to treatment',
main = 'Event study: Staggered treatment (TWFE)')
The results of the event study approach show the effect of the
treatment, the implementation of no-fault divorce on the female suicide
rate over time at the state-level. The coefficient of interest is
estimate for treatment status multiplied by time to treatment. A
negative value of time to treatment represents a time period before the
treatment, while a value of 0 is the year the treatment took place, and
a positive value represents the years prior. Both the table outlining
the model and the plot show a steady negative trend in the female
suicide rate beginning with no fault divorce implementation. Not all
these effects are significant, however. The significant values start at
8 years post treatment, and continue to be marginally or statistically
starting at 10 years post treatment.It is important to question to
relevance of values up to 10 years post treatment, treatment, however,
as there could be period effects affecting the suicide rate unrelated to
the treatment. Research has shown that the dissolution of the benefits
of a no fault divorce law take time to spread throughout society.
Therefore, the effects of this treatment are more prevalent
Next, potential robustness checks such as placebo tests and sensitivity analyses are also important. Including control variables that are shown to affect female suicide in the analysis is usually the first step to show the sensitivity of your results. One could also alter the sample of states included in the analysis to see if certain states are driving the overall effect. Additionally, one could change the control group to see if the strength of the effect holds for another control group, providing support that this control group matches the treatment group well enough. Lastly, one could replace the outcome with a placebo outcome that was not supposed to be affected by the treatment, as a way to rule out any period effects.
The paper discussed for this question is “The Effect of Maternal Stress on Birth Outcomes: Exploiting a Natural Experiment” by Torche, Demography (2011) 48: 1473-1491.
The endogenous issue that this study overcomes is the fact that the “treatment,” stress is not randomly assigned; it may be correlated with certain demographic variables, meaning there could be a pattern in the assignment of stress. Performing a study in which an exogenous force randomly distributes stress, overcomes this endogenous issue.
This study uses the occurrence of an earthquake that took place in Chile in 2005 as an exogenous force affecting the assignment of stress on Chilean pregnant women. In this case the zone of intensity (measured by county) in which a pregnant a woman resides is used to assess the dosage of the treatment. The higher the intensity, the higher the dosage of stress ranging from low to moderate intensity. This method is useful in establishing the causal relationship between maternal stress and the outcome, birth weight, because we are confident that the assignment of stress is not due to other characteristics.
In the diagrams below, the treatment is shown to be maternal stress, which is assigned to Chilean pregnant women who are given a “dosage” of stress based on the level of intensity the of the earthquake in their county of resident. The endogeneity concern shown are the potential confounder variables that could affect the distribution of the treatment or the outcome. The natural experiment overcomes this however, as I explained in part b. The second graphic describes how gestational age is mediates this relationship.
For this design to work the instrument be relevant, so the effect of the exogenous force on the treatment cannot be zero. Secondly, the instrument must be exclusionary, meaning that the effect of the exogenous force only influences birth weight through its affect on stress. With these assumptions, we can assume U, any confounders to be irrelevant, and thus not having any affect on stress or birth weight.
#endogeneity graphic
knitr::include_graphics("C:/Users/rlutt/OneDrive/Spring 2024/Causal inference/Diagram.jpg")
#mediator relationship
knitr::include_graphics("C:/Users/rlutt/OneDrive/Spring 2024/Causal inference/Diagram2.jpg")