1. Differences- in - Differences
#read in data

library(readr)
dlaw_event <- read_csv("C:/Users/rlutt/Downloads/dlaw_event.csv")
## Rows: 1617 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (11): stfips, year, _nfd, post, asmrs, pcinc, asmrh, cases, weight, copo...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
  1. List the time-invariant and time-varying variables:

stfips: This is a time-invariant variable signifying the state FIPS code. There are multiple observations for each state, which vary by year. year: This variable signifies the year of a specific observation (each row). States are observed over multiple years.

*Note all other varies are nested within a state and a year.

_nfd: This variable is also a year. It represents the year a certain “treatment” was given to a state, which in this case was the implementation of no-fault divorce. timetoTreat: This variable is the difference between year observed and year of treatment (measured in years). A negative value signifies that the observation took place before the treatment and a positive value means it took place afterwards. asmrh: This variable measures homicide mortality in a given state, and it is varies by year. pcinc: This variable measures per captia income in a given state, and it varies by year. asmrs: This variable measures the female suicide rate in a given state, and it varies by year. cases: This variable measures the the Aid to Families with Dependent Children (AFDC) rate for a family of four. weight: This variable is a weight used for complex survey design. copop: This variable measures the state population in a given year.

  1. Traditional DiD Analysis:

First, I need to rearrange the data to make it fit for a conventional DiD approach. I need to recode certain observations into a treatment and control arm and time into a pre- treatment and post-treatment period.

#code NAs as 0
dlaw_event$`_nfd`[is.na(dlaw_event$`_nfd`)] <- 0

#code NAs in Timetotreat to point out control arm
dlaw_event$timeToTreat[is.na(dlaw_event$timeToTreat)] <- "M"

#create dummy variable signifying treatment status 

dlaw_event$treated <-ifelse(dlaw_event$timeToTreat != "M", 1, 0)

#recode post (the time dummy variable) to include control group

#states began to implement  no fault divorce laws in 1969 so that will be the year of the cut-off for pre vs. post in the control group

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
   dlaw_event<-dlaw_event%>%
 mutate(newpost = case_when(
      treated== 0 & year>=1969  ~ 1, treated==1 & timeToTreat>=0 ~1, treated==1 & timeToTreat<0 ~0, treated==0 & year<=1968~0))

#create interaction term between the treated and time

dlaw_event$did<- dlaw_event$newpost * dlaw_event$treated

#generate model using this interaction term
  
DiDmodel <-lm(asmrs~ treated + newpost + did + asmrh + cases +pcinc, data = dlaw_event)

summary(DiDmodel)
## 
## Call:
## lm(formula = asmrs ~ treated + newpost + did + asmrh + cases + 
##     pcinc, data = dlaw_event)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.117 -10.698  -3.129   7.568 124.239 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.575e+01  2.733e+00  20.396  < 2e-16 ***
## treated      1.483e+01  2.464e+00   6.018 2.18e-09 ***
## newpost      8.683e+00  2.576e+00   3.371 0.000766 ***
## did         -8.140e+00  2.692e+00  -3.023 0.002539 ** 
## asmrh        2.660e+00  2.622e-01  10.142  < 2e-16 ***
## cases       -4.056e+02  4.828e+01  -8.400  < 2e-16 ***
## pcinc       -6.107e-04  9.104e-05  -6.707 2.73e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.04 on 1610 degrees of freedom
## Multiple R-squared:  0.1579, Adjusted R-squared:  0.1547 
## F-statistic:  50.3 on 6 and 1610 DF,  p-value: < 2.2e-16
#perform this with fixed effecs

library(plm)
## Warning: package 'plm' was built under R version 4.3.3
## 
## Attaching package: 'plm'
## The following objects are masked from 'package:dplyr':
## 
##     between, lag, lead
modplm <- plm(asmrs ~ treated + newpost + did + asmrh + cases +pcinc, data=dlaw_event, index=c("stfips", "year"), model="within")

summary(modplm)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = asmrs ~ treated + newpost + did + asmrh + cases + 
##     pcinc, data = dlaw_event, model = "within", index = c("stfips", 
##     "year"))
## 
## Balanced Panel: n = 49, T = 33, N = 1617
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -45.9715  -6.1871  -0.1671   6.0788  69.8164 
## 
## Coefficients:
##            Estimate  Std. Error  t-value  Pr(>|t|)    
## newpost  1.1642e+01  1.7613e+00   6.6097 5.267e-11 ***
## did     -9.0646e+00  1.7148e+00  -5.2861 1.426e-07 ***
## asmrh    1.9524e+00  3.1860e-01   6.1282 1.123e-09 ***
## cases    4.4868e+01  4.7348e+01   0.9476    0.3435    
## pcinc   -1.5980e-03  8.1493e-05 -19.6094 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    284760
## Residual Sum of Squares: 203740
## R-Squared:      0.28452
## Adj. R-Squared: 0.26026
## F-statistic: 124.31 on 5 and 1563 DF, p-value: < 2.22e-16
  1. The first model, the DiD without fixed effects for state and year, can be interpreted as: the effect of implementing no-fault divorce led to a significant (p<.01) 8% decrease in the female suicide rate compared to states that did not implement no-fault divorce. The second model, the DiD with fixed effects for state and year, can be interpreted as: the effect of implementing no-fault divorce led to a significant (p<.0o1) 9% decrease in the female suicide rate compared to states that did not implement no-fault divorce.
  1. Event Study
#event study approach

#use the provided code

library(fixest)
## Warning: package 'fixest' was built under R version 4.3.3
dlaw_event$timeToTreat[dlaw_event$timeToTreat == 'M'] <- 0

dlaw_event$timeToTreatn<-as.numeric(dlaw_event$timeToTreat)

mod_twfe = feols(asmrs ~ i(timeToTreatn, treated, ref = -1) + ## Our key interaction: time × treatment status
                   pcinc + asmrh + cases |                    ## Other controls
                   stfips + year,                             ## FEs
                 cluster = ~stfips,                          ## Clustered SEs
                 data = dlaw_event)
summary(mod_twfe)
## OLS estimation, Dep. Var.: asmrs
## Observations: 1,617 
## Fixed-effects: stfips: 49,  year: 33
## Standard-errors: Clustered (stfips) 
##                              Estimate Std. Error   t value   Pr(>|t|)    
## timeToTreatn::-21:treated  -22.920727   4.011063 -5.714377 6.8263e-07 ***
## timeToTreatn::-20:treated  -12.084179  10.996365 -1.098925 2.7728e-01    
## timeToTreatn::-19:treated    8.842727   5.957829  1.484220 1.4429e-01    
## timeToTreatn::-18:treated   -0.515951   4.678946 -0.110271 9.1265e-01    
## timeToTreatn::-17:treated   -4.434874   6.210981 -0.714038 4.7866e-01    
## timeToTreatn::-16:treated   -1.022577   3.593627 -0.284553 7.7721e-01    
## timeToTreatn::-15:treated    0.847757   4.195519  0.202062 8.4072e-01    
## timeToTreatn::-14:treated    4.327995   5.218016  0.829433 4.1097e-01    
## timeToTreatn::-13:treated   -1.388569   4.634641 -0.299607 7.6577e-01    
## timeToTreatn::-12:treated   -0.043450   6.912751 -0.006286 9.9501e-01    
## timeToTreatn::-11:treated   -9.381948   3.980237 -2.357133 2.2543e-02 *  
## timeToTreatn::-10:treated   -1.150666   4.932033 -0.233305 8.1652e-01    
## timeToTreatn::-9:treated    -5.000702   3.587977 -1.393739 1.6982e-01    
## timeToTreatn::-8:treated    -2.737651   3.902927 -0.701435 4.8642e-01    
## timeToTreatn::-7:treated    -1.256434   4.340360 -0.289477 7.7346e-01    
## timeToTreatn::-6:treated    -0.750558   2.990802 -0.250956 8.0292e-01    
## timeToTreatn::-5:treated    -2.775423   2.620752 -1.059018 2.9489e-01    
## timeToTreatn::-4:treated     0.228357   2.397350  0.095254 9.2451e-01    
## timeToTreatn::-3:treated    -2.312587   2.970068 -0.778631 4.4002e-01    
## timeToTreatn::-2:treated    -0.515740   2.514907 -0.205073 8.3838e-01    
## timeToTreatn::0:treated      0.250747   2.722144  0.092114 9.2699e-01    
## timeToTreatn::1:treated     -1.619352   2.941537 -0.550512 5.8452e-01    
## timeToTreatn::2:treated     -1.687107   3.898178 -0.432794 6.6710e-01    
## timeToTreatn::3:treated     -0.744471   2.862572 -0.260071 7.9592e-01    
## timeToTreatn::4:treated     -2.956354   2.832628 -1.043679 3.0186e-01    
## timeToTreatn::5:treated     -2.377841   2.754740 -0.863182 3.9233e-01    
## timeToTreatn::6:treated     -3.311888   3.568157 -0.928179 3.5796e-01    
## timeToTreatn::7:treated     -5.136502   3.401946 -1.509872 1.3763e-01    
## timeToTreatn::8:treated     -6.991146   3.086374 -2.265165 2.8054e-02 *  
## timeToTreatn::9:treated     -4.823210   3.089481 -1.561172 1.2505e-01    
## timeToTreatn::10:treated    -8.814158   3.674600 -2.398672 2.0389e-02 *  
## timeToTreatn::11:treated    -7.273310   3.631759 -2.002696 5.0876e-02 .  
## timeToTreatn::12:treated    -6.151559   4.089512 -1.504228 1.3907e-01    
## timeToTreatn::13:treated    -8.276837   3.946249 -2.097393 4.1250e-02 *  
## timeToTreatn::14:treated    -6.593221   3.867273 -1.704876 9.4683e-02 .  
## timeToTreatn::15:treated    -7.850840   4.070836 -1.928557 5.9711e-02 .  
## timeToTreatn::16:treated    -7.234422   4.270836 -1.693912 9.6762e-02 .  
## timeToTreatn::17:treated    -8.516898   4.344278 -1.960486 5.5757e-02 .  
## timeToTreatn::18:treated    -9.991582   3.758781 -2.658198 1.0640e-02 *  
## timeToTreatn::19:treated   -11.536133   3.861769 -2.987267 4.4242e-03 ** 
## timeToTreatn::20:treated    -9.219165   4.501869 -2.047853 4.6067e-02 *  
## timeToTreatn::21:treated   -10.790884   4.417864 -2.442557 1.8315e-02 *  
## timeToTreatn::22:treated   -10.654782   4.608349 -2.312061 2.5110e-02 *  
## timeToTreatn::23:treated   -12.086577   5.292140 -2.283873 2.6844e-02 *  
## timeToTreatn::24:treated   -10.677957   6.147523 -1.736953 8.8810e-02 .  
## timeToTreatn::25:treated   -10.267769   7.459044 -1.376553 1.7504e-01    
## timeToTreatn::26:treated   -16.692550  10.542337 -1.583382 1.1990e-01    
## timeToTreatn::27:treated    -0.434476   8.147106 -0.053329 9.5769e-01    
## pcinc                       -0.001105   0.000407 -2.713385 9.2210e-03 ** 
## asmrh                        1.080640   0.596888  1.810458 7.6487e-02 .  
## cases                     -190.371617 134.499118 -1.415412 1.6340e-01    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 10.4     Adj. R2: 0.696608
##              Within R2: 0.073118
#e.

iplot(mod_twfe, 
      xlab = 'Time to treatment',
      main = 'Event study: Staggered treatment (TWFE)')

The results of the event study approach show the effect of the treatment, the implementation of no-fault divorce on the female suicide rate over time at the state-level. The coefficient of interest is estimate for treatment status multiplied by time to treatment. A negative value of time to treatment represents a time period before the treatment, while a value of 0 is the year the treatment took place, and a positive value represents the years prior. Both the table outlining the model and the plot show a steady negative trend in the female suicide rate beginning with no fault divorce implementation. Not all these effects are significant, however. The significant values start at 8 years post treatment, and continue to be marginally or statistically starting at 10 years post treatment.It is important to question to relevance of values up to 10 years post treatment, treatment, however, as there could be period effects affecting the suicide rate unrelated to the treatment. Research has shown that the dissolution of the benefits of a no fault divorce law take time to spread throughout society. Therefore, the effects of this treatment are more prevalent

  1. It is important to discuss the assumptions that need to be made in this analysis. First, one must assume that the treatment and control arms have parallel trends. In plain terms, this means that whatever happened to the control arm over time is what would have happened to the treatment arm if the treatment hadn’t taken place. The event study design in this case employs fixed effects at the state level to control for any differences between states Second, one could also plot the control and treatment arms prior to the treatment as a way to compare trends before the implementation of no-fault divorce. Another assumption is that there are no compositional differences in the control and treatment arm that could affect the value of the outcome, female suicide. One could compare the demographic make-up of both the treatment and control arms to overcome this. As our unit of analysis is state, employing state-fixed effects also aids in keeping this assumption intact. One could also stratify the analysis based on a variable, such as age, for example, to show that certain age groups are not driving the effect on the outcome. A third assumption is that these effects are reliable over a long-time period. This assumption could threaten this study, as the study spans over 20 years prior to treatment of over 20 years post treatment. However, one could overcome this critique by showing that the effects of a policy such as no-fault divorce take a long time to percolate through society, meaning the effects should be studied over a long period of time. The last assumption is that the magnitude or direction of the effect is sensitive to the functional form of the outcome. This could take place if the the average values for the treatment and control arms are very different prior to the treatment. One way to keep this assumption intact is by presenting the summary statistics of the control and treatment arms of periods prior to treatment to prove they are not different.

Next, potential robustness checks such as placebo tests and sensitivity analyses are also important. Including control variables that are shown to affect female suicide in the analysis is usually the first step to show the sensitivity of your results. One could also alter the sample of states included in the analysis to see if certain states are driving the overall effect. Additionally, one could change the control group to see if the strength of the effect holds for another control group, providing support that this control group matches the treatment group well enough. Lastly, one could replace the outcome with a placebo outcome that was not supposed to be affected by the treatment, as a way to rule out any period effects.

  1. Paper Review

The paper discussed for this question is “The Effect of Maternal Stress on Birth Outcomes: Exploiting a Natural Experiment” by Torche, Demography (2011) 48: 1473-1491.

  1. The endogenous issue that this study overcomes is the fact that the “treatment,” stress is not randomly assigned; it may be correlated with certain demographic variables, meaning there could be a pattern in the assignment of stress. Performing a study in which an exogenous force randomly distributes stress, overcomes this endogenous issue.

  2. This study uses the occurrence of an earthquake that took place in Chile in 2005 as an exogenous force affecting the assignment of stress on Chilean pregnant women. In this case the zone of intensity (measured by county) in which a pregnant a woman resides is used to assess the dosage of the treatment. The higher the intensity, the higher the dosage of stress ranging from low to moderate intensity. This method is useful in establishing the causal relationship between maternal stress and the outcome, birth weight, because we are confident that the assignment of stress is not due to other characteristics.

  3. In the diagrams below, the treatment is shown to be maternal stress, which is assigned to Chilean pregnant women who are given a “dosage” of stress based on the level of intensity the of the earthquake in their county of resident. The endogeneity concern shown are the potential confounder variables that could affect the distribution of the treatment or the outcome. The natural experiment overcomes this however, as I explained in part b. The second graphic describes how gestational age is mediates this relationship.

For this design to work the instrument be relevant, so the effect of the exogenous force on the treatment cannot be zero. Secondly, the instrument must be exclusionary, meaning that the effect of the exogenous force only influences birth weight through its affect on stress. With these assumptions, we can assume U, any confounders to be irrelevant, and thus not having any affect on stress or birth weight.

#endogeneity graphic
knitr::include_graphics("C:/Users/rlutt/OneDrive/Spring 2024/Causal inference/Diagram.jpg")

#mediator relationship
knitr::include_graphics("C:/Users/rlutt/OneDrive/Spring 2024/Causal inference/Diagram2.jpg")

  1. The identification assumption, refers to the assumption that whatever happened to the control group over time is what would have happened to the treatment group if the exogenous force and resulting stress would not have occurred. The author addresses these concerns by first addressing the potential that the earthquake could have lead to spontaneous abortion, which would lead to a sample of positively selected births (in terms of health/ fetus ability to survive acute stress).The author provides data on spontaneous abortions/ miscarriages from the Chilean Ministry of Health in the areas affected by the earthquake, showing that the number is small and not that different than the average in the periods before, during and after the earthquake. The author also notes that if there were a substantial amount of unreported spontaneous abortions/ miscarriage, the analysis would underestimate the effect of negative stress. Next, the author shows that compositional changes in the population do not account for the change in birth weight by investigating out-migration from areas affected by the earthquake.The author accomplishes this by reporting on 2006 data from the Socioeconomic Characterization (CASEN) survey showing that the percent of women moving out of counties affected by the earthquake from 2002 to 2006 is small.