library(knitr)
library(rsconnect)
library(readxl)
library(tinytex)
## Warning: package 'tinytex' was built under R version 4.2.3
library(stargazer)
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

1. Read Data

data <- read.csv("C:/Users/schic/Econometrics Work/us_fred_coastal_us_states_avg_hpi_before_after_2005.csv")

Model:

\[ HPICHG_i = \beta_0 + \beta_1TimePeriod + \beta_2DisaterAffected + \beta_3TimePeriod*DisasterAffected + \epsilon_i \]

2. Create regression

model1 <- lm(HPI_CHG ~ Time_Period + Disaster_Affected + Time_Period*Disaster_Affected,data = data)
summary(model1)
## 
## Call:
## lm(formula = HPI_CHG ~ Time_Period + Disaster_Affected + Time_Period * 
##     Disaster_Affected, data = data)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.023081 -0.007610 -0.000171  0.004656  0.035981 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    0.037090   0.002819  13.157  < 2e-16 ***
## Time_Period                   -0.027847   0.003987  -6.985  1.2e-08 ***
## Disaster_Affected             -0.013944   0.006176  -2.258   0.0290 *  
## Time_Period:Disaster_Affected  0.019739   0.008734   2.260   0.0288 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01229 on 44 degrees of freedom
## Multiple R-squared:  0.5356, Adjusted R-squared:  0.504 
## F-statistic: 16.92 on 3 and 44 DF,  p-value: 1.882e-07

3. Interpretation

Control And Control Group

In this case, we can see that the control group is composed of states that have not been affected by a natural disaster \(DisasterAffected_i\) = 0, and the control in this model is keeping the model confined to the variable \(TimePeriod\). The treatment group, in this case, is composed of states that have been affected by natural disasters \(DisasterAffected_i\) = 1, and the treatment in this model is the dummy variable \(DisasterAffected_i\).

Interpretation of Diff-In-Diff:

By obtaining the difference of the differences of the outcomes, before and after the treatment, we hope to obtain a value that will explain the net effect of the treatment. This should work, as we assume that the trends of the groups prior to the treatment are parallel. In plain words, if the trends of both groups are the same when the treatment is not present, we can assume that the treatment is the variable causing the change in trend, so by calculating the diff-in-diff, we can say that this value is generally the effect of the treatment on our target variable.

Create 2X2 Martix:

Time_Period = 0 Time_Period = 1
Disaster_Affected = 0 \(\beta_0 + \epsilon_i = 0.0371\) \(\beta_0 + \beta_1 + \epsilon_i = 0.009\)
Disaster_Affected = 1 \(\beta_0 + \beta_2 + \epsilon_i = 0.0232\) \(\beta_0 + \beta_1 + \beta_2 + \beta_3 + \epsilon_i = 0.0151\)
#create 2x2 Matrix
values <- c(.0371, 0.009, 0.0232, 0.0151)
rows <- c("Treated = 0","Treated = 1")
cols <- c("Time = 0", "Time = 1")

Matrix1 <- matrix(values, 
                  nrow=2,
                  ncol=2,
                  dimnames=list(rows,cols)
)

stargazer(Matrix1, type = "text")
## 
## =============================
##             Time = 0 Time = 1
## -----------------------------
## Treated = 0  0.037    0.023  
## Treated = 1  0.009    0.015  
## -----------------------------

Post treatment difference: 0.015 - 0.023 = -0.008

Pre-treatment difference: 0.009 - 0.037 = -0.028

Diff-in-Diff: (-0.008) - (-0.028) = 0.02

The diff-in-diff is approximately equal to \(\beta_3\) = 0.0197 after rounding, which is the coefficient on the interaction variable in the regression

4. Threats to Identification

The threat to identification for diff-in-diff models is when the parallel trends assumption is not met. For diff-in-diff methods to work, we must observe that the trends between the treatment and control groups that are analyzed are the same prior to treatment, as this assumption allows us to isolate the effect of the treatment on the treated group. Additionally, for a diff-in-diff model to work in the first place, we must establish control and treatment groups where the trends of the two groups are recorded over the same controlled time period before and after treatment takes place in the treatment group. Lastly, we must abide by the assumption that future treatments do not affect the outcomes of the past, as this would affect our parallel trends assumption.